The BeFree Blog

2015 has brought numerous and exciting changes to FalconStor. As an early pioneer and industry leader in innovative, software-defined storage solutions, we often have thoughts and expertise we would like to share. Here at FalconStor, we strive to provide IT organizations and customers with solutions that provide the flexibility to BE FREE. Our newest platform, FreeStor is all about delivering the freedom and flexibility to manage storage sprawl and truly unify heterogeneous storage infrastructures. We also like to provide thought provoking and alternative views to Storage challenges, infrastructure, and the industry itself. Check back often for our latest thoughts and BE FREE to share your thoughts and comments. After all, ideas spark other ideas, and community discussion shapes cultures. Let’s share and learn together. | Sincerely – Gary Quinn – CEO

By Pete McCallum – Director, Data Center Solutions Architecture - FalconStor
14 Oct 2015

There was a time, not so long ago, when a storage administrator actually had to know something in order to do their job. There was no automation, auto-tiering, virtualization, API sets, QoS, or analytics; all we had was metaLUNs, concatenated metaLUNs, extent management, and RAID sets. We used to sit at the same lunch table as the UNIX guys who had to write code to open a text editor, and who had never EVER used a mouse.

Yes, it used to be that when something broke or started to slow down, we would fix it by actually going into a console or shell and typing some magic commands to save the day.

These days, managing storage is a very different proposition. We have such awesome capabilities emerging from software-defined platform stacks, such as IO-path QoS, hypervisor-and-cloud-agnostic protocols, and scale-out metadata - just to name a few. With all of these advancements, one would tend to think the days of the storage administrator have gone away. And I would tend to agree to some extent.

No longer is the storage administrator really concerned with finite volume management and provisioning. Today, storage performance almost manages itself. Thin provisioning is less about capacity optimization and more about data mobility. And there is almost as much data about our data as there is data.

In some ways we have converted storage administration into air-traffic control: finding optimal data paths and managing congestion of IO as things scale beyond reason. This is where analytics really comes into play.

In all aspects of IT, administration is taking a back seat to business integration, where knowing what has happened (reporting), plus what is happening (monitoring) starts to generate knowledge (analytics) about what is happening in the business. When we add predictive analytics, we add the ability to not only make technology decisions, but ostensibly, business decisions too, which can make a huge difference in meeting market demands and avoiding pitfalls. This moves IT (as well as storage) out of reactive mode and into pro-active mode, which is the number one benefit of predictive analytics.

Let’s see how this applies to a business-IT arrangement through a real world example: month-end close-out of books in a large company. In the past, an IT department would provide infrastructure that met the worst-case scenario of performance impact: So, despite having a 3000 IOPS requirement for 27 days of the month, the 35,000 IOPS month-end churn (for about eight hours) pushed for an all-flash array at 4x the cost of spinning disk. Because the volumes require a tremendous amount of “swing space” as journals fill and flush, reporting is run against copies of data, and Hadoop clusters scale up to analyze the data sets, almost a PB of storage capacity is required to support 200TB of actual production data. All of this is thick-provisioned across two datacenters for redundancy and performance in case of a problem or emergency.

Most of this data would be made available to the business through reporting and monitoring, which would allow an IT architect to decide on a storage and server platform that would handle this kind of load. Manual or semi-manual analytics of many different consoles and systems would merge the data (perhaps into a spreadsheet) where we would find that (apologies if my math is off a little):

  • 30% of all data load happens on one day of the month.
  • 90% of the “storage sprawl" is used for other than production data. Of the remaining 10% used for production data, perhaps 2% of that space actually requires the performance.
  • Cost/TB/IOPS is skewed to fit 10% of the capacity (or .2% for real!), and 30% of the total load, at 8-20x the cost.

There are far more correlations of data that can be made – and are obviously actionable and meaningful to business. For example, one could:

  • Right-size the performance load to the actual requirements of the dataset, rather than incurring tremendous expense to meet the worst-case scenario.
  • Manually shift storage performance tiers prior to month-end (or automatically if the storage platform allows).
  • Thin provision or use non-volatile, mountable snapshots for handling data mining and “copy data” to reduce storage sprawl.

All of these are actionable through a good virtualization platform (like FreeStor) and analytics on platform and application metadata. If we add a truly heterogeneous SDS platform (like FreeStor) that can operate across different performance and platform tiers of storage, we start gaining a breadth of insight into the infrastructure that surpasses anything an admin could reasonably wrap their day around. However, because of the sheer volume and complexity of capabilities, automation and foresight MUST be imbued into the control plane.

This is where intelligent predictive analytics comes in: It’s not about seeing into the future as much as it is correlating events from the past with current events to adjust capabilities in the present. If I know all the capabilities of my targets (performance, capacity, cache, storage layout for read/write optimization, etc.) and I know the trends in requirements from the source applications, AND I know the capabilities and features of the SDS platform (like FreeStor), then I should be able to correlate events and occurrences into policy-based actions to monitor security, performance, protection, and cost SLAs with actual point-in-time events in the system. I can then recommend or automate adjustments to IO paths, storage targets, DR strategies, and new operational requests through intelligent predictive analytics.

All this boils down to operational efficiencies for the business, cost savings in key infrastructure purchasing decisions, better SLA management for business workloads, faster conversion of data into information, and faster time-to-value. I know these are big phrases and promises, but we see it every day. No longer is it enough to be an administrator or an infrastructure architect. No longer is it enough for the CIO to manage a budget and hope systems don’t go down. These days, every aspect of IT is part of the business revenue stream and is a partner in making businesses profitable and efficient. Predictive analytics is a key enabler for this new requirement.

By Pete McCallum – Director, Data Center Solutions Architecture - FalconStor
21 Aug 2015

Let’s face it; embracing new storage technologies, capabilities, and upgrading to new hardware often results in added complexity and costs. The reality is that when IT equipment, platforms, and applications do not integrate with one another, the resulting “sprawl” of storage islands and silos on disparate systems can be costly, risky, disruptive, and time-consuming.  But it does not have to be that way. 

Few organizations have the luxury of performing a massive infrastructure replacement or maintaining completely identical infrastructures for primary and secondary storage. Hardware/platform incompatibility, different system generations, different architectures, and different media types can compromise even the most diligent efforts at protecting and replicating business critical data.

A properly architected Software-defined storage approach can ease many of these integration and management pains.   Software-defined storage implemented at the network fabric layer, abstracted from the underlying hardware, will bypass storage sprawl issues because it standardizes all tools, data services, and management. 

Horizontal, software-defined storage deployed across the infrastructure in a common way should accommodate storage silos on geographically dispersed data centers, locally on different storage systems, or across physical and virtual infrastructures. Software-defined storage eliminates the accumulation of point solutions, and regards all storage as equal.  This enables the delivery of common data services like migration, continuity, recovery, and optimization that can be executed consistently across the entire storage infrastructure. That reduces complexity, the numbers of silos to manage, as well as lowers licensing costs for data services array by array.

The key to solving the problem is not to solve it at all, but to work with it, through a truly horizontal software-defined storage platform that can marry unlike infrastructures, including arrays, servers, hypervisors, and the private or hybrid cloud.  It’s time move the industry forward and BE FREE to eliminate the legacy of silos and infrastructure complexity.

© 2015 FalconStor Software. All Rights Reserved
Privacy Policy & Legal