The 100-year Archive and the Data Preservation Explosion Part Four: Redundant Array of Independent Clouds (RAIC) for Data Resiliency

A Redundant Array of Independent (interchangeable with Inexpensive) Disks (RAID) was groundbreaking. It combines multiple hard drives to improve the mean time between failure (MTBF) versus the MTBF of a single drive. The RAID configuration delivers data redundancy and data survivability. With one drive, the concept of a disk failure was one of disaster acceptance. With two drives mirroring each other, we get two complete copies (RAID 0) at twice the cost. It also delivers the concept of disaster mitigation, which is for an additional cost and a more complex configuration, the probability of data loss is reduced. Talented engineers have delivered multiple flavors of RAID 0, 1, 10, 5, and 6 are common, and there are several other more esoteric options. Depending on your needs, each scheme, or RAID level, provides a different balance among the key goals: reliabilityavailabilityperformance, and capacity For a full history and details for RAID levels, go here.

There are several different RAID options, Hardware-based, Software-Based, and new complex mathematical algorithms that technical aren’t RAID, but deliver similar capabilities. There is more flexibility in both software-based and algorithm-based RAID options to tune them to specific needs. One of the top algorithms methodologies is called Erasure Coding. There are numerous iterations and a few new approaches, but they all deliver robust RAID type capabilities.

With virtualization, the greater concept of RAID can reach across entire systems and data centers, which delivers an increased level of reliability. We have moved from 99.999% (5 9s) uptime per year or 5.26 minutes downtime a year in a traditional data center to over 99.9999999% (9 9s) uptime per year or 31.56 milliseconds of downtime a year in major Cloud architectures.

Today, organizations are moving to the Cloud. Each Cloud provider has an underlying RAID capability to manage their Cloud infrastructure and their customer’s data. Each Cloud provider has similar but different ways of managing RAID, but each major cloud vendor takes a modular approach to building and scaling their architectures. This “Lego” approach allows Cloud vendors to design once, deploy many, and scale rapidly to respond to customers growing demands. However, there are a few challenges.

Modular Architectures and the Potato Problem

The modular architectures are cost-efficient and highly scalable. Their modularity is their weak point. If a bad actor or hostile nation-state finds a cyber threat vector (vulnerability), it can potentially be exploited to infect the entire Cloud system and, in the worst cases, cause the entire system to go down or all data to be corrupted. The analogy of this cyber issue is the terrible human tragedy of the late 1800s and due to the overreliance on a single crop, the Potato. The Irish Potato Famine, also known as the Great Hunger, began in 1845 when a fungus-like organism called Phytophthora infestans (or P. infestans) spread rapidly throughout Ireland. The infestation ruined up to one-half of the potato crop that year, and about three-quarters of the crop over the next seven years. Cloud architectures have built-in some fail-safes for some of the known challenges, but not challenges are known. It is good to not have all your eggs in a single basket and have a data copy on a second Cloud.

Hungover Backhoe Driver and the Fiber Cut

Although a funny title, it has happened. In the late 1990s, Sprint had only on the point of failure in their network where several fiber rings converged in south Kansas. Good ol’ Murphy visited, and a backhoe driver dug into and cut the exact junction point that was the single point of failure. In the early 2000s, road construction was being conducted in Silicon Valley near Brokaw Road when a backhoe severed a major fiber line that took the valley to its knees. A natural disaster can also be a significant factor in outages. Having two data centers with two fiber carriers, or a data center with a Cloud provider, or two Cloud geo-distributed providers can help mitigate these issues.

Am I really paying to have my Data held Hostage?

The magical journey to the Cloud has not been without harsh realizations. There are numerous arguments for and against Cloud, as the cost of Cloud rise. At the end of the day, Cloud is here to stay. However, one issue has most people and organizations hot under the collar, Data Egress penalties.

Organizations pay to have their data hosted by a Cloud Provider, and then have to pay more money to get the organization’s data repatriated. Despite the “it is very costly to transfer” argument, data egress, at the end of all the arguments, is a simple lock-in strategy inhibiting an organization from moving data out of the Cloud provider. It also allows a Cloud provider to price gouge organizations. Personally, I believe it is a lazy product management ideology. If you have an excellent service at a competitive price and continuously improve the product, you don’t need an implied or explicit lock-in strategy to take advantage of customers. Unfortunately, the need to repatriate data is likely instigated by an already costly event, like compliance audit, regulatory review, eDiscovery subpoena, or other unexpected mandatory events. Data Egress further burdens an organization in an already complicated and costly situation. Is there a way to mitigate data egress? Yes.

Cloud Solutions, Egress Mitigation, and Increased Reliability…

So, we have three significant challenges with Clouds. FalconStor’s engineers have worked to mitigate all three and, in doing so, also help mitigate cost for organizations, as well as give organizations options. Leveraging our Redundant Array of Independent Clouds (RAIC) technology, which uses our persistent patent-pending container technology, offers advanced erasure coding, sophisticated data movement, monitoring, and management console to enable organizations to take back control.

Leveraging the RAID concept, FalconStor has extended the paradigm to multiple Clouds, with RAIC. FalconStor uses an individual cloud like an individual disk drive in a RAID group. Leveraging six or more different clouds like separate disk drives, FalconStor takes a “unit of data” or container and breaks it into multiple segments (mini containers) and then stores each segment on an individual Cloud or data center. Simple, it is a RAID array made of Clouds.

Put another way…With its innovative StorSafe product using Persistent Virtual Storage Containers (VSC) leveraging Linux containers, a persistent container is filled with data. The VSC can then be divided into multiple mini containers leveraging erasure coding. These mini VSCs can then be distributed and stored across multiple Clouds or multiple on-premises storage systems or a combination of both.

Organizations no longer have to store two (2) separate copies on two separate clouds and double pay for the 100% data redundancy that a RAID Zero configuration requires. Using FalconStor’s RAIC, an equivalent redundancy would require 75% less data overlap and subsequently require a 75% cost reduction over the RAID Zero Cloud option. RAIC delivers a new standard in reliability and cost.

What if a Cloud Goes Dark?

If a Cloud goes Dark Today, an entire organization goes dark. We all remember it happening and it will happen again.

With FalconStor StorSafe, a cloud going dark or becoming inaccessible would be analogous to a disk drive failing. FalconStor recognizes the failure and issues a rebuild command from the segments on other Clouds or data centers. And, your organization is back up and running, as others remain dark.

Within a six segmented container, two mini containers can go dark, and FalconStor can recover 100% of the stored data to rebuild the original container. Or in another scenario, your Cloud provider “A” increases storage cost and your data egress fees to a very high level. Your organization can issue a secure delete command to the mini container on Cloud “A” and delete all data on Cloud “A” bring your cost for that vendor to zero. FalconStor would then automatically issue a rebuild command to your remain Cloud vendors to rebuild the “sixth” mini container. This allows an organization to decouple itself from a Cloud vendor without incurring egress fees or allowing the Cloud provider to hold their data hostage.

In another similar scenario, FalconStor StorSafe allows the organization to review all Cloud vendor’s pricing and egress fees. The organization can then decide to recover data from the four (4) lowest price alternatives to repatriate their data for the lowest cost and, delete the other two.

Alternatively, FalconStor’s partner, Wasabi, could be used as a preferred vendor, as they have a “No Egress Fee” policy in place. I, personally, call it their “We don’t hold our customer’s data Hostage or Gouge Our Customers Policy.”

FalconStor’s Redundant Array of Clouds (RAIC) technology delivers broad data control and management capabilities to any organization that, to date, has only been available within a major Cloud vendor. With StorSafe, FalconStor delivers the capability for any organization to have RAIC technology and take control of their data. Since StorSafe and RAIC deliver the ability to span multiple Clouds, organizations are able to increase data accessibility and availability to over the typical nine nines (9 9s) of a single Cloud to deliver less than 31.56 milliseconds of downtime a year.

To span Clouds and data centers, a core feature of StorSafe is portability to any S3 cloud or traditional storage system, which gives organizations full control of where they want to move their data today and for the future. The Open Source API that enables container virtualization also delivers an added benefit of futureproofing. Over the 100 year retention period, Clouds, Storage Systems, Operating Systems, and Hardware will change, with StorSafe, you will have access to your data today, tomorrow, and in 2120.

FalconStor ensures your organization is in control of your data in the new Data-Centric Era.

Join this The 100-year Archive and the Data Preservation Explosion Series to learn more about StorSafe features, functionality, and use cases, as well as more about the New Data-Centric Era.