MOVING TO A DISAGGREGATED STORAGE MODEL IN MODERN DATA CENTERS
By Farid Yavari
Today, in most data centers, cloud, no Structured Query Language (No-SQL) and analytics infrastructures have been largely deployed on a direct-attached storage (DAS) architecture and is generally a Total Cost of Ownership (TCO) -driven deployment.
The DAS approach binds the compute and storage resources together, preventing independent scaling and tech refresh cycles. The converged DAS model works very well at smaller scale, but as the infrastructure grows to a substantial size, wasted compute or storage can greatly affect the TCO of the environment. Since the DAS model is constrained by the available slots in a server, scale is limited and often quickly outgrown. In some compute heavy environments, there may be enough DAS allocated to the servers, but the work load needs more Central Processing Units (CPUs), therefore some of the allocated DAS stays unused when additional nodes are added. In addition, since the compute infrastructure is usually on a more aggressive tech-refresh cycle than the storage, converging them together in a single solution limits the flexibility for the tech-refresh. There is a trend to disaggregate at least the warm, cold, and archive data from the compute capacity, and use storage servers in separate racks as Internet Small Computer System Interface (iSCSI) targets to carve out the storage capacity. Hot data, especially if it resides on Solid State Disks (SSDs), is not easily moved to a disaggregated model because of network bandwidth and throughput requirements
The disaggregated iSCSI storage servers are basically commodity servers with just enough compute to drive the input/output (IO), and a large amount of storage to act as a pool of dense capacity. They can contain high performance SSDs, Hard Disk Drives (HDDs), or extremely low-cost, low-performance solutions such as Shingled Magnetic Recording (SMR) drive, depending on the workload performance and price requirements. In some SMR-based storage servers, a very thin layer of Non-Volatile Dual In-line Memory Module (NVDIMM) is used as a buffer to convert random write IOs to sequential for better efficiency
Some high-performance storage servers accommodate up to 240 terabytes (TB) of all-flash capacity sitting on a 12G Serial Attached SCSI (SAS) backend in 2 Rack Units (RUs), with two separate X86 servers in the same chassis, acting as “controllers” and a total of four, 40-Gigabit (40G) Ethernet connections (two on each server). There are other examples of very low cost, all-HDD storage servers with up to 109 6TB 3.5” Serial Advanced Technology Attachment (SATA) drives and two, single-core X86 controllers with 10G Ethernet connections to the network in a 4 RU stamp.
Carving out ISCSI target logical unit numbers (LUNs) in a storage rack and presenting them to various initiators in a different compute rack is a valid disaggregated model for storage in a scale-out architecture. In some instances using iSCSI Extensions for RDMA (ISER) with routable Remote Direct Memory Access (RDMA) can further speed up the throughput and input/output operations per second (IOPS) of the architecture. There is an added cost of the network upgrade that needs to be accounted for, usually around 20-25 percent of the total cost of the solution. The storage network needs a minimum of 40G connectivity on the storage servers and 10G connectivity on the initiator side. The network switches need to have extra-large buffers to prevent packet drops and in many cases priority flow control (PFC), explicit congestion notification (ECN), and quantized congestion notification (QCN) become necessary.
There are many ways to build a disaggregated storage model depending on use cases and requirements. In our next blog, we will cover how a disaggregated model benefits from a properly architected software platform, to gain not only value and utility, but essential features for the applications and the driving business needs.