As an example of how confusing it can be, some companies that perform simple RAID functionality in their storage arrays sometimes market their solutions as a “virtual array”, simply because the RAID component abstracts multiple underlying physical disks into looking like a single device. RAID technology is available in almost EVERY storage array, so even though there is some abstraction involved, marketing a simple RAID array as a virtualization solution is not really ethical. Just to confuse matters even more, there are some storage arrays that actually do provide virtualization functions! So in order to understand what a true storage virtualization solution really does, you will need a proper definition of the term.
1. The act of abstracting, hiding, or isolating the internal functions of a storage (sub)system or service from applications, host computers, or general network resources, for the purpose of enabling application or network independent management of storage or data
2. The application of virtualization to storage services or storage devices for the purpose of aggregating functions or devices, hiding complexity, or adding new capabilities to lower level storage resources.
In other words, when you virtualize storage, you group together multiple heterogeneous storage arrays which normally have their own proprietary methods of creating and provisioning LUNS, and tie them together into a virtual POOL of storage, where you can now use a single method to create and provision a LUN from any part of the pool.
Figure: Pooling multiple physical storage devices to create a virtual LUN
So when you hear the term storage virtualization, most people will be referring to the ability to pool different storage resources together, and then be able to create LUNS for servers from the pool. This is a bit different than just providing RAID functionality in a single storage array, and why RAID should not be marketed as storage virtualization, even though the technical term actually fits.
Virtual pools of storage can also be divided up by capabilities or cost. Once divided up, data can be moved around between the pools at any time, even while the applications are still running on the servers. The ability to move data at will provides a method to move data between higher cost production storage and lower cost backup or archive storage. By moving data between these “tiers” of storage as the data ages and becomes less relevant to the company, you are implementing a concept known as Information Lifecycle Management (ILM).
Some of the benefits of storage virtualization are:
You can view the entire SAN as one pool or many pools of storage independent of physical location of the actual storage devices
Storage virtualization masks the differences between heterogeneous devices
Simplifies provisioning, control, and management under a single interface.
You can centralize all storage volume management on the SAN through a single console
Dynamically allocate capacity to the applications that need it to any pool of storage based on the type of storage in the pool (This means you can put data where it’s supposed to be based on the cost of the disk in the pool)
Move data between any device in any pool while your applications are running
You can migrate data from older to newer storage at any time while the applications are running, with zero downtime.
You can move the intelligence for things like data replication into the virtual abstraction layer, which means you don’t have to buy array based replication solutions like SRDF, PPRC, TrueCopy, etc, which can save a lot of money
You can get better utilization of your storage through things like thin provisioning and added capacity on demand.
Some virtualization solutions also provide other cool features like data deduplication, data encryption, data compression, etc.
By using virtual tape solutions, you can eliminate all your physical tapes for backup, and use disks for all data recovery.
One of the main benefits of storage virtualization within the optimized model is the ability to use a single console to provision storage from multiple vendors. Instead of having to learn how to provision storage using the interface that comes with multiple array vendors, thus greatly simplifying dat to day operations.
Definition of a Virtual Disk
A virtual disk is the virtual representation of a single or pool of physical disks. A virtual disk represents block based storage from the underlying abstracted physical devices.
You can create a virtual disk in two ways:
1) You can create a virtual disk by directly mapping a physical disk through the virtual abstraction layer and presenting it as a virtual disk.
2) You can create a virtual disk by combining multiple physical disks into an extent pool, and then mapping out a virtual disk from the underlying pool of storage.
The first method of directly mapping the physical disk back to a host as a virtual disk may not really seem like virtualization, but it provides some interesting advantages. If the virtualization solution has added functionality that the original storage array does not have, like snapshot or replication capabilities, the new virtualized LUN inherits those new abilities from the virtualization solution. Since the physical LUN is directly mapped as a virtual LUN, the original data on the original LUN stays in place. There is no need to migrate data for the data to be virtualized.
The other beautiful thing about being able to directly map a physical LUN as a virtual LUN is there is a way to back things out if you don’t like it, or if things go wrong with the virtualization solution. The LUN gets all these new abilities, and if the virtualization solution breaks for any reason, you can simple yank out the solution, and your data is still ready to be used. In other words, solutions with direct virtual mapping provide a way to “fall back” method to the way everything was prior to installation. Another benefit is the ability to preserve the identity of the original drive, so you can keep your original path failover software.
Extent pool mapping
Creating a virtual disk out of a pool of physical disks is called extent pool mapping. File systems are made up of technical things like extents and inode structures, and that all relates to how virtualization solutions technically work. You don’t need to know any of that stuff to use it though! The virtualization solution does all that for you.
I hope this overview helps you in understanding the concepts and benefits of storage virtualization, and how it can help optimize data management in your organization