Traditional backup, not to name any, has few challenges in our modern IT environments. First is that addiction to copying data over and over again, leaving behind a very long trail of tape that can become very challenging to manage, or even to make sense of, even with the best catalogues.
Second, it’s the reason why we have a “backup window,” due to its thirst for CPU and network resources. It will very simply bring all systems to a halt during the backup process. An old friend use to say, if you want to know whether you have a solid network and IT infrastructure or not, start your backups and watch what happens.
By looking at the above symptoms, data deduplication seems to be a good remedy, but all deduplication solutions are not created equal. The choice should be based on the type of IT operations that you have. The performance of the solution will directly impact your backup window. Common sense would tell you the smaller the backup window the higher performance you need, but there is a bit more to that. Your deduplication solution should be able to support different types and techniques of backup operations. An example would be how effective the solution is at deduplicating multiplexed backup streams!
Also, since the deduplication process itself is a CPU-intensive operation, you should check whether you could exclude some data types from deduplication and only apply the process where it matters. It doesn’t make sense to try to deduplicate encrypted data, for example, as it’s pretty much all unique data. You also want to exclude some other data streams, such as archiving medical imagery or microscopic or telescopic data, that are not very friendly to the deduplication process.
And since we are talking about backups, despite the fact that data deduplication allows you retain data longer on disk resources, many organizations still have a requirement to backup data to tape. In that case, you’d want a solution that streamlines the process and integrates with tape infrastructure seamlessly. Having two separate infrastructures and performing backups twice, once to the deduplication target and another to a tape library, defeats the whole purpose of deduplication.
And to get back to the waste management analogy, in the greatest borough of all, Manhattan, during curbside collection days you see only garbage bags and no containers, and there is definitely a logistical reason for that. So as with any waste management system, while choosing your data deduplication solution, try to align it to your business and IT goals and operations.
Check out FalconStor’s next data deduplication announcement on April 26th. You may find your solution there!