Otava continues its business continuity series of videos on data backup and replication by explaining why de-duplication is a important and cost-savings tool for data backup.
Steven: There are a couple of technologies that really can get you the most bang for your buck in backup software. There is the technology of de-duplication and the technology of compression. And compression has been around for awhile, most people know how zip files work. They take a long of string of numbers and they can press them down using some mathematical formulas. And de-duplication has been around for awhile too but implemented properly it can really save a lot of time on your backups, and it can save a lot of space on your backup storage.
So let’s take, for example, a Microsoft Word document or a notepad document, and you have two documents. And one just has the word ‘Jog’ in it, J-O-G, and the other document has the word ‘Dog’ in it, D-O-G. When you go to backup your system you have these two Word files or notepad files in the backup software. If it’s intelligent enough, it can go through and look at these two files and say, “Hey, I have two notepad files and one says Jog and one says Dog. The only two difference in these files is a J and a D.
So what a good backup software can do is say, we were going to backup this Jog file first, and then from the second file we are just going to backup the D, because we already have this data in the system and we can keep a mapping of how this data is built. It’s like building a puzzle, and it’s all verified mathematically on the backend. So what that allows you to do is drastically cut the time of backup Windows.
You may have had without de-duplication a backup job for a file server that may have been two or three terabytes. It could have been hours, if not a good part of a day, versus if you have a technology that implements de-duplication in a very efficient way, you can then go from taking a very intensive six to eight hour backup process and cut it down to 30 to 50 minutes.
So for servers that are mission-critical that need to be backed up because they are mission-critical but also need to be online and servicing customers, this is a huge, huge benefit, because you are spending less time backing up the server, and the server can spend its time the way it’s supposed to be servicing request from customers.