Section 8.8. Simultaneous Backup of One Client to Many Drives


8.8. Simultaneous Backup of One Client to Many Drives

How does a backup product get a very large database (VLDB) or very large system (VLS) onto backup media? This is a slightly different scenario from the one described previously. Multiplexing allows 5 or 10 systems to share the same device, so that backup devices are constantly streaming, and backups can be done in a timely manner. In contrast, the situation that calls for multistreaming is a single system that could not possibly back up its files to a single backup drive in one night. Even if one had a backup drive that is capable of 100 MB/s, that is only 720 GB per hour, or 5.76 TB in an eight-hour periodassuming the drive could be streamed for that long. What if the system contains 20 TB of data? What if the drive is capable only of 25 MB/s because of the network it's running on?

The only way to get that kind of speed is to use several drives simultaneously. (Other combinations would work, of course, but they all require multiple simultaneous backup drives.) However, in order to make that happen, the backup software needs to be able to take the one system and send it to many devices simultaneously. Many products are able to do this. What is important is how they do it.

The most common way that backup software companies tell you to back up a VLDB or VLS is to configure multiple backup definitions, each of which is a subset of the entire system, then run them simultaneously. For example, there could be one backup definition that backs up everything, excluding /home1 and /home2. Then there could be a second backup definition that backs up just /home1 and /home2. The software then would be told to back them up at the same time. That way, two different devices would be operating at the same timepotentially cutting the backup time by as much as half.

This method is not desirable for a few reasons. The first is that it is difficult to figure out where to divide the partitions up, and you'll never get it exactly right. Even if you get it right one day, things will change. The second problem is that it requires you to maintain an include list that must be updated every time they add a new filesystem.

Backups should never be defined this way. If backup definitions must be split up, always start with an "everything except" backup definition. That way, when a new filesystem is added, it will get backed up automatically.


Glad Nobody Asked for That One!

I was using a backup product that required me to create separate backup definitions if I wanted to do one-to-many. The host was so big and my tape drives were so slow that I had to define five separate backup definitions. First, I had a backup definition that included everything except /data1-/data8. Then I had four more, each of which backed up two of the /data<x> filesystems. When the machine's administrator added /data9, it was backed up automatically by the "everything except /data1-8" backup. However, when /data10 and /data11 were added, they didn't get backed up at all.

One day I noticed that I couldn't find any history for the /data10 filesystem. It was only after a frantic call to tech support that I found out the error was mine. When I excluded /data1, what I actually said was, "exclude all filesystems that match the regular expression /data1." Unfortunately, /data1 also matches /data10. For more than two months, there were two filesystems that weren't getting backed up. Luckily, nobody needed a restore. This is why I say to be very careful when excluding filesystems using regular expressions.


A better method is if the backup product automatically creates the multiple jobs for you. If they support this, they usually do it on the filesystem level, creating a job for each filesystem. One challenge with this is a single large filesystem. Due to modern advancements in filesystems, they can now be several terabytes. So now, not only are there several terabytes of data to back up from one system, but all the data resides in one filesystem.

There is no way to back up a multiterabyte system to a single backup drive in one night. The only way to get that kind of speed is to use several drives simultaneously, on four separate channels. However, in order to make that happen, the backup software must be able to take the one filesystem and send it to many devices simultaneously. As of this writing, this is an area that only two products have properly addressed.




Backup & Recovery
Backup & Recovery: Inexpensive Backup Solutions for Open Systems
ISBN: 0596102461
EAN: 2147483647
Year: 2006
Pages: 237

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net