Section 1.4. Wax On, Wax Off: Finding a Balance


1.4. Wax On, Wax Off: Finding a Balance

Using a system that has no backups is like driving a car 100 miles an hour down a busy road the day after your insurance policy expires. Likewise, having a three-node, highly available cluster for a noncritical application is like having full coverage on your 20-year-old fifth car. Just as insurance plans have different levels of coverage and riders to cover various types of damage, different backup methodologies provide different levels of recoverability.

That Was Close

One memorable moment was when we had a 600 GB file server that hadn't been properly backed up in a while. During a particularly hot weekend, both A/Cs handling the room failed, and temperatures soared. We shut everything down, waited for the A/Cs to be fixed, and started things back up after it cooled off a bit. Sure enough, two disks, physically next to each other in the same RAID4 array, failed. We were narrowly able to avoid total data loss by finding a spare disk and swapping control boards between it and one of the failed drives, which let it spin up and be accessed. We had the vendor courier us replacement drives the next day and then spent a lot of time fixing the backup server.

Theo Van Dinter


1.4.1. Don't Go Overboard

Not all environments need up-to-the-minute data recoverability. For many environments, recovering the systems up to last night's backups is acceptable. For some environments, recovering the system even up to last week or month is OK. Spending thousands of dollars and hundreds of hours implementing the greatest backup solution in the world is a waste if you don't need that level of coverage. This usually is not the problem for most sites; on the contrary, most sites don't spend nearly enough money or effort on their backup and recovery systems. In other cases, however, money may be wasted on unnecessarily elaborate systems.

Recoverability requirements also vary from machine to machine within the same company. The amount of work that would be lost, or the possibility of adversely affecting a customer, may determine these requirements. For example, it may be considered acceptable for an employee or two to lose a day's work spent on a few word processing documents. That is, unless it was the Senior Vice President's assistant who was working on the departmental budget, in which case your mileage may vary. And, it would probably be totally unacceptable for you to lose even one hour's worth of entries into a companywide sales database used by hundreds of people.

The point is that your backup requirements are determined by your recovery requirements. The difficulty comes in finding and using a tool capable of providing the level of recovery that you need. Consider users' home directories for a minute. If they are local to each user's workstation, a loss of one user's disk in the afternoon would mean that one user would lose a few hours of work. However, if user directories are located on an NFS file server that serves thousands of users, you could potentially lose several thousand hours of work if you use only traditional backup tools.

If the loss of a networked file server is unacceptable, you might want to consider snapshot technology. Snapshot software allows you to take a "picture" of your drive or filesystem at a single point in time and then use that picture to back up that drive or filesystem. If the backup references the drive or filesystem via this snapshot, it will back up a consistent picture of the drive or filesystem as it looked at the time the snapshot was taken. If this kind of functionality is interesting to you, you might consider reading Chapter 7, which describes emulating snapshot functionality with rsync and hard links.


Sometimes the tool you need comes with your operating system or database platform, but it's just not being used properly. Sometimes backup tools aren't being used at all. For example, if you have a production Oracle database, combining nightly hot backups with archived redo logs provides you up-to-the-minute recoverability. However, if you lose a disk that is part of a database that doesn't back up its transaction logs, you will lose all work since the last cold backup. See Part V for more information .

If you have a production instance of any kind and are not using the transaction logging feature of your database engine, turn on logging as soon as possible!


Therefore, while it is necessary to find the appropriate utility to give you the degree of recoverability that you require, it is also necessary to use it.

1.4.2. Get the Coverage That You Need

Some environments cannot afford even one minute of downtime, and they should pay for the best backup coveragewhatever it costs. This is because of the great loss that they will incur if they ever lose their systems for even a short period (I know of one company that claims that it loses over $1 million a minute when its systems are down). On the other hand, if you are in an environment that can afford downtime, then spending huge amounts of money for an immediately available hot site [*] isacompletewasteofmoney.

[*] A hot site is a place where you have computers standing by to do an immediate recovery of your environment.

Consider Table 1-1.Nooneshoulddependonacar,oracomputer,withouthavingat least the basic level of coverage. If the only car that you own is uninsured and a drunk driver runs into you and totals it, how would you recover from such a loss? Similarly, if your computer systems have critical information stored on them, how will you recover when a hard drive crashes and all that data is lost? What some people forget is that the opposite of this equation is true as well. If you have a third car that happens to be a 20-year-old (nonclassic) car, you will probably get only liability coverage on it; you could live without that car if it were destroyed today. Spending hundreds of extra dollars a year to insure a $50 car just doesn't make sense. Likewise, if the computers that you are managing are in an environment in which you can do without them for a few days, do you really need hot-swappable, mirrored drives? Pick an appropriate level of protection for your environment.

Table 1-1. Comparing automobile insurance and data protection
Types of coverageAutomobile insuranceComputer backups
Minimum coverageCollision and liability (just keeps you from losing your shirt if you run into someone).Regular nightly backups (keeps you from losing your job when a disk drive dies)
Unexpected disastersComprehensive coverage (vandalism, acts of God, etc.).Journaling filesystemsUninterruptible Power Supplies (UPSs)
Get me driving nowRental car coverage (you get a car if your car is in the shop due to an accident).RAIDMirroringUsing hot-swap drivesHigh-availability (HA) system
Major disastersAnother company will pick up your policy and replace your car if both your car and your insurance company are destroyed in an earthquake.Sending copies of your backup volumes to off-site storage, in case both your computer room and media library are destroyedSending your backups via a dedicated network to a large storage system at your off-site storage vendor
Maximum protectionThe insurance company not only agrees to the conditions listed earlier, but also agrees to store another car of the same model in another state that you can use at any time if all cars in your state are destroyed. Real-time mirroring to a hot-swappable system at another of your sitesSending your backups via either network or courier to a hot-site vendor


You need to balance the cost of a particular backup implementation against the projected monetary loss of the outage from which it protects you. For example, assume that you are evaluating two backup choices. The first option involves sending copies of your backup volumes to an off-site vendor for storage at a cost of $500 a month. The second option is an immediately available standby machine in another city that receives up-to-the-minute replication data from your production machine; let's say this option costs you $5,000 a month.

Your company is located in Utopia, where no natural disasters ever occur, your disks are all mirrored, and you have determined that a day's worth of downtime would cost only $500. Do you really want to spend $60,000 a year to protect against something that will probably never occur? If something catastrophic happened to your datacenter, wouldn't the day-old, off-site copies serve just as well? Your company would suffer an extra day or so of downtime, but you have already determined that this is affordable. The $6,000-a-year solution is probably much more appropriate for this environment.

However, are you protecting yourself from everything that you should be? Are you in an area that is prone to natural disasters and yet have no protection against that sort of event? Maybe you need to consider a different type of off-site storage. If you have a customer base that needs the data on your computers on a regular basis, have you provided for quick recovery in case of a failure? Perhaps you should be considering a hot site or multiple-site mirroring of your database servers. Table 1-1 provides a good overview of the various levels of coverage.

1.4.3. Why the Word "Volume" Instead of "Tape"?

Most backup utilities were originally written to back up to tape. Therefore, most books and online manuals talk about backing up to tape. However, many people are backing up to CDs, magneto-optical disks, or even disk drives. These media types have many advantages, because they act more like disk drives than tape drives. Random access of backup data is easier and you can read them using any block size you wish, because they do not record interrecord gaps like tape drives do.

Since many people no longer use tape, this book uses the more generic word volume whenever appropriate. You'll also find the term backup drive instead of tape drive. Again, that is because the backup drive could be a CD burner or a disk drive. The book uses the words tape and tape drive only when they are necessary and appropriate.

BackupCentral.com has a wiki page for every chapter in this book. Read or contribute updated information about this chapter at http://www.backupcentral.com.





Backup & Recovery
Backup & Recovery: Inexpensive Backup Solutions for Open Systems
ISBN: 0596102461
EAN: 2147483647
Year: 2006
Pages: 237

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net