Selecting a backup strategy depends on the risks that you are willing to take. The risk equation for any computer backup consists of two parts . First, you need to understand what can happen to your data and computers. Disasters range from a corrupted file to the destruction of your main corporate facility. Second, you need to select a backup strategy to address each of these disasters. The strategy (and cost) varies depending on the importance of the data, your users reactions to different disasters, and how fast you need to restore from backup. Finally, you must make sure that you can restore from any backup you create, before you really need it.
To understand these parts of the risk equation, you should examine various disaster scenarios and the available levels of data and computer backup.
The loss of even a single file can be a disaster for a user . The loss of a commercial airplane engineering drawing, a master s thesis, or even chapters for a book in production can be a life-changing event.
Information technology managers have to plan for every level of disaster, from the loss of a file to the effects of a nuclear war. (Yes, there are corporate IT managers who create backup plans for a nuclear war.) See Table 14.1 for several basic scenarios.
Lost user file
Restore from backup of the /home filesystem.
Lost configuration file
Restore from backup of /etc .
Lost application file
Reload from backup, or reinstall the application.
Restore partition from backup, or use an appropriate level of RAID.
Damaged hard drive
Restore hard drive from backup or an appropriate level of hardware RAID.
Restore data from other computers or tapes/CDs/DVDs on site.
Damaged data facility
Restore from backups stored in a remote location.
Electromagnetic data loss
Restore from nonmagnetic backups.
This is far from a comprehensive list of possible disaster scenarios. For example, problems with a network can be just as difficult, especially if they prevent users from accessing their files or applications on a server. Of course, disaster planning for networks is beyond the scope of this book, but the principles are essentially the same.
You need to decide what data is critical to you. If you re a personal desktop user, you may have just a few critical files, such as documents. You may be able to back up these files every time you change them.
If you re a Linux administrator for a network of computers, you may be willing to spend a lot more money to protect and back up your data. However, with the amount of data stored in a network of computers, it may not be cost-effective to back up everything every night.
The following sections examine what you might do if you use a Linux computer as a personal desktop, administer a regular network, or administer a network where you have very time-sensitive data. What you actually do in practice may vary with the importance of the data and your available resources.
Your needs will also determine how often you do backups of time-sensitive data and the hard drives on a large group of computers.
Not all users back up their computers. Personal desktop users who just use their computers to browse the Internet may not have any irreplaceable data on their systems. For some home users, a disaster is just an inconvenience; all they need to do is reinstall their operating system and connect to the Internet once again. However, if you re a home user who keeps critical data such as financial records on your computer, consider yourself a Linux administrator and read the sections that follow.
In many cases, all these users need are backups of files on their home directories. Backups of configuration files in /etc can also help users restore many customized settings.
Some users prefer to back up all files and data on their Linux computers. That way, they can recover from any disaster without spending additional time reconfiguring their systems.
If you re the Linux administrator responsible for a group of computers, timely backups are critical. For example, the data associated with the design of a new airplane evolves constantly.
Though it may not be too difficult to recover data from a lost day of work, the consequences of a lost week or month of design work for an airplane company can be rather expensive. In this case, you might configure a series of nightly backups on larger capacity media, such as DVDs or tape drives.
In this way, a Linux administrator can help tired engineers recover the data they accidentally deleted. If there s a larger disaster, the administrator can reinstall Linux, along with the appropriate engineering software, and then restore the design files to the appropriate directories.
Computers are used in time-sensitive situations. For example, if you re the Linux administrator responsible for a financial services firm, timely backups are critical. For example, if you are unable to restore the data associated with sales in the stock market, the consequences can be expensive. Time-sensitive information suggests the need for real-time backups, such as those associated with RAID.
In this way, the failure of any hard disk does not affect the operation of the firm. With the use of removable hard disks, RAID data can also be copied and stored in external locations.
The most straightforward backup is of everything on your computer. However, as the amount of data on individual hard disks moves into the hundreds of gigabytes, the amount of time required can stretch into dozens of hours.
Although Linux computers are multitasking, the load associated with a backup can affect performance for your users. That leaves you with two basic choices: back up your entire computer only on occasion (e.g., weekends), or back up only part of your data, such new files or the /home and /etc directories.
Many Linux administrators use a mix of the two philosophies ”a complete backup available for a Linux computer, with daily backups for new files. There are two ways to make this happen:
Incremental backup An incremental backup includes all files that were created or changed since the last full backup. As time increases since the last full backup, the size of an incremental backup gets progressively larger. Restoring a system requires only the data you saved in the full and the latest incremental backup.
Differential backup A differential backup includes all files that were changed since the last backup of any type. Differential backups are almost always smaller than incremental backups. However, restoring a system from a differential backup can be more difficult. It requires the data you saved in the full backup, the incremental backup (if applicable ), and all of the subsequent differential backups.
Because of the time associated with restoring data, many Linux administrators use some form of RAID. As you ll see later in the chapter, RAID can provide approximate real-time redundancy for your data.