Making backups of your system is an important way to protect yourself from data corruption or loss in case you have problems with your hardware, or you make a mistake such as deleting important files inadvertently. During your experiences with Linux, you're likely to make quite a few customizations to the system that can't be restored by simply reinstalling from your original installation media. However, if you happen to have your original Linux CD-ROM or DVD-ROM handy, it may not be necessary to back up your entire system. Your original installation media already serve as an
excellent
backup.
Under Linux, as with any Unix-like system, you can make mistakes while logged in as
root
that would make it
impossible
to boot the system or log in later. Many newcomers approach such a problem by reinstalling the system entirely from backup, or
worse
, from scratch. This is seldom, if ever, necessary. In "What to Do in an Emergency," later in this chapter, we talk about what to do in these cases.
If you do experience data loss, it is sometimes possible to recover that data using the filesystem maintenance tools described in "Checking and Repairing Filesystems" in Chapter 10. Unlike some other operating systems, however, it's
generally
not possible to "undelete" a file that has been removed by
rm
or overwritten by a careless
cp
or
mv
command (for example, copying one file over another destroys the file to which you're copying). In these extreme cases, backups are key to recovering from problems.
Backups are usually made to tape, floppy, CD-R(W), or DVD-R(W). None of these media is 100% reliable, although tape, CD-R(W), and DVD-R(W) are more dependable than floppies in the long
term
. These days, with the cost of hard disks plummeting and the capacity increasing, backing up to a hard disk is also an option.
Many tools are available to help you make backups. In the simplest case, you can use a combination of
gzip
(or
bzip2
) and
tar
to back up files from your hard drive to removable media. This is the best method to use when you make only
occasional
backupsno more often than, say, once a month.
The idea behind an incremental backup is that it is more efficient to make backups in small steps; you use fewer tapes or CDs, and the weekly and nightly backups are shorter and easier to run. With this method, you have a backup that is at most a day old. If you were to, say,
accidentally
delete your entire system, you would restore it from backup in the following manner:
Depending on the
size
of your system, the full monthly backup might require 4 GB or more of backup storageoften not more than one DVD-R(W) or tape, but quite a few Zip disks. However, the weekly and daily backups would generally require much less storage space . Depending on how your system is used, you might decide to make the weekly backup on Sunday night and not bother with daily backups for the
weekend
.
One important characteristic that backups should (usually) have is the ability to select individual files from the backup for restoration. This way, if you accidentally delete a single file or
group
of files, you can simply restore those files without having to do a full system restoration. Depending on how you make backups , however, this task will be either very easy or painfully difficult.
It's also highly desirable to keep the backup media physically separate from the computer. If you choose to back up to hard disk, consider an external USB, FireWire, or SCSI drive. USB and FireWire are particularly nice because they can be easily plugged into or removed from the system as needed. If you choose to back up to a second, internal hard disk, it would be wise to at least keep the disk unmounted when it's not in use, so that if you were to accidentally delete one or more filesystems, your backup would be spared. An important decision you need to make is evaluating the relative importance of your data's safety and recoverability versus the cost and convenience of the backup media you choose, as well as how you use it.
In this section, we talk about the use of
tar
,
gzip
, and a few
related
tools for making backups to CD and tape. We even cover the use of tape
drives
, as well as CD-R in the bargain. These tools allow you to make backups more or less "by hand"; you can automate the process by writing shell scripts and even schedule your backups to run automatically during the night using
cron
. All you have to do is flip tapes. Other software packages provide a nice
menu-driven
interface for creating backups, restoring specific files from backup, and so forth. Many of these packages are, in fact, nice frontends to
tar
and
gzip
. You can decide for yourself what kind of backup system
suits
you best.
27.1.1. Simple Backups
The simplest means of making a backup is to use
tar
to archive all the files on the system or only those files in a set of specific directories. Before you do this, however, you need to decide what files to back up. Do you need to back up every file on the system? This is rarely necessary,
especially
if you have your original installation disks or CD-ROM. If you have made important changes to the system, but everything else is just the way it was found on your installation media, you could get by with only archiving those files you have made changes to. Over time, however, it is difficult to keep track of such changes.
In general, you will be making changes to the system configuration files in
/etc
. There are other configuration files as well, and it can't hurt to archive directories such as
/usr/lib
and
/etc/X11
(which contains the XFree86 configuration files, as we saw in "Installing X.org" in Chapter 16).
You should also back up your kernel sources (if you have upgraded or built your own kernel); these are found in
/usr/src/linux
.
During your Linux
adventures
it's a good idea to keep notes on what features of the system you've made changes to so that you can make
intelligent
choices when making backups . If you're truly
paranoid
, go ahead and back up the whole system; that can't hurt, but the cost of backup media might.
Of course, you should also back up the home directories for each
user
on the system; these are generally found in
/home
. If you have your system configured to receive electronic mail (see "The Postfix MTA" in Chapter 23), you might want to back up the incoming mail files for each user. Many people tend to keep old and "important" electronic mail in their incoming mail spool, and it's not difficult to accidentally corrupt one of these files through a mailer error or other mistake. These files are usually found in
/var/spool/mail
. Of course, this applies only if you are using the local mail system, not if you access mail directly via POP3 or IMAP.
27.1.1.1. Backing up to tape
Assuming you know what files or directories to back up, you're ready to roll. You can use the
tar
command directly, as we saw in "Using tar" in Chapter 12, to make a backup. For example, the command:
tar cvf /dev/qft0 /usr/src /etc /home
archives all the files from
/usr/src
,
/etc
, and
/home
to
/dev/qft0
.
/dev/qft0
is the first "floppy-tape" devicethat is, a tape drive that
hangs
off of the floppy controller. Many popular tape drives for the PC use this interface. If you have a SCSI tape drive, the device
names
are
/dev/st0
,
/dev/st1
, and so on, based on the drive number. Those tape drives with another type of interface have their own device names; you can determine these by looking at the documentation for the device driver in the kernel.
You can then read the archive back from the tape using a command such as:
tar xvf /dev/qft0
This is exactly as if you were dealing with a
tar
file on disk, as discussed in "Archive and Compression Utilities" in Chapter 12.
When you use the tape drive, the tape is seen as a stream that may be read from or written to in one direction only. Once
tar
is done, the tape device will be closed, and the tape will rewind. You don't create a filesystem on a tape, nor do you mount it or attempt to access the data on it as files. You simply treat the tape device itself as a single "file" from which to create or extract archives.
Be sure your tapes are formatted before you use them. This ensures that the beginning-of-tape marker and bad-blocks information have been written to the tape. For formatting QIC-80 tapes (those used with floppy-tape drivers), you can use a tool called
ftformat
that is either already included with your distribution or can be downloaded from ftp://sunsite.unc.edu/pub/Linux/kernel/tapes as part of the
ftape
package.
Creating one
tar
file per tape might be
wasteful
if the archive requires only a fraction of the capacity of the tape. To place more than one file on a tape, you must first prevent the tape from rewinding after each use, and you must have a way to position the tape to the
next
file marker, for both
tar
file creation and extraction.
The way to do this is to use the nonrewinding tape devices , which are named
/dev/nqft0
,
/dev/nqft1
, and so on for floppy-tape drivers, and
/dev/nst0
,
/dev/nst1
, and so on for SCSI tapes. When this device is used for reading or writing, the tape will not be rewound when the device is closed (that is, once
tar
has completed). You can then use
tar
again to add another archive to the tape. The two
tar
files on the tape won't have anything to do with each other. Of course, if you later overwrite the first
tar
file, you may overwrite the second file or leave an undesirable gap between the first and second files (which may be interpreted as garbage). In general, don't attempt to replace just one file on a tape that has multiple files on it.
Using the nonrewinding tape device, you can add as many files to the tape as space
permits
. To rewind the tape after use, use the
mt
command.
mt
is a general-purpose command that
performs
a number of functions with the tape drive. The
mt
command, although very useful and powerful, is also
fairly
complicated. There's a lot to keep track of to locate a particular record on the tape, and it's easy to get
confused
. If you're particularly motivated to use your tapes as
efficiently
as possible, read up on
mt
; the
manpage
is quite
concise
. We include a few examples here. The command:
mt /dev/nqft0 rewind
rewinds the tape in the first floppy-tape device.
Similarly, the command:
mt /dev/nqft0 reten
retensions the tape by winding it to the end and then rewinding it.
When reading files on a multiple-file tape, you must use the nonrewinding tape device with
tar
and the
mt
command to position the tape to the appropriate file.
For example, to skip to the next file on the tape, use the command:
mt /dev/nqft0 fsf 1
This skips over one file on the tape. Similarly, to skip over two files, use:
mt /dev/nqft0 fsf 2
or
mt
device
fsf 1
to move to the next file.
Be sure to use the appropriate nonrewinding tape device with
mt
. Note that this command does not move to "file number two" on the tape; it skips over the next two files based on the current tape position. Just use
mt
to rewind the tape if you're not sure where the tape is currently positioned. You can also skip back; see the
mt
(1) manual page for a complete list of options.
27.1.1.2. Backing up to CD-R
You can back up your files to recordable CD perhaps more easily than to tape. Blank CDs are very inexpensive, widely available, readable on just about any system, and much easier to transport and store than tapes. In this section, we show you the basics of writing backups to CD-R, as well as a couple of tricks. Almost all of the techniques that work for CD-Rs work equally well for the various flavors of recordable DVDs that are available.
By far the most common way to write data to a CD is to create a CD image file on your hard disk, then burn that to CD. This is easy to do, but has one
slight
disadvantage
: you need at least 650 or 700 MB of free disk space to create a full-
sized
CD image. On modern systems that generally shouldn't be a problem.
CD-ROMs use the ISO 9660 filesystem standard, which can be mounted and read on just about any operating system in common use today. The program
mkisofs
is a
full-featured
and robust tool for creating such filesystems, which can be used in a number of ways, including burning them to CD-R. The actual burning can be done with
cdrecord
. Both of these programs are usually included with most Linux systems.
Here's how to create an ISO 9660 image and burn it to CD. Let's say you have a directory called
/data
that you want to put on CD. Enter:
#
mkisofs -T -r -o /tmp/mycd.iso /data
#
cdrecord -v -eject -fs=4M speed=8 dev=0,0,0 /tmp/mycd.iso
Some of the parameters of
cdrecord
are system-specific. You can run
cdrecord -scanbus
to search for the CD burner on your machine. On the machine used for testing the material in this section, the CD burner shows up as device 0,0,0. Even though the author has a 52 x
burner
, he still chooses to record CDs at only 8 x, to make sure he doesn't underflow the drive and make a bad disk. Experiment with your hardware and determine what worksyou may or may not be able to burn reliably at higher speeds.
A slick, if somewhat less reliable, way to create a CD without writing an image file first is to simply pipe the output of
mkisofs
directly to
cdrecord
:
mkisofs -T -r /data cdrecord -v -eject -fs=4M speed=8 dev=0,0,0 -
That's not the only possible optimization. If, for some reason, you wanted to treat a CD like a tape, you could skip creating an ISO 9660 filesystem, and just write a tar file directly to a CD. This won't be mountable as a normal CD, and you won't be able to put it in a Windows system, but if you prefer this, it works:
tar -czf - /data cdrecord -v -eject -fs=4M speed=8 dev=0,0,0 -
It is important to note that although CD burners are much better than they were a few
years
ago, it's still quite possible to produce a useless disk if there is anything more than a brief interruption in the flow of data from the source to the burner. These problems are even more apparent when burning from a pipeline as in the last two examples. For this reason, we urge you to check your backups after making them!
In addition to
cdrecord
,
tar
, and
mkisofs
, there are a large number of other programs available on the Web that provide an easy-to-use frontend for creating backups. Some of them are able to span multiple CDs, or manage a rotation of CD-RW disks. If you find that creating CD backups as we've described here doesn't fit your needs,
chances
are someone has created another program that will work for you.
27.1.1.3. Backing up to hard disks
As hard disks get bigger and cheaper, one problem is that backup media often don't keep up. Now that 500-GB hard disks are available and affordable by normal people, it may have occurred to you that the only thing you can back a disk that big up to is...another disk that big!
Just about all of the techniques you can use for media in general can apply to hard disks, but there are a few special considerations as well.
If you have a disk mounted at
/data
, and you want to back it up to a second hard disk (of equal or greater size) mounted at
/backup
, you could do a
tar
and un-
tar
pipeline, like this:
cd / ; tar -cvf - /data (cd /backup ; tar -xf -)
If you have room on your backup disk, you can use the remaining space to store incremental backups using the techniques described elsewhere in this chapter. The nice thing about hard disk backups is that you can create any kind of directory structure that makes sense to you. You could have a disk mounted at
/backup
, with subdirectories
/backup/full
and
/backup/incremental
, or any other scheme you choose.
With a combination of
find
,
cron
,
tar
, and
gzip
, you could create a fairly small but powerful script that would mount your backup hard disk,
tar
up the files that have changed since the last time your backup ran, delete the backups older than the last full backup, and
unmount
the backup disk.
27.1.1.4. To compress or not to compress?
There are good arguments both for and against compression of
tar
archives when making backups. The overall problem is that
neither
tar
nor the compression tools
gzip
and
bzip2
are particularly fault-tolerant, no matter how
convenient
they are. Although compression using
gzip
or
bzip2
can greatly reduce the amount of backup media required to store an archive, compressing entire
tar
files as they are written to CD-R or tape makes the backup prone to complete loss if one block of the archive is corrupted, say, through a media error (not uncommon in the case of CD-Rs and tapes). Most compression algorithms,
gzip
and
bzip2
included, depend on the coherency of data across many bytes in order to achieve compression. If any data within a compressed archive is corrupt,
gunzip
may not be able to uncompress the file from that point on, making it completely unreadable to
tar
.
This is much worse than if the
tar
file were uncompressed on the tape. Although
tar
doesn't provide much protection against data corruption within an archive, if there is minimal corruption within a
tar
file, you can usually recover most of the archived files with little trouble, or at least those files up until the corruption occurs. Although far from perfect, it's better than losing your entire backup.
A better solution is to use an archiving tool other than
tar
to make backups. Several options are available.
cpio
is an archiving utility that
packs
files together, similar in fashion to
tar
. However, because of the simpler storage method used by
cpio
, it recovers cleanly from data corruption in an archive. (It still doesn't handle errors well on
gzip
ped files.)
The best solution may be to use a tool such as
afio
.
afio
supports
multivolume
backups and is similar in some respects to
cpio
. However,
afio
includes compression and is more reliable because each individual file is compressed. This means that if data in an archive is corrupted, the damage can be isolated to individual files, instead of to the entire backup.
These tools should be available with your Linux distribution, as well as from all the Internet-based Linux archives. A number of other backup utilities, with varying degrees of popularity and usability, have been developed or ported for Linux. If you're serious about backups, you should look into them.
Among these programs are the
freely
available
taper
,
tob
, and Amanda, as well as commercial programs such as ARKEIA (free for use with up to two computers), BRU, and Arcserve. Lots of free backup tools can also be found at http://www.
tucows
.com/downloads/Linux/IS-IT/FileManagement/BackupRestore.
27.1.2. Incremental Backups
Incremental backups, as described earlier in this chapter, are a good way to keep your system backups up-to-date. For example, you can make nightly backups of only those files that changed in the last 24 hours, weekly backups of all files that changed in the last week, and monthly backups of the entire system.
You can create incremental backups using the tools mentioned previously:
tar
,
gzip
,
cpio
, and so on. The first step in creating an incremental backup is to produce a list of files that have changed since a certain amount of time ago. You can do this easily with the
find
command.
If you use a special backup program, you will most likely not have to do this, but can set some option somewhere that you want to do an incremental backup.
For example, to produce a list of all files that were modified in the last 24 hours, we can use the command:
find / -mtime -1 \! -type d -print > /tmp/filelist.daily
The first argument to
find
is the directory to start fromhere,
/
, the root directory. The
-mtime -1
option
tells
find
to locate all files that changed in the last 24 hours.
The
\! -type d
bit is complicated (and optional), but it cuts some unnecessary stuff from your output. It tells
find
to exclude directories from the resulting file list. The ! is a negation operator (meaning here, "exclude files of type d"), but put a backslash in front of it because
otherwise
the shell interprets it as a special character.
The
-print
option causes all filenames matching the search to be printed to standard output. We redirect standard output to a file for later use. Likewise, to locate all files that changed in the last week, use:
find / -mtime -7 -print > /tmp/filelist.weekly
Note that if you use
find
in this way, it traverses all mounted filesystems. If you have a CD-ROM mounted, for example,
find
attempts to locate all files on the CD-ROM as well (which you probably do not wish to backup). The
-xdev
option can be used to limit
find
's traversal to the local filesystem. Another approach would be to use
find
multiple times with a first argument other than
/
. See the manual page for
find
(1) for details.
Now you have produced a list of files to back up. Previously, when using
tar
, we specified the files to archive on the command line. However, this list of files may be too long for a single command line (which is usually limited to around 2048
characters
), and the list itself is contained within a file.
You can use the
-T
option with
tar
to specify a file containing a list of files for
tar
to back up. In order to use this option, you have to use an alternate syntax to
tar
in which all options are specified explicitly with dashes. For example, to back up the files listed in
/tmp/filelist.daily
to the device
/dev/qft0
, use the following command:
tar -cv -T /tmp/filelist.daily -f /dev/qft0
You can now write a short shell script that automatically produces the list of files and backs them up using
tar
. You can use
cron
to execute the script nightly at a certain time; all you have to do is make sure there's a tape in the drive. You can write similar scripts for your weekly and monthly backups.
cron
is covered in Chapter 10.