How Do We Back Up the System?


Team-Fly

	Solaris™ Operating Environment Boot Camp By David Rhodes, Dominic Butler
	Table of Contents

	Chapter 22. Backing Up and Restoring the System

Solaris provides us with many different commands for backing up and restoring data. Which one we choose will depend on a number of factors; it might simply be that it is the method that we are already most familiar with. We will look at the main commands and utilities in this section.

Just be aware that they are not compatible or interchangeable with each other, so the command we use to back up the data will determine the command needed to restore data from the backup.

Dd

The most basic of the backup and restore commands is probably dd (which stands for "direct dump"; it was originally to be called "copy and convert," but cc was already taken). It will copy data from one file to another and you can specify the block size, which can affect the speed of the copy.

In the following example, we use dd to back up the data stored in a file called fred (i.e., the contents of fred) into another file called fred.bak:

 hydrogen$ dd if=fred of=fred.bak 1125+1 records in 1125+1 records out hydrogen$ ls -l fred* -rw-rw-r--   1 jgreen   staff     576415 Nov 29 10:02 fred -rw-rw-r--   1 jgreen   staff     576415 Nov 29 10:34 fred.bak hydrogen$

Here, we literally set the input file (if) to point to the file fred and the output file (of) to fred.bak and let dd get on with making the copy. It will read from the input file one block at a time and write the data to the output file one block at a time. The default block size is 512 bytes. When it has finished, dd reports the number of complete and incomplete blocks that were copied. In this example, dd copied 1,125 complete blocks and one partial block.

To restore the data back from fred.bak we would just reverse the above so that the input file was fred.bak and the output file fred:

 hydrogen$ rm fred hydrogen$ dd if=fred.bak of=fred 1125+1 records in 1125+1 records out hydrogen$ ls -l fred* -rw-rw-r--   1 jgreen   staff     576415 Nov 29 10:48 fred -rw-rw-r--   1 jgreen   staff     576415 Nov 29 10:34 fred.bak hydrogen$

This is very straightforward but only useful if your system only has one file to back up, otherwise you would need to run an instance of dd for each file to be backed up. You may also have noted that the backup of fred will be created in the same directory as fred, which is not the usual way to secure your valuable data. In fact, while we are on the subject of the suitability of dd as a backup tool, surely the cp command would do exactly the same job with less typing.

It is true that if we wanted to copy fred to fred.bak, dd is probably not the obvious tool to use. However, if we wanted to copy fred to tape, that is another matter.

Let's assume that we only have one important file on the system (fred) and we will use dd to back it up. However, we don't want the backup copy to be located elsewhere on the hard diskwe want to back it up to tape instead. Fortunately, this is just as easy to do as the above example, but this time we just need to specify that the output file is the tape device (or rather the file associated with the tape devicesee Chapter 17, "Adding SCSI Devices").

 hydrogen# dd if=fred of=/dev/rmt/0 1125+1 records in 1125+1 records out hydrogen#

Here the data in the file fred is copied to the tape in the tape drive associated with rmt/0 (and now, at last, we have done something that we couldn't with the cp command). The reason we say that the data in fred is copied to tape rather than just saying fred is copied to tape is that the name associated with the data is not copied (nor are the permissions or anything else stored in the inodesee Chapter 6, "The Filesystem and Its Contents"). This command literally copies the data contained within the file fred on to the tape. In fact, not even the name of the file is stored on the tape with the data. To restore the data we would type:

 hydrogen# dd if=/dev/rmt/0 of=sebastobold hydrogen# ls -l fred sebastobold -rw-rw-r--   1 jgreen   staff     576415 Nov 29 10:48 fred -rw-rw-r--  1 root     other    576415 Nov 29 11:14 sebastobold hydrogen#

A file called sebastobold is created, which contains an exact copy of the data that was originally stored in the file fred. The ownership and group of the new file are determined by the user that ran the command and the permissions are based on that user's umask. So, even though the original file was owned by the user jgreen, because we restored the data as root the restored file is also owned by root.

You have probably already come to the conclusion that dd is not the most useful command for backing up a server as it will only back up one file and it doesn't store any of the file's ownership and permissions details. But, we saw in Chapter 6, "The Filesystem and Its Contents," each disk slice, and also each disk, has a file associated with it under the /dev directory. If we used this file as the input file to dd, then we could make a backup copy of a whole filesystem or disk with one simple command:

 hydrogen# dd if=/dev/rdsk/c0t0d0s2 of=/dev/rmt/0 bs=2028k 201751+1 records in 201751+1 records out hydrogen#

Here we are backing up the whole of disk c0t0d0 to tape (remember that sector 2 is equal to the whole disk). We have also increased the block size to 2,028 KB to speed things up a bit. This specifies the amount of data that is transferred with each read and write operation. If you set a small block size, the process will take longer as it is wasting time constantly switching from reading to writing. If you choose to copy from the block device rather than the raw device, you will find that the last block copied is padded with nulls to make it end on a block boundary, so setting a large block size could cause a large amount of wasted space.

It doesn't matter that we aren't saving the owner and permissions of the file /dev/c0t0d0s2 because we are storing this information for each of the files that are stored on that disk. If we want to restore the data, we set the input file to be the tape device and the output file to be the disk. We do not need to restore to the disk we actually backed up from; it could be any disk as long as it is big enough to hold all the data. However, if it is bigger than the original disk, we will not be able to access the whole of the new disk, as it will now appear to be the same size as the original. This is because it will now have the disk label from the original disk. Also, we should be aware that if the geometry of the disk we are restoring to is different from the one we backed up from, although it should still work, the performance may not be optimized, so trying to stick with the same type of disk is recommended. Recovery this way is quick and simple, but we can't recover individual files from this backup, just a whole disk at a time.

We can also use dd to copy directly from one disk to another, as shown below:

 hydrogen# dd if=/dev/rdsk/c0t0d0s2 of=/dev/rdsk/c1t0d0s2 bs=2048k 201751+1 records in 201751+1 records out hydrogen#

This will build the disk c1t0d0 as a complete replica of disk c0t0d0. This could be used to duplicate a large number of system disks, which can then be placed in other servers to provide a standard build. Alternatively if we have a small backup window but lots of room for removable disks, we could back each disk up to an identical one during our backup window, start the services back up again, then remove the duplicates to a safe location. If we need to recover the system, we can simply remove all the live disks and replace them with the backup disks (they were properly labelled, weren't they?). We can then boot up the system and carry on from the point of time at which the backup was taken.

If you look in the man page for dd, you will see that it has many uses other than simply copying data from one file (or device) to another. This includes activities such as converting data from one format to another. In conclusion, dd could not really be described as a general-purpose backup utility, but it is a useful tool to know about and can be used for a number of specific tasks.

Tar

We will now look at the utilities tar and cpio. Both these commands perform similar tasks. System Administrators tend to favor one or the other and may often be heard arguing over which is the better of the two. I suspect system administrators will prefer the one they learned first, as this is probably the one they also know most about.

The manual pages for tar provide full information on all its available options, of which there are many, so we will get straight on with an example showing how to back up the contents of a directory to tape:

 hydrogen# tar cvf /dev/rmt/0 ./* a ./dir1/file1 1K a ./dir1/file2 1K a ./dir2/file1 2K a ./dir2/file1 3K <lines removed for clarity> hydrogen#

This command says that we are going to create ("c" for create) a backup (or archive) to the file /dev/rmt/0 ("f" for file), which, in this case, is the tape device. We want to see the names of the files as they are backed up ("v" for verbose) and the files we want to back up are all those in the current directory ("*" matches all files). You will notice that the above example also backs up all subdirectories and their contents. If we want to use tar to back up a whole filesystem, we simply need to specify the root directory of the filesystem on the command line.

As an aside, the fact that we used "./*" in the above example may have an unwanted side effect because "./*" will not match files beginning with a dot in the current directory (often referred to as "hidden files"). This is fine if there are no such files. But if you know the directory does contain hidden files and you wish to include them, "." should be used instead of "./*".

If we want to check what is on the tape we can run the following command:

 hydrogen# tar tvf /dev/rmt/0 -rw-rw-r-- 1001/10    781 Oct  7 07:51 1999 ./dir1/file1 -rw-rw-r-- 1001/10    653 Nov 25 11:40 1999 ./dir1/file2 -rw-rw-r-- 1001/10   1391 Jun 18 15:00 2001 ./dir2/file1 -rw-rw-r-- 1001/10   2371 Jun 18 15:00 2001 ./dir2/file2 <lines removed for clarity> hydrogen#

The "t" stands for tablewe are asking to see the table of contents of the tape. You will notice that the output is similar to an ls -l listing, but instead of seeing the name of the owner and group, we see the UID and GID instead.

If we want to recover all the files on the tape, we would type:

 hydrogen# tar xvf /dev/rmt0 x ./dir1/file1, 781 bytes, 2 tape blocks x ./dir1/file2, 653 bytes, 2 tape blocks x ./dir2/file1, 1391 bytes, 3 tape blocks x ./dir2/file2, 2371 bytes, 5 tape blocks <lines removed for clarity> hydrogen#

This would extract ("x" option) all the files on the tape and put them in the current directory. If we only want to recover a few files, we can do so as follows:

 hydrogen# tar xvf /dev/rmt0 ./dir1/file1 ./dir2/file2 x ./dir1/file1, 781 bytes, 2 tape blocks x ./dir2/file2, 2371 bytes, 5 tape blocks hydrogen#

The files are restored relative to the directory you are in when you run the command, unless they are stored on the tape with full path names, in which case they will be restored to their original location. When we supply the names of the files we wish to restore, we have to specify each name exactly as it appears when we list the contents of the tape.

If any files backed up by tar have ACLs set, the file will be backed up but not the ACL. So if you make use of ACLs you need to have a way of reapplying these following a recovery.

When tar backs up files that are linked, it will simply back up one full copy of the file and the names it was linked to. This ensures that space is not wasted in the archive and the files will remain linked when they are restored. Tar is also able to handle symbolic links in the same way.

Cpio

The cpio command does a similar job to tar (it collects the files whose names it is supplied with into an archive) but it is used in a slightly different way. It is often recommended, over tar, as being more compatible between different versions of UNIX (although in practice I have never had any compatibility problems with either tar or cpio).

The cpio command expects the names of the files we require to be backed up, to be passed as a list on its standard input. This means we need a way of producing the list of files we want to back up. This is usually achieved using the find command:

 hydrogen# find /home -print | cpio -ovc -O /dev/rmt0 /home /home/lost+found <lines removed for clarity> 6142 blocks hydrogen#

This command will find all files and directories below the /home directory and pass their names to the cpio command, which will back them up to the tape device. We can restore all the files with the following command:

 hydrogen# cpio -ivcud -I /dev/rmt0 /home /home/lost+found <lines removed for clarity> 6142 blocks hydrogen#

This is a perfectly good way of backing up a directory tree, a filesystem, or even a whole system. But you need to be aware that because the find command presented each file or directory to cpio as a full path name, this is how they were backed up and it is the only practical way of restoring them. In other words, there is no easy way of restoring the files to a location other than their original location. (There is a way of doing itlook up the "-r" option in the cpio man pagebut unless you want to recover one or two files you won't want to go to all that trouble.) If you want the flexibility of restoring files to an alternative location, you should pass relative file and directory names to cpio:

 hydrogen# cd /home hydrogen# find . -print | cpio -ovc -O /dev/rmt0 . ./lost+found <lines removed for clarity> 6142 blocks hydrogen#

To list the contents of a cpio archive, we would use the "-t" option (as with tar), but we use it in addition to the "-i" option rather than instead of it:

 hydrogen# cpio -ivct -I /dev/rmt/0 drwxr-xr-x   10 root     root           0 Nov 23 10:45 2001, . drwxr-xr-x    2 root     root           0 Jun 28 13:48 2001, lost+found <lines removed for clarity> hydrogen#

If we wanted to restore a number of files from a cpio archive, we would specify them on the command line just as we did with the tar command:

 hydrogen# cpio -ivcdmu ./dir1/file1 ./dir2/file2 -I /dev/rmt/0 ./dir1/file1 ./dir2/file2 hydrogen#

The options we have used in this example are listed in Table 22.1.

Table 22.1. Cpio Options
Option	Description
i	We are reading data in from the archive, rather than writing out to the archive ("-o" option).
v	This is verbose mode. The name of each file processed is displayed on the standard output. If we used a capital V, cpio would display a dot for each filename instead.
c	We are using an ASCII header (which is always recommended for portability of the archive between platforms).
d	Directories will be created automatically if needed.
m	Retain the original modification time of the file.
u	If the file being restored already exists, unconditionally overwrite it.
I	We use "-I" before the input filename (or "-O" before the output filename). If "-I" is omitted cpio will read from standard input (likewise, if "-O" is omitted, cpio will write to standard output).

The commands tar and cpio are not only used for backing up datathey are also very useful for transferring files from one location on the directory tree to another. For example, if we had a group of files that we wanted to copy to other servers on the network, we could ftp each file individually, but this would be time-consuming and may not preserve the ownership and permissions of each file. Instead, we could place the files into a single archive file (using either tar or cpio), ftp that file to each server, and then extract the files at the other end.

Links and ACLs are handled by cpio in exactly the same way as tar handles them.

The cpio command can also be used to copy a directory along with all its subdirectories and files using the "-p" option:

 hydrogen# find . -print | cpio -pvdmu /new_location /new_location/. /new_location/file1 /new_location/dir1 /new_location/dir1/file2 <lines removed for clarity> hydrogen#

As well as copying all the files and directories, the above command will also preserve all ownerships and permissions.

The same action can be undertaken with both tar and ufsdump/ufsrestore, but the commands used are more cumbersome. Examples of both are shown here, although we haven't yet looked at ufsdump and ufsrestore:

 hydrogen# tar cvf - . | (cd /new_directory; tar xvf -) a ./dir1/file1 1K a ./dir2/file1 1K x ./dir1/file1, 802 bytes, 2 tape blocks x ./dir2/file2, 649 bytes, 2 tape blocks <lines removed for clarity> hydrogen# ufsdump 0f - . | (cd /new_directory; ufsrestore rf - )   DUMP: Writing 32 Kilobyte records   DUMP: Date of this level 0 dump: Sat 29 Dec 2001 17:00:28 GMT   DUMP: Date of last level 0 dump: the epoch   DUMP: Dumping /dev/rdsk/c0t2d0s1 (hydrogen:/var) to standard output.   DUMP: Mapping (Pass I) [regular files]   DUMP: Mapping (Pass II) [directories]   DUMP: Estimated 2844 blocks (1.39MB).   DUMP: Dumping (Pass III) [directories]   DUMP: Dumping (Pass IV) [regular files]   DUMP: 2814 blocks (1.37MB) on 1 volume at 363 KB/sec   DUMP: DUMP IS DONE <lines removed for clarity> hydrogen#

Both these examples work in a similar way. They have a backup process writing to one end of a pipe and a restore process reading from the other end. The example using ufsdump/ufsrestore is recommended as being the quickest of the three methods mentioned for copying large quantities of data between two directories. However, because ufsdump likes to work with entire filesystems, if we are not copying a whole filesystem we will get a number of messages complaining that certain directories were not found on the volume. However, these can be ignored since the command will copy what we expected it to.

How Do We Fit More Data on the Tape?

Before we move on and look at the commands ufsdump and ufsrestore in more detail, we'll have a quick diversion and look at data compression. Compression is a way of reducing the amount of storage space required for a given chunk of data by converting it into a form which represents the data more efficiently.

Software Compression

If we wanted to compress the data in a file so it used up less space, we would run a compression program on the file. The program would look through the data and work out how to reduce the space taken up based on a compression algorithm. The compressed version of the file is usually given a specific extension to make it is easier to distinguish compressed files from standard files.

Unfortunately, once a file has been compressed it cannot be used in the same way as it was before, because the data contained in the file is now different. If you want to use the data in the file you must uncompress it first, which will make it exactly the same as it was before (which of course means it now takes up the original amount of disk space).

The following example shows how we can compress a file to reduce the amount of space it uses on the disk:

 hydrogen# ls -l total 1304 -rw-r--r--   1 root     other     665205 Dec  9 16:54 fred hydrogen# compress fred hydrogen# ls -l total 216 -rw-r--r--   1 root     other     110247 Dec  9 16:54 fred.Z hydrogen#

There are a number of different commands available that can compress and uncompress files. They are not interchangeable and come in pairs (e.g., compress goes with uncompress and pack goes with unpack). The extension given to the compressed file will help you to see which command was used to perform the compression. The compress command will produce a compressed file ending in ".Z" and pack will produce a file ending with ".z."

We can see that we have managed to save a fair bit of space by compressing this file, but now that it is compressed we cannot use it unless we uncompress it. The original file was a standard ASCII text file, but if we look at what type of file the compressed version is, using the file command, we see that it is no longer a text file. Consequently, it is no good trying to look at its contents with any of the usual Solaris commands (e.g., cat, vi, pg, or more):

 hydrogen# file fred.Z fred.Z:         compressed data block compressed 16 bits hydrogen# uncompress fred hydrogen# file fred fred:           ascii text hydrogen# ls -l fred -rw-r--r--   1 root     other     665205 Dec  9 16:54 fred hydrogen#

The file command looks in the file supplied as an argument and tries to work out what type of file it is. It isn't actually clever enough to know what all the possible file types are; it knows some common file types, but will also make use of the information contained in /etc/magic to help it out. This file contains a list of byte positions along with byte sequences and descriptive text. If a file contains one of the byte sequences at the byte position specified, the text is displayed.

Since the file command doesn't actually know how to tell if a file is compressed, it does a bit of cheating. If we search for the string "compressed" in the magic file, we can see exactly how the file command knows that the file fred.Z is actually compressed:

 hydrogen# grep "compressed data" /etc/magic 0       string          \037\235        compressed data hydrogen#

Here we see that if a file contains a string of the character sequence represented by \037\235 at byte position 0, the file command will print the text "compressed data." The remainder of the output from the file command (shown above) was displayed because other sequences matched other lines in the /etc/magic file.

If we do need to uncompress a file, we may not have enough space on the disk to perform this operation (after all, it is probably lack of disk space that caused us to compress the file in the first place). To save us having to juggle files between filesystems to free up enough space, there is a command provided that will let us look at a compressed file without uncompressing it first. This command is zcat, which is just like the standard cat command, but it will uncompress the data it reads from the file, in memory, writing the uncompressed version to its stdout:

 hydrogen# file archive.Z archive.Z:      compressed data block compressed 16 bits hydrogen# zcat archive | pg first line of the archived file <lines removed for clarity> last line of the archived file hydrogen#

When you use the commands uncompress or zcat, they expect the file to have a ".Z" extension, so you do not need to supply it. The command zcat has a number of useful applications; for example, if you wish to search for a string in a compressed file (in the same way you would use grep), then you could simply pipe the output from zcat into the grep command:

 hydrogen# zcat compressed_file.Z | grep "string" <lines removed for clarity> hydrogen#

So, how does all this help us to fit more data on the tapes? Well, we could compress each file before we backed it up, but a better way would be to compress the data as it was backed up leaving the version on disk in its original form.

To do this using cpio but compressing the data as it is being backed up we could use the following command:

 hydrogen# find . -print | cpio -ovc | compress >/tmp/backup.cpio <lines removed for clarity> hydrogen# ls -l /tmp/backup.cpio -rw-r--r--   1 root    other    168649 Dec 19 17:13 /tmp/backup.cpio hydrogen# file /tmp/backup.cpio /tmp/backup.cpio:       compressed data block compressed 16 bits hydrogen#

When compress is used without any arguments, it will expect to receive its input as a stream of data on its standard input (stdin). It will compress any data it reads and write the compressed version to its standard output (stdout). The result of the above example is that the file /tmp/backup contains a compressed version of the cpio archive. If we were backing up to tape, we would simply replace the output file with the name of the tape device (e.g., /dev/rmt/0).

To view the contents of the archive or recover data from the archive we simply need to use the uncompress command at the beginning of the command:

 hydrogen# uncompress </tmp/backup.cpio | cpio -ivct drwxrwxr-x    6 root     sys            0 Apr 29 16:04 2001, . drwxrwxr-x    2 adm      adm            0 Oct 18 22:55 2000, log drwxrwxr-x    2 adm      adm            0 Oct 18 22:55 2000, passwd -rw-r--r--    1 root     bin          360 Dec 19 17:08 2001, utmp <lines removed for clarity> hydrogen# uncompress </tmp/backup.cpio | cpio -ivc "utmp" -rw-r--r--    1 root    bin          360 Dec 19 17:08 2001, utmp 2755 blocks hydrogen#

The first example shows how we can view the contents of the compressed cpio archive, and the second shows how we can extract a file from it.

The following examples show how we can do the same using the tar command (this time we will also show that zcat can be used when reading from the compressed backup file, though zcat expects the file to end in ".Z" and so we will need to rename it):

 hydrogen# tar cvf - . | compress >/tmp/backup.tar\ <lines removed for clarity> hydrogen# ls -l /tmp/backup.tar -rw-r--r--   1 root     other     169286 Dec 19 17:28 /tmp/backup.tar hydrogen# file /tmp/backup.tar /tmp/backup.tar:        compressed data block compressed 16 bits hydrogen# uncompress </tmp/backup.tar | tar tvf - drwxrwxr-x   0/3        0 Apr 29 16:04 2001 ./ drwxrwxr-x   4/4        0 Oct 18 22:55 2000 ./log/ drwxrwxr-x   4/4        0 Oct 18 22:55 2000 ./passwd/ -rw-r--r--   0/2      360 Dec 19 17:08 2001 ./utmp <lines removed for clarity> hydrogen# mv /tmp/backup.tar /tmp/backup.tar.Z hydrogen# zcat /tmp/backup.tar.Z | tar xvf - "utmp" x ./utmp, 360 bytes, 1 tape blocks hydrogen#

If we are running short of tape space we can compress the data as shown, which will enable us to get more data on the tape. However, this will increase the overall time taken to perform the backup.

Hardware Compression

Some tape drives have the ability to perform compression themselves by making use of the actual hardware on the tape device. If we can perform the compression this way, we can relieve some of the workload of the server CPU. The other benefit is that we won't need to remember where to put the compress or uncompress commands in our backup and restore scripts.

To make use of hardware compression, we just need to alter the name of the device we use, so if our tape device was /dev/rmt/0 we would use the device /dev/rmt/0c to either write to or read from the tape in compressed mode. Depending on the facilities available on the tape drive, there will be a number of different device names that can be used. Most tape drives will provide as a minimum a rewind device (which is the name we would normally use) and a nonrewind device (which will cause the tape to remain at its current position after use). The nonrewind device will end in the letter "n"for example, /dev/rmt/0n (see Chapter 17, "Adding SCSI Devices").

Ufsdump and Ufsrestore

The commands we have looked at so far have been concerned with backing up or archiving files. With these commands, simply supply the command with a filename and it will be backed up. However, there are also commands that are designed for backing up and restoring a whole filesystem. The commands that we will look at are probably the ones most commonly used for backing up live systems (where a third-party backup tool is not being used). These are ufsdump (which will back up a filesystem or individual files) and ufsrestore (which will recover either a whole filesystem or individual files). They are based on two older commands, dump and restore, but are designed for use with UFS filesystems.

Full Backup

We will start by looking at how to perform a full backup of a single filesystem, then look at incremental backups, and end up with a single backup script that we can use on all our systems:

 hydrogen# ufsdump 0uf /tmp/var.backup /var DUMP: Writing 32 Kilobyte records DUMP: Date of this level 0 dump: Sat 08 Dec 10:05:15 2001 DUMP: Date of last level 0 dump: the epoch DUMP: Dumping /dev/rdsk/c0d0s4 (hydrogen:/var) to /tmp/var.backup. DUMP: Mapping (Pass I) [regular files] DUMP: Mapping (Pass II) [directories] DUMP: Estimated 2235926 blocks (1091.76MB). DUMP: Dumping (Pass III) [directories] DUMP: Dumping (Pass IV) [regular files] DUMP: 2235774 blocks (1091.69MB) on 1 volume at 11100 KB/sec DUMP: DUMP IS DONE DUMP: Level 0 dump on Sat Dec 08 10:05:15 2001 hydrogen# ls -l /tmp/var.backup -rw-rw-r--   1 root    other   1144717312 Dec  8 10:08 /tmp/var.backup hydrogen# file /tmp/var.backup /tmp/var.backup:  ufsdump archive file hydrogen#

This command backs up the whole of the filesystem /var to the file /tmp/var.backup. If we had wanted to write the dump to tape, we would just replace the filename we specified with the filename associated with the tape device (e.g., /dev/rmt/0). We have specified that we have used a dump level of 0, which will always result in a full rather than an incremental backup (we will look at dump levels in more detail when we look at incremental backups.

Backing up a filesystem using ufsdump is generally much faster than using either tar or cpio, but we still have the control to either restore the whole filesystem or just the files we wish:

 hydrogen# ufsrestore rf /dev/rmt/0 hydrogen#

We use the "r" option to specify that we want to restore the entire contents of the tape without any prompting for input from the user. The backup will be restored relative to the current directory and there will be no output (apart from errors and warnings) unless the "v" flag is also specified. We can also restore individual files; ufsrestore actually provides a nice interactive interface for doing so:

 hydrogen# ufsrestore if /dev/rmt/0 ufsrestore > ls .:  adm/        dt/         news/       sadm/        tmp/  audit/      log/        nis/        saf/         yp/  crash/      lost+found/ ntp/        snmp/  cron/       lp/         opt/        spool/  dmi/        mail/       preserve/   statmon/ ufsrestore > cd adm ufsrestore > ls ./adm:  acct/       log/        sa/         utmp        wtmp  aculog      messages    spellhist   utmpx       wtmp.old  lastlog     passwd/     sulog       vold.log    wtmpx ufsrestore > add utmp ufsrestore > extract You have not read any volumes yet. Unless you know which volume your file(s) are on you should start with the last volume and work towards the first. Specify next volume #: 1 set owner/mode for '.'? [yn] n ufsrestore > quit hydrogen#

We can add as many files as we wish before typing "extract," which will then cause ufsrestore to begin recovering those files. When we are asked to "specify next volume," we should enter "1" since we know the file we want is on this tape. If we were recovering from an incremental backup, we might not know which volume contained the file we wanted so we might need to look through more than one.

Backing Up Many Filesystems to One Tape

We can see that ufsdump and ufsrestore have some useful features, but we may have one slight problem. If we use ufsdump to back up the first filesystem onto tape, how do we back up the second of the filesystems without overwriting the first, and so on? The answer to this is to use the nonrewind device. When you access a tape using the standard device (e.g., /dev/rmt/0), as soon as the command that is accessing the device completes, the tape is automatically rewound. But if you use the nonrewind device (e.g., /dev/rmt/0n), then when the command completes the tape will remain at its current position. You can then back up another filesystem; as long as you keep using the nonrewind device, the tape will not be rewound and you can go on adding more filesystems.

When you have finished, the tape will need rewinding before you can access any of the data you have placed on it. The tape can be rewound using the mt command as follows:

 hydrogen# mt -f /dev/rmt/0 rewind hydrogen#

This command is specifically instructing the tape device to rewind the tape. However, we have just seen that any command that opens the tape device using the standard device name will cause the tape to automatically rewind when it closes the device file on completion. Therefore, all we need to actually do is open and close the device. This can be achieved simply by typing the following:

 hydrogen# </dev/rmt/0 hydrogen#

In this example, we haven't actually typed a command in, just an instruction for the shell to open the tape device for reading. As we saw in Chapter 5, "Shells," when we type "return" at the end of a line the shell scans our line looking to see what we want it to do. It sees the less-than symbol and knows that this means we are redirecting the standard input. Then it closes the standard input and opens the file we have specified (in this case, /dev/rmt/0) so it will get the file descriptor that standard input had. The shell then removes this part of the line and looks to see if there is anything else for it to do before actually running the command. At this point, there is actually nothing left on the command line so the shell simply closes the file it opened and gets ready for its next instruction. By the time the close operation is complete, the tape has rewound and the shell can then display the prompt for the next command. This means that you will notice a delay before the shell displays the next prompt, but you know that when it is displayed the tape is now back at the beginning.

Restoring Files from a Tape with Multifile Sets

Backing up many filesystems to a single tape is easy enough. But when it comes to trying to restore the data, things get slightly more complicated. If we try reading from a tape containing many file sets, we will see that the ufsrestore command actually only reads the first one then acts as though it has reached the end of the tape. This is because each file set acts as though it was a complete tape archive in its own right.

The problem we have then is how to get the tape to go past the end of the first file set and stop at the beginning of the one we want. One way to do this would be to use ufsrestore to display a table of contents of the tape using the nonrewind device. When the command had finished, the tape position would be at the start of the second file set. If we issued the same command again, the tape would be at the beginning of the third file set, and so on.

We now have a way of getting to the position we want the tape to be at, but it is a bit tedious. What we need is a command that takes us straight to the point in the tape that we want to be, without us needing to display the list of files stored on each file set. Fortunately, the command mt allows us to do exactly this:

 hydrogen# mt -f /dev/rmt/0n fsf 3 hydrogen#

This command will cause mt to wind the tape forward and skip three end-of-file markers, leaving the tape positioned at the beginning of the fourth file set, which will be the fourth filesystem backed up. It is important to remember to use the nonrewind device, otherwise mt will find the correct place but the tape will then be automatically rewound back to the beginning.

Incremental Backups

We have seen that the amount of tape space required for a backup can be reduced using data compression. But another way of doing this, that also helps reduce the amount of time taken to perform the backup, is to perform an incremental backup. This will only back up files that have changed since a specific time or event in the past. This would usually be when the last backup was taken. An incremental backup is not something you would normally do only once, but would be part of your overall backup strategy.

The process would usually be that a full backup is taken, followed by an incremental backup each night. The first incremental backup would just contain the files that had changed since the full backup was taken. The second incremental could contain all the files that had changed since the full backup, or only those that have changed since they were last backed up.

The former method will result in each incremental backup containing progressively more data, while the latter would ensure that the backup was as small as possible but would increase the work when it came to restoring data. If we wanted to restore the entire contents from an incremental backup (which could be an entire filesystem or an entire system), we would first need to restore the entire contents of the initial full backup then restore the latest version of the files that have changed since the full backup.

If the incremental backup backed up all files that had changed since the full backup every night, then we would only need to restore the entire contents of the latest incremental backup to be sure we had recovered the latest version of everything. However, if our incremental backups only contained the files that had changed since they were previously backed up, we would need to restore all files from every tape used since the initial full backup, starting with the oldest, to be sure that we had recovered everything correctly.

Thus, the trade-off is between having an incremental backup that gets progressively larger each night, but makes it simple to restore from, or an incremental backup that remains small, but will take more effort and a longer time to restore from. A similar problem exists even if you only want to restore a single file. With the former type of incremental backup, we know that the latest version of the file is on either the full backup tape or the latest incremental tape. However, with the latter method, we do not know which tape holds the latest version of the file without trying to restore from each tape in turn (again, starting with the oldest). We could make it easier, however, if we made a list of what files had been backed up each night. We could then search through the lists to see which tape held the latest version of the file in question.

Which form of incremental backup we choose to implement depends on the local requirements. It may well be preferable to keep the backup as small as possible and not worry too much about the extra time taken to restore, since we hope that this is not something we will need to do often.

If we wanted to perform incremental backups using cpio, we would have to manage which files were to be backed up ourselves, since cpio does not know anything about incremental backups. It will simply back up the files that are supplied to it on its standard input. This isn't such a problem though, since we can make use of the "-newer" option of the find command. Assuming that the file named /backup_marker was created at the time of the last full backup, the following command would back up all files that had changed since then:

 hydrogen# find . -newer /backup_marker -print | cpio -ovc /dev/rmt/0 <lines removed for clarity> hydrogen#

The same is also true for the tar command. The following example shows how we can use the same method to take an incremental backup using tar:

 hydrogen# tar cvf /dev/rmt/0 $(find . -newer /backup_marker -print) <lines removed for clarity> hydrogen#

As mentioned earlier, ufsdump is aware of the concept of an incremental backup, so we don't need to worry about how we will work out which files to back up. Whether an incremental backup is being performed depends upon the dump level that is selected when the ufsdump command is issued. If the dump level is 0, ufsdump will always do a full backup. If the dump level is in the range 1 to 9, ufsdump will back up all files that have changed since the last backup at a lower dump level was taken.

For example, we might perform a level-0 backup on the first night followed by a level-2 backup the next night. If we perform a level-4 backup the following night it will only back up files that have changed since the level-2 backup; if we perform a level-1 backup the following night, we will get all the files that changed since the level-0 backup. The concept of dump levels gives us more flexibility in balancing the amount of data backed up against the ease of restoration. As always, with incremental backups, the procedure for recovering files will be much simpler if you have a list of the files included on each tape.


Team-Fly

Top

Dd

Tar