Section 3.1. An Overview


3.1. An Overview

This chapter describes the benefits and pitfalls of several utilities. For all versions of Windows since NT, ntbackup is the only native choice for a traditional backup application, although you should also be familiar with System Restore. Mac OS X users running a version greater than 10.4 have a number of Unix-based backup tools available to them, including cpio, tar, rsync, and ditto. For commercial Unix systems, dump and restore are quite popular, but they're not considered a viable option on Linux. dump is available on Mac OS, but it doesn't support HFS+. After dump and restore, the native backup utility with the most features is cpio, but it is less user friendly than its cousin tar. tar is incredibly easy to use and is more portable than either dump or cpio. The GNU versions of tar and cpio have much more functionality than either of the native versions. If you have to back up raw devices or perform remote backups with tar or cpio, dd will be your new best friend. Finally, rsync can be used to copy data between filesystems on Windows, Mac OS, Linux, and Unix.

This chapter begins with an overview of each of these backup utilities. It then goes into detail about the syntax for each command for both backup and recovery. Finally, near the end of the chapter, you'll find an invaluable comparison chart that can be used as a quick-reference guide for comparing tar, cpio, and dump.

How Not to Use dump

I went on one gig to fix a client's "email" problem. Turns out it wasn't an email problem; it was a DNS problem. They also asked me to look at their backups. What I found was appalling. They were doing backups by issuing commands to run dump out of cron:

They didn't write a script; they just issued successive dump commands at different time intervals. Subsequent dumps were being executed before the previous one was finished.

They used the rewind device driver.

They were amazed they could fit everything on one tape!


3.1.1. How Mac OS Filesystems Are Different

Leon Towns-von Stauber (the author of Chapter 14) contributed this information about Mac OS backups.


What can make Mac OS X backups tricky is the default native filesystem format, HFS+, which is the advanced version of the legacy Macintosh Hierarchical File System. There are significant differences between HFS+ and the Unix File System (UFS), including support for forks (multiple sets of data associated with a single file) and specialized file attributes (such as type, creator, and creation date). While Mac OS X can work with UFS filesystems, the UFS format is not nearly as commonly used as HFS+, nor as well supported by Apple and third-party software vendors.

A utility not designed to handle the unique features of HFS+ can cause backups to go haywire, losing essential forks and attributes, making full restoration impossible. The biggest problem is the resource fork, a set of auxiliary data associated with many kinds of Macintosh files. Despite being frowned upon by Apple since the release of Mac OS X, many applications still use resource forks to store information such as thumbnail icons for image files, and even Apple still uses them to store the contents of aliases, which are the GUI equivalents to symbolic links.

Before Tiger (Mac OS X 10.4), even the Unix-standard native utilities ignored forks and Macintosh attributes. If you're using Mac OS X 10.3 or earlier without third-party tools, your best options are CpMac (an HFS+-aware cp equivalent included with the Developer Tools), ditto (a recursive copying utility that supports resource forks and HFS+ attributes through use of the rsrc flag), or asr (Apple System Restore, a volume cloning utility).

Due to the difficulty of making backups of Mac OS X systems before Tiger, a number of Mac OS X-specific variants of standard backup utilities sprang up on the Internet, including hfstar, xtar, hfspax, rsync_hfs, and psync, along with graphical frontends such as RsyncX, PsyncX, and Carbon Copy Cloner. Cross-platform applications such as Amanda and BackupPC also used these tools to support HFS+ backups.

3.1.2. cpio

cpio can be a very powerful backup tool. Its most important feature is its ability to accept the list of files to be backed up from standard input. It's the only native utility that can do this. This feature can be combined with the use of touch files and the find command to create incremental backups.

Unlike dump, however, cpio cannot:

  • Perform incremental backups without the use of touch files and find

  • Leave both atime and ctime unchanged after a backup (see the section "Don't Forget Unix mtime, atime, and ctime" in Chapter 2)

  • Perform an interactive restore, like the -i option in restore

3.1.2.1. Why isn't cpio more popular?

If cpio is so powerful, why is tar more popular? One reason is that the basic operations of tar are much simpler (and more standard) than the same operations in cpio. For example, every version of tar supports tar cf device and tar xf device, whereas cpio sometimes supports the -I and -O options and sometimes does not. If you add up all the cpio options available on all the various versions, you would find more than 40 of them. There are also some arguments that use the same letter but have completely different functions on different versions of Unix. Another reason why tar is more popular is the development of GNU tar. It combines the power of cpio with tar's ease of use.

3.1.3. ditto

ditto is found only on Mac OS systems and is normally used to clone one disk to another; it is used in that fashion in Chapter 14. ditto can be also used to create a ZIP or cpio file. Because we use the tool in this book, and it's commonly used in Mac OS environments, it's covered in this chapter.

3.1.4. dd

The dd command is not a backup command used by most people. It is a very low-level command designed for copying bits of information from one place to another. It does not have any knowledge of the structure of the data it is copyingit doesn't need to. Therefore, unlike dump, tar, and cpio, it is not used to copy a group of files to a backup volume. It can copy a single file, a part of a file, a raw partition, or a part of a raw partition, and can even copy data from stdin to stdout while modifying it en route. Again, although it can copy a file, it has no knowledge of the filename or contents once it has done so. It simply copies the bytes that are in the place from which you told it to copy. It then puts those bytes where you told it to put them.

Although dd is rather simplistic, it is extremely flexible. It can copy files or partitions regardless of format. It can translate data between two different platforms, such as EBCDIC to ASCII, or big endian to little endian. (The concept of big endian/little endian is explained in detail in the section "The Little Endian That Couldn't" in Chapter 23.) A perfect example of dd's flexibility is the Oracle backup script included in Chapter 16. Oracle data is allowed to be in files in the filesystem or on raw disk partitions. Since the script could not predict which configuration each DBA would use, it used dd, because it could copy both files and raw partitions. That way the DBA can use whichever configuration makes most sense for his application, and the script will automatically back up either configuration. It even backs up a mixed configuration, in which some of the data sits on files and some sits on raw partitions. This is the kind of flexibility dd gives you.

3.1.5. dump and restore

dump and restore are considered by many to be the most powerful tools in the Unix backup toolbox. dump and restore's differentiating features include being able to back up files without changing their access time and being able to use a mini shell to interactively select the files you want to restore before you begin. dump and restore are relatively sophisticated commands, with simple interfaces whose essential options are the same on most Unix systems. There is a lot of controversy surrounding dump and whether or not it can properly back up an active filesystem. Read more about that in the dump section later in this chapter.

3.1.6. ntbackup

This is the only native tool in Windows that you can use to create a traditional backup, although some people do download and use GNU tar or rsync on their Windows systems. Like the Unix utilities covered in this chapter, it can back up to disk or tape, and you can specify a number of options. You can even save these options in a configuration file and then tell Windows to use that configuration file when ntbackup runs. The configuration file allows you to run automated backups with this tool.

3.1.7. rsync

Think of rsync as an open-source, fancier version of the Unix rcp command, that can be used to synchronize two folders even if they're on separate systems. Its basic syntax is essentially the same as rcp, so those familiar with that command should find rsync very easy to understand. Two of the open-source backup products covered in this book use rsync with other tools to provide backup and recovery functionality, so we'll cover its basic functionality in this chapter.

3.1.8. System Restore

System Restore isn't quite like the other tools in this chapter, but it's important to mention it. Since Windows 2000, you can use System Restore to create a snapshot of your system. It backs up a few critical files and your registry, allowing you to roll back your system state to a previous point in time.

3.1.9. tar

The greatest feature of tar is its wide acceptance, which is due in large part to its ease of use. Nearly everyone knows how to read a tar volume. If they don't, it's really easy to show them how. If it is a tar file on disk or even a compressed tar file, programs such as WinZip[*] can automatically decompress it and read what's inside. (WinZip cannot open a cpio archive.) It is also much more portable between Unix platforms than dump or cpio.[]

[*] WinZip is a registered trademark of Nico Mak Computing, Inc. You can download a demo version from http://www.winzip.com.

[] The DJGPP project, a port of gcc and the GNU tools and utilities suites to MS-DOS and Windows, made cpio its portable archive standard and has ported both GNU cpio and GNU tar to DOS and Windows as 32-bit executables.

If you need to make a quick backup of a directory or a set of files, it's hard to beat tar's ease of use. However, if you need to make regular backups, you'll be looking for features that the native version of tar does not have. Among other things, you'll want to make incremental backups, leave atime alone, and make sure that you're restoring the proper permissions and ownership of files. To do these sorts of things, you can use GNU tar, or you can look at cpio.

The explanations of the basic backup utilities that follow are not meant to replace the official documentation for those commands. You should definitely become familiar with the documentation for each command. It may contain anything from minor to major caveats for that particular OS. In some cases, vendors document an extra feature or two. Always stay up to date with the documentation for your backup commandwhatever it is.


3.1.10. Other Utilities

This section contains a list of commands that we don't cover in this book for various reasons.

3.1.10.1. asr

asr, for Apple System Restore, is an imaging utility found only on Mac OS systems. It is used primarily as a bulk-cloning tool, similar to the way Windows customers use the ghost utility. It is an image-based utility and can be used to copy directly from one hard drive to another or to create a disk image of a hard drive, similar to an ISO file in other operating systems. Such a file carries a .dmg extension.

3.1.10.2. pax

The portable archive exchange, or pax, utility produces a portable archive that conforms to the Archive/Interchange File Format specified in IEEE Std. 1003.1-1988. pax also can read and write a number of other file formats such as tar or cpio and is used by the Mac OS install utility. Like many things in the Unix world, pax has a group of devoted followers that swear it's the best way to go. However, it will not be covered here because most people don't use it.

3.1.10.3. psync, rsyncx, hfstar, xtar, and hfspax

Since Mac OS X was built on top of a Mach Unix kernel, it shipped with a number of Unix-style tools such as tar, cpio, pax, cp, and rsync. Unfortunately, the early Mac OS versions of these tools did not support the concept of a multifork filesystem such as HFS+, and GNU tar didn't support it either.

psync, rsyncx, hfstar, xtar, and hfspax are all tools contributed by the Mac OS community that were designed to overcome the limitations of Mac OS's native tools. psync and rsyncx were written to behave like rsync, but to properly handle resource forks. hfstar and xtar behaved like tar but handled resource forks. Finally, hfspax did the same thing for pax.

As of Mac OS 10.4.x, tar, pax, cp, and rsync all properly handle resource forks using the AppleDouble format. (According to Apple, these commands now use the same API as Spotlight, the Mac OS search tool.) When a file is copied into a format that doesn't support multiple forks, such as tar, cpio, or even a UFS filesystem on a Mac OS system, the tools mentioned here convert the file into two files. The first file contains the data fork, or actual data for the file. The second file is the header file; it stores the resource fork and finder information. The datafile is stored using the original filename for the file. The header file is the name of the file preceded by the string "._":

mydocument.txt ._mydocument.txt

When the multifork file is copied or restored from the nonmultifork format (tar, cpio, UFS) into a multifork format (HFS+), the two files are converted back to a single file with a data fork and a resource fork.




Backup & Recovery
Backup & Recovery: Inexpensive Backup Solutions for Open Systems
ISBN: 0596102461
EAN: 2147483647
Year: 2006
Pages: 237

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net