Section 3.10. Backing Up and Restoring with the dd Utility | Backup & Recovery: Inexpensive Backup Solutions for Open Systems

3.10. Backing Up and Restoring with the dd Utility

As far as backup utilities go, the dd utility is about as featureless as they come. However, it is uniquely suited for certain applications.

3.10.1. Basic dd Options

The basic syntax of dd is as follows:

# dd if=device of=device bs=blocksize

The preceding options are used almost every time you run dd; they are explained in the following sections.

3.10.1.1. Specifying the input file

The if= argument specifies the input file or the file from which dd is going to copy the data. This is the file or raw partition that you are going to back up (e.g., dd if=/dev/dsk/c0t0d0s0 or dd if=/home/file). If you want dd to look at stdin for its data, you don't need this argument.

3.10.1.2. Specifying the output file

The of= argument specifies the output file or the file to which you are sending the data. This could be a file on disk or an optical platter, another raw partition, or a tape drive^[] (e.g., dd of=/backup/file, dd of=/dev/rmt/0n). If you are sending to stdout, you dont need this argument.

^[] Of course, a tape drive is another raw device as well.

3.10.1.3. Specifying the block size

The bs= argument specifies the block size, or the amount of data that will be transferred in one I/O operation. This value is normally expressed in bytes, but in most versions of dd, it can also be specified in kilobytes by adding a k at the end of the number (e.g., 10 K). (A block size is different from a blocking factor, like dump and tar use, which is multiplied by a fixed value known as the minimum block size. A blocking factor of 20 with a minimum block size of 512 gives you an actual block size of 10,240, or 10 K.) It should be noted that when reading from or writing to a pipe, dd defaults to a block size of 1.

Changing block size does not affect how the data is physically written to a disk device, such as a file on disk or optical platter. Using a large block size just makes the data transfer more efficient. When writing to a tape device, however, each block becomes a record, and each record is separated by an interrecord gap. Once a tape is written with a certain block size, it must be read with that block size or a multiple of that block size. (For example, if a tape is written with a block size of 1,024, you must use the block size of 1,024 when reading it, or you may use 2,048 or 10,240, which are multiples of 1,024.) Again, this applies only to tape devices, not disk-like devices.

3.10.1.4. Specifying the input and output block sizes separately

When specifying block size with the option bs=, you are specifying both the incoming and outgoing block size. Sometimes you may need different block sizes on each. This is done with the ibs= and obs= options. For example, to read a tape with one block size and create a tape with another, you could issue a command such as this one:

# dd if=/dev/rmt/0 ibs=10k of=/dev/rmt/1 obs=64k

3.10.1.5. Specifying the number of records to read

The count=n option tells dd how many records (blocks) to read. You can use this to read the first few blocks of a file or tape to see what kind of data it is, for example (see the following section for more information). You can also use it to have dd tell you what block size a tape was written in.

3.10.2. Using dd to Copy a File or Raw Device

You can use dd as a backup command because it can copy the bits in a file or raw device to another location. You can even pipe the bit stream through compress, allowing you to store a compressed copy of the data. (dump, tar, and cpio do not have this capability, although GNU tar does.) The best example of using dd as a backup command is the hot-backup script for Oracle, oraback.sh (see Chapter 16 for more information about oraback.sh). Since Oracle can use both raw partitions and files for its database files, the script cannot predict which command to use. However, dd supports both of them!

3.10.3. Using dd to Convert Data

The dd command also can be used to convert data from one format to another in one pass.

3.10.3.1. Converting data to go into another command

Again, this is done by using different input and output block sizes (ibs=, obs=). If a command, such as restore, can read only certain block sizes, and you have a volume that was written in another block size, you can use dd to read the volume, and pipe the results of dd into restore.

3.10.3.2. Converting data that is in the wrong format

Although you may think of dd as a bit copier, it also can manipulate the format of the data, such as converting between different character sets, upper- and lowercase, and fixed- and variable-length records:

conv=ascii: Converts EBCDIC to ASCII
conv=ebcdic: Converts ASCII to EBCDIC
conv=ibm: Converts ASCII to EBCDIC using the IBM conversion table
conv=lcase: Maps US ASCII alphabetic characters to their lowercase counterparts
conv=ucase: Maps US ASCII alphabetic characters to their uppercase counterparts
conv=swab: Swaps every pair of bytes; can be used to read a volume written in different byte order
conv=noerror: Does not stop processing on an error
conv=sync: Pads every input block to input block size (ibs)
conv=notrunc: Does not truncate the existing file on output
conv=block: Converts the input record to a fixed length specified by cbs
conv=unblock: Converts fixed-length records to variable length
conv=..., ...: Uses multiple conversion methods separated by commas

3.10.4. Using dd to Determine the Block Size of a Tape

This is kind of a neat trick. If you tell dd to read one block of data and then write it to disk, you can look at the size of that block to see what the block size of the tape is. Since you don't know the block size, start by using the largest block size that your operating system supports for that device, which is usually 128 K or 256 K, although it could be higher:

# dd if=device bs=128k of=/tmp/junk count=1

This tells dd to read data, using a block size of 128 K, until it gets to the first interrecord gap. If the block size is smaller than 128 K, it stops there. If it's bigger than 128 K, dd interprets it as an I/O error and complains. Just increase the block size value and try again. (Try 256 K this time.) This process creates a file called /tmp/junk. The size of that file is the block size of the tape!

3.10.5. Using dd to Figure out the Backup Format

Here's another trick. Use the same command as in the preceding section to create the file /tmp/junk, then issue the command:

# file /tmp/junk

This uses /etc/magic to determine the file type. If it is tar or cpio, it usually comes back and tells you so. If it can't guess the file type, it just says "data," which isn't very helpful.

Another interesting use of dd is to combine it with ssh or rsh. Be sure to read the section "Using ssh or rsh as a Conduit Between Systems" later in this chapter.