Project28.Archive Files


Project 28. Archive Files

"How do I squash a directory of files into a single compressed file?"

This project covers the tar command and shows you how to use it to combine a collection of files into a single archive file, how to retrieve those files from an archive, and how to use tar as a file-backup tool.

Make an Archive

Many files can be combined into an archive file for easy distribution or storage. An archive can contain anything from a few named files to a whole directory hierarchy. We'll take a look at creating archives by using the tar command and see how to compress the archive. Then we'll do the reverse; decompressing and extracting files from the archive.

Let's make an archive of the files in the directory week1 by using GNU tar, which is the version of tar supplied with Mac OS X. As arguments, tar requires a function followed by function modifiers. To create a new archive file, specify function c for create and modifier f directly followed by a filename for the archive. You may also include modifier v for verbose, which tells tar to list files and directories as they are added to the archive. (Preceding the function and its modifiers with a dash [ - ] is optional.)

$ tar cvf week1.tar week1 week1/ week1/friday.ws week1/monday.ws week1/thursday.ws week1/tuesday.ws week1/wednesday.ws


We retrieve files from the archive (extract files) and write them to the current directory by specifying function x. When an archive is extracted, tar automatically creates directories as needed to match each extracted file's pathname. If a file's target directory already exists, the file will be extracted into that directory and will overwrite any existing file that shares its name.

$ tar xvf week1.tar week1 week1/ week1/friday.ws week1/monday.ws week1/thursday.ws week1/tuesday.ws week1/wednesday.ws


Tip

Extract just some of the files in an archive by naming the files to extract, possibly by using shell-style pattern matching operators, but escaping them from the shell. (Refer to Project 11 to learn about pattern matching.)

$ tar xvf week1.tar ¬     'week1/t*' week1/thursday.ws week1/tuesday.ws



To view archive contents, specify function t (for table of contents).

$ tar tf week1.tar ...


The tar command is inherently recursive. Applying it to a directory archives the directory's contents and those of all its subdirectories.

$ tar cvf Sites.tar ~/Sites tar: Removing leading `/' from member names ... Users/saruman/Sites/jan/ Users/saruman/Sites/jan/images/ Users/saruman/Sites/jan/images/background/ Users/saruman/Sites/jan/images/background/.DS_Store Users/saruman/Sites/jan/images/background/shade-left-b.png ...


The strange comment Removing leading `/' from member names is explained in "Understand tar and Pathnames," later in this project.

Tape Archive?

The tar command got its name from its original purpose, which was to archive onto magnetic tape. It's kept that name but nowadays is used mostly to create archive fileshence, the almost-universal application of modifier f followed by a filename.


Compress and Uncompress

To compress and uncompress tar archives, we could apply gzip and friends to the archive files manually, but built-in tar functions spare us that effort. Various modifiers instruct tar to pass archives to gzip, bzip2, or compress automatically:

  • To gzip/gunzip a file, specify modifier z or --gzip. The standard extension for a tar-gzipped file is .tgz.

  • To bzip2/bunzip2 a file, specify modifier j or --bzip2. The standard extension for a tar-bzipped file is .tbz2 or .tbz.

  • To use the older compress, specify modifier Z or --compress. The standard extension for a tar-compressed file is .taZ.

Learn More

Refer to Project 27 to learn about compressing and uncompressing files.


When an archive is created, it will be compressed, and before files are extracted, the archive will be uncompressed.

We archive and compress with gzip, using either

$ tar czf week1.tgz week1 $ tar cf week1.tgz --gzip week1


Let's check that the archive is in fact compressed by using the file command.

$ file week1.tgz week1.tgz: gzip compressed data, from Unix


To uncompress, use either

$ tar xzf week1.tgz $tar xf week1.tgz --gzip


We archive and compress with bzip2 by using either

$ tar cjf week1.tbz2 week1 $ tar cf week1.tbz2 --bzip2 week1


Let's check, again using file.

$ file week1.tbz2 week1.tbz: bzip2 compressed data, block size = 900k


To uncompress, use either

$ tar xjf week1.tbz2 $ tar xf week1.tbz2 --bzip2


Understand tar and Pathnames

It's important to understand the significance that pathnames have when an archive is extracted. It's also important to understand the different behaviors of tar toward relative and absolute pathnames.

Relative Pathnames

A tar archive includes the relative pathname of each file, from the current directory to the directory being archived. Previously, we archived the directory week1 from the directory that contained it (tips). This time, we'll move up one level, out of tips, and archive by specifying tips/week1. Compare this with the example at the start of the project.

$ cd .. $ tar cf week1.tar tips/week1 $ tar tf week1.tar tips/week1/ tips/week1/friday.ws tips/week1/monday.ws ...


You'll notice that the pathname now includes tips/, and when the archive is extracted, it will be written back to tips/week1/ in the current directory, not directly to week1/. This ensures that when an archive is extracted, it will be written back to the same point in the directory hierarchy from which it was archived.

Note that if you were to extract this archive from within tips instead of the directory above from where it was archived, it would be written back to tips/week1 in the current directorythat is, tips/tips/week1.

Absolute Pathnames

If we specify an absolute pathname to tar, the leading slash character is dropped to make the pathname relative.

$ tar cf week1.tar /Users/saruman/Development/tips/week1 tar: Removing leading `/' from member names $ tar tf week1.tar Users/saruman/Development/tips/week1/ Users/saruman/Development/tips/week1/friday.ws ...


To extract the archive, you must change to the root directory.

$ cd / $ tar xvf /path/to/week1.tar


If you do not move to the root directory, the entire pathname of Users/saruman/Development/tips/week1/ will be created below the current directory as the archive is extracted. If you really do want absolute pathnames in the archive, specify option -P or --absolute-names when creating the archive and when extracting from the archive.

Why is the leading / stripped? If it were not, the archive would always be written starting from the root directory, creating all other directories needed to match the archive pathname. At best, sending the absolute-pathname archive /Users/saruman to a friend would force him to create a directory called /Users/saruman that he doesn't need. At worst, if your friend lacks the permissions needed to create that directory, he will not be able to extract the archive.

Make Incremental Backups

We can use tar to make a backup of a directory and write the archive to CD or DVD. You might place the archive on an external drive or mounted server, and in this case, a neat trick uses the tar function update (u) to update the archive periodically. Updating an archive considers only those files that have changed, adding them to the end of the archive. It's obviously quicker and easier to update an existing archive than to create a new one.

In the following example, we create an archive of the directory week1 and then change a couple of files with the vim text editor.

$ tar cf week1.tar week1 $ vim week1/tuesday.ws $ vim week1/wednesday.ws


Tip

The tar command has many more options; check its man page. Some of the most useful are

-A to add a new archive to an existing archive

-d to report differences between files in an archive and the original files

-r to append files to an archive

--delete to remove specific files from an archive


Next, we update the archive by using the function u. The modifier v gives reassurance that the changed files are detected and added to the archive.

$ tar uvf week1.tar week1 week1/ week1/tuesday.ws week1/wednesday.ws


Editing and updating again:

$ vim week1/tuesday.ws $ tar uvf week1.tar week1 week1/ week1/tuesday.ws


If we examine the archive, all the original files, plus the two sets of updates, will be shown.

$ tar tf week1.tar week1/ week1/friday.ws week1/monday.ws week1/saturday.ws week1/thursday.ws week1/tuesday.ws week1/wednesday.ws week1/ week1/tuesday.ws week1/wednesday.ws week1/ week1/tuesday.ws


When the archive is extracted, earlier versions of tuesday.ws and wednesday.ws are replaced by the latest versions.




Mac OS X UNIX 101 Byte-Sized Projects
Mac OS X Unix 101 Byte-Sized Projects
ISBN: 0321374118
EAN: 2147483647
Year: 2003
Pages: 153
Authors: Adrian Mayo

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net