Compressing and Packaging Files


Compression replaces a file with an encoded version containing fewer bytes. The compressed version of the file saves all the information that was in the original file. The original file can be recovered by undoing the compression procedure.

Compressed files require less storage space but are also less convenient to work with than uncompressed files. Most commands won’t work on compressed files-for example, you can’t edit a text file while it’s compressed. Because of this, compressed files are ideal for backups, which won’t need to be accessed very often. Compression is also used to reduce the size of files being sent over a network or distributed on a web site.

Most UNIX variants provide utilities for compressing files. SVR4-based systems include the pack and compress commands. Other systems, including Linux, provide the gzip command, which is probably the most popular compression utility for UNIX today It is available for most platforms (including Windows) at http://www.gzip.org/. The command bzip2, a somewhat newer utility that’s very similar to gzip, can be downloaded for various platforms from http://www.bzip.org/.

The compress command is more efficient than pack, meaning that it will almost always create smaller compressed files. Similarly, gzip is more efficient than compress, and bzip2 is generally more efficient than gzip.

All UNIX variants include the tar command, which was originally designed for creating tape archives for backups but is now commonly used to “bundle” files, often before compressing them.

pack

The pack command replaces a file with a compressed version. The original file is destroyed, so be sure to make a copy beforehand if you need to save the file. The compressed file has .z appended to the filename, to indicate how it was compressed. To uncompress the file, use the unpack command, with the original filename as the argument.

 $ pack research-data pack: research-data:     45.4% Compression $ ls research* research-data.z $ unpack research-data unpack: research-data: unpacked $ ls research* research-data

The second line of this example shows that the file research-data.z is 45.4 percent smaller than research-data. Note that the compressed file is deleted when it is uncompressed. If you want to keep the compressed file, you will need to create a copy.

compress

The compress command works in pretty much the same way as pack. It adds .Z (uppercase) at the end of the compressed filename, instead of the .z (lowercase) that pack uses. The uncompress command will recover the original file. As with pack, compressing or uncompressing a file will delete it, so be sure to make a copy if you need to save the original version.

 $ compress research-data $ ls research* research-data.Z $ uncompress research-data

Note that, unlike pack, compress does not report after compressing or uncompressing a file. The v (verbose) option will cause it to display feedback.

gzip

The gzip command will also replace a file with a compressed version. A file compressed with gzip has the extension .gz. To uncompress the file, use either gzip d (for decompress), or the command gunzip. As with compress, the v option will cause gzip and gunzip to display a confirmation after compressing or uncompressing a file.

 $ gzip −v research-data research-data:        81.3% -- replaced with research-data.gz $ gunzip −v *.gz download.gz           33.6% -- replaced with download research-data.gz:     81.3% -- replaced with research-data

gunzip can also be used to decompress .z and .Z files. Some systems (such as Linux) include the command bzip2 (and the related command bunzip2 for decompressing files), which is an alternative to gzip that works in the same way

Working with Compressed Files

The gzip package comes with a set of tools for working with compressed files. These tools include zcat, zmore, zless, zgrep, and zdiff, which do for compressed files what their counterparts do with ordinary text files.

The zcat command reads files that have been compressed by compress or gzip and prints the uncompressed content to standard output.

The zmore and zless commands work like the more and less commands, printing compressed files in their uncompressed form, one screen at a time.

The zgrep command searches a compressed file for lines that match a grep search target, and prints them in uncompressed form. The following finds lines that contain “toss” in the compressed file fulltext.gz.

 $ zgrep toss fulltext.gz Your mind is tossing on the ocean;

The zdiff command is based on the diff command, which is described later in this chapter. zdiff reads the files specified as its arguments and prints the result of doing a diff on the uncompressed contents. It can be used to compare two compressed files, or to compare a compressed file to an uncompressed file.

tar

As noted previously, two of the most common uses of compression are creating backup files and sending files over a network. In both of these cases, you may have many files that you want to keep together. For example, you may be backing up an entire directory, or e-mailing all of the files for a project. The tar command can be used to “package” a group of files into a single file. It is commonly used on files before compressing them.

The syntax for the tar command is complicated. This section will cover only the basic commands for combining or separating a group of files. More details can be found in the UNIX man page for tar.

To combine files with tar, use the command

 $ tar −cvf mail.tar save sent

This will create a file called mail.tar that contains the files save and sent. (The c option stands for create.) You can list as many files to include as you like, including directories. To package all the files in the directory -/Project into a tar file, use

 $ tar −cvf projectfiles.tar −/Project

Note that, unlike the compression tools, tar leaves the original files unchanged. Also, it does not automatically add the .tar extension to the combined file. Unlike most UNIX commands, tar does not require the-in front of options, so tar cvf could also be written as tar cvf.

To separate a .tar file, use the command

 $ tar −xvf projectfiles.tar

This will extract all of the files from projectfiles.tar. (The x option stands for extract.)

Some versions of tar (including the versions found on most Linux systems) have an option to create a .tar file and compress it with gzip in one step. This can be convenient, since tar is commonly used to package files before compressing them. The following command will tar and compress all files starting with cs in the current directory:

 $ tar −cvzf csfiles.tar.gz cs*

These versions of tar can also extract .tar.gz files in a single step. To do this, use the command

 $ tar −xvzf csfiles.tar.gz




UNIX. The Complete Reference
UNIX: The Complete Reference, Second Edition (Complete Reference Series)
ISBN: 0072263369
EAN: 2147483647
Year: 2006
Pages: 316

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net