The sole reason for compressing files is to save space. You want to save space, so you make room. This is analogous to closet space in your home. Often, you can buy something that will allow you to organize and condense your space so that you have more room to put more things in. This theoretically is the same concept behind compression. You take something, squeeze it down, and organize it so that you can store more. You can also compress files to send them to others. This is common with digital photography and today's email clients. Many people have email and digital cameras, and they want to send photos through email. Often an email of this size will be denied because the person you are sending it to may not have enough space on her system, or in the mailbox on the email server, to accept a file that size. Therefore, you will need to compress it. Makes sense, right? Well, that's all you need to know about why to compress something. Now you need to know the actual mechanics of it. If you happen to be using a system where disk space is restricted and you need to maximize available space, you can use the Unix commands you will learn here. These commands will reduce the amount of space your files occupy, and will allow you to store more files in the space you are allowed. There are three major compression formats you will use when working with Unix:
We will cover all of these in both compressing and decompressing (or uncompressing) formats, as you will need to know how to decompress something that you compressed. Each of these formats has a set of programs for compressing and uncompressing. For our first example, we shall look at the standard (and hardly used) compress tool that comes with almost every distribution of Unix.
The compress CommandUse this command and specify what you want to compress. The compress command, when used, will be seen as compress <filename>. The compress command is an older Unix command that uses an older algorithm to make the compression. In fact, this tool is not commonly used anymore, but it does exist on just about every version of Unix. Better compression algorithms have since been developed; that's why it's been moved to the side and replaced by tools such as gzip. Files created with the compress command have the file suffix .Z. This will appear in the directory in which you compressed the original file, and can be seen by using the ls -l command. The uncompress command uncompresses the results of a compress command. To use the uncompress command, you issue the command as uncompress <filename.Z>. Remember learning about how the cat command can be used to read files? The zcat command is a version of cat that reads compressed files rather than normal text files. Using zcat is similar to using compress and uncompress; issue the command as zcat <filename.Z>. Remember, since you already compressed a file, the file suffix is .Z.
The gzip CommandUsing the compress command will get you the results you need, but again, the utility is older and does not work as well as newer ones. Also, the Unix version of compress can be slightly altered as you go from distribution to distribution. Any variance is not good as you may not be able to compress with one utility and decompress with another. To make this point clearer, consider why you would use compress: because it is the only thing you either know or have. It is located on your local Unix system and is there for use. What if you wanted to use something that was a little less likely to be proprietary? The gzip command (stands for GNU zip) is the original file compression program for GNU/Linux and has been adopted for use with all Unix systems under the GPL (GNU Public License). This means that it is free for use and standardized as a common tool that almost everyone in Unix and Linux environments will use. Current versions of gzip produce files with a .gz extension. The gzip command will work essentially identically to the compress/uncompress/zcat suite we just talked about. It is a better utility and less proprietary than the older tools in use such as compress. To make your life a bit easier, GNU has included the capability to deal with compressed (.Z) files in their gunzip and gzcat utilities. You might find that gzip and gunzip exist on your system, but that gzcat is missing. Some distributions have renamed gzcat to zcat because it handles compressed files as well. When gzip is combined with tar (which stands for Tape Archive and will be discussed later), the resulting file extensions may be .tgz, .tar.gz, or tar.Z. zip/unzipAs we wind down to the end of our compression utilities offerings that can be used with Unix, let's cover the last of the commonly seen utilities used for compression and decompression. Most PC users, whether familiar with Unix or not, know about Zip files. The zip command offers compression that is based on the algorithm from the PC standard PKZip program. The zip and unzip programs work exactly as you might expect them to: zip <filename> to compress a file with zip, and unzip <filename.z> to unzip the files.
Creating files using the zip format (which uses the file suffix .z in Unix) for distribution to other Unix users is generally not a good idea, as zip and unzip are not always available to Unix users. These utilities are freeware, so get your system administrator to install them if you need to have access to them. If your target, however, is users of Macintosh or Windows computers, zip is a file format that they can most likely read. Both the zip and unzip programs have a number of potentially useful options, a list of which can be displayed by issuing either command followed by the option h. In this section of the lesson, we have covered how to compress data, and we lightly touched on the use of the tar command. In the next section, we will dig deeper into the tar command and cover its use. |