Comparing Files


Often you need to see whether two files have different contents and to list the differences if there are any For example, you may want to compare two versions of a document you’re working on to see what you’ve changed. It is also sometimes useful to be able to tell whether files having the same name in two different directories are simply different copies of the same file, or whether the files themselves are different.

  • cmp, comm, and diff each tell whether two files are the same or different, and they give information about where or how the files differ. The differences among them have to do with how much information they give you, and how they display it.

  • patch uses the list of differences produced by diff, together with an original file, to update the original to include the differences.

  • dircmp tells whether the files in two directories are the same or different.

cmp

The cmp command is the simplest of the file comparison tools. It tells you whether two files differ, and if they do, it reports the position in the file where the first difference occurs. The following example illustrates how it works:

 $ cat note Nate, Here's the first draft of the plan. I think it needs more work. $ cat note.more Nate, Here's the first draft of the new plan. I think it needs more work. Let me know what you think. $ cmp note note.more note note.more differ: byte 37, line 2

This output shows that the first difference in the two files occurs at the 37th character, which is in the second line. cmp does not print anything if there are no differences in the files.

comm

The comm (common) command is designed to compare two sorted files and show lines that are the same or different. You can display lines that are found only in the first file, lines found only in the second file, and/or lines that are found in both files.

By default, comm prints its output in three columns: lines unique to the first file, those unique to the second file, and lines found in both, respectively The following illustrates how it works, using two files containing lists of cities:

 $ comm cities.1 cities.2 New York                               Palo Alto                               San Francisco              Santa Monica                               Seattle

This shows that “New York” is only in the first file, “Santa Monica” only occurs in the second, and “Palo Alto”, “San Francisco”, and “Seattle” are found in both.

The comm command provides options you can use to control which of the summary reports it prints. Options 1 and 2 suppress the reports of lines unique to the first and second files, respectively Use 3 to suppress printing of the lines found in both. These options can be combined. For example, to print only the lines unique to the first file, use 23, like this:

 $ comm −23 cities.1 cities.2 New York

diff

The diff command compares two files, line by line, and prints out differences. In addition, for each block of text that differs between the two files, diff tells you how the text from the first file would have to be changed to match the text from the second.

The following example illustrates the diff output for the two note files described earlier:

 $ diff note note.more 2c2 < Here's the first draft of the plan. -- > Here's the first draft of the new plan. 3a4 > Let me know what you think.

Lines containing text that is found only in the first file begin with <. Lines containing text found only in the second file begin with >. Dashed lines separate parts of the diff output that refer to different files.

Each section of the diff output begins with a code that indicates what kinds of differences the following lines refer to. In the preceding example, the first pair of differences begin with the code 3c3. This tells you that there is a change (c) between line 3 in the first file and line 3 in the second file. The second difference begins with 4a5. The letter a (append) indicates that line 5 in the second file is added following line 4 in the first. Similarly, a d (deleted) would indicate lines found in one file but not in the other.

patch

If you save the output from diff, you can use the patch command to recreate the second file by applying the differences to the first file. The patched version replaces the original file. The following shows how you could patch the file project.c using the difference file diffs.

 $ diff project.c project2.c > diffs $ patch project.c diffs

After this pair of commands, the contents of project.c are identical to the contents of project2.c.

The patch command allows you to keep track of successive versions of a file without having to keep all of the intermediate versions. All you need to do is to keep the original version and the output from diff needed to change it into each new version. (This is how some revision control systems store files. See Chapter 24 for an explanation of revision control.)

dircmp

Some versions of UNIX, such as Solaris, include the dircmp command, which compares the contents of two directories and tells you how they differ. The output of dircmp lists the filenames that are unique to each directory If there are files with the same name in both directories, dircmp tells you whether their contents are the same or different.

The following command compares the contents of ~jcm/Dev with the contents of ~jcm/ Dev/Backup:

 $ dircmp ~jcm/Dev ~jcm/Dev/Backup

In addition to comparing two of your own directories, dircmp may be used to compare directories belonging to different users. For example, if two users are working on the same project and each has their own copy of the files, they may need to determine which files are no longer identical.




UNIX. The Complete Reference
UNIX: The Complete Reference, Second Edition (Complete Reference Series)
ISBN: 0072263369
EAN: 2147483647
Year: 2006
Pages: 316

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net