4.3. Sorting Files: sortThe sort utility sorts a file in ascending or descending order based on one or more sort fields, and works as described in Figure 4-7.
Individual fields are ordered lexicographically, which means that corresponding characters are compared based on their ASCII value (see man ascii for a list of all characters and their corresponding values). Two consequences of this are that an uppercase letter is "less" than its lowercase equivalent, and a space is "less" than a letter. In the following example, I sorted a text file in ascending order and descending order using the default ordering rule: $ cat sortfile ...list the file to be sorted. jan Start chapter 3 10th Jan Start chapter 1 30th Jan Start chapter 5 23rd Jan End chapter 3 23rd Mar Start chapter 7 27 may End chapter 7 17th Apr End Chapter 5 1 Feb End chapter 1 14 $ sort sortfile ...sort it. Feb End chapter 1 14 Jan End chapter 3 23rd Jan Start chapter 5 23rd may End chapter 7 17th Apr End Chapter 5 1 Jan Start chapter 1 30th Mar Start chapter 7 27 jan Start chapter 3 10th $ sort -r sortfile ...sort it in reverse order. jan Start chapter 3 10th Mar Start chapter 7 27 Jan Start chapter 1 30th Apr End Chapter 5 1 may End chapter 7 17th Jan Start chapter 5 23rd Jan End chapter 3 23rd Feb End chapter 1 14 $ _ To sort on a particular field, you must specify the starting field number using a + prefix, followed by the noninclusive stop field number using a - prefix. Field numbers start at index 0. If you leave off the stop field number, all fields following the start field are included. In the next example, I sorted the same text file on the first field only, which is number zero: $ sort +0 -1 sortfile ...sort on first field only. Feb End chapter 1 14 Jan End chapter 3 23rd Jan Start chapter 5 23rd may End chapter 7 17th Note that the leading spaces were counted as being part of the first field, which resulted in a strange sorting sequence. Additionally, I would have preferred the months to be sorted in correct order, with "Jan" before "Feb", etc. The -b option ignores leading blanks and the -M option sorts a field based on a month order. Here's an example that worked better: $ sort +0 -1 -bM sortfile ...sort on first month. Jan End chapter 3 23rd Jan Start chapter 5 23rd Jan Start chapter 1 30th jan Start chapter 3 10th Feb End chapter 1 14 Mar Start chapter 7 27 Apr End Chapter 5 1 may End chapter 7 17th $ _ The example text file was correctly sorted by month, but the dates were still out of order. You may specify multiple sort fields on the command line to deal with this problem. The sort utility first sorts all of the lines based on the first sort specifier, and then uses the second sort specifier to order lines that compared equally by the first specifier. Therefore, to sort the example text file by month and date, it had to be sorted based on the first field and then the fifth. In addition, the fifth field had to be sorted numerically by using the -n option. $ sort +0 -1 -bM +4 -n sortfile jan Start chapter 3 10th Jan End chapter 3 23rd Jan Start chapter 5 23rd Jan Start chapter 1 30th Feb End chapter 1 14 Mar Start chapter 7 27 Apr End Chapter 5 1 may End chapter 7 17th $ _ Characters other than spaces often delimit fields. For example, the "/etc/passwd" file contains user information stored in fields separated by colons. You may use the -t option to specify an alternative field separator. In the following example, I sorted a file based on fields separated by : characters. $ cat sortfile2 ...look at the test file. jan:Start chapter 3:10th Jan:Start chapter 1:30th Jan:Start chapter 5:23rd Jan:End chapter 3:23rd Mar:Start chapter 7:27 may:End chapter 7:17th Apr:End Chapter 5:1 Feb:End chapter 1:14 $ sort -t: +0 -1 -bM +2 -n sortfile2 ...colon delimiters. jan:Start chapter 3:10th Jan:End chapter 3:23rd Jan:Start chapter 5:23rd Jan:Start chapter 1:30th Feb:End chapter 1:14 Mar:Start chapter 7:27 Apr:End Chapter 5:1 may:End chapter 7:17th $ _ sort contains several other options that are too detailed to describe here; I suggest that you use the man utility to find out more about them. |