6.2 Pipes and Filters


We've seen how to redirect input from a file and output to a file. You can also connect two programs together so that the output from one program becomes the input of the next program. Two or more programs connected in this way form a pipe . To make a pipe, put a vertical bar ( ) on the command line between two commands. When a pipe is set up between two commands, the standard output of the command to the left of the pipe symbol becomes the standard input of the command to the right of the pipe symbol. Any two commands can form a pipe as long as the first program writes to standard output and the second program reads from standard input.

When a program takes its input from another program, performs some operation on that input, and writes the result to the standard output (which may be piped to yet another program), it is referred to as a filter . A common use of filters is to modify output. Just as a common filter culls unwanted items, Unix filters can restructure output.

Most Unix programs can be used to form pipes. Some programs that are commonly used as filters are described in the next sections. Note that these programs aren't used only as filters or parts of pipes. They're also useful on their own.

6.2.1 grep

The grep program searches the contents of files for lines that have a certain pattern. The syntax is:

 grep "pattern"   file(s)   

The name "grep" is derived from the ed (a Unix line editor) command g/ re/p , which means " g lobally search for a r egular e xpression and p rint all matching lines containing it." A regular expression is either some plain text (a word, for example) or special characters used for pattern matching. When you learn more about regular expressions, you can use them to specify complex patterns of text.

grep understands plain text and that's all. The Find command in the Finder can meaningfully search Microsoft Word data files, for example, but grep knows text only. Feeding it non-text files can produce puzzling and peculiar results. For example, Word files and a lot of other application data contain characters that, when sent to Terminal.app , mess up your display in strange and interesting ways. One way to search such files from the command line is to extract only the printable characters using the strings program (see man strings for details).


The simplest use of grep is to look for a pattern consisting of a single word. It can be used in a pipe so only those lines of the input files containing a given string are sent to the standard output. But let's start with an example reading from files: searching all files in the working directory for a word ” say, Unix . We'll use the wildcard * to quickly give grep all filenames in the directory.

 $  grep "Unix" *  ch01:Unix is a flexible and powerful operating system ch01:When the Unix designers started work, little did ch05:What can we do with Unix? $ 

When grep searches multiple files, it shows the filename where it finds each matching line of text. Alternatively, if you don't give grep a filename to read, it reads its standard input; that's the way all filter programs work:

 $  ls -l  grep "Jan"  drwx------   4 taylor  taylor  264  Jan 29 22:33 Movies/ drwx------   2 taylor  taylor  264  Jan 13 10:02 Music/ drwx------  95 taylor  taylor  3186 Jan 29 22:44 Pictures/ drwxr-xr-x   3 taylor  taylor  264  Jan 24 21:24 Public/ $ 

First, the example runs ls -l to list your directory. The standard output of ls -l is piped to grep , which outputs only lines that contain the string Jan (that is, files or directories that were last modified in January and any other lines that have the pattern "Jan" within). Because the standard output of grep isn't redirected, those lines go to the Terminal screen.

grep options let you modify the search. Table 6-1 lists some of the options.

Table 6-1. Some grep options

Option

Description

-v

Print all lines that do not match pattern.

-n

Print the matched line and its line number.

-l

Print only the names of files with matching lines (lowercase letter "L").

-c

Print only the count of matching lines.

-i

Match either upper- or lowercase.

Next, let's use a regular expression that tells grep to find lines with root , followed by zero or more other characters (abbreviated in a regular expression as .* ), then followed by Jan :

 $  ls -l  grep "root.*Jan"  drwxr-xr-x  12 root    staff   364 Jan  9 20:24 NetInfo/ $ 

Note that the regular expression for "zero or more characters," .* , is different than the corresponding filename wildcard * . See Section 4.2 in Chapter 4. We can't cover regular expressions in enough depth here to explain the difference, though more-detailed books do. As a rule of thumb, remember that the first argument to grep is a regular expression; other arguments, if any, are filenames that can use wildcards.


For more about regular expressions, see the references in Section 10.1 in Chapter 10.

6.2.2 sort

The sort program arranges lines of text alphabetically or numerically . The following example sorts the lines in the food file (from Section 5.1.1 in Chapter 5) alphabetically. sort doesn't modify the file itself; it just reads the file and displays the result on standard output (in this case, the Terminal).

 $  sort food  Afghani Cuisine Bangkok Wok Big Apple Deli Isle of Java Mandalay Sushi and Sashimi Sweet Tooth Tio Pepe's Peppers 

By default, sort arranges lines of text alphabetically. Many options control the sorting, and Table 6-2 lists some of them.

Table 6-2. Some sort options

Option

Description

-n

Sort numerically (for example, 10 sorts after 2); ignore blanks and tabs.

-r

Reverse the sorting order.

-f

Sort upper- and lowercase together.

+ x

Ignore first x fields when sorting.

More than two commands may be linked up into a pipe. Taking a previous pipe example using grep , we can further sort the files modified in January by order of size. The following pipe uses the commands ls , grep , and sort :

 $  ls -l  grep "Jan"  sort +4n  drwx------   2  taylor taylor  264  Jan 13 10:02 Music/ drwx------   4  taylor taylor  264  Jan 29 22:33 Movies/ drwxr-xr-x   3  taylor taylor  264  Jan 24 21:24 Public/ drwx------  95  taylor taylor 3186  Jan 29 22:44 Pictures/ $ 

This pipe sorts all files in your directory modified in January by order of size, and prints them to the Terminal screen. The sort option +4 n skips 4 fields (fields are separated by blanks), then sorts the lines in numeric order. So, the output of ls , filtered by grep , is sorted by the file size (this is the fifth column, starting with 264). Both grep and sort are used here as filters to modify the output of the ls -l command. You could print the listing by piping the sort output to your printer command (either lp , lpr , or atprint ).

6.2.3 Piping to a Pager

The less program, which you saw in Section 3.1.13 in Chapter 3, can also be used as a filter. A long output normally zips by you on the screen, but if you run text through less , the display stops after each page or screenfull of text (that's why they're called " pagers ": they let you see the output page by page).

Let's assume that you have a long directory listing. (If you want to try this example and need a directory with lots of files, use cd first to change to a system directory such as /bin or /usr/bin .) To make it easier to read the sorted listing, pipe the output through less :

 $  cd /bin  $  ls -l  sort +4n  less  total 8288 -r-xr-xr-x  1 root  wheel       9736 27 Aug 04:36 echo -r-xr-xr-x  1 root  wheel      10256 27 Aug 04:44 sync -r-xr-xr-x  1 root  wheel      10476 27 Aug 05:03 domainname ... -r-sr-xr-x  1 root  wheel      25248 27 Aug 05:03 rcp -r-xr-xr-x  1 root  wheel      27308 27 Aug 04:31 dd 

less reads a screenful of text from the pipe (consisting of lines sorted by order of file size), then prints a colon (:) prompt. At the prompt, you can type a less command to move through the sorted text. less reads more text from the pipe, shows it to you, and saves a copy of what it has read, so you can go backward to reread previous text if you want. (The simpler pager program more can't back up while reading from a pipe.) When you're done seeing the sorted text, the q command quits less .

6.2.4 Exercise: Redirecting Input/Output

In the following exercises, you redirect output, create a simple pipe, and use filters to modify output:

Task

Command

Redirect output to a file.

 ls > files 

Change all the letters to uppercase.

 tr '[a-z]' '[A-Z]' < files 

Sort the output of a program.

 ls  sort 

Append sorted output to a file.

 ls  sort >> files 

Display output to the screen.

 less files (or more files) 

Display long output to the screen.

 ls -l /bin  less (or more) 

Format and print a file with pr .

 pr files  lp or pr files  lpr 


Learning Unix for Mac OS X Panther
Learning Unix for Mac OS X Panther
ISBN: 0596006179
EAN: 2147483647
Year: 2003
Pages: 88

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net