Project 6. Use Redirection and PipeliningRedirection and pipelining are techniques for using Unix commands in combinations to perform a particular task. Many of the projects in this book use, and assume knowledge of, redirection and pipelining. The concepts of redirection and pipelining lie at the core of Unix and take advantage of the central idea that set Unix apart from many other computer operating systems. The central idea has input and output (IO) as streams of (usually human readable) characters. This is true for all IO, whether it is file-based, console-based (terminal screen and keyboard) or interprocess communication (communication between running commands). Other operating systems break this symmetry by doing file-based IO in blocks. In Unix, information can flow as it is generated. The symmetry among file, console, and interprocess IO makes it easy to plug any output stream into any input stream. A Unix command that normally takes input from the keyboard and writes output to the screen, for example, can easily be made to read from and write to a file instead. This is termed redirection. A command can also be told to take its input from the output of a previous command, and in this way, many commands can be chained, each reading the output from its predecessor. This is termed pipelining. Pipelining enables a task to be performed by combining lots of small, specialized tools, each doing its own thing on the data stream. You don't have to rely on a few monolithic applications, hoping they do exactly what you require. Instead, you combine small tools to create your own customized application, tailored to do exactly as you require. Much of the skill in getting the most from Unix lies not just in employing the right commands, but also in being able to combine them in the right manner to achieve your goal. This is the practical application of Unix and, in a nutshell, is what this book and most of its 101 projects are about. Redirection and pipelining work without any special effort from the commands involved; the shell does all the work. A command sees three streams:
Learn More
The shell regards the keyboard as the stdin device and the Terminal screen as the stdout and stderr devices. Normally, these three streams are inherited by the command when it is launched by the shell. When redirection is applied, the shell opens the specified file(s) and reassigns them as stdin, stdout, and stderr as appropriate before executing the command. This process is transparent to the command, which does not treat the streams any differently. When pipelining is applied, the shell opens a special stream called a pipe, assigning this as stdout of the first command and also as stdin of the second command. A pipe just channels its input to its output. To use redirection and pipelining, all you need understand is the shell's syntax. The syntax given here is good for Bourne and Bash. Redirection of stdout and stderrWe'll get started with redirection by having a command that normally writes to the screen send its output to a file instead. Let's redirect output from the command ls to the file list.txt and then display that file. To specify redirection of stdout to a file, simply use > filename. $ ls > list.txt The file list.txt will contain the text that would otherwise have been written to the Terminal screen. Let's view it using the cat command. Note
$ cat list.txt Desktop ... Sites list.txt Note
The next example demonstrates that error messages go to stderr and not to stdout, then illustrates redirection of stderr, using the syntax 2> filename. We'll start by intentionally generating an error message, using the redirection sequence from the previous exercise to list the contents of a nonexistent directory. $ ls zzz > list.txt ls: zzz: No such file or directory $ cat list.txt Now add a 2 before the > symbol to redirect stderr to the list.txt file. You'll notice that nothing is written to the Terminal screen this time because we have sent stderr to a file. $ ls zzz 2> list.txt View the text file with cat to confirm that stderr output has been written to it. $ cat list.txt ls: zzz: No such file or directory To redirect both stdout and stderr, just combine the two examples above. $ ls zzz Sites >out.txt 2>error.txt Sometimes, you may want to redirect both stdout and stderr to the same file. To accomplish this, first redirect stdout; then merge stderr (stream 2) into stdout (stream 1). The second line shown below is equivalent to the first but reverses the way stdout and stderr are merged. $ ls zzz Sites >out.txt 2>&1 $ ls zzz Sites 2>out.txt 1>&2 In Bash, but not Bourne, there's a shorter syntax: $ ls zzz Sites &> out.txt Tip
As you may have noticed when we redirected stderr to list.txt, use of the character > causes the specified file to be overwritten if it exists. Use a double arrow, >> or 2>>, if you want to add, or append, output to an existing text file instead. $ ls > list.txt $ echo "--------------" >> list.txt $ ls Sites >> list.txt $ cat list.txt Desktop ... Sites list.txt -------------- images index.html Try the following experiment. Issue the command cat with no file to read, and redirect stdout to letter.txt. The command cat will wait for you to type something at the keyboard; it assumes that you meant it to read from stdin because you did not specify an input file. When you've finished, type Control-d (Control-d means end of file), and display the file. $ cat > letter.txt Hello, I am writing... <Control-d> $ cat letter.txt Hello, I am writing... Redirection of stdinThe real value of using stdin lies in pipelining, described later in this chapter, but redirection of stdin is useful where a command takes no filenames, expecting to use the keyboard and the screen. A command that normally reads from the keyboard, for example, can be made to read from a file by means of redirection. One such command is tr, which translates one character to another. To see how this works, try this example, which changes every occurrence of the letter i to the letter u. First, let tr read from the keyboard and write to the screen. Tip
$ tr i u big bug <Control-d> $ A typical use for tr is to translate files that have Mac-style end-of-line characters (Return) into files that have Unix-style end-of-line characters (Newline). In this example, stdin is redirected so it's read from mac-file, and stdout is redirected so it's written to unix-file. The character sequence \r represents Return, and \n represents Newline. To specify redirection of stdin, simply use < filename. $ tr '\r' '\n' < mac-file > unix-file
PipeliningThe concept of standard in and standard out streams lets us chain Unix commands, where a command takes its input from the output of the previous command. Such pipelines are set up by the shell, which arranges for the standard input of command #n to be taken from the standard output of command #(n-1). Here's an example of pipelining that uses the commands ps and grep. The command ps displays the status of all processes running on your Mac, and grep picks out lines that match a regular expression. First, try ps on its own. Options -cx tell ps to list the names of all commands you are running. $ ps -cx PID TT STAT TIME COMMAND 196 ?? Ss 3:02.28 WindowServer ... 389 std Ss 0:00.16 -bash 478 p2 S 0:00.12 -bash Learn More
Now pipe the results to grep, which we'll tell to pick out lines containing "Terminal". This should leave just the status line for process Terminal. $ ps -xc | grep "Terminal" 476 ?? S 0:24.52 Terminal The three-digit number at the start of the line is the process identification (PID) of Terminal, a number that changes each time you quit and re-launch it. Let's add another stage to the pipe to return just the PID. We can use the command awk to do this. The command awk displays particular fields of each line it reads. In the following example, we display just field 1 ($1). $ ps -xc | grep "Terminal" | awk '{print $1}' 476 Ordinarily, stderr output is not included in a pipeline. To pipe it too, first combine stderr with stdout by using 2>&1; then pipe as normal. Note
A Time to KillFinally, let's write a command line that will kill (abort) the Terminal. We use the command kill to achieve this. The command kill takes a PID as its argument, which we'll furnish using the pipeline we just built. (It outputs Terminal's PID, you'll recall.) We'll enclose the pipeline sequence in $(), which tells Bash to execute it, write the result back to the command line, and then execute the remainder of the command line. Tip
Before we do any actual killing, use echo to demonstrate that the expression enclosed by $() still outputs Terminal's PID. $ echo $(ps -xc | grep "Terminal" | awk '{print $1}') 476 Now run kill. $ kill $(ps -xc | grep "Terminal" | awk '{print $1}') (At this point, Terminal will vanish!) When Terminal restarts, it will have a different PID, but the command above will still work. As an exercise, write a generic kill command that takes a name instead of a PID. The new command will take one argument, so a Bash function seems an obvious solution. If you would like to have a go at this, you can find all the required information in Project 4. The pipelining example above shows how a task can be accomplished by combining a few Unix commands and even how you can create your own custom command. This nicely illustrates an important point. To achieve what we did took a lot of Unix knowledge and the ability to pick the correct commands and combine them in the right manner. You could read 500 pages on Unix and still not be able to craft the command line you require from a blank sheet. This is precisely why the majority of projects in this book are task focused. Pick a chapter and a project that looks closest to your needs, and you'll learn how to use Unix, not just about Unix. Syntax SummaryTable 1.1 and Table 1.2 summarize Bash and Tcsh syntax for redirection and pipelining.
|