Project6.Use Redirection and Pipelining

Project 6. Use Redirection and Pipelining

Redirection and pipelining are techniques for using Unix commands in combinations to perform a particular task. Many of the projects in this book use, and assume knowledge of, redirection and pipelining.

The concepts of redirection and pipelining lie at the core of Unix and take advantage of the central idea that set Unix apart from many other computer operating systems. The central idea has input and output (IO) as streams of (usually human readable) characters. This is true for all IO, whether it is file-based, console-based (terminal screen and keyboard) or interprocess communication (communication between running commands). Other operating systems break this symmetry by doing file-based IO in blocks. In Unix, information can flow as it is generated. The symmetry among file, console, and interprocess IO makes it easy to plug any output stream into any input stream.

A Unix command that normally takes input from the keyboard and writes output to the screen, for example, can easily be made to read from and write to a file instead. This is termed redirection. A command can also be told to take its input from the output of a previous command, and in this way, many commands can be chained, each reading the output from its predecessor. This is termed pipelining. Pipelining enables a task to be performed by combining lots of small, specialized tools, each doing its own thing on the data stream. You don't have to rely on a few monolithic applications, hoping they do exactly what you require. Instead, you combine small tools to create your own customized application, tailored to do exactly as you require.

Much of the skill in getting the most from Unix lies not just in employing the right commands, but also in being able to combine them in the right manner to achieve your goal. This is the practical application of Unix and, in a nutshell, is what this book and most of its 101 projects are about.

Redirection and pipelining work without any special effort from the commands involved; the shell does all the work. A command sees three streams:

Standard in (stdin or stream 0) The source from which it reads input text
Standard out (stdout or stream 1) The destination to which it writes output text
Standard error (stderr or stream 2) To which it writes error message text. The stderr stream exists because it's usually not desirable to mix error messages with normal output.

Learn More

Be sure to learn about shell scripting; it's another tool essential to getting the most from Unix. Refer to Projects 9 and 10, and the projects in Chapter 9.

The shell regards the keyboard as the stdin device and the Terminal screen as the stdout and stderr devices. Normally, these three streams are inherited by the command when it is launched by the shell. When redirection is applied, the shell opens the specified file(s) and reassigns them as stdin, stdout, and stderr as appropriate before executing the command. This process is transparent to the command, which does not treat the streams any differently.

When pipelining is applied, the shell opens a special stream called a pipe, assigning this as stdout of the first command and also as stdin of the second command. A pipe just channels its input to its output.

To use redirection and pipelining, all you need understand is the shell's syntax. The syntax given here is good for Bourne and Bash.

Redirection of stdout and stderr

We'll get started with redirection by having a command that normally writes to the screen send its output to a file instead. Let's redirect output from the command ls to the file list.txt and then display that file. To specify redirection of stdout to a file, simply use > filename.

$ ls > list.txt

The file list.txt will contain the text that would otherwise have been written to the Terminal screen.

Let's view it using the cat command.

Note

You might notice that the text looks different from ls output sent to the screen. The appearance is different because ls was aware of the redirection and, even though the principles of redirection don't require it, formatted its output differently for use in a text file.

$ cat list.txt Desktop ... Sites list.txt

Note

The stderr stream is stream 2; hence, 2> redirects stderr. Similarly, 1> redirects to stdout; > is shorthand for 1>.

The next example demonstrates that error messages go to stderr and not to stdout, then illustrates redirection of stderr, using the syntax 2> filename.

We'll start by intentionally generating an error message, using the redirection sequence from the previous exercise to list the contents of a nonexistent directory.

$ ls zzz > list.txt ls: zzz: No such file or directory $ cat list.txt

Now add a 2 before the > symbol to redirect stderr to the list.txt file. You'll notice that nothing is written to the Terminal screen this time because we have sent stderr to a file.

$ ls zzz 2> list.txt

View the text file with cat to confirm that stderr output has been written to it.

$ cat list.txt ls: zzz: No such file or directory

To redirect both stdout and stderr, just combine the two examples above.

$ ls zzz Sites >out.txt 2>error.txt

Sometimes, you may want to redirect both stdout and stderr to the same file. To accomplish this, first redirect stdout; then merge stderr (stream 2) into stdout (stream 1). The second line shown below is equivalent to the first but reverses the way stdout and stderr are merged.

$ ls zzz Sites >out.txt 2>&1 $ ls zzz Sites 2>out.txt 1>&2

In Bash, but not Bourne, there's a shorter syntax:

$ ls zzz Sites &> out.txt

Tip

Throw away output or errors by redirecting to /dev/null. This is a special file akin to a bottomless pit.

$ cat janets-tax-¬     return.txt > ¬     /dev/null

As you may have noticed when we redirected stderr to list.txt, use of the character > causes the specified file to be overwritten if it exists. Use a double arrow, >> or 2>>, if you want to add, or append, output to an existing text file instead.

$ ls > list.txt $ echo "--------------" >> list.txt $ ls Sites >> list.txt $ cat list.txt Desktop ... Sites list.txt -------------- images index.html

Try the following experiment. Issue the command cat with no file to read, and redirect stdout to letter.txt. The command cat will wait for you to type something at the keyboard; it assumes that you meant it to read from stdin because you did not specify an input file. When you've finished, type Control-d (Control-d means end of file), and display the file.

$ cat > letter.txt Hello, I am writing... <Control-d> $ cat letter.txt Hello, I am writing...

Redirection of stdin

The real value of using stdin lies in pipelining, described later in this chapter, but redirection of stdin is useful where a command takes no filenames, expecting to use the keyboard and the screen. A command that normally reads from the keyboard, for example, can be made to read from a file by means of redirection. One such command is tr, which translates one character to another. To see how this works, try this example, which changes every occurrence of the letter i to the letter u. First, let tr read from the keyboard and write to the screen.

Tip

Because the command tr processes a line at a time, it's impossible to write output back to the file being read. The following trick, which uses a semicolon to separate two commands on a single line, produces that effect: The command before the semicolon redirects translated output from mac-file to a new file, tmp. When that command completes, the mv command renames tmp to mac-file, overwriting the original file with a translated replacement.

$ tr '\r' '\n' < mac-file > tmp; mv tmp mac-file

$ tr i u big bug <Control-d> $

A typical use for tr is to translate files that have Mac-style end-of-line characters (Return) into files that have Unix-style end-of-line characters (Newline). In this example, stdin is redirected so it's read from mac-file, and stdout is redirected so it's written to unix-file. The character sequence \r represents Return, and \n represents Newline. To specify redirection of stdin, simply use < filename.

$ tr '\r' '\n' < mac-file > unix-file

What's the Difference?

What is the difference between

$ cat letter.txt

and

$ cat < letter.txt

To the user, there is no difference. In the first example, cat opens letter.txt and reads from it. In the second, the shell opens letter.txt, passing it as stdin to cat, while cat, seeing no input filename, reads stdin.

Pipelining

The concept of standard in and standard out streams lets us chain Unix commands, where a command takes its input from the output of the previous command. Such pipelines are set up by the shell, which arranges for the standard input of command #n to be taken from the standard output of command #(n-1).

Here's an example of pipelining that uses the commands ps and grep. The command ps displays the status of all processes running on your Mac, and grep picks out lines that match a regular expression.

First, try ps on its own. Options -cx tell ps to list the names of all commands you are running.

$ ps -cx   PID  TT  STAT     TIME COMMAND   196  ??  Ss    3:02.28 WindowServer ...   389 std Ss    0:00.16 -bash   478  p2 S     0:00.12 -bash

Learn More

Project 23 tells you more about using the command grep, and Project 39 covers the command ps.

Projects 60 and 62 tell you more about the command awk.

Now pipe the results to grep, which we'll tell to pick out lines containing "Terminal". This should leave just the status line for process Terminal.

$ ps -xc | grep "Terminal"   476 ?? S      0:24.52 Terminal

The three-digit number at the start of the line is the process identification (PID) of Terminal, a number that changes each time you quit and re-launch it. Let's add another stage to the pipe to return just the PID. We can use the command awk to do this. The command awk displays particular fields of each line it reads. In the following example, we display just field 1 ($1).

$ ps -xc | grep "Terminal" | awk '{print $1}' 476

Ordinarily, stderr output is not included in a pipeline. To pipe it too, first combine stderr with stdout by using 2>&1; then pipe as normal.

Note

Other shells, such as Tcsh, use the syntax `command` instead of $(command).

A Time to Kill

Finally, let's write a command line that will kill (abort) the Terminal. We use the command kill to achieve this. The command kill takes a PID as its argument, which we'll furnish using the pipeline we just built. (It outputs Terminal's PID, you'll recall.) We'll enclose the pipeline sequence in $(), which tells Bash to execute it, write the result back to the command line, and then execute the remainder of the command line.

Tip

Use the Unix manual to find out about the command killall.

Before we do any actual killing, use echo to demonstrate that the expression enclosed by $() still outputs Terminal's PID.

$ echo $(ps -xc | grep "Terminal" | awk '{print $1}') 476

Now run kill.

$ kill $(ps -xc | grep "Terminal" | awk '{print $1}')

(At this point, Terminal will vanish!)

When Terminal restarts, it will have a different PID, but the command above will still work. As an exercise, write a generic kill command that takes a name instead of a PID. The new command will take one argument, so a Bash function seems an obvious solution. If you would like to have a go at this, you can find all the required information in Project 4.

The pipelining example above shows how a task can be accomplished by combining a few Unix commands and even how you can create your own custom command. This nicely illustrates an important point. To achieve what we did took a lot of Unix knowledge and the ability to pick the correct commands and combine them in the right manner. You could read 500 pages on Unix and still not be able to craft the command line you require from a blank sheet. This is precisely why the majority of projects in this book are task focused. Pick a chapter and a project that looks closest to your needs, and you'll learn how to use Unix, not just about Unix.

Syntax Summary

Table 1.1 and Table 1.2 summarize Bash and Tcsh syntax for redirection and pipelining.

Table 1.1. Bash Syntax for Redirection and Pipelining
Function	Syntax
redirect stdout	command > filename
redirect stderr	command 2> filename
redirect stdout appending	command >> filename
redirect stderr appending	command 2>> filename
redirect both to the same file	command &> filename
redirect both to different files	command > outfile 2> errorfile
redirect stdin	command < filename
pipe stdout	command1 \| command 2
pipe both	command1 2>&1 \| command2

Table 1.2. Tcsh Syntax for Redirection and Pipelining
Function	Syntax
redirect stdout	command > filename
redirect stderr	(command > /dev/tty) >& filename
redirect stdout appending	command >> filename
redirect stderr appending	(command > /dev/tty) >>& filename
redirect both to the same file	command >& filename
redirect both to different files	(command > outfile) >& errorfile
redirect stdin	command < filename
pipe stdout	command1 \| command2
pipe both	command1 \|& command2

Just for Fun

Create a trick file.

$ cat haha.txt 2> tmp; mv tmp haha.txt $ cat haha.txt cat: haha.txt: No such file or directory