7.1. IO Redirectors

< Day Day Up >

7.1. I/O Redirectors

In Chapter 1, you learned about the shell's basic I/O redirectors: >, <, and |. Although these are enough to get you through 95% of your UNIX life, you should know that bash supports many other redirectors. Table 7-1 lists them, including the three we've already seen. Although some of the rest are broadly useful, others are mainly for systems programmers.

Table 7-1. I/O redirectors
Redirector	Function
cmd1 \| cmd2	Pipe; take standard output of cmd1 as standard input to cmd2.
> file	Direct standard output to file.
< file	Take standard input from file.
>> file	Direct standard output to file; append to file if it already exists.
>\| file	Force standard output to file even if noclobber is set.
n>\| file	Force output to file from file descriptor n even if noclobber is set.
<> file	Use file as both standard input and standard output.
n<> file	Use file as both input and output for file descriptor n.
<< label	Here-document; see text.
n> file	Direct file descriptor n to file.
n< file	Take file descriptor n from file.
n>> file	Direct file descriptor n to file; append to file if it already exists.
n>&	Duplicate standard output to file descriptor n.
n<&	Duplicate standard input from file descriptor n.
n>&m	File descriptor n is made to be a copy of the output file descriptor.
n<&m	File descriptor n is made to be a copy of the input file descriptor.
&>file	Directs standard output and standard error to file.
<&-	Close the standard input.
>&-	Close the standard output.
n>&-	Close the output from file descriptor n.
n<&-	Close the input from file descriptor n.
n>&word	If n is not specified, the standard output (file descriptor 1) is used. If the digits in word do not specify a file descriptor open for output, a redirection error occurs. As a special case, if n is omitted, and word does not expand to one or more digits, the standard output and standard error are redirected as described previously.
n<&word	If word expands to one or more digits, the file descriptor denoted by n is made to be a copy of that file descriptor. If the digits in word do not specify a file descriptor open for input, a redirection error occurs. If word evaluates to -, file descriptor n is closed. If n is not specified, the standard input (file descriptor 0) is used.
n>&digit-	Moves the file descriptor digit to file descriptor n, or the standard output (file descriptor 1) if n is not specified.
n<&digit-	Moves the file descriptor digit to file descriptor n, or the standard input (file descriptor 0) if n is not specified. digit is closed after being duplicated to n.

Notice that some of the redirectors in Table 7-1 contain a digit n, and that their descriptions contain the term file descriptor; we'll cover that in a little while.

The first two new redirectors, >> and >|, are simple variations on the standard output redirector >. The >> appends to the output file (instead of overwriting it) if it already exists; otherwise it acts exactly like >. A common use of >> is for adding a line to an initialization file (such as .bashrc or .mailrc) when you don't want to bother with a text editor. For example:

$ cat >> .bashrc   alias cdmnt='mount -t iso9660 /dev/sbpcd /cdrom'   ^D

As we saw in Chapter 1, cat without an argument uses standard input as its input. This allows you to type the input and end it with CTRL-D on its own line. The alias line will be appended to the file .bashrc if it already exists; if it doesn't, the file is created with that one line.

Recall from Chapter 3, that you can prevent the shell from overwriting a file with > file by typing set -o noclobber. >| overrides noclobber it's the "Do it anyway, dammit!" redirector.

The redirector <> is mainly meant for use with device files (in the /dev directory), i.e., files that correspond to hardware devices such as terminals and communication lines. Low-level systems programmers can use it to test device drivers; otherwise, it's not very useful.

The rest of the redirectors will only be useful in special situations and you are unlikely to need them most of the time.

7.1.1. Here-documents

The << label redirector essentially forces the input to a command to be the shell's standard input, which is read until there is a line that contains only label. The input in between is called a here-document. Here-documents aren't very interesting when used from the command prompt. In fact, it's the same as the normal use of standard input except for the label. We could use a here-document to simulate the mail facility. When you send a message to someone with the mail utility, you end the message with a dot (.). The body of the message is saved in a file, msgfile:

$ cat >> msgfile << .   > this is the text of   > our message.   > .

Here-documents are meant to be used from within shell scripts; they let you specify "batch" input to programs. A common use of here-documents is with simple text editors like ed. Task 7-1 is a programming task that uses a here-document in this way.

Task 7-1

The s file command in mail saves the current message in file. If the message came over a network (such as the Internet), then it has several header lines prepended that give information about network routing. Write a shell script that deletes the header lines from the file.

We can use ed to delete the header lines. To do this, we need to know something about the syntax of mail messages; specifically, that there is always a blank line between the header lines and the message text. The ed command 1,/^[]*$/d does the trick: it means, "Delete from line 1 until the first blank line." We also need the ed commands w (write the changed file) and q (quit). Here is the code that solves the task:

ed $1 << EOF 1,/^[ ]*$/d w q EOF

The shell does parameter (variable) substitution and command substitution on text in a here-document, meaning that you can use shell variables and commands to customize the text. A good example of this is the bashbug script, which sends a bug report to the bash maintainer (see Chapter 11). Here is a stripped-down version:

MACHINE="i586" OS="linux-gnu" CC="gcc" CFLAGS=" -DPROGRAM='bash' -DHOSTTYPE='i586' -DOSTYPE='linux-gnu' \     -DMACHTYPE='i586-pc-linux-gnu' -DSHELL -DHAVE_CONFIG_H   -I. \     -I. -I./lib -g -O2" RELEASE="2.01" PATCHLEVEL="0" RELSTATUS="release" MACHTYPE="i586-pc-linux-gnu"       TEMP=/tmp/bbug.$$       case "$RELSTATUS" in alpha*|beta*)   BUGBASH=chet@po.cwru.edu ;; *)              BUGBASH=bug-bash@prep.ai.mit.edu ;; esac       BUGADDR="${1-$BUGBASH}"       UN= if (uname) >/dev/null 2>&1; then         UN=`uname -a` fi       cat > $TEMP <<EOF From: ${USER} To: ${BUGADDR} Subject: [50 character or so descriptive subject here (for reference)]       Configuration Information [Automatically generated, do not change]: Machine: $MACHINE OS: $OS Compiler: $CC Compilation CFLAGS: $CFLAGS uname output: $UN Machine Type: $MACHTYPE       bash Version: $RELEASE Patch Level: $PATCHLEVEL Release Status: $RELSTATUS       Description:         [Detailed description of the problem, suggestion, or complaint.]       Repeat-By:         [Describe the sequence of events that causes the problem         to occur.]       Fix:         [Description of how to fix the problem.  If you don't know a         fix for the problem, don't include this section.] EOF       vi $TEMP       mail $BUGADDR < $TEMP

The first eight lines are generated when bashbug is installed. The shell will then substitute the appropriate values for the variables in the text whenever the script is run.

The redirector << has two variations. First, you can prevent the shell from doing parameter and command substitution by surrounding the label in single or double quotes. In the above example, if you used the line cat > $TEMP <<`EOF', then text like $USER and $MACHINE would remain untouched (defeating the purpose of this particular script).

The second variation is <<-, which deletes leading TABs (but not blanks) from the here-document and the label line. This allows you to indent the here-document's text, making the shell script more readable:

cat > $TEMP <<-EOF         From: ${USER}         To: ${BUGADDR}         Subject: [50 character or so descriptive subject here]               Configuration Information [Automatically generated,             do not change]:         Machine: $MACHINE         OS: $OS         Compiler: $CC         Compilation CFLAGS: $CFLAGS         ... EOF

Make sure you are careful when choosing your label so that it doesn't appear as an actual input line.

A slight variation on this is provided by the here string. It takes the form <<<word; the word is expanded and supplied on the standard input.

7.1.2. File Descriptors

The next few redirectors in Table 7-1 depend on the notion of a file descriptor. Like the device files used with <>, this is a low-level UNIX I/O concept that is of interest only to systems programmers and then only occasionally. You can get by with a few basic facts about them; for the whole story, look at the entries for read( ), write( ), fcntl( ), and others in Section 2 of the UNIX manual. You might wish to refer to UNIX Power Tools by Shelley Powers, Jerry Peek, Tim O'Reilly, and Mike Loukides (O'Reilly).

File descriptors are integers starting at 0 that refer to particular streams of data associated with a process. When a process starts, it usually has three file descriptors open. These correspond to the three standards: standard input (file descriptor 0), standard output (1), and standard error (2). If a process opens additional files for input or output, they are assigned to the next available file descriptors, starting with 3.

By far the most common use of file descriptors with bash is in saving standard error in a file. For example, if you want to save the error messages from a long job in a file so that they don't scroll off the screen, append 2> file to your command. If you also want to save standard output, append > file1 2> file2.

This leads to another programming task.

Task 7-2

You want to start a long job in the background (so that your terminal is freed up) and save both standard output and standard error in a single log file. Write a script that does this.

We'll call this script start. The code is very terse:

"$@" > logfile 2>&1 &

This line executes whatever command and parameters follow start. (The command cannot contain pipes or output redirectors.) It sends the command's standard output to logfile.

Then, the redirector 2>&1 says, "send standard error (file descriptor 2) to the same place as standard output (file descriptor 1)." Since standard output is redirected to logfile, standard error will go there too. The final & puts the job in the background so that you get your shell prompt back.

As a small variation on this theme, we can send both standard output and standard error into a pipe instead of a file: command 2>&1 | ... does this. (Make sure you understand why.) Here is a script that sends both standard output and standard error to the logfile (as above) and to the terminal:

"$@" 2>&1 | tee logfile &

The command tee takes its standard input and copies it to standard output and the file given as argument.

These scripts have one shortcoming: you must remain logged in until the job completes. Although you can always type jobs (see Chapter 1) to check on progress, you can't leave your terminal until the job finishes, unless you want to risk a breach of security.^[1] We'll see how to solve this problem in the next chapter.

^[1] Don't put it past people to come up to your unattended terminal and cause mischief!

The other file-descriptor-oriented redirectors (e.g., <&n) are usually used for reading input from (or writing output to) more than one file at the same time. We'll see an example later in this chapter. Otherwise, they're mainly meant for systems programmers, as are <&- (force standard input to close) and >&- (force standard output to close).

Before we leave this topic, we should just note that 1> is the same as >, and 0< is the same as <. If you understand this, then you probably know all you need to know about file descriptors.

< Day Day Up >