Builtin Commands

< Day Day Up >

Builtin commands were introduced in Chapter 5. Commands that are built into a shell do not fork a new process when you execute them. This section discusses the type, read, exec, trap, kill, and getopts builtins and concludes with Table 11-6 on page 500, which lists many bash builtins. See Table 9-10 on page 377 for a list of tcsh builtins.

`type`: Displays Information About a Command

The type builtin (use which under tcsh) provides information about a command:

 $ type cat echo who if lt cat is hashed (/bin/cat) echo is a shell builtin who is /usr/bin/who if is a shell keyword lt is aliased to 'ls -ltrh | tail'

The preceding output shows the files that would be executed if you gave cat or who as a command. Because cat has already been called from the current shell, it is in the hash table (page 878) and type reports that cat is hashed. The output also shows that a call to echo runs the echo builtin, if is a keyword, and lt is an alias.

`read`: Accepts User Input

When you begin writing shell scripts, you soon realize that one of the most common tasks for user-created variables is storing information a user enters in response to a prompt. Using read, scripts can accept input from the user and store that input in variables. See page 361 for information about reading user input under tcsh. The read builtin reads one line from standard input and assigns the words on the line to one or more variables:

 $ cat read1 echo -n "Go ahead: " read firstline echo "You entered: $firstline" $ read1 Go ahead: This is a line. You entered: This is a line.

The first line of the read1 script uses echo to prompt you to enter a line of text. The n option suppresses the following NEWLINE, allowing you to enter a line of text on the same line as the prompt. The second line reads the text into the variable firstline. The third line verifies the action of read by displaying the value of firstline. The variable is quoted (along with the text string) in this example because you, as the script writer, cannot anticipate which characters the user might enter in response to the prompt. Consider what would happen if the variable were not quoted and the user entered * in response to the prompt:

 $ cat read1_no_quote echo -n "Go ahead: " read firstline echo You entered: $firstline $ read1_no_quote Go ahead: * You entered: read1 read1_no_quote script.1 $ ls read1   read1_no_quote    script.1

The ls command lists the same words as the script, demonstrating that the shell expands the asterisk into a list of files in the working directory. When the variable $firstline is surrounded by double quotation marks, the shell does not expand the asterisk. Thus the read1 script behaves correctly:

 $ read1 Go ahead: * You entered: *

If you want the shell to interpret the special meanings of special characters, do not use quotation marks.

REPLY

The read builtin has features that can make it easier to use. When you do not specify a variable to receive read's input, bash puts the input into the variable named REPLY. You can use the p option to prompt the user instead of using a separate echo command. The following read1a script performs exactly the same task as read1:

 $ cat read1a read -p "Go ahead: " echo "You entered: $REPLY"

The read2 script prompts for a command line and reads the user's response into the variable cmd. The script then attempts to execute the command line that results from the expansion of the cmd variable:

 $ cat read2 read -p "Enter a command: " cmd $cmd echo "Thanks"

In the following example, read2 reads a command line that calls the echo builtin. The shell executes the command and then displays Thanks. Next read2 reads a command line that executes the who utility:

 $ read2 Enter a command: echo Please display this message. Please display this message. Thanks $ read2 Enter a command: who alex      pts/4         Jun 17 07:50  (:0.0) scott     pts/12        Jun 17 11:54  (bravo.example.com) Thanks

If cmd does not expand into a valid command line, the shell issues an error message:

 $ read2 Enter a command: xxx ./read2: line 2: xxx: command not found Thanks

The read3 script reads values into three variables. The read builtin assigns one word (a sequence of nonblank characters) to each variable:

 $ cat read3 read -p "Enter something: " word1 word2 word3 echo "Word 1 is: $word1" echo "Word 2 is: $word2" echo "Word 3 is: $word3" $ read3 Enter something: this is something Word 1 is: this Word 2 is: is Word 3 is: something

When you enter more words than read has variables, read assigns one word to each variable, with all leftover words going to the last variable. Both read1 and read2 assigned the first word and all leftover words to the one variable they each had to work with. In the following example, read accepts five words into three variables, assigning the first word to the first variable, the second word to the second variable, and the third through fifth words to the third variable:

 $ read3 Enter something: this is something else, really. Word 1 is:  this Word 2 is:  is Word 3 is:  something else, really.

Table 11-4 lists some of the options supported by the read builtin.

Table 11-4. read options
Option	Function
a aname (array)	Assigns each word of input to an element of array aname.
d delim (delimiter)	Uses delim to terminate the input instead of NEWLINE.
e (Readline)	If input is coming from a keyboard, use the Readline Library (page 305) to get input.
n num (number of characters)	Reads num characters and returns. As soon as the user types num characters, `read` returns; there is no need to press RETURN.
p prompt (prompt)	Displays prompt on standard error without a terminating `NEWLINE` before reading input. Displays prompt only when input comes from the keyboard.
s (silent)	Does not echo characters.
un (file descriptor)	Uses the integer n as the file descriptor that `read` takes its input from. read u4 arg1 arg2 is equivalent to read arg1 arg2 <&4 See "File Descriptors" (page 470) for a discussion of redirection and file descriptors.

The read builtin returns an exit status of 0 if it successfully reads any data. It has a nonzero exit status when it reaches the EOF (end of file). The following example runs a while loop from the command line. It takes its input from the names file and terminates after reading the last line from names.

 $ cat names Alice Jones Robert Smith Alice Paulson John Q. Public $ while read first rest > do > echo $rest, $first > done < names Jones, Alice Smith, Robert Paulson, Alice Q. Public, John $

The placement of the redirection symbol (<) for the while structure is critical. It is important that you place the redirection symbol at the done statement and not at the call to read.

optional

Each time you redirect input, the shell opens the input file and repositions the read pointer at the start of the file:

 $ read line1 < names; echo $line1; read line2 < names; echo $line2 Alice Jones Alice Jones

Here each read opens names and starts at the beginning of the names file. In the following example, names is opened once, as standard input of the subshell created by the parentheses. Each read then reads successive lines of standard input.

 $ (read line1; echo $line1; read line2; echo $line2) < names Alice Jones Robert Smith

Another way to get the same effect is to open the input file with exec and hold it open (refer to "File Descriptors" on page 470):

 $ exec 3< names $ read -u3 line1; echo $line1; read -u3 line2; echo $line2 Alice Jones Robert Smith $ exec 3<&-

`exec`: Executes a Command

The exec builtin (not available in tcsh) has two primary purposes: to run a command without creating a new process and to redirect a file descriptor including standard input, output, or error of a shell script from within the script (page 470). When the shell executes a command that is not built into the shell, it typically creates a new process. The new process inherits environment (global or exported) variables from its parent but does not inherit variables that are not exported by the parent. (For more information refer to "Locality of Variables" on page 475.) In contrast, exec executes a command in place of (overlays) the current process.

`exec` versus . (dot)

Insofar as exec runs a command in the environment of the original process, it is similar to the . (dot) command (page 259). However, unlike the . command, which can run only shell scripts, exec can run both scripts and compiled programs. Also, whereas the . command returns control to the original script when it finishes running, exec does not. Finally, the . command gives the new program access to local variables, whereas exec does not.

`exec` runs a command

The exec builtin used for running a command has the following syntax:

 exec command arguments

`exec` does not return control

Because the shell does not create a new process when you use exec, the command runs more quickly. However, because exec does not return control to the original program, it can be used only as the last command that you want to run in a script. The following script shows that control is not returned to the script:

 $ cat exec_demo who exec date echo "This line is never displayed." $ exec_demo jenny    pts/7    May 30  7:05 (bravo.example.com) hls      pts/1    May 30  6:59 (:0.0) Mon May 30 11:42:56 PDT 2005

The next example, a modified version of the out script (page 442), uses exec to execute the final command the script runs. Because out runs either cat or less and then terminates, the new version, named out2, uses exec with both cat and less:

 $ cat out2 if [ $# -eq 0 ]     then         echo "Usage: out2 [-v] filenames" 1>&2         exit 1 fi if [ "$1" = "-v" ]     then         shift         exec less "$@"     else         exec cat -- "$@" fi

`exec` redirects input and output

The second major use of exec is to redirect a file descriptor including standard input, output, or error from within a script. The next command causes all subsequent input to a script that would have come from standard input to come from the file named infile:

 exec < infile

Similarly the following command redirects standard output and standard error to outfile and errfile, respectively:

 exec > outfile 2> errfile

When you use exec in this manner, the current process is not replaced with a new process, and exec can be followed by other commands in the script.

/dev/tty

When you redirect the output from a script to a file, you must make sure that the user sees any prompts the script displays. The /dev/tty device is a pseudonym for the screen the user is working on; you can use this device to refer to the user's screen without knowing which device it is. (The tty utility displays the name of the device you are using.) By redirecting the output from a script to /dev/tty, you ensure that prompts and messages go to the user's terminal, regardless of which terminal the user is logged in on. Messages sent to /dev/tty are also not diverted if standard output and standard error from the script are redirected.

The to_screen1 script sends output to three places: standard output, standard error, and the user's screen. When it is run with standard output and standard error redirected, to_screen1 still displays the message sent to /dev/tty on the user's screen. The out and err files hold the output sent to standard output and standard error.

 $ cat to_screen1 echo "message to standard output" echo "message to standard error" 1>&2 echo "message to the user" > /dev/tty $ to_screen1 > out 2> err message to the user $ cat out message to standard output $ cat err message to standard error

The following command redirects the output from a script to the user's screen:

 exec > /dev/tty

Putting this command at the beginning of the previous script changes where the output goes. In to_screen2, exec redirects standard output to the user's screen so the > /dev/tty is superfluous. Following the exec command, all output sent to standard output goes to /dev/tty (the screen). Output to standard error is not affected.

 $ cat to_screen2 exec > /dev/tty echo "message to standard output" echo "message to standard error" 1>&2 echo "message to the user" > /dev/tty $ to_screen2 > out 2> err message to standard output message to the user

One disadvantage of using exec to redirect the output to /dev/tty is that all subsequent output is redirected unless you use exec again in the script.

You can also redirect the input to read (standard input) so that it comes from /dev/tty (the keyboard):

 read name < /dev/tty

 exec < /dev/tty

`trap`: Catches a Signal

A signal is a report to a process about a condition. Linux uses signals to report interrupts generated by the user (for example, pressing the interrupt key) as well as bad system calls, broken pipes, illegal instructions, and other conditions. The TRap builtin (tcsh uses onintr) catches, or traps, one or more signals, allowing you to direct the actions a script takes when it receives a specified signal.

This discussion covers six signals that are significant when you work with shell scripts. Table 11-5 lists these signals, the signal numbers that systems often ascribe to them, and the conditions that usually generate each signal. Give the command kill l, trap l, or man 7 signal for a list of signal names.

When it traps a signal, a script takes whatever action you specify: It can remove files or finish any other processing as needed, display a message, terminate execution immediately, or ignore the signal. If you do not use trap in a script, any of the six actual signals listed in Table 11-5 (not EXIT, DEBUG, or ERR) terminates the script. Because a process cannot trap a KILL signal, you can use kill KILL (or kill 9) as a last resort to terminate a script or any other process. (See page 497 for more information on kill.)

The TRap command has the following syntax:

 trap ['commands'] [signal]

The optional commands part specifies the commands that the shell executes when it catches one of the signals specified by signal. The signal can be a signal name or number for example, INT or 2. If commands is not present, trap resets the trap to its initial condition, which is usually to exit from the script.

The trap builtin does not require single quotation marks around commands as shown in the preceding syntax, but it is a good practice to use them. The single quotation marks cause shell variables within the commands to be expanded when the signal occurs, not when the shell evaluates the arguments to TRap. Even if you do not use any shell variables in the commands, you need to enclose any command that takes arguments within either single or double quotation marks. Quoting the commands causes the shell to pass to trap the entire command as a single argument.

After executing the commands, the shell resumes executing the script where it left off. If you want trap to prevent a script from exiting when it receives a signal but not to run any commands explicitly, you can specify a null (empty) commands string, as shown in the locktty script (page 458). The following command traps signal number 15 after which the script continues.

 trap '' 15

The following script demonstrates how the TRap builtin can catch the terminal interrupt signal (2). You can use SIGINT, INT, or 2 to specify this signal. The script returns an exit status of 1:

 $ cat inter #!/bin/bash trap 'echo PROGRAM INTERRUPTED; exit 1' INT while true do     echo "Program running."     sleep 1 done $ inter Program running. Program running. Program running. CONTROL-C PROGRAM INTERRUPTED $

: (null) builtin

The second line of inter sets up a trap for the terminal interrupt signal using INT. When trap catches the signal, the shell executes the two commands between the single quotation marks in the trap command. The echo builtin displays the message PROGRAM INTERRUPTED, exit terminates the shell running the script, and the parent shell displays a prompt. If exit were not there, the shell would return control to the while loop after displaying the message. The while loop repeats continuously until the script receives a signal because the true utility always returns a true exit status. In place of true you can use the : (null) builtin, which is written as a colon and always returns a 0 (true) status.

The trap builtin frequently removes temporary files when a script is terminated prematurely so that the files are not left to clutter the filesystem. The following shell script, named addbanner, uses two traps to remove a temporary file when the script terminates normally or owing to a hangup, software interrupt, quit, or software termination signal:

 $ cat addbanner #!/bin/bash script=$(basename $0) if [ ! -r "$HOME/banner" ]     then         echo "$script: need readable $HOME/banner file" 1>&2         exit 1 fi trap 'exit 1' 1 2 3 15 trap 'rm /tmp/$$.$script 2> /dev/null' 0 for file do     if [ -r "$file" -a -w "$file" ]         then             cat $HOME/banner $file > /tmp/$$.$script             cp /tmp/$$.$script $file             echo "$script: banner added to $file" 1>&2         else             echo "$script: need read and write permission for $file" 1>&2         fi done

When called with one or more filename arguments, addbanner loops through the files, adding a header to the top of each. This script is useful when you use a standard format at the top of your documents, such as a standard layout for memos, or when you want to add a standard header to shell scripts. The header is kept in a file named ~/banner. Because addbanner uses the HOME variable, which contains the pathname of the user's home directory, the script can be used by several users without modification. If Alex had written the script with /home/alex in place of $HOME and then given the script to Jenny, either she would have had to change it or addbanner would have used Alex's banner file when Jenny ran it (assuming Jenny had read permission for the file).

The first trap in addbanner causes it to exit with a status of 1 when it receives a hangup, software interrupt (terminal interrupt or quit signal), or software termination signal. The second TRap uses a 0 in place of signal-number, which causes trap to execute its command argument whenever the script exits because it receives an exit command or reaches its end. Together these traps remove a temporary file whether the script terminates normally or prematurely. Standard error of the second trap is sent to /dev/null for cases in which trap attempts to remove a nonexistent temporary file. In those cases rm sends an error message to standard error; because standard error is redirected, the user does not see this message.

See page 458 for another example that uses TRap.

`kill`: Aborts a Process

The kill builtin sends a signal to a process or job. The kill command has the following syntax:

 kill [ signal] PID

where signal is the signal name or number (for example, INT or 2) and PID is the process identification number of the process that is to receive the signal. You can specify a job number (page 125) as %n in place of PID. If you omit signal, kill sends a TERM (software termination, number 15) signal. For more information on signal names and numbers see Table 11-5 on page 494.

The following command sends the TERM signal to job number 1:

 $ kill -TERM %1

Because TERM is the default signal for kill, you can also give this command as kill %1. Give the command kill l (lowercase "l") to display a list of signal names.

A program that is interrupted often leaves matters in an unpredictable state: Temporary files may be left behind (when they are normally removed), and permissions may be changed. A well-written application traps, or detects, signals and cleans up before exiting. Most carefully written applications trap the INT, QUIT, and TERM signals.

To terminate a program, first try INT (press CONTROL-C, if the job is in the foreground). Because an application can be written to ignore these signals, you may need to use the KILL signal, which cannot be trapped or ignored; it is a "sure kill." Refer to page 693 for more information on kill. See also the related utility killall (page 695).

`getopts`: Parses Options

The getopts builtin (not available in tcsh) parses command line arguments, thereby making it easier to write programs that follow the Linux argument conventions. The syntax for getopts is

 getopts optstring varname [arg ...]

where optstring is a list of the valid option letters, varname is the variable that receives the options one at a time, and arg is the optional list of parameters to be processed. If arg is not present, getopts processes the command line arguments. If optstring starts with a colon (:), the script takes care of generating error messages; otherwise, getopts generates error messages.

The getopts builtin uses the OPTIND (option index) and OPTARG (option argument) variables to store option-related values. When a shell script starts, the value of OPTIND is 1. Each time getopts locates an argument, it increments OPTIND to the index of the next option to be processed. If the option takes an argument, bash assigns the value of the argument to OPTARG.

To indicate that an option takes an argument, follow the corresponding letter in optstring with a colon (:). The option string dxo:lt:r indicates that getopts should search for d, x, o, l, t, and r options and that the o and t options take arguments.

Using getopts as the test-command in a while control structure allows you to loop over the options one at a time. The getopts builtin checks the option list for options that are in optstring. Each time through the loop, getopts stores the option letter it finds in varname.

Suppose that you want to write a program that can take three options:

A b option indicates that the program should ignore whitespace at the start of input lines.
A t option followed by the name of a directory indicates that the program should use that directory for temporary files. Otherwise, it should use /tmp.
A u option indicates that the program should translate all its output to uppercase.

In addition, the program should ignore all other options and end option processing when it encounters two hyphens ( ).

The problem is to write the portion of the program that determines which options the user has supplied. The following solution does not use getopts:

 SKIPBLANKS= TMPDIR=/tmp CASE=lower while [[ "$1" = -* ]] # [[ = ]] does pattern match do     case $1 in         -b)        SKIPBLANKS=TRUE ;;         -t)        if [ -d "$2" ]                     then                     TMPDIR=$2                     shift                 else                     echo "$0: -t takes a directory argument." >&2                     exit 1                 fi ;;         -u)        CASE=upper ;;         --)        break   ;;      # Stop processing options         *)        echo "$0: Invalid option $1 ignored." >&2 ;;         esac     shift done

This program fragment uses a loop to check and shift arguments while the argument is not . As long as the argument is not two hyphens, the program continues to loop through a case statement that checks for possible options. The case label breaks out of the while loop. The * case label recognizes any option; it appears as the last case label to catch any unknown options, displays an error message, and allows processing to continue. On each pass through the loop, the program does a shift to get to the next argument. If an option takes an argument, the program does an extra shift to get past that argument.

The following program fragment processes the same options, but uses getopts:

 SKIPBLANKS= TMPDIR=/tmp CASE=lower while getopts :bt:u arg do     case $arg in         b)        SKIPBLANKS=TRUE ;;         t)        if [ -d "$OPTARG" ]                     then                     TMPDIR=$OPTARG                 else                     echo "$0: $OPTARG is not a directory." >&2                     exit 1                 fi ;;         u)        CASE=upper ;;         :)        echo "$0: Must supply an argument to -$OPTARG." >&2                 exit 1 ;;         \?)        echo "Invalid option -$OPTARG ignored." >&2 ;;         esac done

In this version of the code, the while structure evaluates the getopts builtin each time it comes to the top of the loop. The getopts builtin uses the OPTIND variable to keep track of the index of the argument it is to process the next time it is called. There is no need to call shift in this example.

In the getopts version of the script the case patterns do not start with a hyphen because the value of arg is just the option letter (getopts strips off the hyphen). Also, getopts recognizes as the end of the options, so you do not have to specify it explicitly as in the case statement in the first example.

Because you tell getopts which options are valid and which require arguments, it can detect errors in the command line and handle them in two ways. This example uses a leading colon in optstring to specify that you check for and handle errors in your code; when getopts finds an invalid option, it sets varname to ? and OPTARG to the option letter. When it finds an option that is missing an argument, getopts sets varname to : and OPTARG to the option lacking an argument.

The \? case pattern specifies the action to take when getopts detects an invalid option. The : case pattern specifies the action to take when getopts detects a missing option argument. In both cases getopts does not write any error message; it leaves that task to you.

If you omit the leading colon from optstring, both an invalid option and a missing option argument cause varname to be assigned the string ?. OPTARG is not set and getopts writes its own diagnostic message to standard error. Generally this method is less desirable because you have less control over what the user sees when an error is made.

Using getopts will not necessarily make your programs shorter. Its principal advantages are that it provides a uniform programming interface and it enforces standard option handling.

A Partial List of Builtins

Table 11-6 lists some of the bash builtins. See "Listing bash builtins" on page 133 for instructions on how to display complete lists of builtins.

< Day Day Up >

type: Displays Information About a Command

read: Accepts User Input

REPLY

Table 11-4. read options

optional

exec: Executes a Command

exec versus . (dot)

exec runs a command

exec does not return control

exec redirects input and output