Further Scripting Techniques | UNIX: The Complete Reference, Second Edition (Complete Reference Series)

By now you know most of the important techniques for shell scripting, including various methods of getting input from the user, working with data, and controlling the flow of your scripts with statements like if and for. This section describes techniques that are less common (but still useful), such as how to process command-line options, how to read all the lines in a file, and how to process interrupt signals.

Command-Line Options in Shell Scripts

You already know how to use command-line arguments, such as filenames, with the positional parameters $1, $2, and so on. You could use the positional parameters and a set of if or case statements to handle option flags (as in ls -la) as well, but the command getopts is much easier to use.

getopts parses the options that are given to a script on the command line. It interprets any letter preceded by a minus sign as an option. It allows options to be specified in any order, and options without arguments to be grouped together.

The easiest way to understand getopts is from an example. This example simply reads the command-line options with getopts and prints them to standard output:

 $ cat getoptsExample # Look for the command line options a, b, c, and d. # The options a and d take arguments, unlike b and c. # Print any options that are found. while getopts a:bcd: FLAGNAME do     case $FLAGNAME in     a)  echo "Found -a $OPTARG"         ;;     b)  echo "Found -b"         ;;     c)  echo "Found -c"         ;;     d)  echo "Found -d $OPTARG"         ;;             \?) echo "Error: unexpected argument"         exit 2         ;; done echo "There were $OPTIND options and arguments total." # Remove the options from the list of positional parameters. shift 'expr $OPTIND − 1' echo -e "The other command line arguments were:\n$*"

Here’s what it might look like when run:

 $ ./getoptsExample -bc -a "testing options" filename1 filename2 Found -b Found -c Found -a testing options There were 4 options and arguments total. The rest of the command line arguments were: filename1 filename2

Here’s how the example works. The line getopts a:bcd: FLAGNAME looks for the options a, b, c, and d. The : after a and d shows that those options take additional arguments. The first option found is saved in FLAGNAME. Any arguments for that option are saved in the special variable OPTARG. The case statement checks which option it was, and takes whatever action is appropriate. In this case, the options were printed with echo. More commonly, variables might be set here to indicate which options were chosen and to save their arguments.

If an argument not on the getopts list is found, FLAGNAME is set to ?. The case statement shown above includes a test for ?, which will print an error message and exit.

The while loop repeats until all the options have been found. At this point, the special variable OPTIND has the number of options and arguments that have been found. The shift command is used to remove these from the list of positional parameters, so that the command-line arguments can be used.

Using getopts may seem rather daunting, and of course for the majority of scripts it is unnecessary But once you understand how it works, it’s not too hard to adapt the sample code just shown for use in any script you might write.

Grouping Commands

You can execute a list of commands as a group by enclosing them in parentheses. The commands are executed in their own subshell. For example,

 (cd ~/bin; ls −1)

You can enter this on the command line to list the contents of ~/bin. Because the commands are executed in a subshell, your current directory will not be changed.

If you want to execute a group of commands in the current shell, enclose them with curly brackets instead of parentheses.

Grouping commands makes it easy to redirect output. For example,

 {date; who; last} > $LOGFILE

is shorter than

 date > $LOGFILE who >> $LOGFILE last >> $LOGFILE

Grouping also allows you to redirect output from commands in a pipeline. If you try to redirect standard error like this:

 diff $OLDFILE $NEWFILE | lp 2> errorfile

only error messages from lp will be captured. You can use

 (diff $OLDFILE $NEWFILE | lp) 2> errorfile

to redirect error messages from all the commands in the pipeline.

Reading Each Line in a File

Suppose you want to read the contents of a file one line at a time. For example, you might want to print a line number at the beginning of each line. You could do it like this:

 n=0 cat $FILE |   while read LINE   do     echo "$n) $LINE"     n='expr $n + 1'   done echo "There were $n lines in $FILE."

This uses a pipe to send the contents of $FILE to the read command in the while loop. The loop repeats as long as there are lines to read. The variable n keeps track of the total number of lines.

The problem with is this is that each command in a pipeline is executed in a subshell. Because the while loop is executed in its own subshell, the changes to the variable n don’t get saved. So the last line of the script says that there were 0 lines in the file.

You can fix this by grouping the loop with curly braces (so that it gets executed in the current shell), and sending the contents of $FILE to the loop. The new script will look like this:

 n=0 {   while read LINE   do     echo "$n) $LINE"     n='expr $n + 1'   done } < $FILE echo "There were $n lines in $FILE."

As before, the lines from $FILE are printed with line numbers, but this time the variable n is updated, so the total number of lines is reported correctly

The trap Command

Some shell scripts create temporary files to store data. These files are typically deleted at the end of the script. But sometimes scripts are interrupted before they finish (e.g., if you hit CTRL-C), in which case these files might be left sitting there. The trap command provides a way to execute a short sequence of commands to clean up before your script is forced to exit.

Ending a process with kill, hitting CTRL-C, or closing your terminal window causes the UNIX system to send an interrupt signal to your script. With trap you can specify which of these signals to look for. The general form of the command is

 trap commands interrupt-numbers

The first argument to trap is the command or commands to be executed when an interrupt is received. The interrupt-numbers are codes that specify the interrupt. The most important interrupts are shown in Table 20–6.

Table 20–6: Interrupt Codes
Number	Interrupt Meaning
0	Shell Exit This occurs at the end of a script that is being executed in a subshell. It is not normally included in a trap statement.
1	Hangup This occurs when you exit your current session (e.g., if you close your terminal window).
2	Interrupt This happens when you end a process with CTRL-C.
9	Kill This happens when you use kill −9 to terminate the script. It cannot be trapped.
15	Terminate This happens if you use kill to terminate the script, as in kill %1.

The trap statement is usually added at the beginning of your script, so that it will be executed no matter when your script is interrupted. It might look something like this:

 trap 'rm tmpfile; exit 1' 1 2 15

In this case, if an interrupt is received, tmpfile will be deleted, and the script will exit with an error code. If you do not include the exit command, the script will not exit. Instead, it will continue executing from the point where the interrupt was received. To ensure that your scripts exit when they are interrupted, always remember to include exit as part of the trap statement. If you forget to do this, you will have to use kill −9 to end your script. Since interrupt 9 cannot be trapped, you can always use CTRL-Z, followed by kill −9 %n (where n is the job number), to end your current process.

The xargs Command

One much-used feature of the shell is the capability to connect the output of one program to the input of another using pipes. Sometime you may want to use the output of one command to define the arguments for another. xargs is a shell programming tool that lets you do this. xargs is an especially useful command for constructing lists of arguments and executing commands. This is the general format of xargs:

 xargs [flags] [command [(initial args)]]

xargs takes its initial arguments, combines them with arguments read from the standard input, and uses the combination in executing the specified command. Each command to be executed is constructed from the command, then the initial args, and then the arguments read from standard input.

For example, you can use xargs to combine the commands find and grep in order to search an entire directory structure for files containing a particular string. The find command is used to recursively descend the directory tree, and grep is used to search for the target string in all of the files from find.

In this example, find starts in the current directory (.) and prints on standard output all filenames in the directory and its subdirectories. xargs then takes each filename from its standard input and combines it with the options to grep (-s, -i, -l, -n) and the command-line arguments ($*, which is the target pattern) to construct a command of the form grep -i -l, -n $* filename. xargs continues to construct and execute a new command line for every filename provided to it. The program fileswith prints out the name of each file that has the target pattern in it, so the command fileswith Calvino will print out names of all files that contain the string “Calvino”.

 # # fileswith - descend directory structure # and print names of files that contain # target words specified on the command line. # find . -type f -print | xargs grep −1 -i -s $* 2>/dev/null

The output is a listing of all the files that contain the target phrase:

 $ fileswith Borges ./mbox ./Notes/books ./Scripts/Perl/orbis-tertius.pl

xargs itself can take several arguments, and its use can get rather complicated. The two most commonly used arguments are:

-i	Each line from standard input is treated as a single argument and inserted into initial args in place of the () symbols.
-p	Prompt mode. For each command to be executed, print the command, followed by a ?. Execute the command only if the user types y (followed by anything). If anything else is typed, skip the command.

In the following example, move uses xargs to list all the files in a directory ($1) and move each file to a second directory ($2), using the same filename. The -i option to xargs replaces the () in the script with the output of ls. The -p option prompts the user before executing each command:

 # # move $1 $2 - move files from directory $1 to directory $2, # echo mv command, and prompt for "y" before # executing command. # ls $1 | xargs -i -p mv $1/() $2/()