File Descriptors


As discussed on page 262, before a process can read from or write to a file it must open that file. When a process opens a file, Mac OS X associates a number (called a file descriptor) with the file. Each process has its own set of open files and its own file descriptors. After opening a file, a process reads from and writes to that file by referring to its file descriptor. When it no longer needs the file, the process closes the file, freeing the file descriptor.

A typical Mac OS X process starts with three open files: standard input (file descriptor 0), standard output (file descriptor 1), and standard error (file descriptor 2). Often those are the only files the process needs. Recall that you redirect standard output with the symbol > or the symbol 1> and that you redirect standard error with the symbol 2>. Although you can redirect other file descriptors, because file descriptors other than 0, 1, and 2 do not have any special conventional meaning, it is rarely useful to do so. The exception is in programs that you write yourself, in which case you control the meaning of the file descriptors and can take advantage of redirection.

Opening a file descriptor

The Bourne Again Shell opens files using the exec builtin as follows:

exec n> outfile exec m< infile


The first line opens outfile for output and holds it open, associating it with file descriptor n. The second line opens infile for input and holds it open, associating it with file descriptor m.

Duplicating a file descriptor

The <& token duplicates an input file descriptor; use >& to duplicate an output file descriptor. You can duplicate a file descriptor by making it refer to the same file as another open file descriptor, such as standard input or output. Use the following format to open or redirect file descriptor n as a duplicate of file descriptor m:

exec n<&m


Once you have opened a file, you can use it for input and output in two different ways. First, you can use I/O redirection on any command line, redirecting standard output to a file descriptor with >&n or redirecting standard input from a file descriptor with <&n. Second, you can use the read (page 571) and echo builtins. If you invoke other commands, including functions (page 314), they inherit these open files and file descriptors. When you have finished using a file, you can close it with

exec n<&


When you invoke the shell function in the next example, named mycp, with two arguments, it copies the file named by the first argument to the file named by the second argument. If you supply only one argument, the script copies the file named by the argument to standard output. If you invoke mycp with no arguments, it copies standard input to standard output.

Tip: A function is not a shell script

The mycp example is a shell function; it will not work as you expect if you execute it as a shell script. (It will work: The function will be created in a very short-lived subshell, which is probably of little use.) You can enter this function from the keyboard. If you put the function in a file, you can run it as an argument to the . (dot) builtin (page 261). You can also put the function in a startup file if you want it to be always available (page 315).


function mycp () { case $# in     0)         # zero arguments         # file descriptor 3 duplicates standard input         # file descriptor 4 duplicates standard output         exec 3<&0 4<&1         ;;     1)         # one argument         # open the file named by the argument for input         # and associate it with file descriptor 3         # file descriptor 4 duplicates standard output         exec 3< $1 4<&1         ;;     2)         # two arguments         # open the file named by the first argument for input         # and associate it with file descriptor 3         # open the file named by the second argument for output         # and associate it with file descriptor 4         exec 3< $1 4> $2         ;;     *)         echo "Usage: mycp [source [dest]]"         return 1         ;; esac # call cat with input coming from file descriptor 3 # and output going to file descriptor 4 cat <&3 >&4 # close file descriptors 3 and 4 exec 3<&- 4<&- }


The real work of this function is done in the line that begins with cat. The rest of the script arranges for file descriptors 3 and 4, which are the input and output of the cat command, to be associated with the appropriate files.

With its output redirected, the cat utility supports only data forks. Files that have resource forks or other metadata will not be copied properly by this script. For more information see the "Redirection does not support resource forks" tip on page 94.

Optional

The next program takes two filenames on the command line, sorts both, and sends the output to temporary files. The program then merges the sorted files to standard output, preceding each line by a number that indicates which file it came from.

$ cat sortmerg #!/bin/bash usage () { if [ $# -ne 2 ]; then     echo "Usage: $0 file1 file2" 2>&1     exit 1     fi } # Default temporary directory : ${TEMPDIR:=/tmp} # Check argument count usage "$@" # Set up temporary files for sorting file1=$TEMPDIR/$$.file1 file2=$TEMPDIR/$$.file2 # Sort sort $1 > $file1 sort $2 > $file2 # Open $file1 and $file2 for reading. Use file descriptors 3 and 4. exec 3<$file1 exec 4<$file2 # Read the first line from each file to figure out how to start. read Line1 <&3 status1=$? read Line2 <&4 status2=$? # Strategy: while there is still input left in both files: #   Output the line that should come first. #   Read a new line from the file that line came from. while [ $status1 -eq 0 -a $status2 -eq 0 ]     do        if [[ "$Line2" > "$Line1" ]]; then            echo -e "1.\t$Line1"            read -u3 Line1            status1=$?         else            echo -e "2.\t$Line2"            read -u4 Line2            status2=$?         fi     done # Now one of the files is at end-of-file. # Read from each file until the end. # First file1: while [ $status1 -eq 0 ]     do         echo -e "1.\t$Line1"         read Line1 <&3         status1=$?     done # Next file2: while [[ $status2 -eq 0 ]]     do         echo -e "2.\t$Line2"         read Line2 <&4         status2=$?     done # Close and remove both input files exec 3<&- 4<&- rm -f $file1 $file2 exit 0



Parameters and Variables

Shell parameters and variables were introduced on page 278. This section adds to the previous coverage with a discussion of array variables, global versus local variables, special and positional parameters, and expanding null and unset variables.

Array Variables

The Bourne Again Shell supports one-dimensional array variables. The subscripts are integers with zero-based indexing (i.e., the first element of the array has the subscript 0). The following format declares and assigns values to an array:

name=(element1 element2 ...)


The following example assigns four values to the array NAMES:

$ NAMES=(max helen sam zach)


You reference a single element of an array as follows:

$ echo ${NAMES[2]} sam


The subscripts [*] and [@] both extract the entire array but work differently when used within double quotation marks. An @ produces an array that is a duplicate of the original array; an * produces a single element of an array (or a plain variable) that holds all the elements of the array separated by the first character in IFS (normally a SPACE). In the following example, the array A is filled with the elements of the NAMES variable using an *, and B is filled using an @. The declare builtin with the a option displays the values of the arrays (and reminds you that bash uses zero-based indexing for arrays):

$ A=("${NAMES[*]}") $ B=("${NAMES[@]}") $ declare -a declare -a A='([0]="max helen sam zach")' declare -a B='([0]="max" [1]="helen" [2]="sam" [3]="zach")' ... declare -a NAMES='([0]="max" [1]="helen" [2]="sam" [3]="zach")'


From the output of declare, you can see that NAMES and B have multiple elements. In contrast, A, which was assigned its value with an * within double quotation marks, has only one element: A has all its elements enclosed between double quotation marks.

In the next example, echo attempts to display element 1 of array A. Nothing is displayed because A has only one element and that element has an index of 0. Element 0 of array A holds all four names. Element 1 of B holds the second item in the array and element 0 holds the first item.

$ echo ${A[1]} $ echo ${A[0]} max helen sam zach $ echo ${B[1]} helen $ echo ${B[0]} max


You can apply the ${#name[*]} operator to array variables, returning the number of elements in the array:

$ echo ${#NAMES[*]} 4


The same operator, when given the index of an element of an array in place of *, returns the length of the element:

$ echo ${#NAMES[1]} 5


You can use subscripts on the left side of an assignment statement to replace selected elements of the array:

$ NAMES[1]=alex $ echo ${NAMES[*]} max alex sam zach


Locality of Variables

By default variables are local to the process in which they are declared. Thus a shell script does not have access to variables declared in your login shell unless you explicitly make the variables available (global). Under bash, export makes a variable available to child processes. Under tcsh, setenv (page 353) assigns a value to a variable and makes it available to child processes. The examples in this section use the bash syntax but the theory applies to both shells.

Once you use the export builtin with a variable name as an argument, the shell places the value of the variable in the calling environment of child processes. This call by value gives each child process a copy of the variable for its own use.

The following extest1 shell script assigns a value of american to the variable named cheese and then displays its filename (extest1) and the value of cheese. The extest1 script then calls subtest, which attempts to display the same information. Next subtest declares a cheese variable and displays its value. When subtest finishes, it returns control to the parent process, which is executing extest1. At this point extest1 again displays the value of the original cheese variable.

$ cat extest1 cheese=american echo "extest1 1: $cheese" subtest echo "extest1 2: $cheese" $ cat subtest echo "subtest 1: $cheese" cheese=swiss echo "subtest 2: $cheese" $ extest1 extest1 1: american subtest 1: subtest 2: swiss extest1 2: american


The subtest script never receives the value of cheese from extest1, and extest1 never loses the value. Unlike in the real world, a child can never affect its parent's attributes. When a process attempts to display the value of a variable that has not been declared, as is the case with subtest, the process displays nothing; the value of an undeclared variable is that of a null string.

The following extest2 script is the same as extest1 except that it uses export to make cheese available to the subtest script:

$ cat extest2 export cheese=american echo "extest2 1: $cheese" subtest echo "extest2 2: $cheese" $ extest2 extest2 1: american subtest 1: american subtest 2: swiss extest2 2: american


Here the child process inherits the value of cheese as american and, after displaying this value, changes its copy to swiss. When control is returned to the parent, the parent's copy of cheese retains its original value: american.

An export builtin can optionally include an assignment:

export cheese=american


The preceding statement is equivalent to the following two statements:

cheese=american export cheese


Although it is rarely done, you can export a variable before you assign a value to it. You do not need to export an already-exported variable a second time after you change its value. For example, you do not usually need to export PATH when you assign a value to it in ~/.bash_profile because it is typically exported in the /etc/profile global startup file.

Functions

Because functions run in the same environment as the shell that calls them, variables are implicitly shared by a shell and a function it calls.

$ function nam () { > echo $myname > myname=zach > } $ myname=sam $ nam sam $ echo $myname zach


In the preceding example, the myname variable is set to sam in the interactive shell. Then the nam function is called. It displays the value of myname it has (sam) and sets myname to zach. The final echo shows that, in the interactive shell, the value of myname has been changed to zach.

Function local variables

Local variables are helpful in a function written for general use. Because the function is called by many scripts that may be written by different programmers, you need to make sure that the names of the variables used within the function do not interact with variables of the same name in the programs that call the function. Local variables eliminate this problem. When used within a function, the typeset builtin declares a variable to be local to the function it is defined in.

The next example shows the use of a local variable in a function. It uses two variables named count. The first is declared and assigned a value of 10 in the interactive shell. Its value never changes, as echo verifies after count_down is run. The other count is declared, using typeset, to be local to the function. Its value, which is unknown outside the function, ranges from 4 to 1, as the echo command within the function confirms.

The example shows the function being entered from the keyboard; it is not a shell script. (See the tip "A function is not a shell script" on page 556).

$ function count_down () { > typeset count > count=$1 > while [ $count -gt 0 ] > do > echo "$count..." > ((count=count-1)) > sleep 1 > done > echo "Blast Off." > } $ count=10 $ count_down 4 4... 3... 2... 1... Blast Off\! $ echo $count 10


The ((count=count1)) assignment is enclosed between double parentheses, which cause the shell to perform an arithmetic evaluation (page 585). Within the double parentheses you can reference shell variables without the leading dollar sign ($).

Special Parameters

Special parameters enable you to access useful values pertaining to command line arguments and the execution of shell commands. You reference a shell special parameter by preceding a special character with a dollar sign ($). As with positional parameters, it is not possible to modify the value of a special parameter by assignment.

$$: PID Number

The shell stores in the $$ parameter the PID number of the process that is executing it. In the following interaction, echo displays the value of this variable and the ps utility confirms its value. Both commands show that the shell has a PID number of 5209:

$ echo $$ 5209 $ ps -a   PID TT STAT TIME      COMMAND  1709 p1 Ss     0:00.32 -bash  4168 p1 R+     0:00.00 ps -a


Because echo is built into the shell, the shell does not have to create another process when you give an echo command. However, the results are the same whether echo is a builtin or not, because the shell substitutes the value of $$ before it forks a new process to run a command. Try using the echo utility (/bin/echo), which is run by another process, and see what happens. In the following example, the shell substitutes the value of $$ and passes that value to cp as a prefix for a filename:

$ echo $$ 8232 $ cp memo $$.memo $ ls 8232.memo memo


Incorporating a PID number in a filename is useful for creating unique filenames when the meanings of the names do not matter; it is often used in shell scripts for creating names of temporary files. When two people are running the same shell script, these unique filenames keep them from inadvertently sharing the same temporary file.

The following example demonstrates that the shell creates a new shell process when it runs a shell script. The id2 script displays the PID number of the process running it (not the process that called itthe substitution for $$ is performed by the shell that is forked to run id2):

$ cat id2 echo "$0 PID= $$" $ echo $$ 8232 $ id2 ./id2 PID= 8362 $ echo $$ 8232


The first echo displays the PID number of the interactive shell. Then id2 displays its name ($0) and the PID of the subshell that it is running in. The last echo shows that the PID number of the interactive shell has not changed.

$!

The value of the PID number of the last process that you ran in the background is stored in $! (not available in tcsh). The following example executes sleep as a background task and uses echo to display the value of $!:

$ sleep 60 & [1] 8376 $ echo $! 8376


$?: Exit Status

When a process stops executing for any reason, it returns an exit status to the parent process. The exit status is also referred to as a condition code or a return code. The $? ($status under tcsh) variable stores the exit status of the last command.

By convention a nonzero exit status represents a false value and means that the command failed. A zero is true and indicates that the command was successful. In the following example, the first ls command succeeds and the second fails:

$ ls es es $ echo $? 0 $ ls xxx ls: xxx: No such file or directory $ echo $? 1


You can specify the exit status that a shell script returns by using the exit builtin, followed by a number, to terminate the script. If you do not use exit with a number to terminate a script, the exit status of the script is that of the last command the script ran.

$ cat es echo This program returns an exit status of 7. exit 7 $ es This program returns an exit status of 7. $ echo $? 7 $ echo $? 0


The es shell script displays a message and terminates execution with an exit command that returns an exit status of 7, the user-defined exit status in this script. The first echo then displays the value of the exit status of es. The second echo displays the value of the exit status of the first echo. The value is 0 because the first echo was successful.

Positional Parameters

The positional parameters comprise the command name and command line arguments. They are called positional because within a shell script, you refer to them by their position on the command line. Only the set builtin (page 568) allows you to change the values of positional parameters with one exception: You cannot change the value of the command name from within a script. The tcsh set builtin does not change the values of positional parameters.

$#: Number of Command Line Arguments

The $# parameter holds the number of arguments on the command line (positional parameters), not counting the command itself:

$ cat num_args echo "This script was called with $# arguments." $ num_args sam max zach This script was called with 3 arguments.


$0: Name of the Calling Program

The shell stores the name of the command you used to call a program in parameter $0. This parameter is numbered zero because it appears before the first argument on the command line:

$ cat abc echo "The command used to run this script is $0" $ abc The command used to run this script is ./abc $ /Users/sam/abc The command used to run this script is /Users/sam/abc


The preceding shell script uses echo to verify the name of the script you are executing. You can use the basename utility and command substitution to extract and display the simple filename of the command:

$ cat abc2 echo "The command used to run this script is $(basename $0)" $ /Users/sam/abc2 The command used to run this script is abc2


$1-$n: Command Line Arguments

The first argument on the command line is represented by parameter $1, the second argument by $2, and so on up to $n. For values of n over 9, the number must be enclosed within braces. For example, the twelfth command line argument is represented by ${12}. The following script displays positional parameters that hold command line arguments:

$ cat display_5args echo First 5 arguments are $1 $2 $3 $4 $5 $ display_5args jenny alex helen First 5 arguments are jenny alex helen


The display_5args script displays the first five command line arguments. The shell assigns a null value to each parameter that represents an argument that is not present on the command line. Thus the $4 and $5 variables have null values in this example.

$*

The $* variable represents all the command line arguments, as the display_all program demonstrates:

$ cat display_all echo All arguments are $* $ display_all a b c d e f g h i j k l m n o p All arguments are a b c d e f g h i j k l m n o p


Enclose references to positional parameters between double quotation marks. The quotation marks are particularly important when you are using positional parameters as arguments to commands. Without double quotation marks, a positional parameter that is not set or that has a null value disappears:

$ cat showargs echo "$0 was called with $# arguments, the first is :$1:." $ showargs a b c ./showargs was called with 3 arguments, the first is :a:. $ echo $xx $ showargs $xx a b c ./showargs was called with 3 arguments, the first is :a:. $ showargs "$xx" a b c ./showargs was called with 4 arguments, the first is ::.


The showargs script displays the number of arguments ($#) followed by the value of the first argument enclosed between colons. The preceding example first calls showargs with three simple arguments. Next the echo command demonstrates that the $xx variable, which is not set, has a null value. In the final two calls to showargs, the first argument is $xx. In the first case the command line becomes showargs a b c; the shell passes showargs three arguments. In the second case the command line becomes showargs "" a b c, which results in calling showargs with four arguments. The difference in the two calls to showargs illustrates a subtle potential problem that you should keep in mind when using positional parameters that may not be set or that may have a null value.

"$*"versus "$@"

The $* and $@ parameters work the same way except when they are enclosed within double quotation marks. Using "$*" yields a single argument (with SPACEs or the value of IFS [page 288] between the positional parameters), whereas "$@" produces a list wherein each positional parameter is a separate argument. This difference typically makes "$@" more useful than "$*" in shell scripts.

The following scripts help to explain the difference between these two special parameters. In the second line of both scripts, the single quotation marks keep the shell from interpreting the enclosed special characters so they can be displayed as themselves. The bb1 script shows that set "$*" assigns multiple arguments to the first command line parameter:

$ cat bb1 set "$*" echo $# parameters with '"$*"' echo 1: $1 echo 2: $2 echo 3: $3 $ bb1 a b c 1 parameters with "$*"' 1: a b c 2: 3:


The bb2 script shows that set "$@" assigns each argument to a different command line parameter:

$ cat bb2 set "$@" echo $# parameters with '"$@"' echo 1: $1 echo 2: $2 echo 3: $3 $ bb2 a b c 3 parameters with "$@" 1: a 2: b 3: c


shift: Promotes Command Line Arguments

The shift builtin promotes each command line argument. The first argument (which was $1) is discarded. The second argument (which was $2) becomes the first argument (now $1), the third becomes the second, and so on. Because no "unshift" command exists, you cannot bring back arguments that have been discarded. An optional argument to shift specifies the number of positions to shift (and the number of arguments to discard); the default is 1.

The following demo_shift script is called with three arguments. Double quotation marks around the arguments to echo preserve the spacing of the output. The program displays the arguments and shifts them repeatedly until there are no more arguments left to shift:

$ cat demo_shift echo "arg1= $1    arg2= $2    arg3= $3" shift echo "arg1= $1    arg2= $2    arg3= $3" shift echo "arg1= $1    arg2= $2    arg3= $3" shift echo "arg1= $1    arg2= $2    arg3= $3" shift $ demo_shift alice helen jenny arg1= alice    arg2= helen    arg3= jenny arg1= helen    arg2= jenny    arg3= arg1= jenny    arg2=    arg3= arg1=     arg2=    arg3=


Repeatedly using shift is a convenient way to loop over all the command line arguments in shell scripts that expect an arbitrary number of arguments. See page 529 for a shell script that uses shift.

set: Initializes Command Line Arguments

When you call the set builtin with one or more arguments, it assigns the values of the arguments to the positional parameters, starting with $1 (not available in tcsh). The following script uses set to assign values to the positional parameters $1, $2, and $3:

$ cat set_it set this is it echo $3 $2 $1 $ set_it it is this


Combining command substitution (page 327) with the set builtin is a convenient way to get standard output of a command in a form that can be easily manipulated in a shell script. The following script shows how to use date and set to provide the date in a useful format. The first command shows the output of date. Then cat displays the contents of the dateset script. The first command in this script uses command substitution to set the positional parameters to the output of the date utility. The next command, echo $*, displays all positional parameters resulting from the previous set. Subsequent commands display the values of parameters $1, $2, $3, and $4. The final command displays the date in a format you can use in a letter or report:

$ date Wed Jan  5 23:39:18 PST 2005 $ cat dateset set $(date) echo $* echo echo "Argument 1: $1" echo "Argument 2: $2" echo "Argument 3: $3" echo "Argument 6: $6" echo echo "$2 $3, $6" $ dateset Wed Jan 5 23:39:25 PST 2005 Argument 1: Wed Argument 2: Jan Argument 3: 5 Argument 6: 2005 Jan 5, 2005


You can also use the +format argument to date (page 701) to modify the format of its output.

When used without any arguments, set displays a list of the shell variables that are set, including user-created variables and keyword variables. Under bash, this list is the same as that displayed by declare and typeset when they are called without any arguments.

The set builtin also accepts options that let you customize the behavior of the shell (not available in tcsh). For more information refer to "set ±o: Turns Shell Features On and Off" on page 318.

Expanding Null and Unset Variables

The expression ${name} (or just $name if it is not ambiguous) expands to the value of the name variable. If name is null or not set, bash expands ${name} to a null string. The Bourne Again Shell provides the following alternatives to accepting the expanded null string as the value of the variable:

  • Use a default value for the variable.

  • Use a default value and assign that value to the variable.

  • Display an error.

You can choose one of these alternatives by using a modifier with the variable name. In addition, you can use set o nounset (page 320) to cause bash to display an error and exit from a script whenever an unset variable is referenced.

:- Uses a Default Value

The :- modifier uses a default value in place of a null or unset variable while allowing a nonnull variable to represent itself:

${name:default}


The shell interprets : as "If name is null or unset, expand default and use the expanded value in place of name; else use name." The following command lists the contents of the directory named by the LIT variable. If LIT is null or unset, it lists the contents of /Users/alex/literature:

$ ls ${LIT:-/Users/alex/literature}


The default can itself have variable references that are expanded:

$ ls ${LIT:-$HOME/literature}


:= Assigns a Default Value

The : modifier does not change the value of a variable. You may want to change the value of a null or unset variable to its default in a script, however. You can do so with the := modifier:

${name:=default}


The shell expands the expression ${name:=default} in the same manner as it expands ${name:default} but also sets the value of name to the expanded value of default. If a script contains a line such as the following and LIT is unset or null at the time this line is executed, LIT is assigned the value /Users/alex/literature:

$ ls ${LIT:=/Users/alex/literature}


: builtin

Shell scripts frequently start with the : (colon) builtin followed on the same line by the := expansion modifier to set any variables that may be null or unset. The : builtin evaluates each token in the remainder of the command line but does not execute any commands. Without the leading colon (:), the shell evaluates and attempts to execute the "command" that results from the evaluation.

Use the following syntax to set a default for a null or unset variable in a shell script (there is a SPACE following the first colon):

: ${name:=default}


When a script needs a directory for temporary files and uses the value of TEMPDIR for the name of this directory, the following line makes TEMPDIR default to /tmp:

: ${TEMPDIR:=/tmp}


:? Displays an Error Message

Sometimes a script needs the value of a variable but you cannot supply a reasonable default at the time you write the script. If the variable is null or unset, the :? modifier causes the script to display an error message and terminate with an exit status of 1:

${name:?message}


You must quote message if it contains SPACEs. If you omit message, the shell displays the default error message (parameter null or not set). Interactive shells do not exit when you use :?. In the following command, TESTDIR is not set so the shell displays on standard error the expanded value of the string following :?. In this case the string includes command substitution for date, with the %T format being followed by the string error, variable not set.

cd ${TESTDIR:?$(date +%T) error, variable not set.} bash: TESTDIR: 16:16:14 error, variable not set.


Builtin Commands

Builtin commands were introduced in Chapter 5. Commands that are built into a shell do not fork a new process when you execute them. This section discusses the type, read, exec, trap, kill, and getopts builtins and concludes with Table 13-6 on page 583, which lists many bash builtins. See Table 9-10 on page 373 for a list of tcsh builtins.

type: Displays Information About a Command

The type builtin (use which under tcsh) provides information about a command:

$ type cat echo who if lt cat is hashed (/bin/cat) echo is a shell builtin who is /usr/bin/who if is a shell keyword lt is aliased to 'ls -ltrh tail'


The preceding output shows the files that would be executed if you gave cat or who as a command. Because cat has already been called from the current shell, it is in the hash table (page 935) and type reports that cat is hashed. The output also shows that a call to echo runs the echo builtin, if is a keyword, and lt is an alias.

read: Accepts User Input

When you begin writing shell scripts, you soon realize that one of the most common tasks for user-created variables is storing information a user enters in response to a prompt. Using read, scripts can accept input from the user and store that input in variables. See page 358 for information about reading user input under tcsh. The read builtin reads one line from standard input and assigns the words on the line to one or more variables:

$ cat read1 echo -n "Go ahead: " read firstline echo "You entered: $firstline" $ read1 Go ahead: This is a line. You entered: This is a line.


The first line of the read1 script uses echo to prompt you to enter a line of text. The n option suppresses the following NEWLINE, allowing you to enter a line of text on the same line as the prompt. The second line reads the text into the variable firstline. The third line verifies the action of read by displaying the value of firstline. The variable is quoted (along with the text string) in this example because you, as the script writer, cannot anticipate which characters the user might enter in response to the prompt. Consider what would happen if the variable were not quoted and the user entered * in response to the prompt:

$ cat read1_no_quote echo -n "Go ahead: " read firstline echo You entered: $firstline $ read1_no_quote Go ahead: * You entered: read1 read1_no_quote script.1 $ ls read1   read1_no_quote    script.1


The ls command lists the same words as the script, demonstrating that the shell expands the asterisk into a list of files in the working directory. When the variable $firstline is surrounded by double quotation marks, the shell does not expand the asterisk. Thus the read1 script behaves correctly:

$ read1 Go ahead: * You entered: *


If you want the shell to interpret the special meanings of special characters, do not use quotation marks.

REPLY

The read builtin has features that can make it easier to use. When you do not specify a variable to receive read's input, bash puts the input into the variable named REPLY. You can use the p option to prompt the user instead of using a separate echo command. The following read1a script performs exactly the same task as read1:

$ cat read1a read -p "Go ahead: " echo "You entered: $REPLY"


The read2 script prompts for a command line and reads the user's response into the variable cmd. The script then attempts to execute the command line that results from the expansion of the cmd variable:

$ cat read2 read -p "Enter a command: " cmd $cmd echo "Thanks"


In the following example, read2 reads a command line that calls the echo builtin. The shell executes the command and then displays Thanks. Next read2 reads a command line that executes the who utility:

$ read2 Enter a command: echo Please display this message. Please display this message. Thanks $ read2 Enter a command: who alex     console  Jun 10 15:20 alex     ttyp1    Jun 16 15:16 (bravo.example.com) Thanks


If cmd does not expand into a valid command line, the shell issues an error message:

$ read2 Enter a command: xxx ./read2: line 2: xxx: command not found Thanks


The read3 script reads values into three variables. The read builtin assigns one word (a sequence of nonblank characters) to each variable:

$ cat read3 read -p "Enter something: " word1 word2 word3 echo "Word 1 is: $word1" echo "Word 2 is: $word2" echo "Word 3 is: $word3" $ read3 Enter something: this is something Word 1 is: this Word 2 is: is Word 3 is: something


When you enter more words than read has variables, read assigns one word to each variable, with all leftover words going to the last variable. Both read1 and read2 assigned the first word and all leftover words to the one variable they each had to work with. In the following example, read accepts five words into three variables, assigning the first word to the first variable, the second word to the second variable, and the third through fifth words to the third variable:

$ read3 Enter something: this is something else, really. Word 1 is:  this Word 2 is:  is Word 3 is:  something else, really.


Table 13-4 lists some of the options supported by the read builtin.

Table 13-4. read options

Option

Function

a aname (array)

Assigns each word of input to an element of array aname.

d delim (delimiter)

Uses delim to terminate the input instead of NEWLINE.

e (Readline)

If input is coming from a keyboard, use the Readline Library (page 304) to get input.

n num (number of characters)

Reads num characters and returns. As soon as the user types num characters, read returns; there is no need to press RETURN.

p prompt (prompt)

Displays prompt on standard error without a terminating NEWLINE before reading input. Displays prompt only when input comes from the keyboard.

s (silent)

Does not echo characters.

un (file descriptor)

Uses the integer n as the file descriptor that read takes its input from.

 

     read u4 arg1 arg2


 

is equivalent to

 

     read arg1 arg2 <&4


 

See "File Descriptors" (page 555) for a discussion of redirection and file descriptors.


The read builtin returns an exit status of 0 if it successfully reads any data. It has a nonzero exit status when it reaches the EOF (end of file). The following example runs a while loop from the command line. It takes its input from the names file and terminates after reading the last line from names.

$ cat names Alice Jones Robert Smith Alice Paulson John Q. Public $ while read first rest > do > echo $rest, $first > done < names Jones, Alice Smith, Robert Paulson, Alice Q. Public, John $


The placement of the redirection symbol (<) for the while structure is critical. It is important that you place the redirection symbol at the done statement and not at the call to read.

Optional

Each time you redirect input, the shell opens the input file and repositions the read pointer at the start of the file:

$ read line1 < names; echo $line1; read line2 < names; echo $line2 Alice Jones Alice Jones


Here each read opens names and starts at the beginning of the names file. In the following example, names is opened once, as standard input of the subshell created by the parentheses. Each read then reads successive lines of standard input.

$ (read line1; echo $line1; read line2; echo $line2) < names Alice Jones Robert Smith


Another way to get the same effect is to open the input file with exec and hold it open (refer to "File Descriptors" on page 555):

$ exec 3< names $ read -u3 line1; echo $line1; read -u3 line2; echo $line2 Alice Jones Robert Smith $ exec 3<&-



exec: Executes a Command

The exec builtin (not available in tcsh) has two primary purposes: to run a command without creating a new process and to redirect a file descriptorincluding standard input, output, or errorof a shell script from within the script (page 555). When the shell executes a command that is not built into the shell, it typically creates a new process. The new process inherits environment (global or exported) variables from its parent but does not inherit variables that are not exported by the parent. (For more information refer to "Locality of Variables" on page 560.) In contrast, exec executes a command in place of (overlays) the current process.

exec versus .(dot)

Insofar as exec runs a command in the environment of the original process, it is similar to the .(dot) command (page 261). However, unlike the . command, which can run only shell scripts, exec can run both scripts and compiled programs. Also, whereas the . command returns control to the original script when it finishes running, exec does not. Finally, the . command gives the new program access to local variables, whereas exec does not.

exec runs a command

The exec builtin used for running a command has the following syntax:

exec command arguments


exec does not return control

Because the shell does not create a new process when you use exec, the command runs more quickly. However, because exec does not return control to the original program, it can be used only as the last command that you want to run in a script. The following script shows that control is not returned to the script:

$ cat exec_demo who exec date echo "This line is never displayed." $ exec_demo jenny    ttyp1    May 30 7:05 (bravo.example.com) hls      ttyp2    May 30 6:59 Mon May 30 11:42:56 PDT 2005


The next example, a modified version of the out script (page 529), uses exec to execute the final command the script runs. Because out runs either cat or less and then terminates, the new version, named out2, uses exec with both cat and less:

$ cat out2 if [ $# -eq 0 ]     then         echo "Usage: out2 [-v] filenames" 1>&2         exit 1 fi if [ "$1" = "-v" ]     then         shift         exec less "$@"     else         exec cat -- "$@" fi


exec redirects input and output

The second major use of exec is to redirect a file descriptorincluding standard input, output, or errorfrom within a script. The next command causes all subsequent input to a script that would have come from standard input to come from the file named infile:

exec < infile


Similarly the following command redirects standard output and standard error to outfile and errfile, respectively:

exec > outfile 2> errfile


When you use exec in this manner, the current process is not replaced with a new process, and exec can be followed by other commands in the script.

/dev/tty

When you redirect the output from a script to a file, you must make sure that the user sees any prompts the script displays. The /dev/tty device is a pseudonym for the screen the user is working on; you can use this device to refer to the user's screen without knowing which device it is. (The tty utility displays the name of the device you are using.) By redirecting the output from a script to /dev/tty, you ensure that prompts and messages go to the user's terminal, regardless of which terminal the user is logged in on. Messages sent to /dev/tty are also not diverted if standard output and standard error from the script are redirected.

The to_screen1 script sends output to three places: standard output, standard error, and the user's screen. When it is run with standard output and standard error redirected, to_screen1 still displays the message sent to /dev/tty on the user's screen. The out and err files hold the output sent to standard output and standard error.

$ cat to_screen1 echo "message to standard output" echo "message to standard error" 1>&2 echo "message to the user" > /dev/tty $ to_screen1 > out 2> err message to the user $ cat out message to standard output $ cat err message to standard error


The following command redirects the output from a script to the user's screen:

exec > /dev/tty


Putting this command at the beginning of the previous script changes where the output goes. In to_screen2, exec redirects standard output to the user's screen so the >/dev/tty is superfluous. Following the exec command, all output sent to standard output goes to /dev/tty (the screen). Output to standard error is not affected.

$ cat to_screen2 exec > /dev/tty echo "message to standard output" echo "message to standard error" 1>&2 echo "message to the user" > /dev/tty $ to_screen2 > out 2> err message to standard output message to the user


One disadvantage of using exec to redirect the output to /dev/tty is that all subsequent output is redirected unless you use exec again in the script.

You can also redirect the input to read (standard input) so that it comes from /dev/tty (the keyboard):

read name < /dev/tty


or

exec < /dev/tty


trap: Catches a Signal

A signal is a report to a process about a condition. Mac OS X uses signals to report interrupts generated by the user (for example, pressing the interrupt key) as well as bad system calls, broken pipes, illegal instructions, and other conditions. The trap builtin (tcsh uses onintr) catches, or traps, one or more signals, allowing you to direct the actions a script takes when it receives a specified signal.

This discussion covers six signals that are significant when you work with shell scripts. Table 13-5 lists these signals, the signal numbers that systems often ascribe to them, and the conditions that usually generate each signal. Give the command kill l, trap l, or man 7 signal for a list of signal names.

Table 13-5. Signals

Type

Name

Number

Generating condition

Not a real signal

EXIT

0

Exit because of exit command or reaching the end of the program (not an actual signal but useful in trap)

Hang up

SIGHUP or HUP

1

Disconnect the line

Terminal interrupt

SIGINT or INT

2

Press the interrupt key (usually CONTROL-C)

Quit

SIGQUIT or QUIT

3

Press the quit key (usually CONTROL-SHIFT-| or CONTROL-SHIFT-\)

Kill

SIGKILL or KILL

9

The kill command with the 9 option (cannot be trapped; use only as a last resort)

Software termination

SIGTERM or TERM

15

Default of the kill command

Stop

SIGTSTP or TSTP

20

Press the suspend key (usually CONTROL-Z)

Debug

DEBUG

 

Executes commands specified in the trap statement after each command (not an actual signal but useful in trap)

Error

ERR

 

Executes commands specified in the trap statement after each command that returns a nonzero exit status (not an actual signal but useful in trap)


When it traps a signal, a script takes whatever action you specify: It can remove files or finish any other processing as needed, display a message, terminate execution immediately, or ignore the signal. If you do not use trap in a script, any of the six actual signals listed in Table 13-5 (not EXIT, DEBUG, or ERR) terminates the script. Because a process cannot trap a KILL signal, you can use kill KILL (or kill 9) as a last resort to terminate a script or any other process. (See page 580 for more information on kill.)

The trap command has the following syntax:

trap ['commands'] [signal]


The optional commands part specifies the commands that the shell executes when it catches one of the signals specified by signal. The signal can be a signal name or numberfor example, INT or 2. If commands is not present, trap resets the trap to its initial condition, which is usually to exit from the script.

The trap builtin does not require single quotation marks around commands as shown in the preceding syntax, but it is a good practice to use them. The single quotation marks cause shell variables within the commands to be expanded when the signal occurs, not when the shell evaluates the arguments to trap. Even if you do not use any shell variables in the commands, you need to enclose any command that takes arguments within either single or double quotation marks. Quoting the commands causes the shell to pass to TRap the entire command as a single argument.

After executing the commands, the shell resumes executing the script where it left off. If you want trap to prevent a script from exiting when it receives a signal but not to run any commands explicitly, you can specify a null (empty) commands string, as shown in the locktty script (page 543). The following command traps signal number 15 after which the script continues.

trap '' 15


The following script demonstrates how the trap builtin can catch the terminal interrupt signal (2). You can use SIGINT, INT, or 2 to specify this signal. The script returns an exit status of 1:

$ cat inter #!/bin/bash trap 'echo PROGRAM INTERRUPTED; exit 1' INT while true do     echo "Program running."     sleep 1 done $ inter Program running. Program running. Program running. CONTROL-C ^CPROGRAM INTERRUPTED $


:(null) builtin

The second line of inter sets up a trap for the terminal interrupt signal using INT. When trap catches the signal, the shell executes the two commands between the single quotation marks in the TRap command. The echo builtin displays the message PROGRAM INTERRUPTED, exit terminates the shell running the script, and the parent shell displays a prompt. If exit were not there, the shell would return control to the while loop after displaying the message. The while loop repeats continuously until the script receives a signal because the true utility always returns a true exit status. In place of true you can use the : (null) builtin, which is written as a colon and always returns a 0 (true) status.

The trap builtin frequently removes temporary files when a script is terminated prematurely so that the files are not left to clutter the filesystem. The following shell script, named addbanner, uses two traps to remove a temporary file when the script terminates normally or owing to a hangup, software interrupt, quit, or software termination signal:

$ cat addbanner #!/bin/bash script=$(basename $0) if [ ! -r "$HOME/banner" ]     then         echo "$script: need readable $HOME/banner file" 1>&2         exit 1 fi trap 'exit 1' 1 2 3 15 trap 'rm /tmp/$$.$script 2> /dev/null' 0 for file do     if [ -r "$file" -a -w "$file" ]         then             cat $HOME/banner $file > /tmp/$$.$script             cp /tmp/$$.$script $file             echo "$script: banner added to $file" 1>&2         else             echo "$script: need read and write permission for $file" 1>&2         fi done


When called with one or more filename arguments, addbanner loops through the files, adding a header to the top of each. This script is useful when you use a standard format at the top of your documents, such as a standard layout for memos, or when you want to add a standard header to shell scripts. The header is kept in a file named ~/banner. Because addbanner uses the HOME variable, which contains the pathname of the user's home directory, the script can be used by several users without modification. If Alex had written the script with /Users/alex in place of $HOME and then given the script to Jenny, either she would have had to change it or addbanner would have used Alex's banner file when Jenny ran it (assuming Jenny had read permission for the file).

The first trap in addbanner causes it to exit with a status of 1 when it receives a hangup, software interrupt (terminal interrupt or quit signal), or software termination signal. The second trap uses a 0 in place of signal-number, which causes trap to execute its command argument whenever the script exits because it receives an exit command or reaches its end. Together these TRaps remove a temporary file whether the script terminates normally or prematurely. Standard error of the second trap is sent to /dev/null for cases in which trap attempts to remove a nonexistent temporary file. In those cases rm sends an error message to standard error; because standard error is redirected, the user does not see this message.

See page 543 for another example that uses trap.

kill: Aborts a Process

The kill builtin sends a signal to a process or job. The kill command has the following syntax:

kill [signal] PID


where signal is the signal name or number (for example, INT or 2) and PID is the process identification number of the process that is to receive the signal. You can specify a job number (page 131) as %n in place of PID. If you omit signal, kill sends a TERM (software termination, number 15) signal. For more information on signal names and numbers see Table 13-5 on page 577.

The following command sends the TERM signal to job number 1:

$ kill -TERM %1


Because TERM is the default signal for kill, you can also give this command as kill %1. Give the command kill l (lowercase "l") to display a list of signal names.

A program that is interrupted often leaves matters in an unpredictable state: Temporary files may be left behind (when they are normally removed), and permissions may be changed. A well-written application traps, or detects, signals and cleans up before exiting. Most carefully written applications trap the INT, QUIT, and TERM signals.

To terminate a program, first try INT (press CONTROL-C, if the job is in the foreground). Because an application can be written to ignore these signals, you may need to use the KILL signal, which cannot be trapped or ignored; it is a "sure kill." Refer to page 761 for more information on kill. See also the related utility killall (page 763).

getopts: Parses Options

The getopts builtin (not available in tcsh) parses command line arguments, thereby making it easier to write programs that follow the Mac OS X argument conventions. The syntax for getopts is

getopts optstring varname [arg...]


where optstring is a list of the valid option letters, varname is the variable that receives the options one at a time, and arg is the optional list of parameters to be processed. If arg is not present, getopts processes the command line arguments. If optstring starts with a colon (:), the script takes care of generating error messages; otherwise, getopts generates error messages.

The getopts builtin uses the OPTIND (option index) and OPTARG (option argument) variables to store option-related values. When a shell script starts, the value of OPTIND is 1. Each time getopts locates an argument, it increments OPTIND to the index of the next option to be processed. If the option takes an argument, bash assigns the value of the argument to OPTARG.

To indicate that an option takes an argument, follow the corresponding letter in optstring with a colon (:). The option string dxo:lt:r indicates that getopts should search for d, x, o, l, t, and r options and that the o and t options take arguments.

Using getopts as the test-command in a while control structure allows you to loop over the options one at a time. The getopts builtin checks the option list for options that are in optstring. Each time through the loop, getopts stores the option letter it finds in varname.

Suppose that you want to write a program that can take three options:

  1. A b option indicates that the program should ignore whitespace at the start of input lines.

  2. A t option followed by the name of a directory indicates that the program should use that directory for temporary files. Otherwise, it should use /tmp.

  3. A u option indicates that the program should translate all its output to uppercase.

In addition, the program should ignore all other options and end option processing when it encounters two hyphens (--).

The problem is to write the portion of the program that determines which options the user has supplied. The following solution does not use getopts.

SKIPBLANKS= TMPDIR=/tmp CASE=lower while [[ "$1" = -* ]] # [[ = ]] does pattern match do     case $1 in         -b)    SKIPBLANKS=TRUE ;;         -t)    if [ -d "$2" ]                    then                    TMPDIR=$2                    shift                else                    echo "$0: -t takes a directory argument." >&2                    exit 1                fi ;;         -u)    CASE=upper ;;         --)    break ;; # Stop processing options         *)     echo "$0: Invalid option $1 ignored." >&2 ;;         esac     shift done


This program fragment uses a loop to check and shift arguments while the argument is not . As long as the argument is not two hyphens, the program continues to loop through a case statement that checks for possible options. The case label breaks out of the while loop. The * case label recognizes any option; it appears as the last case label to catch any unknown options, displays an error message, and allows processing to continue. On each pass through the loop, the program does a shift to get to the next argument. If an option takes an argument, the program does an extra shift to get past that argument.

The following program fragment processes the same options, but uses getopts:

SKIPBLANKS= TMPDIR=/tmp CASE=lower while getopts :bt:u arg do     case $arg in         b)     SKIPBLANKS=TRUE ;;         t)     if [ -d "$OPTARG" ]                    then                    TMPDIR=$OPTARG                else                    echo "$0: $OPTARG is not a directory." >&2                    exit 1                fi ;;         u)     CASE=upper ;;         :)     echo "$0: Must supply an argument to -$OPTARG." >&2                exit 1 ;;         \?)    echo "Invalid option -$OPTARG ignored." >&2 ;;         esac done


In this version of the code, the while structure evaluates the getopts builtin each time it comes to the top of the loop. The getopts builtin uses the OPTIND variable to keep track of the index of the argument it is to process the next time it is called. There is no need to call shift in this example.

In the getopts version of the script the case patterns do not start with a hyphen because the value of arg is just the option letter (getopts strips off the hyphen). Also, getopts recognizes as the end of the options, so you do not have to specify it explicitly as in the case statement in the first example.

Because you tell getopts which options are valid and which require arguments, it can detect errors in the command line and handle them in two ways. This example uses a leading colon in optstring to specify that you check for and handle errors in your code; when getopts finds an invalid option, it sets varname to ? and OPTARG to the option letter. When it finds an option that is missing an argument, getopts sets varname to : and OPTARG to the option lacking an argument.

The \? case pattern specifies the action to take when getopts detects an invalid option. The : case pattern specifies the action to take when getopts detects a missing option argument. In both cases getopts does not write any error message; it leaves that task to you.

If you omit the leading colon from optstring, both an invalid option and a missing option argument cause varname to be assigned the string ?. OPTARG is not set and getopts writes its own diagnostic message to standard error. Generally this method is less desirable because you have less control over what the user sees when an error is made.

Using getopts will not necessarily make your programs shorter. Its principal advantages are that it provides a uniform programming interface and it enforces standard option handling.

A Partial List of Builtins

Table 13-6 lists some of the bash builtins. See "Listing bash builtins" on page 138 for instructions on how to display complete lists of builtins.

Table 13-6. bash builtins

Builtin

Function

:

Returns 0 or true (the null builtin; page 579)

.(dot)

Executes a shell script as part of the current process (page 261)

bg

Puts a suspended job in the background (page 274)

break

Exits from a looping control structure (page 543)

cd

Changes to another working directory (page 82)

continue

Starts with the next iteration of a looping control structure (page 543)

echo

Displays its arguments (page 52)

eval

Scans and evaluates the command line (page 374)

exec

Executes a shell script or program in place of the current process (page 574)

exit

Exits from the current shell (usually the same as CONTROL-D from an interactive shell; page 564)

export

Places the value of a variable in the calling environment (makes it global; page 560)

fg

Brings a job from the background into the foreground (page 273)

getopts

Parses arguments to a shell script (page 581)

jobs

Displays list of background jobs (page 273)

kill

Sends a signal to a process or job (page 761)

pwd

Displays the name of the working directory (page 77)

read

Reads a line from standard input (page 571)

readonly

Declares a variable to be readonly (page 282)

set

Sets shell flags or command line argument variables; with no argument, lists all variables (pages 318, 353, and 568)

shift

Promotes each command line argument (page 567)

test

Compares arguments (pages 525 and 871)

times

Displays total times for the current shell and its children

trap

Traps a signal (page 577)

type

Displays how each argument would be interpreted as a command (page 570)

umask

Returns the value of the file-creation mask (page 883)

unset

Removes a variable or function (page 282)

wait

Waits for a background process to terminate (page 377)


Expressions

An expression is composed of constants, variables, and operators that can be processed to return a value. This section covers arithmetic, logical, and conditional expressions as well as operators. Table 13-8 on page 588 lists the bash operators.

Arithmetic Evaluation

The Bourne Again Shell can perform arithmetic assignments and evaluate many different types of arithmetic expressions, all using integers. The shell performs arithmetic assignments in a number of ways. One is with arguments to the let builtin:

$ let "VALUE=VALUE * 10 + NEW"


In the preceding example, the variables VALUE and NEW contain integer values. Within a let statement you do not need to use dollar signs ($) in front of variable names. Double quotation marks must enclose a single argument, or expression, that contains SPACEs. Because most expressions contain SPACEs and need to be quoted, bash accepts ((expression)) as a synonym for let "expression", obviating the need for both quotation marks and dollar signs:

$ ((VALUE=VALUE * 10 + NEW))


You can use either form wherever a command is allowed and can remove the SPACEs if you like. In the following example, the asterisk (*) does not need to be quoted because the shell does not perform pathname expansion on the right side of an assignment (page 280):

$ let VALUE=VALUE*10+NEW


Because each argument to let is evaluated as a separate expression, you can assign values to more than one variable on a single line:

$ let "COUNT = COUNT + 1" VALUE=VALUE*10+NEW


You need to use commas to separate multiple assignments within a set of double parentheses:

$ ((COUNT = COUNT + 1, VALUE=VALUE*10+NEW))


Tip: Arithmetic evaluation versus arithmetic expansion

Arithmetic evaluation differs from arithmetic expansion. As explained on page 325, arithmetic expansion uses the syntax $((expression)), evaluates expression, and replaces $((expression)) with the result. You can use arithmetic expansion to display the value of an expression or to assign that value to a variable.

Arithmetic evaluation uses the let expression or ((expression)) syntax, evaluates expression, and returns a status code. You can use arithmetic evaluation to perform a logical comparison or an assignment.


Logical expressions

You can use the ((expression)) syntax for logical expressions, although that task is frequently left to [[expression ]]. The next example expands the age_check script (page 326) to include logical arithmetic evaluation in addition to arithmetic expansion.

$ cat age2 #!/bin/bash echo -n "How old are you? " read age if ((30 < age && age < 60)); then         echo "Wow, in $((60-age)) years, you'll be 60!"     else         echo "You are too young or too old to play." fi $ age2 How old are you? 25 You are too young or too old to play.


The test-statement for the if structure evaluates two logical comparisons joined by a Boolean AND and returns 0 (true) if they are both true or 1 (false) otherwise.

Logical Evaluation (Conditional Expressions)

The syntax of a conditional expression is

[[ expression ]]


where expression is a Boolean (logical) expression. You must precede a variable name with a dollar sign ($) within expression. The result of executing this builtin, like the test builtin, is a return status. The conditions allowed within the brackets are almost a superset of those accepted by test (page 871). Where the test builtin uses a as a Boolean AND operator, [[ expression ]] uses &&. Similarly, where test uses o as a Boolean OR operator, [[ expression ]] uses ||.

You can replace the line that tests age in the age2 script (preceding) with the following conditional expression. You must surround the [[ and ]] tokens with whitespace or a command terminator, and place dollar signs before the variables:

if [[ 30 < $age && $age < 60 ]]; then


You can also use test's relational operators gt, ge, lt, le, eq, and ne:

if  [[ 30 -lt $age && $age -lt 60 ]]; then


String comparisons

The test builtin tests whether strings are equal or unequal. The [[ expression ]] syntax adds comparison tests for string operators. The > and < operators compare strings for order (for example, "aa" < "bbb"). The = operator tests for pattern match, not just equality: [[ string = pattern ]] is true if string matches pattern. This operator is not symmetrical; the pattern must appear on the right side of the equal sign. For example, [[ artist = a* ]] is true (= 0), whereas [[ a* = artist ]] is false (= 1):

$ [[ artist = a* ]] $ echo $? 0 $ [[ a* = artist ]] $ echo $? 1


The next example uses a command list that starts with a compound condition. The condition tests that the directory bin and the file src/myscript.bash exist. If this is true, cp copies src/myscript.bash to bin/myscript. If the copy succeeds, chmod makes myscript executable. If any of these steps fails, echo displays a message.

$ [[ -d bin && -f src/myscript.bash ]] && cp src/myscript.bash \ bin/myscript && chmod +x bin/myscript || echo "Cannot make \ executable version of myscript"


String Pattern Matching

The Bourne Again Shell provides string pattern-matching operators that can manipulate pathnames and other strings. These operators can delete from strings prefixes or suffixes that match patterns. The four operators are listed in Table 13-7.

Table 13-7. String operators

Operator

Function

#

Removes minimal matching prefixes

##

Removes maximal matching prefixes

%

Removes minimal matching suffixes

%%

Removes maximal matching suffixes


The syntax for these operators is

${varname op pattern}


where op is one of the operators listed in Table 13-7 and pattern is a match pattern similar to that used for filename generation. These operators are commonly used to manipulate pathnames so as to extract or remove components or to change suffixes:

$ SOURCEFILE=/usr/local/src/prog.c $ echo ${SOURCEFILE#/*/} local/src/prog.c $ echo ${SOURCEFILE##/*/} prog.c $ echo ${SOURCEFILE%/*} /usr/local/src $ echo ${SOURCEFILE%%/*} $ echo ${SOURCEFILE%.c} /usr/local/src/prog $ CHOPFIRST=${SOURCEFILE#/*/} $ echo $CHOPFIRST local/src/prog.c $ NEXT=${CHOPFIRST%%/*} $ echo $NEXT local


Here the string-length operator, ${#name}, is replaced by the number of characters in the value of name:

$ echo $SOURCEFILE /usr/local/src/prog.c $ echo ${#SOURCEFILE} 21


Operators

Arithmetic expansion and arithmetic evaluation use the same syntax, precedence, and associativity of expressions as the C language. Table 13-8 lists operators in order of decreasing precedence (priority of evaluation); each group of operators has equal precedence. Within an expression you can use parentheses to change the order of evaluation.

Table 13-8. Operators

Type of operator/operator

Function

Post

 

var++

Postincrement

var

Postdecrement

Pre

 

++var

Preincrement

var

Predecrement

Unary

 

Unary minus

+

Unary plus

Negation

 

!

Boolean NOT (logical negation)

~

Complement (bitwise negation)

Exponentiation

 

**

Exponent

Multiplication, division, remainder

 

*

Multiplication

/

Division

%

Remainder

Addition, subtraction

 

Subtraction

+

Addition

Bitwise shifts

 

<<

Left bitwise shift

>>

Right bitwise shift

Comparison

 

<=

Less than or equal

>=

Greater than or equal

<

Less than

>

Greater than

Equality, inequality

 

==

Equality

!=

Inequality

Bitwise

 

&

Bitwise AND

^

Bitwise XOR (exclusive OR)

|

Bitwise OR

Boolean (logical)

 

&&

Boolean AND

||

Boolean OR

Conditional evaluation

 

?:

Ternary operator

Assignment

 

=, *=, /=, %=, +=, =, <<=, >>=, &=, ^=, |=

Assignment

Comma

 

,

Comma


Pipe

The pipe token has higher precedence than operators. You can use pipes anywhere in a command that you can use simple commands. For example, the command line

$ cmd1 | cmd2 || cmd3 | cmd4 && cmd5 | cmd6


is interpreted as if you had typed

$ ((cmd1 | cmd2) || (cmd3 | cmd4)) && (cmd5 | cmd6)


Tip: Do not rely on rules of precedence: use parentheses

Do not rely on the precedence rules when you use compound commands. Instead, use parentheses to explicitly state the order in which you want the shell to interpret the commands.


Increment and decrement operators

The postincrement, postdecrement, preincrement, and predecrement operators work with variables. The pre- operators, which appear in front of the variable name as in ++COUNT and VALUE, first change the value of the variable (++ adds 1; subtracts 1) and then provide the result for use in the expression. The post- operators appear after the variable name as in COUNT++ and VALUE; they first provide the unchanged value of the variable for use in the expression and then change the value of the variable.

$ N=10 $ echo $N 10 $ echo $((--N+3)) 12 $ echo $N 9 $ echo $((N++ - 3)) 6 $ echo $N 10


Remainder

The remainder operator (%) gives the remainder when its first operand is divided by its second. For example, the expression $((15%7)) has the value 1.

Boolean

The result of a Boolean operation is either 0 (false) or 1 (true).

The && (AND) and || (OR) Boolean operators are called short-circuiting operators. If the result of using one of these operators can be decided by looking only at the left operand, the right operand is not evaluated. The && operator causes the shell to test the exit status of the command preceding it. If the command succeeded, bash executes the next command; otherwise, it skips the remaining commands on the command line. You can use this construct to execute commands conditionally:

$ mkdir bkup && cp -r src bkup


This compound command creates the directory bkup. If mkdir succeeds, the contents of directory src is copied recursively to bkup.

The || separator also causes bash to test the exit status of the first command but has the opposite effect: The remaining command(s) are executed only if the first one failed (that is, exited with nonzero status):

$ mkdir bkup || echo "mkdir of bkup failed" >> /tmp/log


The exit status of a command list is the exit status of the last command in the list. You can group lists with parentheses. For example, you could combine the previous two examples as

$ (mkdir bkup && cp -r src bkup) || echo "mkdir failed" >> /tmp/log


In the absence of parentheses, && and || have equal precedence and are grouped from left to right. The following examples use the true and false utilities. These utilities do nothing and return true (0) and false (1) exit statuses, respectively:

$ false; echo $? 1


The $? variable holds the exit status of the preceding command (page 564). The next two commands yield an exit status of 1 (false):

$ true || false && false $ echo $? 1 $ (true || false) && false $ echo $? 1


Similarly the next two commands yield an exit status of 0 (true):

$ false && false || true $ echo $? 0 $ (false && false) || true $ echo $? 0


Because || and && have equal precedence, the parentheses in the two preceding pairs of examples do nothing to change the order of operations.

Because the expression on the right side of a short-circuiting operator may never get executed, you must be careful with assignment statements in that location. The following example demonstrates what can happen:

$ ((N=10,Z=0)) $ echo $((N || ((Z+=1)) )) 1 $ echo $Z 0


Because the value of N is nonzero, the result of the || (OR) operation is 1 (true), no matter what the value of the right side is. As a consequence ((Z+=1)) is never evaluated and Z is not incremented.

Ternary

The ternary operator, ? :, decides which of two expressions should be evaluated, based on the value returned from a third expression:

expression1 ? expression2 : expression3


If expression1 produces a false (0) value, expression3 is evaluated; otherwise, expression2 is evaluated. The value of the entire expression is the value of expression2 or expression3, depending on which one is evaluated. If expression1 is true, expression3 is not evaluated. If expression1 is false expression2 is not evaluated:

$ ((N=10,Z=0,COUNT=1)) $ ((T=N>COUNT?++Z:--Z)) $ echo $T 1 $ echo $Z 1


Assignment

The assignment operators, such as +=, are shorthand notations. For example, N+=3 is the same as ((N=N+3)).

Other bases

The following commands use the syntax base#n to assign base 2 (binary) values. First v1 is assigned a value of 0101 (5 decimal) and v2 is assigned a value of 0110 (6 decimal). The echo utility verifies the decimal values.

$ ((v1=2#0101)) $ ((v2=2#0110)) $ echo "$v1 and $v2" 5 and 6


Next the bitwise AND operator (&) selects the bits that are on in both 5 (0101 binary) and 6 (0110 binary). The result is binary 0100, which is 4 decimal.

$ echo $(( v1 & v2 )) 4


The Boolean AND operator (&&) produces a result of 1 if both of its operands are nonzero and a result of 0 otherwise. The bitwise inclusive OR operator (|) selects the bits that are on in either 0101 or 0110, resulting in 0111, which is 7 decimal. The Boolean OR operator (||) produces a result of 1 if either of its operands is nonzero and a result of 0 otherwise.

$ echo $(( v1 && v2 )) 1 $ echo $(( v1 | v2 )) 7 $ echo $(( v1 || v2 )) 1


Next the bitwise exclusive OR operator (^) selects the bits that are on in either, but not both, of the operands 0101 and 0110, yielding 0011, which is 3 decimal. The Boolean NOT operator (!) produces a result of 1 if its operand is 0 and a result of 0 otherwise. Because the exclamation point in $(( ! v1 )) is enclosed within double parentheses, it does not need to be escaped to prevent the shell from interpreting the exclamation point as a history event. The comparison operators produce a result of 1 if the comparison is true and a result of 0 otherwise.

$ echo $(( v1 ^ v2 )) 3 $ echo $(( ! v1 )) 0 $ echo $(( v1 < v2 )) 1 $ echo $(( v1 > v2 )) 0


Shell Programs

The Bourne Again Shell has many features that make it a good programming language. The structures that bash provides are not a random assortment. Rather, they have been chosen to provide most of the structural features that are in other procedural languages, such as C or Pascal. A procedural language provides the ability to

  • Declare, assign, and manipulate variables and constant data. The Bourne Again Shell provides string variables, together with powerful string operators, and integer variables, along with a complete set of arithmetic operators.

  • Break large problems into small ones by creating subprograms. The Bourne Again Shell allows you to create functions and call scripts from other scripts. Shell functions can be called recursively; that is, a Bourne Again Shell function can call itself. You may not need to use recursion often, but it may allow you to solve some apparently difficult problems with ease.

  • Execute statements conditionally, using statements such as if.

  • Execute statements iteratively, using statements such as while and for.

  • Transfer data to and from the program, communicating with both data files and users.

Programming languages implement these capabilities in different ways but with the same ideas in mind. When you want to solve a problem by writing a program, you must first figure out a procedure that leads you to a solutionthat is, an algorithm. Typically you can implement the same algorithm in roughly the same way in different programming languages, using the same kinds of constructs in each language.

Chapter 8 and this chapter have introduced numerous bash features, many of which are useful for interactive use as well as for shell programming. This section develops two complete shell programs, demonstrating how to combine some of these features effectively. The programs are presented as problems for you to solve along with sample solutions.

A Recursive Shell Script

A recursive construct is one that is defined in terms of itself. Alternatively, you might say that a recursive program is one that can call itself. This may seem circular, but it need not be. To avoid circularity a recursive definition must have a special case that is not self-referential. Recursive ideas occur in everyday life. For example, you can define an ancestor as your mother, your father, or one of their ancestors. This definition is not circular; it specifies unambiguously who your ancestors are: your mother or your father, or your mother's mother or father or your father's mother or father, and so on.

A number of Mac OS X system utilities can operate recursively. See the R option to the chmod (page 676), chown (page 682), and cp (page 690) utilities for examples.

Solve the following problem by using a recursive shell function:

Write a shell function named makepath that, given a pathname, creates all components in that pathname as directories. For example, the command makepath a/b/c/d should create directories a, a/b, a/b/c, and a/b/c/d. (The mkdir utility supports a p option that does exactly this. Solve the problem without using mkdir p.)


One algorithm for a recursive solution follows:

  1. Examine the path argument. If it is a null string or if it names an existing directory, do nothing and return.

  2. If it is a simple path component, create it (using mkdir) and return.

  3. Otherwise, call makepath using the path prefix of the original argument. This step eventually creates all the directories up to the last component, which you can then create with mkdir.

In general, a recursive function must invoke itself with a simpler version of the problem than it was given until it is finally called with a simple case that does not need to call itself. Following is one possible solution based on this algorithm:

makepath

# this is a function # enter it at the keyboard, do not run it as a shell script # function makepath() {     if [[ ${#1} -eq 0 || -d "$1" ]]         then             return 0       # Do nothing     fi     if [[ "${1%/*}" = "$1" ]]         then             mkdir $1             return $?     fi     makepath ${1%/*} || return 1     mkdir $1     return $? }


In the test for a simple component (the if statement in the middle of the function), the left expression is the argument after the shortest suffix that starts with a / character has been stripped away (page 587). If there is no such character (for example, if $1 is alex), nothing is stripped off and the two sides are equal. If the argument is a simple filename preceded by a slash, such as /usr, the expression ${1%/*} evaluates to a null string. To make the function work in this case, you must take two precautions: Put the left expression within quotation marks and ensure that the recursive function behaves sensibly when it is passed a null string as an argument. In general, good programs are robust: They should be prepared for borderline, invalid, or meaningless input and behave appropriately in such cases.

By giving the following command from the shell you are working in, you turn on debugging tracing so that you can watch the recursion work:

$ set -o xtrace


(Give the same command, but replace the hyphen with a plus sign (+) to turn debugging off.) With debugging turned on, the shell displays each line in its expanded form as it executes the line. A + precedes each line of debugging output. In the following example, the first line that starts with + shows the shell calling makepath. The makepath function is called from the command line with arguments of a/b/c. Subsequently it calls itself with arguments of a/b and finally a. All the work is done (using mkdir) as each call to makepath returns.

$ makepath a/b/c + makepath a/b/c + [[ 5 -eq 0 ]] + [[ -d a/b/c ]] + [[ a/b = \a\/\b\/\c ]] + makepath a/b + [[ 3 -eq 0 ]] + [[ -d a/b ]] + [[ a = \a\/\b ]] + makepath a + [[ 1 -eq 0 ]] + [[ -d a ]] + [[ a = \a ]] + mkdir a + return 0 + mkdir a/b + return 0 + mkdir a/b/c + return 0


The function works its way down the recursive path and back up again.

It is instructive to invoke makepath with an invalid path and see what happens. The following example, run with debugging turned on, tries to create the path /a/b, which requires that you create directory a in the root directory. Unless you have permission to write to the root directory, you are not permitted to create this directory.

$ makepath /a/b + makepath /a/b + [[ 4 -eq 0 ]] + [[ -d /a/b ]] + [[ /a = \/\a\/\b ]] + makepath /a + [[ 2 -eq 0 ]] + [[ -d /a ]] + [[ '' = \/\a ]] + makepath + [[ 0 -eq 0 ]] + return 0 + mkdir /a mkdir: cannot create directory '/a': Permission denied + return 1 + return 1


The recursion stops when makepath is denied permission to create the /a directory. The error return is passed all the way back, so the original makepath exits with nonzero status.

Tip: Use local variables with recursive functions

The preceding example glossed over a potential problem that you may encounter when you use a recursive function. During the execution of a recursive function, many separate instances of that function may be active simultaneously. All but one of them are waiting for their child invocation to complete.

Because functions run in the same environment as the shell that calls them, variables are implicitly shared by a shell and a function it calls so that all instances of the function share a single copy of each variable. Sharing variables can give rise to side effects that are rarely what you want. As a rule, you should use typeset to make all variables of a recursive function be local variables. See page 561 for more information.


The quiz Shell Script

Solve the following problem using a bash script:

Write a generic multiple-choice quiz program. The program should get its questions from data files, present them to the user, and keep track of the number of correct and incorrect answers. The user must be able to exit from the program at any time with a summary of results to that point.


The detailed design of this program and even the detailed description of the problem depend on a number of choices: How will the program know which subjects are available for quizzes? How will the user choose a subject? How will the program know when the quiz is over? Should the program present the same questions (for a given subject) in the same order each time, or should it scramble them?

Of course, you can make many perfectly good choices that implement the specification of the problem. The following details narrow the problem specification:

  • Each subject will correspond to a subdirectory of a master quiz directory. This directory will be named in the environment variable QUIZDIR, whose default will be ~/quiz. For example, you could have the following directories correspond to the subjects engineering, art, and politics: ~/quiz/engineering, ~/quiz/art, and ~/quiz/politics.

  • Each subject can have several questions. Each question is represented by a file in its subject's directory.

  • The first line of each file that represents a question is the text of the question. If it takes more than one line, you must escape the NEWLINE with a backslash. (This setup makes it easy to read a single question with the read builtin.) The second line of the file is an integer that specifies the number of choices. The next lines are the choices themselves. The last line is the correct answer. Following is a sample question file:

    Who discovered the principle of the lever? 4 Euclid Archimedes Thomas Edison The Lever Brothers Archimedes

  • The program presents all the questions in a subject directory. At any point the user can interrupt the quiz with CONTROL-C, whereupon the program will summarize the results so far and exit. If the user does not interrupt, the program summarizes the results and exits when it has asked all questions for the chosen subject.

  • The program scrambles the questions in a subject before presenting them.

Following is a top-level design for this program:

  1. Initialize. This involves a number of steps, such as setting the counts of the number of questions asked so far and the number of correct and wrong answers to zero. Sets up to trap CONTROL-C.

  2. Present the user with a choice of subjects and get the user's response.

  3. Change to the corresponding subject directory.

  4. Determine the questions to be asked (that is, the filenames in that directory). Arrange them in random order.

  5. Repeatedly present questions and ask for answers until the quiz is over or is interrupted by the user.

  6. Present the results and exit.

Clearly some of these steps (such as step 3) are simple, whereas others (such as step 4) are complex and worthy of analysis on their own. Use shell functions for any complex step, and use the trap builtin to handle a user interrupt.

Here is a skeleton version of the program with empty shell functions:

function initialize { # Initializes variables. } function choose_subj { # Writes choice to standard output. } function scramble { # Stores names of question files, scrambled, # in an array variable named questions. } function ask { # Reads a question file, asks the question, and checks the # answer. Returns 1 if the answer was correct, 0 otherwise. If it # encounters an invalid question file, exit with status 2. } function summarize { # Presents the user's score. } # Main program initialize                       # Step 1 in top-level design subject=$(choose_subj)           # Step 2 [[ $? -eq 0 ]] || exit 2         # If no valid choice, exit cd $subject || exit 2            # Step 3 echo                             # Skip a line scramble                         # Step 4 for ques in ${questions[*]}; do  # Step 5     ask $ques     result=$?     (( num_ques=num_ques+1 ))     if [[ $result == 1 ]]; then         (( num_correct += 1 ))     fi     echo                         # Skip a line between questions     sleep ${QUIZDELAY:=1} done summarize                        # Step 6 exit 0


To make reading the results a bit easier for the user, a sleep call appears inside the question loop. It delays $QUIZDELAY seconds (default = 1) between questions.

Now the task is to fill in the missing pieces of the program. In a sense this program is being written backward. The details (the shell functions) come first in the file but come last in the development process. This common programming practice is called top-down design. In top-down design you fill in the broad outline of the program first and supply the details later. In this way you break the problem up into smaller problems, each of which you can work on independently. Shell functions are a great help in using the top-down approach.

One way to write the initialize function follows. The cd command causes QUIZDIR to be the working directory for the rest of the script and defaults to ~/quiz if QUIZDIR is not set.

function initialize () { trap 'summarize ; exit 0' INT    # Handle user interrupts num_ques=0                       # Number of questions asked so far num_correct=0                    # Number answered correctly so far first_time=true                  # true until first question is asked cd ${QUIZDIR:=~/quiz} || exit 2 }


Be prepared for the cd command to fail. The directory may be unsearchable or conceivably another user may have removed it. The preceding function exits with a status code of 2 if cd fails.

The next function, choose_subj, is a bit more complicated. It displays a menu using a select statement:

function choose_subj () { subjects=($(ls)) PS3="Choose a subject for the quiz from the preceding list: " select Subject in ${subjects[*]}; do     if [[ -z "$Subject" ]]; then         echo "No subject chosen. Bye." >&2         exit 1     fi     echo $Subject     return 0 done }


The function first uses an ls command and command substitution to put a list of subject directories in the subjects array. Next the select structure (page 551) presents the user with a list of subjects (the directories found by ls) and assigns the chosen directory name to the Subject variable. Finally the function writes the name of the subject directory to standard output. The main program uses command substitution to assign this value to the subject variable [subject=$(choose_subj)].

The scramble function presents a number of difficulties. In this solution it uses an array variable (questions) to hold the names of the questions. It scrambles the entries in an array using the RANDOM variable (each time you reference RANDOM it has the value of a [random] integer between 0 and 32767):

function scramble () { typeset -i index quescount questions=($(ls)) quescount=${#questions[*]}         # Number of elements ((index=quescount-1)) while [[ $index > 0 ]]; do     ((target=RANDOM % index))     exchange $target $index     ((index -= 1)) done }


This function initializes the array variable questions to the list of filenames (questions) in the working directory. The variable quescount is set to the number of such files. Then the following algorithm is used: Let the variable index count down from quescount 1 (the index of the last entry in the array variable). For each value of index, the function chooses a random value target between 0 and index, inclusive. The command

((target=RANDOM % index))


produces a random value between 0 and index 1 by taking the remainder (the % operator) when $RANDOM is divided by index. The function then exchanges the elements of questions at positions target and index. It is convenient to do this in another function named exchange:

function exchange () { temp_value=${questions[$1]} questions[$1]=${questions[$2]} questions[$2]=$temp_value }


The ask function also uses the select structure. It reads the question file named in its argument and uses the contents of that file to present the question, accept the answer, and determine whether the answer is correct. (See the code that follows.)

The ask function uses file descriptor 3 to read successive lines from the question file, whose name was passed as an argument and is represented by $1 in the function. It reads the question into the ques variable and the number of questions into num_opts. The function constructs the variable choices by initializing it to a null string and successively appending the next choice. Then it sets PS3 to the value of ques and uses a select structure to prompt the user with ques. The select structure places the user's answer in answer, and the function then checks it against the correct answer from the file.

The construction of the choices variable is done with an eye toward avoiding a potential problem. Suppose that one answer has some whitespace in it. Then it might appear as two or more arguments in choices. To avoid this problem, make sure that choices is an array variable. The select statement does the rest of the work:

quiz

$ cat quiz #!/bin/bash # remove the # on the following line to turn on debugging # set -o xtrace #================== function initialize () { trap 'summarize ; exit 0' INT     # Handle user interrupts num_ques=0                        # Number of questions asked so far num_correct=0                     # Number answered correctly so far first_time=true                   # true until first question is asked cd ${QUIZDIR:=~/quiz} || exit 2 } #================== function choose_subj () { subjects=($(ls)) PS3="Choose a subject for the quiz from the preceding list: " select Subject in ${subjects[*]}; do     if [[ -z "$Subject" ]]; then         echo "No subject chosen. Bye." >&2         exit 1     fi     echo $Subject     return 0 done } #================== function exchange () { temp_value=${questions[$1]} questions[$1]=${questions[$2]} questions[$2]=$temp_value } #================== function scramble () { typeset -i index quescount questions=($(ls)) quescount=${#questions[*]}        # Number of elements ((index=quescount-1)) while [[ $index > 0 ]]; do     ((target=RANDOM % index))     exchange $target $index     ((index -= 1)) done } #================== function ask () { exec 3<$1 read -u3 ques || exit 2 read -u3 num_opts || exit 2 index=0 choices=() while (( index < num_opts )) ; do     read -u3 next_choice || exit 2     choices=("${choices[@]}" "$next_choice")     ((index += 1)) done read -u3 correct_answer || exit 2 exec 3<&- if [[ $first_time = true ]]; then     first_time=false     echo -e "You may press the interrupt key at any time to quit.\n" fi PS3=$ques"  "                     # Make $ques the prompt for select                                   # and add some spaces for legibility. select answer in "${choices[@]}"; do     if [[ -z "$answer" ]]; then             echo  Not a valid choice. Please choose again.         elif [[ "$answer" = "$correct_answer" ]]; then             echo "Correct!"             return 1         else             echo "No, the answer is $correct_answer."             return 0     fi done } #================== function summarize () { echo                              # Skip a line if (( num_ques == 0 )); then     echo "You did not answer any questions"     exit 0 fi (( percent=num_correct*100/num_ques )) echo "You answered $num_correct questions correctly, out of \ $num_ques total questions." echo "Your score is $percent percent." } #================== # Main program initialize                        # Step 1 in top-level design subject=$(choose_subj)            # Step 2 [[ $? -eq 0 ]] || exit 2          # If no valid choice, exit cd $subject || exit 2             # Step 3 echo                              # Skip a line scramble                          # Step 4 for ques in ${questions[*]}; do   # Step 5     ask $ques     result=$?     (( num_ques=num_ques+1 ))     if [[ $result == 1 ]]; then         (( num_correct += 1 ))     fi     echo                          # Skip a line between questions     sleep ${QUIZDELAY:=1} done summarize                         # Step 6 exit 0


Chapter Summary

The shell is a programming language. Programs written in this language are called shell scripts, or simply scripts. Shell scripts provide the decision and looping control structures present in high-level programming languages while allowing easy access to system utilities and user programs. Shell scripts can use functions to modularize and simplify complex tasks.

Control structures

The control structures that use decisions to select alternatives are if...then, if...then...else, and if...then...elif. The case control structure provides a multiway branch and can be used when you want to express alternatives using a simple pattern-matching syntax.

The looping control structures are for...in, for, until, and while. These structures perform one or more tasks repetitively.

The break and continue control structures alter control within loops: break transfers control out of a loop, and continue transfers control immediately to the top of a loop.

The Here document allows input to a command in a shell script to come from within the script itself.

File descriptors

The Bourne Again Shell provides the ability to manipulate file descriptors. Coupled with the read and echo builtins, file descriptors allow shell scripts to have as much control over input and output as programs written in lower-level languages.

Variables

You assign attributes, such as readonly, to bash variables using the typeset builtin. The Bourne Again Shell provides operators to perform pattern matching on variables, provide default values for variables, and evaluate the length of variables. This shell also supports array variables and local variables for functions and provides built-in integer arithmetic capability, using the let builtin and an expression syntax similar to the C programming language.

Builtins

Bourne Again Shell builtins include type, read, exec, trap, kill, and getopts. The type builtin displays information about a command, including its location; read allows a script to accept user input.

The exec builtin executes a command without creating a new process. The new command overlays the current process, assuming the same environment and PID number of that process. This builtin executes user programs and other Mac OS X commands when it is not necessary to return control to the calling process.

The trap builtin catches a signal sent by Mac OS X to the process running the script and allows you to specify actions to be taken upon receipt of one or more signals. You can use this builtin to cause a script to ignore the signal that is sent when the user presses the interrupt key.

The kill builtin allows you to terminate a running program. The getopts builtin parses command line arguments, making it easier to write programs that follow standard conventions for command line arguments and options.

Utilities in scripts

In addition to using control structures, builtins, and functions, shell scripts generally call utilities. The find utility, for instance, is commonplace in shell scripts that search for files in the system hierarchy and can perform a vast range of tasks, from simple to complex.

A well-written shell script adheres to standard programming practices, such as specifying the shell to execute the script on the first line of the script, verifying the number and type of arguments that the script is called with, displaying a standard usage message to report command line errors, and redirecting all informational messages to standard error.

Expressions

There are two basic types of expressions: arithmetic and logical. Arithmetic expressions allow you to do arithmetic on constants and variables, yielding a numeric result. Logical (Boolean) expressions compare expressions or strings, or test conditions to yield a true or false result. As with all decisions within UNIX shell scripts, a true status is represented by the value zero; false, by any nonzero value.

Exercises

1.

Rewrite the journal script of Chapter 8 (question 5, page 334) by adding commands to verify that the user has write permission for a file named journal-file in the user's home directory, if such a file exists. The script should take appropriate actions if journal-file exists and the user does not have write permission to the file. Verify that the modified script works.

2.

The special parameter "$@" is referenced twice in the out script (page 529). Explain what would be different if the parameter "$*" were used in its place.

3.

Write a filter that takes a list of files as input and outputs the basename (page 550) of each file in the list.

4.

Write a function that takes a single filename as an argument and adds execute permission to the file for the user.

  1. When might such a function be useful?

  2. Revise the script so that it takes one or more filenames as arguments and adds execute permission for the user for each file argument.

  3. What can you do to make the function available every time you log in?

  4. Suppose that, in addition to having the function available on subsequent login sessions, you want to make the function available now in your current shell. How would you do so?

5.

When might it be necessary or advisable to write a shell script instead of a shell function? Give as many reasons as you can think of.

6.

Write a shell script that displays the names of all directory files, but no other types of files, in the working directory.

7.

Write a script to display the time every 15 seconds. Read the date man page and display the time, using the %r field descriptor. Clear the window (using the clear command) each time before you display the time.

8.

Enter the following script named savefiles, and give yourself execute permission to the file:

$ cat savefiles #! /bin/bash echo "Saving files in current directory in file savethem." exec > savethem for i in *         do         echo "==================================================="         echo "File: $i"         echo "==================================================="         cat "$i"         done


  1. What error message do you get when you execute this script? Rewrite the script so that the error does not occur, making sure the output still goes to savethem.

  2. What might be a problem with running this script twice in the same directory? Discuss a solution to this problem.

9.

Read the bash man or info page, try some experiments, and answer the following questions:

  1. How do you export a function?

  2. What does the hash builtin do?

  3. What happens if the argument to exec is not executable?

10.

Using the find utility, perform the following tasks:

  1. List all files in the working directory and all subdirectories that have been modified within the last day.

  2. List all files that you have read access to on the system that are larger than 1 megabyte.

  3. List the inode numbers of all files in the working directory whose file-names end in .c.

  4. List all files that you have read access to on the root filesystem (startup volume) that have been modified in the last 30 days.

11.

Write a short script that tells you whether the permissions for two files, whose names are given as arguments to the script, are identical. If the permissions for the two files are identical, output the common permission field. Otherwise, output each filename followed by its permission field. (Hint: Try using the cut utility.)

12.

Write a script that takes the name of a directory as an argument and searches the file hierarchy rooted at that directory for zero-length files. Write the names of all zero-length files to standard output. If there is no option on the command line, have the script delete the file after displaying its name, asking the user for confirmation, and receiving positive confirmation. A f (force) option on the command line indicates that the script should display the filename but not ask for confirmation before deleting the file.

Advanced Exercises

13.

Write a script that takes a colon-separated list of items and outputs the items, one per line, to standard output (without the colons).

14.

Generalize the script written in exercise 13 so that the character separating the list items is given as an argument to the function. If this argument is absent, the separator should default to a colon.

15.

Write a function named funload that takes as its single argument the name of a file containing other functions. The purpose of funload is to make all functions in the named file available in the current shell; that is, funload loads the functions from the named file. To locate the file, funload searches the colon-separated list of directories given by the environment variable FUNPATH. Assume that the format of FUNPATH is the same as PATH and that searching FUNPATH is similar to the shell's search of the PATH variable.

16.

Rewrite bundle (page 554) so that the script it creates takes an optional list of filenames as arguments. If one or more filenames are given on the command line, only those files should be re-created; otherwise, all files in the shell archive should be re-created. For example, suppose that all files with the filename extension .c are bundled into an archive named srcshell, and you want to unbundle just the files test1.c and test2.c. The following command will unbundle just these two files:

$ bash srcshell test1.c test2.c


17.

What kind of links will the lnks script (page 532) not find? Why?

18.

In principle, recursion is never necessary. It can always be replaced by an iterative construct, such as while or until. Rewrite makepath (page 593) as a nonrecursive function. Which version do you prefer? Why?

19.

Lists are commonly stored in environment variables by putting a colon (:) between each of the list elements. (The value of the PATH variable is a good example.) You can add an element to such a list by catenating the new element to the front of the list, as in

PATH=/opt/bin:$PATH


If the element you add is already in the list, you now have two copies of it in the list. Write a shell function named addenv that takes two arguments: (1) the name of a shell variable and (2) a string to prepend to the list that is the value of the shell variable only if that string is not already an element of the list. For example, the call

addenv PATH /opt/bin


would add /opt/bin to PATH only if that pathname is not already in PATH. Be sure that your solution works even if the shell variable starts out empty. Also make sure that you check the list elements carefully. If /usr/opt/bin is in PATH but /opt/bin is not, the example just given should still add /opt/bin to PATH. (Hint: You may find this exercise easier to complete if you first write a function locate_field that tells you whether a string is an element in the value of a variable.)

20.

Write a function that takes a directory name as an argument and writes to standard output the maximum of the lengths of all filenames in that directory. If the function's argument is not a directory name, write an error message to standard output and exit with nonzero status.

21.

Modify the function you wrote for exercise 20 to descend all subdirectories of the named directory recursively and to find the maximum length of any filename in that hierarchy.

22.

Write a function that lists the number of ordinary files, directories, block special files, character special files, FIFOs, and symbolic links in the working directory. Do this in two different ways:

  1. Use the first letter of the output of ls l to determine a file's type.

  2. Use the file type condition tests of the [[ expression ]] syntax to determine a file's type.

23.

Modify the quiz program (page 596) so that the choices for a question are randomly arranged.




A Practical Guide to UNIX[r] for Mac OS[r] X Users
A Practical Guide to UNIX for Mac OS X Users
ISBN: 0131863339
EAN: 2147483647
Year: 2005
Pages: 234

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net