Shell Programs | A Practical Guide to Red HatВ® LinuxВ®: Fedoraв„ў Core and Red Hat Enterprise Linux (3rd Edition)

The Bourne Again Shell has many features that make it a good programming language. The structures that bash provides are not a random assortment. Rather, they have been chosen to provide most of the structural features that are in other procedural languages, such as C or Pascal. A procedural language provides the ability to

Declare, assign, and manipulate variables and constant data. The Bourne Again Shell provides string variables, together with powerful string operators, and integer variables, along with a complete set of arithmetic operators.
Break large problems into small ones by creating subprograms. The Bourne Again Shell allows you to create functions and call scripts from other scripts. Shell functions can be called recursively; that is, a Bourne Again Shell function can call itself. You may not need to use recursion often, but it may allow you to solve some apparently difficult problems with ease.
Execute statements conditionally, using statements such as if.
Execute statements iteratively, using statements such as while and for.
Transfer data to and from the program, communicating with both data files and users.

Programming languages implement these capabilities in different ways but with the same ideas in mind. When you want to solve a problem by writing a program, you must first figure out a procedure that leads you to a solutionthat is, an algorithm. Typically you can implement the same algorithm in roughly the same way in different programming languages, using the same kinds of constructs in each language.

Chapter 9 and this chapter have introduced numerous bash features, many of which are useful for interactive use as well as for shell programming. This section develops two complete shell programs, demonstrating how to combine some of these features effectively. The programs are presented as problems for you to solve along with sample solutions.

A Recursive Shell Script

A recursive construct is one that is defined in terms of itself. Alternatively, you might say that a recursive program is one that can call itself. This may seem circular, but it need not be. To avoid circularity a recursive definition must have a special case that is not self-referential. Recursive ideas occur in everyday life. For example, you can define an ancestor as your mother, your father, or one of their ancestors. This definition is not circular; it specifies unambiguously who your ancestors are: your mother or your father, or your mother's mother or father or your father's mother or father, and so on.

A number of Linux system utilities can operate recursively. See the R option to the chmod, chown, and cp utilities for examples.

Solve the following problem by using a recursive shell function:

Write a shell function named makepath that, given a pathname, creates all components in that pathname as directories. For example, the command makepath a/b/c/d should create directories a, a/b, a/b/c, and a/b/c/d. (The mkdir utility supports a p option that does exactly this. Solve the problem without using mkdir p.)

One algorithm for a recursive solution follows:

Examine the path argument. If it is a null string or if it names an existing directory, do nothing and return.
If it is a simple path component, create it (using mkdir) and return.
Otherwise, call makepath using the path prefix of the original argument. This step eventually creates all the directories up to the last component, which you can then create with mkdir.

In general, a recursive function must invoke itself with a simpler version of the problem than it was given until it is finally called with a simple case that does not need to call itself. Following is one possible solution based on this algorithm:

makepath

# this is a function # enter it at the keyboard, do not run it as a shell script # function makepath() {        if [[ ${#1} -eq 0 || -d "$1" ]]            then               return 0        # Do nothing        fi        if [[ "${1%/*}" = "$1" ]]            then               mkdir $1               return $?        fi        makepath ${1%/*} || return 1        mkdir $1        return $? }

In the test for a simple component (the if statement in the middle of the function), the left expression is the argument after the shortest suffix that starts with a / character has been stripped away (page 942). If there is no such character (for example, if $1 is alex), nothing is stripped off and the two sides are equal. If the argument is a simple filename preceded by a slash, such as /usr, the expression ${1%/*} evaluates to a null string. To make the function work in this case, you must take two precautions: Put the left expression within quotation marks and ensure that the recursive function behaves sensibly when it is passed a null string as an argument. In general, good programs are robust: They should be prepared for borderline, invalid, or meaningless input and behave appropriately in such cases.

By giving the following command from the shell you are working in, you turn on debugging tracing so that you can watch the recursion work:

$ set -o xtrace

(Give the same command, but replace the hyphen with a plus sign (+) to turn debugging off.) With debugging turned on, the shell displays each line in its expanded form as it executes the line. A + precedes each line of debugging output. In the following example, the first line that starts with + shows the shell calling makepath. The makepath function is called from the command line with arguments of a/b/c. Subsequently it calls itself with arguments of a/b and finally a. All the work is done (using mkdir) as each call to makepath returns.

$ makepath a/b/c + makepath a/b/c + [[ 5 -eq 0 ]] + [[ -d a/b/c ]] + [[ a/b = \a\/\b\/\c ]] + makepath a/b + [[ 3 -eq 0 ]] + [[ -d a/b ]] + [[ a = \a\/\b ]] + makepath a + [[ 1 -eq 0 ]] + [[ -d a ]] + [[ a = \a ]] + mkdir a + return 0 + mkdir a/b + return 0 + mkdir a/b/c + return 0

The function works its way down the recursive path and back up again.

It is instructive to invoke makepath with an invalid path and see what happens. The following example, run with debugging turned on, tries to create the path /a/b, which requires that you create directory a in the root directory. Unless you have permission to write to the root directory, you are not permitted to create this directory.

$ makepath /a/b + makepath /a/b + [[ 4 -eq 0 ]] + [[ -d /a/b ]] + [[ /a = \/\a\/\b ]] + makepath /a + [[ 2 -eq 0 ]] + [[ -d /a ]] + [[ '' = \/\a ]] + makepath + [[ 0 -eq 0 ]] + return 0 + mkdir /a mkdir: cannot create directory '/a': Permission denied + return 1 + return 1

The recursion stops when makepath is denied permission to create the /a directory. The error return is passed all the way back, so the original makepath exits with nonzero status.

Tip: Use local variables with recursive functions

The preceding example glossed over a potential problem that you may encounter when you use a recursive function. During the execution of a recursive function, many separate instances of that function may be active simultaneously. All but one of them are waiting for their child invocation to complete.

Because functions run in the same environment as the shell that calls them, variables are implicitly shared by a shell and a function it calls so that all instances of the function share a single copy of each variable. Sharing variables can give rise to side effects that are rarely what you want. As a rule, you should use typeset to make all variables of a recursive function be local variables. See page 917 for more information.

The quiz Shell Script

Solve the following problem using a bash script:

Write a generic multiple-choice quiz program. The program should get its questions from data files, present them to the user, and keep track of the number of correct and incorrect answers. The user must be able to exit from the program at any time with a summary of results to that point.

The detailed design of this program and even the detailed description of the problem depend on a number of choices: How will the program know which subjects are available for quizzes? How will the user choose a subject? How will the program know when the quiz is over? Should the program present the same questions (for a given subject) in the same order each time, or should it scramble them?

Of course, you can make many perfectly good choices that implement the specification of the problem. The following details narrow the problem specification:

Each subject will correspond to a subdirectory of a master quiz directory. This directory will be named in the environment variable QUIZDIR, whose default will be ~/quiz. For example, you could have the following directories correspond to the subjects engineering, art, and politics: ~/quiz/engineering, ~/quiz/art, and ~/quiz/politics. Put the quiz directory in /usr/games if you want all users to have access to it (requires root privileges).
Each subject can have several questions. Each question is represented by a file in its subject's directory.
The first line of each file that represents a question is the text of the question. If it takes more than one line, you must escape the NEWLINE with a backslash. (This setup makes it easy to read a single question with the read builtin.) The second line of the file is an integer that specifies the number of choices. The next lines are the choices themselves. The last line is the correct answer. Following is a sample question file:
```
Who discovered the principle of the lever? 4 Euclid Archimedes Thomas Edison The Lever Brothers Archimedes
```
The program presents all the questions in a subject directory. At any point the user can interrupt the quiz with CONTROL-C, whereupon the program will summarize the results so far and exit. If the user does not interrupt, the program summarizes the results and exits when it has asked all questions for the chosen subject.
The program scrambles the questions in a subject before presenting them.

Following is a top-level design for this program:

Initialize. This involves a number of steps, such as setting the counts of the number of questions asked so far and the number of correct and wrong answers to zero. Sets up to trap CONTROL-C.
Present the user with a choice of subjects and get the user's response.
Change to the corresponding subject directory.
Determine the questions to be asked (that is, the filenames in that directory). Arrange them in random order.
Repeatedly present questions and ask for answers until the quiz is over or is interrupted by the user.
Present the results and exit.

Clearly some of these steps (such as step 3) are simple, whereas others (such as step 4) are complex and worthy of analysis on their own. Use shell functions for any complex step, and use the trap builtin to handle a user interrupt.

Here is a skeleton version of the program with empty shell functions:

function initialize { # Initializes variables. } function choose_subj { # Writes choice to standard output. } function scramble { # Stores names of question files, scrambled, # in an array variable named questions. } function ask { # Reads a question file, asks the question, and checks the # answer. Returns 1 if the answer was correct, 0 otherwise. If it # encounters an invalid question file, exit with status 2. } function summarize { # Presents the user's score. } # Main program initialize                        # Step 1 in top-level design subject=$(choose_subj)            # Step 2 [[ $? -eq 0 ]] || exit 2          # If no valid choice, exit cd $subject || exit 2             # Step 3 echo                              # Skip a line scramble                          # Step 4 for ques in ${questions[*]}; do   # Step 5      ask $ques      result=$?     (( num_ques=num_ques+1 ))     if [[ $result == 1 ]]; then         (( num_correct += 1 ))     fi     echo                          # Skip a line between questions     sleep ${QUIZDELAY:=1} done summarize                         # Step 6 exit 0

To make reading the results a bit easier for the user, a sleep call appears inside the question loop. It delays $QUIZDELAY seconds (default = 1) between questions.

Now the task is to fill in the missing pieces of the program. In a sense this program is being written backward. The details (the shell functions) come first in the file but come last in the development process. This common programming practice is called top-down design. In top-down design you fill in the broad outline of the program first and supply the details later. In this way you break the problem up into smaller problems, each of which you can work on independently. Shell functions are a great help in using the top-down approach.

One way to write the initialize function follows. The cd command causes QUIZDIR to be the working directory for the rest of the script and defaults to ~/quiz if QUIZDIR is not set.

function initialize () { trap 'summarize ; exit 0' INT     # Handle user interrupts num_ques=0                        # Number of questions asked so far num_correct=0                     # Number answered correctly so far first_time=true                   # true until first question is asked cd ${QUIZDIR:=~/quiz} || exit 2 }

Be prepared for the cd command to fail. The directory may be unsearchable or conceivably another user may have removed it. The preceding function exits with a status code of 2 if cd fails.

The next function, choose_subj, is a bit more complicated. It displays a menu using a select statement:

function choose_subj () { subjects=($(ls)) PS3="Choose a subject for the quiz from the preceding list: " select Subject in ${subjects[*]}; do     if [[ -z "$Subject" ]]; then         echo "No subject chosen.  Bye." >&2         exit 1     fi     echo $Subject     return 0 done }

The function first uses an ls command and command substitution to put a list of subject directories in the subjects array. Next the select structure (page 907) presents the user with a list of subjects (the directories found by ls) and assigns the chosen directory name to the Subject variable. Finally the function writes the name of the subject directory to standard output. The main program uses command substitution to assign this value to the subject variable [subject=$(choose_subj)].

The scramble function presents a number of difficulties. In this solution it uses an array variable (questions) to hold the names of the questions. It scrambles the entries in an array using the RANDOM variable (each time you reference RANDOM it has the value of a [random] integer between 0 and 32767):

function scramble () { typeset -i index quescount questions=($(ls)) quescount=${#questions[*]}        # Number of elements ((index=quescount-1)) while [[ $index > 0 ]]; do     ((target=RANDOM % index))     exchange $target $index     ((index -= 1)) done }

This function initializes the array variable questions to the list of filenames (questions) in the working directory. The variable quescount is set to the number of such files. Then the following algorithm is used: Let the variable index count down from quescount 1 (the index of the last entry in the array variable). For each value of index, the function chooses a random value target between 0 and index, inclusive. The command

((target=RANDOM % index))

produces a random value between 0 and index 1 by taking the remainder (the % operator) when $RANDOM is divided by index. The function then exchanges the elements of questions at positions target and index. It is convenient to do this in another function named exchange:

function exchange () { temp_value=${questions[$1]} questions[$1]=${questions[$2]} questions[$2]=$temp_value }

The ask function also uses the select structure. It reads the question file named in its argument and uses the contents of that file to present the question, accept the answer, and determine whether the answer is correct. (See the code that follows.)

The ask function uses file descriptor 3 to read successive lines from the question file, whose name was passed as an argument and is represented by $1 in the function. It reads the question into the ques variable and the number of questions into num_opts. The function constructs the variable choices by initializing it to a null string and successively appending the next choice. Then it sets PS3 to the value of ques and uses a select structure to prompt the user with ques. The select structure places the user's answer in answer, and the function then checks it against the correct answer from the file.

The construction of the choices variable is done with an eye toward avoiding a potential problem. Suppose that one answer has some whitespace in it. Then it might appear as two or more arguments in choices. To avoid this problem, make sure that choices is an array variable. The select statement does the rest of the work:

quiz

$ cat quiz #!/bin/bash # remove the # on the following line to turn on debugging # set -o xtrace #================== function initialize () { trap 'summarize ; exit 0' INT     # Handle user interrupts num_ques=0                        # Number of questions asked so far num_correct=0                     # Number answered correctly so far first_time=true                   # true until first question is asked cd ${QUIZDIR:=~/quiz} || exit 2 } #================== function choose_subj () { subjects=($(ls)) PS3="Choose a subject for the quiz from the preceding list: " select Subject in ${subjects[*]}; do     if [[ -z "$Subject" ]]; then         echo "No subject chosen.  Bye." >&2         exit 1     fi     echo $Subject     return 0 done } #================== function exchange () { temp_value=${questions[$1]} questions[$1]=${questions[$2]} questions[$2]=$temp_value } #================== function scramble () { typeset -i index quescount questions=($(ls)) quescount=${#questions[*]}         # Number of elements ((index=quescount-1)) while [[ $index > 0 ]]; do     ((target=RANDOM % index))     exchange $target $index      ((index -= 1)) done } #================== function ask () { exec 3<$1 read -u3 ques || exit 2 read -u3 num_opts || exit 2 index=0 choices=() while (( index < num_opts )) ; do     read -u3 next_choice || exit 2     choices=("${choices[@]}" "$next_choice")      ((index += 1)) done read -u3 correct_answer || exit 2 exec 3<&- if [[ $first_time = true ]]; then     first_time=false      echo -e "You may press the interrupt key at any time to quit.\n" fi PS3=$ques"  "                     # Make $ques the prompt for select                                   # and add some spaces for legibility. select answer in "${choices[@]}"; do     if [[ -z "$answer" ]]; then             echo  Not a valid choice. Please choose again.          elif [[ "$answer" = "$correct_answer" ]]; then             echo "Correct!"             return 1         else             echo "No, the answer is $correct_answer."             return 0     fi done } #================== function summarize () { echo                               # Skip a line if (( num_ques == 0 )); then      echo "You did not answer any questions"     exit 0 fi (( percent=num_correct*100/num_ques )) echo "You answered $num_correct questions correctly, out of \ $num_ques total questions." echo "Your score is $percent percent." } #================== # Main program initialize                        # Step 1 in top-level design subject=$(choose_subj)             # Step 2 [[ $? -eq 0 ]] || exit 2          # If no valid choice, exit cd $subject || exit 2              # Step 3 echo                               # Skip a line scramble                           # Step 4 for ques in ${questions[*]}; do    # Step 5      ask $ques      result=$?     (( num_ques=num_ques+1 ))     if [[ $result == 1 ]]; then         (( num_correct += 1 ))     fi     echo                          # Skip a line between questions     sleep ${QUIZDELAY:=1} done summarize                          # Step 6 exit 0