Normally, a Qshell script begins execution with the first line and continues until control reaches the end of the script, or until an exit command is found. However, things don't always happen the way they should.
Occasionally, bad things happen to good scripts. This chapter discusses strategies that you can use to both debug and bulletproof your shell scripts.
The first and most widely used script debugging strategy involves tracing the script. Tracing a shell script outputs information about the variable and parameter substitution that occurs. Tracing also lists the commands that Qshell executes while the script is running.
Figure 15.1 shows a shell script that has a rather trivial problem: It produces extra spaces in the output messages.
cat trace.qsh #!/usr/bin/qsh l="Now is the time for all good men " l="$l to come to the aid of their country" num=$(echo $l wc -w) /usr/bin/echo "There are $num words in the quote" num=$(echo $l wc -l) /usr/bin/echo "There are $num lines in the quote" trace.qsh There are 16 words in the quote There are 4 lines in the quote
Figure 15.1: This script is used to demonstrate Qshell tracing.
Use the -x option ( xtrace ) of the qsh interpreter to turn on tracing for the entire shell script. The shell script will show trace output of its progress. Qshell uses a prefix of the PS4 variable to indicate trace output. The PS4 variable holds a plus sign if it hasn't been changed.
Figure 15.2 demonstrates the problem script from Figure 15.1 with the -x option added to the interpreter. This example shows that the num variable is set to the value 16, with leading spaces coming from the output of the wc -w command.
cat trace1.qsh #!/usr/bin/qsh -x l="Now is the time for all good men " l="$l to come to the aid of their country" num=$(echo $l wc -w) /usr/bin/echo "There are $num words in the quote" num=$(echo $l wc -l) /usr/bin/echo "There are $num lines in the quote" trace1.qsh + l=Now is the time for all good men + l=Now is the time for all good men to come to the aid of their country + echo Now is the time for all good men to come to the aid of their country + wc -w + num= 16 + /usr/bin/echo There are 16 words in the quote There are 16 words in the quote + echo Now is the time for all good men to come to the aid of their country + wc -l + num= 4 + /usr/bin/echo There are 4 lines in the quote There are 4 lines in the quote
Figure 15.2: Use the -x option to the qsh interpreter to trace the entire shell script.
Sometimes, tracing the entire script produces too much output. Use the set utility to trace a portion of a script or to turn on tracing in an interactive session. Use set -x to turn on the x (xtrace) option, as shown in Figure 15.3 To turn off the xtrace option, use set +x .
cat trace2.qsh #!/usr/bin/qsh l="Now is the time for all good men " l="$l to come to the aid of their country" set -x num=$(echo $l wc -w) set +x /usr/bin/echo "There are $num words in the quote" num=$(echo $l wc -l) /usr/bin/echo "There are $num lines in the quote" trace2.qsh + echo Now is the time for all good men to come to the aid of their country + wc -w + num= 16 + set +x There are 16 words in the quote There are 4 lines in the quote
Figure 15.3: Turn on the tracing option (set -x) for part of the script, then turn it off again.
Although xtrace is by far the most frequently used tracing option, the set utility supports other trace- related settings that can aid in script debugging. Use these options, shown in Table 15.1, to enhance the debugging of script problems.
Option |
Description |
---|---|
-e |
The "error exit" option causes the script to exit if a command fails and the exit status of the command has not been tested . The If, Elif, While, or Until statements, or the or && operators, test the exit status of a command. |
-j |
The "job trace" option causes Qshell to print a status message with the iSeries' fully qualified job name and the pid (process ID) of each job that is started. |
-l (letter "ell") |
The "log commands" option causes Qshell to write each command to a message in the iSeries job log before running the command. The -l option is provided in v5r2. |
-m |
The "monitor" option causes Qshell to print a status message when a job completes. |
-t |
The "trace" option causes Qshell to write internal trace information to the qsh_trace file in the user 's home directory. |
-u |
The "unset" option causes Qshell to write an error message to stderr and exit immediately when it expands a shell variable that is not set. |
-v |
The "verbose" option causes Qshell to echo all input from stdin back to stderr during processing. |
-x |
The "xtrace" option causes Qshell to trace commands (after expansion and substitution). Trace output is preceded by the value of the PS4 variable. |
When an unexpected error happens in a process or script, the system sometimes generates a signal. Your script can use signals to handle error conditions. Handling error conditions using signals is transparent to most of the script and helps prevents undetected errors from propagating and causing more serious ones.
The trap and kill utilities allow your scripts or interactive Qshell sessions to deal with signals. The two utilities are related ; trap allows a script to receive signals, while kill allows the script to send signals.
To see a list of the signals, use the kill or trap command with the - l option, as shown in Figure 15.4. The - l option was added to trap in V5R2; earlier releases require the kill command.
kill -l 1) ABRT 2) FPE 3) ILL 4) INT 5) SEGV 6) TERM 7) USR1 8) USR2 9) IO 10) bad trap 11) bad trap 12) KILL 13) PIPE 14) ALRM 15) HUP 16) QUIT 17) STOP 18) TSTP 19) CONT 20) CHLD 21) TTIN 22) TTOU 23) URG 24) POLL 25) bad trap 26) bad trap 27) WINCH 28) bad trap 29) bad trap 30) bad trap 31) bad trap 32) BUS 33) DANGER 34) PRE 35) SYS 36) TRAP 37) PROF 38) VTALRM 39) XCPU 40) XFSZ 41) bad trap 42) bad trap 43) bad trap 44) bad trap 45) bad trap 46) bad trap 47) bad trap 48) bad trap 49) bad trap 50) bad trap 51) bad trap 52) bad trap 53) bad trap 54) bad trap 55) bad trap 56) bad trap 57) bad trap 58) bad trap 59) bad trap 60) bad trap 61) bad trap 62) bad trap 63) bad trap /home/jsmith $
Figure 15.4: Use the kill command with the -l option to view the list of defined signals.
There are 63 possible signals, numbered 1 through 63, but not all are defined or used by the system. The undefined signals are indicated by the text bad trap .
The defined signals have names. You can reference a signal by its name or by its number. It is better to reference signals by name , because they are not assigned the same numbers on all systems. For example, the interrupt signal, INT, is signal 4 in Qshell, but signal 2 in many Unix shells . Scripts that use the signal names are more easily ported to other systems than those that use signal numbers .
You may follow the -l option of the kill utility with a signal name or number. A signal number causes kill to display the signal's name. A signal name causes kill to display the signal's number. This is not true of trap , however. The trap command can include the -l option, but the option cannot be followed by a signal name or number. Figure 15.5 illustrates this use of the kill utility. Following the -l option with a three gives signal 3's name: ILL. Following the -l option with CONT gives the number of the CONT signal: 19.
kill -l 3 ILL kill -l CONT 19
Figure 15.5: The kill utility interprets signal names and numbers.
You will not have to deal with most of these signals, because OS/400 and Qshell typically do not generate them. The ones you will probably use most are the INT signal, which is sent when you select option 2 from the System Request menu, and the TERM signal, which is sent when you use kill to terminate a process.
In addition to these signals, there are three "pseudo-signals," listed in Table 15.2. They are called pseudo-signals because they are not caused by interrupts from the operating system. Instead, they are regularly generated events that you can treat as if they were unexpected events. Two of them, DEBUG and ERR, are provided as aids for debugging. The EXIT signal is provided as a way to force final processing, no matter how a script ends.
Name |
Number |
Description |
---|---|---|
DEBUG |
None |
Occurs after execution of a command |
ERR |
None |
Occurs when a utility returns a non-zero exit status |
EXIT |
Occurs when a script or shell ends |
Keep the following technical notes in mind when using signals:
There are good reasons for not allowing every OS/400 job to receive signals. Many system services, processing systems, and other applications were written long before the implementation of signals, Qshell, or Unix-type APIs. Those traditional applications are not written for and cannot tolerate receiving signals; they do not know how to clean up after being interrupted or ended due to a signal.
Forcing release-to-release compatibility problems for source code or applications is an anathema for the iSeries, so older, traditional applications are simply not exposed to the iSeries signals infrastructure. Only by using signals, Qshell or other Unix-type related system services directly are those applications exposed to signals. You can be sure that anything that explicitly uses signals, or anything that displays, prints, or retrieves a job-process ID (like the getpid() API or the Qshell special variable $$) will perform the appropriate system initialization so that the job is enabled for signals.
Use the trap utility to control a script's reaction to a signal (or a pseudo-signal). You can ignore a signal or execute one or more commands of your choosing. Here is the syntax of trap :
trap action signal
Replace signal with the names and/or numbers of one or more signals that are to be trapped. Replace action with one of the options listed in Table 15.3. For the action parameter, list one or more commands to be executed when one of the signals is received.
Action |
Description |
---|---|
commands or 'commands' or "commands" |
Execute a list of commands. |
- (hyphen) or omitted |
Reset to default behavior. |
two apostrophes or two quotes |
Ignore the signal. |
If there are embedded blanks, surround the command with single quotes or double quotes. If there are two or more commands, separate them with semicolons ”or, better yet, put them into a subroutine. When the signal is received, Qshell executes the action, then continues with the statement after the one that caused the interrupt, unless the action terminates the script.
Here are a few more things to keep in mind when using the trap utility:
On a Unix-based system with the correct type of terminal (for example, a TTY or Telnet session), the operating system sends the INT signal when the user presses the Control-C key combination. On a 5250 terminal emulator, on the other hand, Control-C is mapped differently. There, option 2 (End Previous Request) of the System Request menu is the closest match. The Qshell terminal generates an INT signal when the "End Previous Request" option is used so that you can interrupt running applications.
In Figure 15.6, the INT signal is trapped. Option 2 of the System Request menu is used when the script is waiting for user input, but it is not visible because it generates no output. The generated INT signal causes the script to run function HandleInt, which sends a message to stderr and ends the script with an exit status of 3. The first two trap commands are identical. Since there are no embedded blanks in the command, the single quotes are optional.
cat handleint.qsh #!/usr/bin/qsh # the following two lines are identical trap 'HandleInt' INT trap HandleInt INT function HandleInt { print -u2 "Script was aborted." exit 3 } echo "Do you want to continue?" read response echo "You said: $response" ./HandleInt.qsh Do you want to continue? Script was aborted.
Figure 15.6: The INT signal is raised from option 2 of the System Request menu, interrupting the script.
The script in Figure 15.7 begins by creating two work files whose names include the process ID. When the script ends, the EXIT signal is sent, causing the script to delete the work files. Note that there are three places where the script can end, but the rm utility will run, no matter how the script ends.
cat handleexit.qsh #!/usr/bin/qsh trap Cleanup EXIT function Cleanup { echo "Cleaning up files" rm temp.*.$$ } echo "Starting work file 1" > temp.1.$$ echo "Starting work file 2" > temp.2.$$ cat temp.*.$$ # All done, script ends normally with a call # to the exit utility or at the end of the file ./handleexit.qsh Starting work file 1 Starting work file 2 Cleaning up files
Figure 15.7: The EXIT signal is raised when the script ends normally.
If any command in the script in Figure 15.8 returns a non-zero exit status, Qshell aborts the script immediately, setting the exit status to 5. The first print command executes, but the second one does not because no command br549 exists.
cat handleerr.qsh #!/usr/bin/qsh trap 'echo "Whoa!"; exit 5' ERR print 'do something.' br549 print 'do something else' ./handleerr.qsh do something. ./handleerr.qsh: 001-0019 Error found searching for command br549. No such path or directory. Whoa! print $? 5
Figure 15.8: Use the ERR pseudo-signal to detect errors as they occur.
The -e (error exit) option setting provides the same basic support as setting up a trap for the ERR signal. The -e option setting provides less flexibility, however, because your script exits immediately with a predefined exit status. Figure 15.9 demonstrates using -e instead of the trap ERR statement.
cat handleerr2.qsh #!/usr/bin/qsh set -e print 'do something.' br549 print 'do something else' ./handleerr2.qsh do something. ./handleerr2.qsh: 001-0019 Error found searching for command br549. No such path or directory. print $? 127
Figure 15.9: Use the -e (error exit) option to detect errors as they occur.
The script in Figure 15.10 does not include any error-trapping logic. Qshell sends an error message, but still assigns the invalid value to the TradingPartner variable.
cat trading.qsh #!/usr/bin/qsh declare -i TradingPartner print 'Enter the trading partner ID' read TradingPartner print $TradingPartner exit 0 ./trading.qsh Enter the trading partner ID 35.8 read: 001-0032 Number 35.8 is not valid. 35.8
Figure 15.10: Failure to trap an error can result in subsequent errors.
Contrast the script in Figure 15.10 with the one in Figure 15.11. The new script uses additional error-trapping logic to detect the errors that occur in a block of code. The first trap command tells Qshell to print an error message when any command returns a non-zero status code. The second trap resets the ERR signal to its default action. If the read command encounters a value that is not of the integer type, the usual error message is written to the special file /dev/null, so that the user never sees it. The ERR condition is raised, causing the user to see the message in the trap command.
cat trading2.qsh #!/usr/bin/qsh declare -i TradingPartner trap 'print "Trading partner ID is a whole number; try again."' ERR while true do print 'Enter the trading partner ID' read TradingPartner 2> /dev/null if [[ $? -eq 0 ]] then break fi done # Reset the trap trap ERR print $TradingPartner exit 0 ./trading2.qsh Enter the trading partner ID 49024G Trading partner ID is a whole number; try again. Enter the trading partner ID 49024.5 Trading partner ID is a whole number; try again. Enter the trading partner ID 49024 49024
Figure 15.11: Use the ERR pseudo-signal to detect specific error conditions.
Figure 15.12 demonstrates a script that confirms and optionally ignores the INT signal. In the example output for this script, the user chooses option 2 from the System Request menu while the script is attempting to read the user's name .
cat confirmint.qsh #!/usr/bin/qsh trap ConfirmCancel INT function ConfirmCancel { print "You have asked to cancel script
cat confirmint.qsh #!/usr/bin/qsh trap ConfirmCancel INT function ConfirmCancel { print "You have asked to cancel script $0." print "Cancelling this script may result in corrupted data." print "If you wish to cancel, type YES." read confirmation if [[ "$confirmation" == YES ]] then print "Script $0 was cancelled." exit 33 else print "Script $0 is continuing." fi } # CAREFUL. Writing infinite loops like this can # cause apparent hangs if a recurring error is detected . while true do print 'What is your name?' read name if [[ $? -eq 0 ]] then break fi done echo "Hello $name" ./confirmint.qsh What is your name? You have asked to cancel script ./confirmint.qsh. Cancelling this script may result in corrupted data. If you wish to cancel, type YES. no Script ./confirmint.qsh is continuing. What is your name? Fred Hello Fred
." print "Cancelling this script may result in corrupted data." print "If you wish to cancel, type YES." read confirmation if [[ "$confirmation" == YES ]] then print "Script
cat confirmint.qsh #!/usr/bin/qsh trap ConfirmCancel INT function ConfirmCancel { print "You have asked to cancel script $0." print "Cancelling this script may result in corrupted data." print "If you wish to cancel, type YES." read confirmation if [[ "$confirmation" == YES ]] then print "Script $0 was cancelled." exit 33 else print "Script $0 is continuing." fi } # CAREFUL. Writing infinite loops like this can # cause apparent hangs if a recurring error is detected . while true do print 'What is your name?' read name if [[ $? -eq 0 ]] then break fi done echo "Hello $name" ./confirmint.qsh What is your name? You have asked to cancel script ./confirmint.qsh. Cancelling this script may result in corrupted data. If you wish to cancel, type YES. no Script ./confirmint.qsh is continuing. What is your name? Fred Hello Fred
was cancelled." exit 33 else print "Script
cat confirmint.qsh #!/usr/bin/qsh trap ConfirmCancel INT function ConfirmCancel { print "You have asked to cancel script $0." print "Cancelling this script may result in corrupted data." print "If you wish to cancel, type YES." read confirmation if [[ "$confirmation" == YES ]] then print "Script $0 was cancelled." exit 33 else print "Script $0 is continuing." fi } # CAREFUL. Writing infinite loops like this can # cause apparent hangs if a recurring error is detected . while true do print 'What is your name?' read name if [[ $? -eq 0 ]] then break fi done echo "Hello $name" ./confirmint.qsh What is your name? You have asked to cancel script ./confirmint.qsh. Cancelling this script may result in corrupted data. If you wish to cancel, type YES. no Script ./confirmint.qsh is continuing. What is your name? Fred Hello Fred
is continuing." fi } # CAREFUL. Writing infinite loops like this can # cause apparent hangs if a recurring error is detected. while true do print 'What is your name?' read name if [[ $? -eq 0 ]] then break fi done echo "Hello $name" ./confirmint.qsh What is your name? You have asked to cancel script ./confirmint.qsh. Cancelling this script may result in corrupted data. If you wish to cancel, type YES. no Script ./confirmint.qsh is continuing. What is your name? Fred Hello Fred
Figure 15.12: Use care with using or automating the INT signal.
The first portion of the script defines a trap for the ENDRQS command, so that if the user chooses option 2 from the System Request menu, Qshell runs function ConfirmCancel . The ConfirmCancel function issues a warning message and requires the user to enter the word YES in capital letters to cancel execution. If the user enters YES , the script sends a cancellation message to stdout and exits with status 33. Otherwise, the script continues execution from the point at which the user chose option 2.
Figure 15.12 also demonstrates a rather subtle but important concept: the distinction between the shell script and the utilities used. Recall that each utility run by a shell script is either a built-in utility processed by Qshell directly, or a regular utility started in a separate job. Notice in Figure 15.12 that there is both a trap handler registered (the ConfirmCancel function) and error-checking logic with a loop for processing the read utility. At the point shown in the figure, the script is processing the "read name" command line.
When the user chooses option 2 from the System Request menu, the interrupt occurs. It first interrupts the read utility, and then causes Qshell to run the trap handler (the ConfirmCancel function). Regardless of the action of the ConfirmCancel function, read has already been interrupted . The script detects the error (the return from read being interrupted), and retries the operation.
The example in Figure 15.13 shows how to disable the ENDRQS (End Request) command. If the user selects the option 2 from the System Request menu, nothing happens. After the "read name" command line, the second trap restores the default action of the ENDRQS command, which is to immediately terminate the script.
cat ./ignoreint.qsh #!/usr/bin/qsh # CAREFUL. Writing infinite loops like this can # cause apparent hangs if a recurring error is detected. while true do print 'What is your name?' trap "" INT read name trap - INT if [[ $? -eq 0 ]] then break fi done echo "Hello $name" ./ignoreint.qsh What is your name? Fred Hello Fred
Figure 15.13: Use trap to disable an interrupt.
You have seen the way that scripts can handle signals and pseudo-signals through the trap utility. Those same scripts might need to send signals.
Somewhat misnamed for its most frequently used action, kill explicitly sends any signal to a job. In its simplest form or using the correct parameters, the kill utility can be loosely equated to the CL command ENDJOB (End Job), in that it can terminate a process. In this form, it requires only one parameter: the process ID of the job to be terminated . In the absence of other parameters, kill sends the TERM signal (assigned the number 6 in OS/400) to the process, terminating it.
In its other forms, kill sends arbitrary signals to a job or jobs. These signals are used by scripts or programs for various indicators and for more general actions. Other than terminating a job, the second most common use of kill is to tell a job to "refresh" in some fashion. For example, use kill to send a signal that, when received, causes the target job to reread its configuration files and restart with the new configuration.
The syntax of the kill command is as follows :
kill [-signal] process-id kill [-s signal-name] process-id kill [-n signal-number] process-id kill -l [signal-name ]
Details about the options for kill are given in Table 15.4.
Option |
Description |
---|---|
-l name / number |
List the signals given by the signal-name parameters, or list all signals if no parameters are specified. |
-n number |
Send signal number number to the list of process IDs specified. |
-s name |
Send signal name name to the list of process IDs specified. |
- signal |
A shortcut for the -n or -s option, the signal name or number can be immediately prefixed by a dash to send the signal. |
You may use the -s and -n options to send signals other than TERM. Follow the -n option with the signal, or follow -s with a signal name. The -l option was demonstrated in Figures 15.4 and 15.5.
Use special care if you're familiar with the kill utility on other platforms. Signal numbers are assigned different values on various platforms. For example, a command like kill -9 won't act as expected. Instead, use the kill utility with the name of the signal, such as kill -KILL .
For each form of the kill utility, the process-id parameter can be given in one of three forms to indicate a particular job:
Figure 15.14 demonstrates a simple script, receiver.qsh, which provides a generic signal-handling tool. Use this script to experiment with the kill utility. The example shown submits the receiver.qsh script in the background and sends the TERM signal (the default signal) using the process ID that Qshell displays when starting jobs in the background.
cat receiver.qsh #!/usr/bin/qsh # This script is a generic signal receiver script. # It is used to demonstrate handling signals. if [ "" = "" ]; then echo "Usage: receiver.qsh [signal-name]..." exit 1 fi trap HandleSignal $* function HandleSignal { echo "Script receiver.qsh received a signal. Ending." exit 0 } while true; do echo "Script receiver.qsh is trapping signals: $*" sleep 10; done receiver.qsh TERM & [1] 194 Script receiver.qsh is trapping signals: TERM kill 194 Script receiver.qsh received a signal. Ending.
Figure 15.14: Send the TERM signal to a running script using the process ID.
The example shown in Figure 15.15 builds slightly on Figure 15.14. In this example, the command lines for the kill utility also use the % number Qshell shortcut to target the running script using the Qshell job number. The Qshell job number and the job name of submitted Qshell jobs can be listed using the jobs utility. Be aware that the Qshell job number and job name differ from the iSeries job number and job name.
receiver.qsh USR1 USR2 & [1] 197 Script receiver.qsh is trapping signals: USR1 USR2 kill %1 jobs [1] Terminated receiver.qsh USR1 USR2 & receiver.qsh USR1 USR2 & [1] 198 Script receiver.qsh is trapping signals: USR1 USR2 kill -USR1 %1 Script receiver.qsh received a signal. Ending.
Figure 15.15: Send the TERM signal and the USR1 signal using the Qshell job-number shortcut.
Using the same receiver.qsh script as Figure 15.14, the TERM signal in Figure 15.15 is first sent to the running script. The script terminates immediately because it does not trap the TERM signal. The second attempt sends the USR1 signal to the running script. Since the script is trapping the USR1 signal, the handler runs and prints out the message.
The example in Figure 15.16 again uses the receiver.qsh script. In this case, the Qshell jobs utility displays the running Qshell jobs. The INT signal is sent to the running script using the % jobname shortcut. Since the receiver.qsh script is trapping the INT signal, it runs the Handler subroutine and exits.
receiver.qsh INT & [1] 202 Script receiver.qsh is trapping signals: INT jobs [1] Running receiver.qsh INT & kill -INT %receiver Script receiver.qsh received a signal. Ending.
Figure 15.16: Send the INT signal using the Qshell job name shortcut.
In this chapter, you've learned about some advanced Qshell programming features:
Preface