Command-Line Options | Learning the bash Shell: Unix Shell Programming (In a Nutshell (OReilly))

6.1 Command-Line Options

We have already seen many examples of the positional parameters ( variables called 1 , 2 , 3 , etc.) that the shell uses to store the command-line arguments to a shell script or function when it runs. We have also seen related variables like * (for the string of all arguments) and # (for the number of arguments).

Indeed, these variables hold all of the information on the user 's command-line. But consider what happens when options are involved. Typical UNIX commands have the form command [- options ] args , meaning that there can be 0 or more options. If a shell script processes the command teatime alice hatter , then $1 is "alice" and $2 is "hatter". But if the command is teatime -o alice hatter , then $1 is -o , $2 is "alice", and $3 is "hatter".

You might think you could write code like this to handle it:

  if [  = -o ]; then

    code that processes the -o option

1=

2=

fi

  normal processing of  and ...

But this code has several problems. First, assignments like 1=$2 are illegal because positional parameters are read-only. Even if they were legal, another problem is that this kind of code imposes limitations on how many arguments the script can handle ”which is very unwise. Furthermore, if this command had several possible options, the code to handle all of them would get very messy very quickly.

6.1.1 shift

Luckily, the shell provides a way around this problem. The command shift performs the function of:

1=

2=

...

for every argument, regardless of how many there are. If you supply a numeric argument to shift , it will shift the arguments that many times over; for example, shift 3 has this effect:

1=

2=

...

This leads immediately to some code that handles a single option (call it -o ) and arbitrarily many arguments:

  if [  = -o ]; then

    process the -o option

  shift

fi

    normal processing of arguments...

After the if construct, $1 , $2 , etc., are set to the correct arguments.

We can use shift together with the programming features we have seen so far to implement simple option schemes. However, we will need additional help when things get more complex. The getopts built-in command, which we will introduce later, provides this help.

shift by itself gives us enough power to implement the - N option to the highest script we saw in Chapter 4 (Task 4-1). Recall that this script takes an input file that lists artists and the number of albums you have by them. It sorts the list and prints out the N highest numbers , in descending order. The code that does the actual data processing is:

  filename=

  howmany=${2:-10}

  sort -nr $filename  head -$howmany

Our original syntax for calling this script was highest filename [- N ] , where N defaults to 10 if omitted. Let's change this to a more conventional UNIX syntax, in which options are given before arguments: highest [- N ] filename . Here is how we would write the script with this syntax:

  if [ -n "$(echo   grep '^-[0-9][0-9]*$')" ]; then

  howmany=

  shift

  elif [ -n "$(echo   grep '^-')" ]; then

  print 'usage: highest [-N] filename'

  exit 1

  else

  howmany="-10"

fi

  filename=

  sort -nr $filename  head $howmany

This uses the grep search utility to test if $1 matches the appropriate pattern. To do this we provide the regular expression ^-[0-9][0-9]*$ to grep , which is interpreted as "an initial dash followed by a digit, optionally followed by one or more digits." If a match is found then grep will return the match and the test will be true, otherwise grep will return nothing and processing will pass to the elif test. Notice that we have enclosed the regular expression in single quotes to stop the shell from interpreting the $ and * , and pass them through to grep unmodified.

If $1 doesn't match, we test to see if it's an option at all, i.e., if it matches the pattern - followed by anything else. If it does, then it's invalid; we print an error message and exit with error status. If we reach the final ( else ) case, we assume that $1 is a filename and treat it as such in the ensuing code. The rest of the script processes the data as before.

We can extend what we have learned so far to a general technique for handling multiple options. For the sake of concreteness, assume that our script is called alice and we want to handle the options -a , -b , and -c :

 while [ -n "$(echo   grep '-')" ]; do

 case  in

 -a)   process option -a    ;;

  -b)    process option -b    ;;

  -c)    process option -c    ;;

  *) echo 'usage: alice [-a] [-b] [-c] args...'

  exit 1

  esac

  shift

  done

   normal processing of arguments...

This code checks $1 repeatedly as long as it starts with a dash ( - ). Then the case construct runs the appropriate code depending on which option $1 is. If the option is invalid ”i.e., if it starts with a dash but isn't -a , -b , or -c ”then the script prints a usage message and returns with an error exit status.

After each option is processed , the arguments are shifted over. The result is that the positional parameters are set to the actual arguments when the while loop finishes.

Notice that this code is capable of handling options of arbitrary length, not just one letter (e.g., -adventure instead of -a ).

6.1.2 Options with Arguments

We need to add one more ingredient to make option processing really useful. Recall that many commands have options that take their own arguments. For example, the cut command, on which we relied heavily in Chapter 4 , accepts the option -d with an argument that determines the field delimiter (if it is not the default TAB). To handle this type of option, we just use another shift when we are processing the option.

Assume that, in our alice script, the option -b requires its own argument. Here is the modified code that will process it:

 while [ -n "$(echo   grep '-')" ]; do

 case  in

 -a)   process option -a   ;;

 -b)   process option -b

   is the option's argument

 shift ;;

 -c)   process option -c   ;;

 *) echo 'usage: alice [-a] [-b barg] [-c] args...'

 exit 1

 esac

 shift

 done

   normal processing of arguments...

6.1.3 getopts

So far, we have a complete, but constrained, way of handling command-line options. The above code does not allow a user to combine arguments with a single dash, e.g., -abc instead of -a -b -c . It also doesn't allow one to specify arguments to options without a space in between, e.g., -barg in addition to -b arg. ^[1]

^[1] Although most UNIX commands allow this, it is actually contrary to the Command Syntax Standard Rules in intro of the User's Manual .

The shell provides a built-in way to deal with multiple complex options without these constraints. The built-in command getopts ^[2] can be used as the condition of the while in an option-processing loop. Given a specification of which options are valid and which require their own arguments, it sets up the body of the loop to process each option in turn .

^[2] getopts replaces the external command getopt , used in Bourne shell programming; getopts is better integrated into the shell's syntax and runs more efficiently . C programmers will recognize getopts as very similar to the standard library routine getopt .

getopts takes two arguments. The first is a string that can contain letters and colons. Each letter is a valid option; if a letter is followed by a colon , the option requires an argument. getopts picks options off the command line and assigns each one (without the leading dash) to a variable whose name is getopts 's second argument. As long as there are options left to process, getopts will return exit status 0; when the options are exhausted, it returns exit status 1, causing the while loop to exit.

getopts does a few other things that make option processing easier; we'll encounter them as we examine how to use getopts in this example:

 while getopts ":ab:c" opt; do

 case $opt in

 a)   process option -a    ;;

  b)    process option -b

     $OPTARG is the option's argument    ;;

  c)    process option -c    ;;

  \?) echo 'usage: alice [-a] [-b barg] [-c] args...'

  exit 1

  esac

  done

  shift $(($OPTIND - 1))

   normal processing of arguments...

The call to getopts in the while condition sets up the loop to accept the options -a , -b , and -c , and specifies that -b takes an argument. (We will explain the : that starts the option string in a moment.) Each time the loop body is executed, it will have the latest option available, without a dash ( - ), in the variable opt .

If the user types an invalid option, getopts normally prints an unfortunate error message (of the form cmd: getopts: illegal option ” o ) and sets opt to ? . However if you begin the option letter string with a colon, getopts won't print the message. ^[3] We recommend that you specify the colon and provide your own error message in a case that handles ? , as above.

^[3] You can also turn off the getopts messages by setting the environment variable OPTERR to 0. We will continue to use the colon method in this book.

We have modified the code in the case construct to reflect what getopts does. But notice that there are no more shift statements inside the while loop: getopts does not rely on shift s to keep track of where it is. It is unnecessary to shift arguments over until getopts is finished, i.e., until the while loop exits.

If an option has an argument, getopts stores it in the variable OPTARG , which can be used in the code that processes the option.

The one shift statement left is after the while loop. getopts stores in the variable OPTIND the number of the next argument to be processed; in this case, that's the number of the first (non-option) command-line argument. For example, if the command line were alice -ab rabbit , then $OPTIND would be "3". If it were alice -a -b rabbit , then $OPTIND would be "4".

The expression $(($OPTIND - 1)) is an arithmetic expression (as we'll see later in this chapter) equal to $OPTIND minus 1. This value is used as the argument to shift . The result is that the correct number of arguments are shifted out of the way, leaving the "real" arguments as $1 , $2 , etc.

Before we continue, now is a good time to summarize everything getopts does:

1. Its first argument is a string containing all valid option letters. If an option requires an argument, a colon follows its letter in the string. An initial colon causes getopts not to print an error message when the user gives an invalid option.

2. Its second argument is the name of a variable that will hold each option letter (without any leading dash) as it is processed.

3. If an option takes an argument, the argument is stored in the variable OPTARG .

4. The variable OPTIND contains a number equal to the next command-line argument to be processed. After getopts is done, it equals the number of the first "real" argument.

The advantages of getopts are that it minimizes extra code necessary to process options and fully supports the standard UNIX option syntax (as specified in intro of the User's Manual ).

As a more concrete example, let's return to our graphics utility (Task 4-2). So far, we have given our script the ability to process various types of graphics files such as PCX files (ending with .pcx ), JPEG files ( .jpg ), XPM files ( .xpm ), etc. As a reminder, here is what we have coded in the script so far:

 filename=

 if [ -z $filename ]; then

 echo "procfile: No file specified"

 exit 1

fi

 for filename in "$@"; do

 ppmfile=${filename%.*}.ppm

 case $filename in

 *.gif) exit 0 ;;

 *.tga) tgatoppm $filename > $ppmfile ;;

 *.xpm) xpmtoppm $filename > $ppmfile ;;

 *.pcx) pcxtoppm $filename > $ppmfile ;;

 *.tif) tifftopnm $filename > $ppmfile ;;

 *.jpg) djpeg $filename > $ppmfile ;;

 *) echo "procfile: $filename is an unknown graphics file."

 exit 1 ;;

 esac

 outfile=${ppmfile%.ppm}.new.gif

 ppmquant -quiet 256 $ppmfile  ppmtogif -quiet > $outfile

 rm $ppmfile

 done

This script works quite well, in that it will convert the various different graphics files that we have lying around into GIF files suitable for our Web page. However, NetPBM has a whole range of useful utilities besides file converters that we could use on the images. It would be nice to be able to select some of them from our script.

Things we might wish to do to the images for our Web page include changing the size and placing a border around them. We want to make the script as flexible as possible; we will want to change the size of the resulting images and we might not want a border around every one of them, so we need to be able to specify to the script what it should do. This is where the command-line option processing will come in useful.

We can change the size of an image by using the NetPBM utility pnmscale . You'll recall from the last chapter that the NetPBM package has its own format called PNM, the Portable Anymap. The fancy utilities we'll be using to change the size and add borders work on PNM's. Fortunately, our script already converts the various formats we give it into PNM's (actually PPM's in our script, which are full-color instances of PNM's). Besides a PNM file, pnmscale also requires some arguments telling it how to scale the image. ^[4] There are various different ways to do this, but the one we'll choose is -xysize which takes a horizontal and a vertical size in pixels for the final image. ^[5]

^[4] We'll also need the -quiet option, which you may already have noticed as an option to the ppmquant and ppmtogif utilities. -quiet suppresses diagnostic output from some NetPBM utilities.

^[5] Actually, -xysize fits the image into a box defined by its arguments without changing the aspect ratio of the image, i.e., without stretching the image horizontally or vertically. For example, if you had an image of size 200 by 100 pixels and you processed it with pnmscale -xysize 100 100 , you'd end up with an image of size 100 by 50 pixels.

The other utility we'll need is pnmmargin which places a colored border around an image. It takes as arguments the width of the border in pixels, and the color of the border.

Our graphics utility will need some options to reflect the ones we have just seen: -s size will specify a size into which the final image will fit (minus any border), -w width will specify the width of the border around the image, and -c color-name will specify the color of the border.

Here is the code for the script procimage that includes the option processing:

 # Set up the defaults

 size=320

 width=1

 colour="-color black"

 usage="Usage:  usage="Usage: $0 [-s N] [-w N] [-c S] imagefile..." 
 [-s N] [-w N] [-c S] imagefile..."

 while getopts ":s:w:c:" opt; do

 case $opt in

 s) size=$OPTARG ;;

 w) width=$OPTARG ;;

 c) colour="-color $OPTARG" ;;

 \?) echo $usage

 exit 1 ;;

 esac

 done

 shift $(($OPTIND - 1))

 if [ -z "$@" ]; then

 echo $usage

 exit 1

fi

 # Process the input files

 for filename in "$*"; do

 ppmfile=${filename%.*}.ppm

 case $filename in

 *.gif) giftopnm $filename > $ppmfile ;;

 *.tga) tgatoppm $filename > $ppmfile ;;

 *.xpm) xpmtoppm $filename > $ppmfile ;;

  *.pcx) pcxtoppm $filename > $ppmfile ;;

  *.tif) tifftopnm $filename > $ppmfile ;;

  *.jpg) djpeg $filename > $ppmfile ;;

  *) echo "   *) echo "$0: Unknown filetype '${filename##*.}'"  
 : Unknown filetype '${filename##*.}'"

  exit 1;;

  esac

  outfile=${ppmfile%.ppm}.new.gif

  pnmscale -quiet -xysize $size $size $ppmfile

  pnmmargin $colour $width

  ppmquant -quiet 256  ppmtogif -quiet > $outfile

  rm $ppmfile

  done

The first several lines of this script initialize variables used as default settings. The defaults set the image size to 320 pixels and a black border of width 1 pixel.

The while , getopts , and case constructs process the options in the same way as in the previous example. The code for the first three options assigns the respective argument to a variable (replacing the default value). The last option is a catchall for any invalid options.

The rest of the code works in much the same way as in the previous example except that we have added the pnmscale and pnmmargin utilities to the processing pipeline.

The script also now generates a different filename; it appends .new.gif to the basename. This allows us to process a GIF file as input, applying scaling and borders, and write it out without destroying the original file.

This version doesn't address every issue, e.g., what if we don't want any scaling to be performed? We'll return to this script and develop it further in the next chapter.