6.3. Integer Variables and Arithmetic

< Day Day Up >

The expression $(($OPTIND - 1)) in the last graphics utility example shows another way that the shell can do integer arithmetic. As you might guess, the shell interprets words surrounded by $(( and )) as arithmetic expressions.^[8] Variables in arithmetic expressions do not need to be preceded by dollar signs, though it is not wrong to do so.

^[8] You can also use the older form $[...], but we don't recommend this because it will be phased out in future versions of bash.

Arithmetic expressions are evaluated inside double quotes, like tildes, variables, and command substitutions. We're finally in a position to state the definitive rule about quoting strings: when in doubt, enclose a string in single quotes, unless it contains tildes or any expression involving a dollar sign, in which case you should use double quotes.

For example, the date command on modern versions of UNIX accepts arguments that tell it how to format its output. The argument +%j tells it to print the day of the year, i.e., the number of days since December 31st of the previous year.

We can use +%j to print a little holiday anticipation message:

echo "Only $(( (365-$(date +%j)) / 7 )) weeks until the New Year"

We'll show where this fits in the overall scheme of command-line processing in Chapter 7.

The arithmetic expression feature is built into bash's syntax, and was available in the Bourne shell (most versions) only through the external command expr. Thus it is yet another example of a desirable feature provided by an external command being better integrated into the shell. getopts, as we have already seen, is another example of this design trend.

bash arithmetic expressions are equivalent to their counterparts in the Java and C languages.^[9] Precedence and associativity are the same as in C. Table 6-2 shows the arithmetic operators that are supported. Although some of these are (or contain) special characters, there is no need to backslash-escape them, because they are within the $((...)) syntax.

^[9] The assignment forms of these operators are also permitted. For example, $((x += 2)) adds 2 to x and stores the result back in x.

Table 6-2. Arithmetic operators
Operator	Meaning
++	Increment by one (prefix and postfix)
	Decrement by one (prefix and postfix)
+	Plus
-	Minus
*	Multiplication
/	Division (with truncation)
%	Remainder
**	Exponentiation^[10]
<<	Bit-shift left
>>	Bit-shift right
&	Bitwise and
\|	Bitwise or
~	Bitwise not
!	Logical not
^	Bitwise exclusive or
,	Sequential evaluation

^[10] Note that ** is not in the C language.

The ++ and - operators are useful when you want to increment or decrement a value by one.^[11] They work the same as in Java and C, e.g., value++ increments value by 1. This is called post-increment; there is also a pre-increment: ++value. The difference becomes evident with an example:

^[11] ++ and - are not available in versions of bash prior to 2.04.

$ i=0 $ echo $i 0 $ echo $((i++)) 0 $ echo $i 1 $ echo $((++i)) 2 $ echo $i 2

In both cases the value has been incremented by one. However, in the first case (post-increment) the value of the variable was passed to echo and then the variable was incremented. In the second case (pre-increment) the increment was performed and then the variable passed to echo.

Parentheses can be used to group subexpressions. The arithmetic expression syntax also (as in C) supports relational operators as "truth values" of 1 for true and 0 for false. Table 6-3 shows the relational operators and the logical operators that can be used to combine relational expressions.

Table 6-3. Relational operators
Operator	Meaning
<	Less than
>	Greater than
<=	Less than or equal to
>=	Greater than or equal to
==	Equal to
!=	Not equal to
&&	Logical and
\|\|	Logical or

For example, $((3 > 2)) has the value 1; $(( (3 > 2) || (4 <= 1) )) also has the value 1, since at least one of the two subexpressions is true.

The shell also supports base N numbers, where N can be from 2 to 36. The notation B#N means "N base B". Of course, if you omit the B#, the base defaults to 10.

6.3.1. Arithmetic Conditionals

In Chapter 5, we saw how to compare strings by the use of [...] notation (or with the test built-in). Arithmetic conditions can also be tested in this way. However, the tests have to be carried out with their own operators. These are shown in Table 6-4.

Table 6-4. Test relational operators
Operator	Meaning
-lt	Less than
-gt	Greater than
-le	Less than or equal to
-ge	Greater than or equal to
-eq	Equal to
-ne	Not equal to

And as with string comparisons, the arithmetic test returns a result of true or false; 0 if true, 1 otherwise. So, for example, [ 3 -gt 2 ] produces exit status 0, as does [ $ 3 -gt 2 $ || $ 4 -le 1 $ ], but [ $ 3 -gt 2 $ && $ 4 -le 1 $ ] has exit status 1 since the second subexpression isn't true.

In these examples we have had to escape the parentheses and pass them to test as separate arguments. As you can see, the result can look rather unreadable if there are many parentheses.

Another way to make arithmetic tests is to use the $((...)) form to encapsulate the condition. For example: [ $(((3 > 2) && (4 <= 1))) = 1 ]. This evaluates the conditionals and then compares the resulting value to 1 (true).^[12]

^[12] Note that the truth values returned by $((...)) are 1 for true, 0 for false the reverse of the test and exit statuses.

There is an even neater and more efficient way of performing an arithmetic test: by using the ((...)) construct.^[13] This returns an exit status of 0 if the expression is true, and 1 otherwise.

^[13] ((...)) is not available in versions of bash prior to 2.0.

The above expression using this construct becomes (( (3 > 2) && (4 <= 1) )). This example returns with an exit status of 1 because, as we said, the second subexpression is false.

6.3.2. Arithmetic Variables and Assignment

As we saw earlier, you can define integer variables by using declare. You can also evaluate arithmetic expressions and assign them to variables with the use of let. The syntax is:

let intvar=expression

It is not necessary (because it's actually redundant) to surround the expression with $(( and )) in a let statement. let doesn't create a variable of type integer; it only causes the expression following the assignment to be interpreted as an arithmetic one. As with any variable assignment, there must not be any space on either side of the equal sign (=). It is good practice to surround expressions with quotes, since many characters are treated as special by the shell (e.g., *, #, and parentheses); furthermore, you must quote expressions that include whitespace (spaces or TABs). See Table 6-5 for examples.

Table 6-5. Sample integer expression assignments
Assignment	Value
let x=	$x
1+4	5
`1 + 4'	5
`(2+3) * 5'	25
`2 + 3 * 5'	17
`17 / 3'	5
`17 % 3'	2
`1<<4'	16
`48>>3'	6
`17 & 3'	1
`17 \| 3'	19
`17 ^ 3'	18

Task 6-1

Here is a small task that makes use of integer arithmetic. Write a script called ndu that prints a summary of the disk space usage for each directory argument (and any subdirectories), both in terms of bytes, and kilobytes or megabytes (whichever is appropriate).

Here is the code:

for dir in ${*:-.}; do     if [ -e $dir ]; then         result=$(du -s $dir | cut -f 1)         let total=$result*1024               echo -n "Total for $dir = $total bytes"               if [ $total -ge 1048576 ]; then               echo " ($((total/1048576)) Mb)"         elif [ $total -ge 1024 ]; then               echo " ($((total/1024)) Kb)"         fi     fi done

To obtain the disk usage of files and directories, we can use the UNIX utility du. The default output of du is a list of directories with the amount of space each one uses, and looks something like this:

6       ./toc 3       ./figlist 6       ./tablist 1       ./exlist 1       ./index/idx 22      ./index 39      .

If you don't specify a directory to du, it will use the current directory (.). Each directory and subdirectory is listed along with the amount of space it uses. The grand total is given in the last line.

The amount of space used by each directory and all the files in it is listed in terms of blocks. Depending on the UNIX system you are running on, one block can represent 512 or 1024 bytes. Each file and directory uses at least one block. Even if a file or directory is empty, it is still allocated a block of space in the filesystem.

In our case, we are only interested in the total usage, given on the last line of du's output. To obtain only this line, we can use the -s option of du. Once we have the line, we want only the number of blocks and can throw away the directory name. For this we use our old friend cut to extract the first field.

Once we have the total, we can multiply it by the number of bytes in a block (1024 in this case) and print the result in terms of bytes. We then test to see if the total is greater than the number of bytes in one megabyte (1048576 bytes, which is 1024 x 1024) and if it is, we can print how many megabytes it is by dividing the total by this large number. If not, we see if it can be expressed in kilobytes, otherwise nothing is printed.

We need to make sure that any specified directories exist, otherwise du will print an error message and the script will fail. We do this by using the test for file or directory existence (-e) that we saw in Chapter 5 before calling du.

To round out this script, it would be nice to imitate du as closely as possible by providing for multiple arguments. To do this, we wrap the code in a for loop. Notice how parameter substitution has been used to specify the current directory if no arguments are given.

As a bigger example of integer arithmetic, we will complete our emulation of the pushd and popd functions (Task 4-8). Remember that these functions operate on DIR_STACK, a stack of directories represented as a string with the directory names separated by spaces. bash's pushd and popd take additional types of arguments, which are:

pushd +n takes the nth directory in the stack (starting with 0), rotates it to the top, and cds to it.
pushd without arguments, instead of complaining, swaps the two top directories on the stack and cds to the new top.
popd +n takes the nth directory in the stack and just deletes it.

The most useful of these features is the ability to get at the nth directory in the stack. Here are the latest versions of both functions:

.ps 8 pushd ( ) {     dirname=$1   if [ -n $dirname ] && [ \( -d $dirname \) -a            \( -x $dirname \) ]; then         DIR_STACK="$dirname ${DIR_STACK:-$PWD' '}"         cd $dirname         echo "$DIR_STACK"     else         echo "still in $PWD."     fi }       popd ( ) {     if [ -n "$DIR_STACK" ]; then         DIR_STACK=${DIR_STACK#* }               cd ${DIR_STACK%% *}         echo "$PWD"     else         echo "stack empty, still in $PWD."     fi }

To get at the nth directory, we use a while loop that transfers the top directory to a temporary copy of the stack n times. We'll put the loop into a function called getNdirs that looks like this:

getNdirs ( ) {     stackfront=''     let count=0     while [ $count -le $1 ]; do         target=${DIR_STACK%${DIR_STACK#* }}         stackfront="$stackfront$target"         DIR_STACK=${DIR_STACK#$target}         let count=count+1     done           stackfront=${stackfront%$target} }

The argument passed to getNdirs is the n in question. The variable target contains the directory currently being moved from DIR_STACK to a temporary stack, stackfront. target will contain the nth directory and stackfront will have all of the directories above (and including) target when the loop finishes. stackfront starts as null; count, which counts the number of loop iterations, starts as 0.

The first line of the loop body copies the first directory on the stack to target. The next line appends target to stackfront and the following line removes target from the stack ${DIR_STACK#$target}. The last line increments the counter for the next iteration. The entire loop executes n+1 times, for values of count from 0 to N.

When the loop finishes, the directory in $target is the nth directory. The expression ${stackfront%$target} removes this directory from stackfront so that stackfront will contain the first n-1 directories. Furthermore, DIR_STACK now contains the "back" of the stack, i.e., the stack without the first n directories. With this in mind, we can now write the code for the improved versions of pushd and popd:

pushd ( ) {     if [ $(echo $1 | grep '^+[0-9][0-9]*$') ]; then               # case of pushd +n: rotate n-th directory to top         let num=${1#+}         getNdirs $num                     DIR_STACK="$target$stackfront$DIR_STACK"         cd $target         echo "$DIR_STACK"       elif [ -z "$1" ]; then         # case of pushd without args; swap top two directories         firstdir=${DIR_STACK%% *}         DIR_STACK=${DIR_STACK#* }         seconddir=${DIR_STACK%% *}         DIR_STACK=${DIR_STACK#* }         DIR_STACK="$seconddir $firstdir $DIR_STACK"         cd $seconddir           else         # normal case of pushd dirname         dirname=$1         if [ \( -d $dirname \) -a \( -x $dirname \) ]; then             DIR_STACK="$dirname ${DIR_STACK:-$PWD" "}"             cd $dirname             echo "$DIR_STACK"         else             echo still in "$PWD."         fi     fi }       popd ( ) {     if [ $(echo $1 | grep '^+[0-9][0-9]*$') ]; then               # case of popd +n: delete n-th directory from stack         let num=${1#+}         getNdirs $num         DIR_STACK="$stackfront$DIR_STACK"         cd ${DIR_STACK%% *}         echo "$PWD"           else               # normal case of popd without argument         if [ -n "$DIR_STACK" ]; then             DIR_STACK=${DIR_STACK#* }             cd ${DIR_STACK%% *}             echo "$PWD"         else             echo "stack empty, still in $PWD."         fi     fi }

These functions have grown rather large; let's look at them in turn. The if at the beginning of pushd checks if the first argument is an option of the form +N. If so, the first body of code is run. The first let simply strips the plus sign (+) from the argument and assigns the result as an integer to the variable num. This, in turn, is passed to the getNdirs function.

The next assignment statement sets DIR_STACK to the new ordering of the list. Then the function cds to the new directory and prints the current directory stack.

The elif clause tests for no argument, in which case pushd should swap the top two directories on the stack. The first four lines of this clause assign the top two directories to firstdir and seconddir, and delete these from the stack. Then, as above, the code puts the stack back together in the new order and cds to the new top directory.

The else clause corresponds to the usual case, where the user supplies a directory name as argument.

popd works similarly. The if clause checks for the +N option, which in this case means "delete the nth directory." A let extracts the N as an integer; the getNdirs function puts the first n directories into stackfront. Finally, the stack is put back together with the nth directory missing, and a cd is performed in case the deleted directory was the first in the list.

The else clause covers the usual case, where the user doesn't supply an argument.

Before we leave this subject, here are a few exercises that should test your understanding of this code:

Implement bash's dirs command and the options +n and -l. dirs by itself displays the list of currently remembered directories (those in the stack). The +n option prints out the nth directory (starting at 0) and the -l option produces a long listing; any tildes (~) are replaced by the full pathname.
Modify the getNdirs function so that it checks for N exceeding the number of directories in the stack and exits with an appropriate error message if true.
Modify pushd, popd, and getNdirs so that they use variables of type integer in the arithmetic expressions.
Change getNdirs so that it uses cut (with command substitution), instead of the while loop, to extract the first N directories. This uses less code but runs more slowly because of the extra processes generated.
bash's versions of pushd and popd also have a -N option. In both cases -N causes the nth directory from the right-hand side of the list to have the operation performed on it. As with +N, it starts at 0. Add this functionality.
Use getNdirs to reimplement the selectd function from the last chapter.

6.3.3. Arithmetic for Loops

Chapter 5 introduced the for loop and briefly mentioned another type of for loop, more akin to the construct found in many programming languages like Java and C. This type of for loop is called an arithmetic for loop.^[14]

^[14] Versions of bash prior to 2.04 do not have this type of loop.

The form of an arithmetic for loop is very similar to those found in Java and C:

for (( initialisation ; ending condition ; update )) do         statements... done

There are four sections to the loop, the first three being arithmetic expressions and the last being a set of statements just as in the standard loop that we saw in the last chapter.

The first expression, initialisation, is something that is done once at the start of the loop and if it evaluates to true the loop continues its process; otherwise, it skips the loop and continues with the next statement. When initialisation is true the loop then evaluates ending condition. If this is true then it executes statements, evaluates update and repeats the cycle again by evaluation ending condition. The loop continues until ending condition becomes false or the loop is exited via one of the statements.

Usually initialisation is used to set an arithmetic variable to some initial value, update updates that variable, and ending condition tests the variable. Any of the values may be left out in which case they automatically evaluate to true. The following simple example:

for ((;;)) do         read var         if [ "$var" = "." ]; then                 break         fi done

loops forever reading lines until a line consisting of a . is found. We'll look at using the expressions in an arithmetic for loop in our next task.

Task 6-2

Write a script that uses for loops to print out a multiplication table for the numbers 1 to 12.

This task is best accomplished using nested for loops:

for (( i=1; i <= 12 ; i++ )) do         for (( j=1 ; j <= 12 ; j++ ))         do                 echo -ne "$(( j * i ))\t"         done         echo done

The script begins with a for loop using a variable i; the initialisation clause sets i to 1, the ending condition clause tests i against the limit (12 in our case), and the update clause adds 1 to i each time around the loop. The body of the loop is another for loop, this time with a variable called j. This is identical to the i for loop except that j is being updated.

The body of the j loop has an echo statement where the two variables are multiplied together and printed along with a trailing tab. We deliberately don't print a newline (with the -n option to echo) so that the numbers appear on one line. Once the inner loop has finished a newline is printed so that the set of numbers starts on the next line.

Arithmetic for loops are useful when dealing with arrays, which we'll now look at.

< Day Day Up >

Table 6-2. Arithmetic operators

Table 6-3. Relational operators

6.3.1. Arithmetic Conditionals

Table 6-4. Test relational operators

6.3.2. Arithmetic Variables and Assignment

Table 6-5. Sample integer expression assignments

Task 6-1

6.3.3. Arithmetic for Loops

Task 6-2