Arrays | Learning the bash Shell: Unix Shell Programming (In a Nutshell (OReilly))

6.4 Arrays

The pushd and popd functions use a string variable to hold a list of directories and manipulate the list with the string pattern-matching operators. Although this is quite efficient for adding or retrieving items at the beginning or end of the string, it becomes cumbersome when attempting to access items that are anywhere else, e.g., obtaining item N with the getNdirs function. It would be nice to be able to specify the number, or index , of the item and retrieve it. Arrays allow us to do this. ^[12]

^[12] Support for arrays is not available in versions of bash prior to 2.0.

An array is like a series of slots that hold values. Each slot is known as an element , and each element can be accessed via a numerical index. An array element can contain a string or a number, and you can use it just like any other variable. The indices for arrays start at 0 and continue up to a very large number. ^[13] So, for example, the fifth element of array names would be names[4] . Indices can be any valid arithmetic expression that evaluates to a number greater than or equal to 0.

^[13] Actually, up to 599147937791. That's almost six hundred billion, so yes, it's pretty large.

There are several ways to assign values to arrays. The most straightforward way is with an assignment, just like any other variable:

  names[2]=alice

  names[0]=hatter

  names[1]=duchess

This assigns hatter to element , duchess to element 1 , and alice to element 2 of the array names .

Another way to assign values is with a compound assignment:

 names=([2]=alice [0]=hatter [1]=duchess)

This is equivalent to the first example and is convenient for initializing an array with a set of values. Notice that we didn't have to specify the indices in numerical order. In fact, we don't even have to supply the indices if we reorder our values slightly:

 names=(hatter duchess alice)

bash automatically assigns the values to consecutive elements starting at 0. If we provide an index at some point in the compound assignment, the values get assigned consecutively from that point on, so:

 names=(hatter [5]=duchess alice)

assigns hatter to element , duchess to element 5 , and alice to element 6 .

An array is created automatically by any assignment of these forms. To explicitly create an empty array, you can use the -a option to declare . Any attributes that you set for the array with declare (e.g., the read-only attribute) apply to the entire array. For example, the statement declare -ar names would create a read-only array called names . Every element of the array would be read-only.

An element in an array may be referenced with the syntax ${ array [ i ]}. So, from our last example above, the statement echo ${names[5]} would print the string "duchess". If no index is supplied, array element 0 is assumed.

You can also use the special indices @ and * . These return all of the values in the array and work in the same way as for the positional parameters; when the array reference is within double quotes, using * expands the reference to one word consisting of all the values in the array separated by the first character of the IFS variable, while @ expands the values in the array to separate words. When unquoted, both of them expand the values of the array to separate words. Just as with positional parameters, this is useful for iterating through the values with a for loop:

 for i in "${names[@]}"; do

 echo $i

 done

Any array elements which are unassigned don't exist; they default to null strings if you explicitly reference them. Therefore, the previous looping example will print out only the assigned elements in the array names . If there were three values at indexes 1, 45, and 1005, only those three values would be printed.

A useful operator that you can use with arrays is # , the length operator that we saw in Chapter 4 . To find out the length of any element in the array, you can use ${# array [ i ]}. Similarly, to find out how many values there are in the array, use * or @ as the index. So, for names=(hatter [5]=duchess alice) , ${#names[5]} has the value 7, and ${#names[@]} has the value 3.

Reassigning to an existing array with a compound array statement replaces the old array with the new one. All of the old values are lost, even if they were at different indices to the new elements. For example, if we reassigned names to be ([100]=tweedledee tweedledum) , the values hatter , duchess , and alice would disappear.

You can destroy any element or the entire array by using the unset built-in. If you specify an index, that particular element will be unset. unset names[100] , for instance, would remove the value at index 100 ; tweedledee in the example above. However, unlike assignment, if you don't specify an index the entire array is unset, not just element 0. You can explicitly specify unsetting the entire array by using * or @ as the index.

Let's now look at a simple example that uses arrays to match user IDs to account names on the system. The code takes a user ID as an argument and prints the name of the account plus the number of accounts currently on the system:

  for i in $(cut -f 1,3 -d: /etc/passwd) ; do

  array[${i#*:}]=${i%:*}

  done

  echo "User ID  is ${array[]}."

  echo "There are currently ${#array[@]} user accounts on the system."

We use cut to create a list from fields 1 and 3 in the /etc/passwd file. Field 1 is the account name and field 3 is the user ID for the account. The script loops through this list using the user ID as an index for each array element and assigns each account name to that element. The script then uses the supplied argument as an index into the array, prints out the value at that index, and prints the number of existing array values.

Some of the environment variables in bash are arrays; DIRSTACK functions as a stack for the pushd and popd built-ins , BASH_VERSINFO is an array of version information for the current instance of the shell, and PIPESTATUS is an array of exit status values for the last foreground pipe that was executed.

We'll see a further use of arrays when we build a bash debugger in Chapter 9 .

To end this chapter, here are some problems relating to what we've just covered:

1. Improve the account ID script so that it checks whether the argument is a number. Also, add a test to print an appropriate message if the user ID doesn't exist.

2. Make the script print out the username (field 5) as well. Hint: this isn't as easy as it sounds. A username can have spaces in it, causing the for loop to iterate on each part of the name.

3. As mentioned earlier, the built-in versions of pushd and popd use an array to implement the stack. Change the pushd , popd , and getNdirs code that we developed in this chapter so that it uses arrays.