Pointers and Multidimensional Arrays

I l @ ve RuBoard

How do pointers relate to multidimensional arrays? Let's look at some examples now to find the answer. To simplify the discussion, you'll examine an array that is much smaller than rain . Suppose you have this declaration:

 int zippo[4][2];  /* an array of arrays of ints */

Then zippo , being the name of an array, is the address of the first element of the array. In this case, the first element of zippo is itself an array of two ints , so zippo is the address of an array of two ints . Let's analyze that further in terms of pointer properties:

Because zippo is the address of the array's first element, zippo equals &zippo[0] . Next , zippo[0] is itself an array of two integers, so zippo[0] equals &zippo[0][0] , the address of its first element, an int . In short, zippo[0] is the address of an int - sized object, and zippo is the address of a two- int -sized object. Because both the integer and the array of two integers begin at the same location, both zippo and zippo[0] have the same numeric value.
Adding 1 to a pointer or address yields a value larger by the size of the referred-to object. In this respect, zippo and zippo[0] differ , for zippo refers to an object two ints in size, and zippo[0] refers to an object one int in size. Therefore, zippo + 1 and zippo[0] + 1 do not have the same value.
Dereferencing a pointer or an address (applying the * operator or else the [ ] operator with an index) yields the value represented by the referred-to object. Because zippo[0] is the address of its first element zippo[0][0] , *(zippo[0]) represents the value stored in zippo[0][0] , an int value. Similarly, *zippo represents the value of its first element, zippo[0] , but zippo[0] itself is the address of an int . It's the address &zippo[0][0] , so *zippo is &zippo[0][0] . Applying the dereferencing operator to both expressions implies that **zippo equals *&zippo[0][0] , which reduces to zippo[0][0] , an int . In short, zippo is the address of an address and must be dereferenced twice to get an ordinary value. An address of an address or a pointer of a pointer is an example of double indirection .

Clearly, increasing the number of array dimensions increases the complexity of the pointer view. At this point, most students of C begin realizing why pointers are considered one of the more difficult aspects of the language. You might want to study the preceding points carefully and see how they are illustrated in the following examples. Listing 10.13 is a program that prints addresses.

Listing 10.13 The `zippo1.c` program.

 /* zippo1.c -- addresses */ #include <stdio.h> int main(void) {      int zippo[4][2];      printf("zippo = %p, zippo[0] = %p\n", "zippo, zippo[0]);      printf("&zippo[0][0] = %p, &zippo = %p\n",                &zippo[0][0], *zippo);      return 0; }

Here is the output:

 zippo = 0064FDD8, zippo[0] = 0064FDD8 &zippo[0][0] = 0064FDD8, &zippo = 0064FDD8

The output shows that the address of the two-dimensional array zippo and the address of the one-dimensional array zippo[0] are the same. Each is the address of the corresponding array's first element, and this is the same numerically as &zippo[0][0] . Also, note that zippo and *zippo have the same value.

Nonetheless, there is a difference. On our system, int is 4 bytes. As discussed earlier, zippo[0] and *zippo point to a 4-byte data object. Adding 1 to either should produce a value larger by 4. The name zippo is the address of an array of two ints , so it identifies an 8-byte data object. Therefore, adding 1 to zippo should produce an address 8 bytes larger. Let's modify the program, as shown in Listing 10.14, to check that.

Listing 10.14 The `zippo2.c` program.

 /* zippo2.c -- more zippo info */ #include <stdio.h> int main(void) {     int zippo[4][2];     printf("zippo = %p, zippo[0] = %p, &zippo[0][0] = %p\n",     zippo, zippo[0], &zippo[0][0]);     printf("*zippo = %p\n", *zippo);     printf("zippo + 1 = %p, zippo[0] + 1 = %p\n",     zippo + 1, zippo[0] + 1);     printf("&zippo[0][0] + 1 = %p, *zippo + 1 = %p\n",             &zippo[0][0] + 1, *zippo + 1);     printf("*(zippo + 1) = %p\n", *(zippo + 1));     return 0; }

Here is the new output:

 zippo = 0064FDD8, zippo[0] = 0064FDD8, &zippo[0][0] = 0064FDD8 *zippo = 0064FDD8 zippo + 1 = 0064FDE0, zippo[0] + 1 = 0064FDDC &zippo[0][0] + 1 = 0064FDDC, *zippo + 1 = 0064FDDC *(zippo + 1) = 0064FDE0

The results are as we predicted . Adding 1 to zippo moves you from one two- int array to the next array, and adding 1 to zippo[0] moves you from one int to the next (see Figure 10.5). Also, note what happens if you add 2 to zippo[0] (or, equivalently, to *zippo ). You would go past the end of the first two- int array to the beginning of the next. The last two examples illustrate that zippo[0] , &zippo[0][0] , and *zippo are but three different notations for the same thing. All are addresses of the same int.

Figure 10.5. An array of arrays.

Note the difference between *zippo + 1 and *(zippo +1) . The former applies the * first and then adds, but the latter adds and then dereferences. In the first case, because *zippo is the address of an int , adding 1 increases the value by 4 bytes on our system. In the second case, because zippo is the address of a two- int object, 8 is added. This means that zippo + 1 is the address of the second two- int array element, and applying the * operator yields the address of the first element of that array. Therefore, *(zippo + 1) is the address of an int . In particular, it's the address of the element zippo[1][0] . The expression *zippo + 1 is also the address of an int , but it's the address of zippo[0][1] . In short, dereferencing, then adding, moves the address along a row (changes the second index); adding, then dereferencing, moves the address along a column (changes the first index).

Another point to note is that each element of zippo is an array and hence the address of a first element. Therefore, you have these relationships:

 zippo[0] == &zippo[0][0] == *zippo zippo[1] == &zippo[1][0] == *(zippo + 1) zippo[2] == &zippo[2][0] == *(zippo + 2) zippo[3] == &zippo[3][0] == *(zippo + 3)

Applying the * operator to each gives the following results:

 *zippo[0] == zippo[0][0] == **zippo *zippo[1] == zippo[1][0] == *(*(zippo + 1)) *zippo[2] == zippo[2][0] == *(*(zippo + 2)) *zippo[3] == zippo[3][0] == *(*(zippo + 3))

More generally , you can represent individual elements by using array notation and pointer notation as follows :

 zippo[m][n] == *(*(zippo + m) + n)

The value m , being the index associated with zippo , is added to zippo . The value n , being the index associated with the subarray zippo[m] , is added to zippo[m] , which is *(zippo +m) in array notation. This makes *(zippo + m) + n the address of element zippo[m][n] , and applying the * operator yields the contents at that address.

Now suppose you want to declare a pointer variable pz that is compatible with zippo . Such a pointer could be used, for example, in writing a function to deal with zippo -like arrays. Will the type pointer-to- int suffice? No. That type is compatible with zippo[0] , which points to a single int , but you want pz to point to an array of ints . Here is what you can do:

 int (* pz)[2];

This statement says that pz is a pointer to an array of two ints . Why the parentheses? Well, [] has a higher precedence than * . Therefore, with a declaration like

 int * pax[2];

you apply the brackets first, making pax an array of two somethings. Next, you apply the * , making pax an array of two pointers. Finally, use the int , making pax an array of two pointers to int . This declaration creates two pointers, but the original version uses parentheses to apply the * first, creating one pointer to an array of two ints .

Functions and Multidimensional Arrays

You might be wondering if you really have to learn about pointers to pointers. If you want to write functions that process two-dimensional arrays, the answer is yes. Mainly, you need to understand pointers well enough to make the proper declarations for function arguments. In the function body itself, you can usually get by with array notation.

Suppose, then, you want to write a function to deal with two-dimensional arrays. You have several choices. You can use a function written for one-dimensional arrays on each subarray, or you can use the same function on the whole array, but treat the whole array as one-dimensional instead of two-dimensional, or you can write a function that explicitly deals with two-dimensional arrays. To illustrate these three approaches, let's take a small two-dimensional array and apply each of the approaches to double the magnitude of each element.

Applying a One-Dimensional Function to Subarrays

To keep things simple, we'll declare junk as a small array of arrays and initialize it. We'll write a function that takes an array address and an array size as arguments and doubles the indicated elements. We'll use a for loop to apply this function to each subarray of junk , and we'll print the array contents. Listing 10.15 shows the program.

Listing 10.15 The `dubarr1.c` program.

 /* dubarr1.c -- doubles array elements */ #include <stdio.h> void dub(int ar[], int size); int main(void) {      int junk[3][4] = {             {2,4,5,8},             {3,5,6,9},             {12,10,8,6}      };      int i, j;      for (i = 0; i < 3 ; i++)         dub(junk[i], 4);      for (i = 0; i < 3; i++)      {         for (j = 0; j < 4; j++)            printf("%5d", junk[i][j]);         putchar('%\n');      }      return 0; } void dub(int ar[], int size)   /* or int * ar */ {      int i;      for (i = 0; i < size; i++)          ar[i] *= 2; }

The first for loop in main() uses dub() to process the subarrays junk[0] , junk[1] , and so on. This approach works because junk[0] , junk[1] , and so forth, are each one-dimensional arrays. Note that we pass dub() a size parameter of 4 because that is the number of elements in each subarray. Here is output:

 4    8   10   16  6   10   12   18 24   20   16   12

Applying a One-Dimensional Function to a Two-Dimensional Array

In the preceding example, you looked at junk as being an array of three arrays of four ints . For most C implementations , you can also look at junk as being an array of 12 ints . Suppose, for instance, you pass dub() the value junk[0] as an argument. This act initializes the pointer ar in dub() to the address of junk[0][0] . As a result, ar[0] corresponds to junk[0][0] , and ar[3] corresponds to junk[0][3] . What about ar[4] ? It represents the element following junk[0][3] , which is junk[1][0] , the first element of the next subarray. In other words, you've finished one row and gone to the beginning of the next. The program in Listing 10.16 continues in this fashion to cover the whole array, with ar[11] representing junk[2][3] .

Listing 10.16 The `dubarr2.c` program.

 /* dubarr2.c -- doubles array elements */ #include <stdio.h> void dub(int ar[], int size); int main(void) {      static int junk[3][4] = {             {2,4,5,8},             {3,5,6,9},             {12,10,8,6}      };      int i, j;      dub(junk[0], 3*4);      for (i = 0; i < 3; i++)      {         for (j = 0; j < 4; j++)            printf("%5d", junk[i][j]);         putchar('\n');      }      return 0; } void dub(int ar[], int size)   /* or int * ar */ {      int i;      for (i = 0; i < size; i++)          ar[i] *= 2; }

In Listing 10.15, note that dub() is unchanged from Listing 10.14. We merely changed the limit to 3*4 (or 12) and used one call to dub() instead of three. (We wrote the limit as 3*4 to emphasize that it is the total number of elements calculated by multiplying the number of rows times the number of columns .) Does it work? Here is the output:

 4    8   10   16  6   10   12   18 24   20   16   12

Because junk has the same numerical value as junk[0] , could you use junk instead of junk[0] as the argument to dub() ? K&R implementations let you, but ANSI C implementations point out that junk clashes with the prototype, which says the argument should be a pointer-to- int , not a pointer to a pointer.

Is this a good approach? Not really, because it treats the junk array as though it were one-dimensional, and that's conceptually wrong. However, it does illustrate that a two-dimensional array is implemented internally as one big array.

Applying a Two-Dimensional Function

Both of the approaches described so far lose track of the column-and-row information. In this application (doubling each element), that information is unimportant, but suppose each row represented a year and each column a month. Then you might want a function to, say, total up individual columns. In that case, the function should have the row and column information available. This can be accomplished by declaring the right kind of formal variable so that the function can pass the array properly. In this case, the array junk is an array of three arrays of four ints . As the earlier discussion implied , that means junk is a pointer to an array of four ints , and a function parameter of this type can be declared in this way:

 int (* pj)[4]

Alternatively, if pj is a formal argument to a function, you can declare it this way:

 int pj[][4]

Note that the first set of brackets is empty. The empty brackets identify pj as being a pointer. Such a variable can then be used in the same way as junk . That is what we have done in the next example, shown in Listing 10.17.

Listing 10.17 The `dubarr3.c` program.

 /* dubarr3.c -- doubles array elements */ #include <stdio.h> void dub2 (int ar[][4], int size); int main(void) {      static int junk[3][4] = {             {2,4,5,8},             {3,5,6,9},             {12,10,8,6}      };      int i, j;      dub2(junk,3);      for (i = 0; i < 3; i++)      {         for (j = 0; j < 4; j++)            printf("%5d", junk[i][j]);         putchar('\n');      }      return 0; } void dub2 (int ar[][4], int size)   /* or int (*ar)[4] */ {      int i, j;      for (i = 0; i < size; i++)           for (j = 0; j < 4; j++)             ar[i][j] *= 2; }

The program in Listing 10.17 passes as arguments junk , which is a pointer to the first array, and 3 , the number of rows. The dub2() function then treats ar as an array of arrays of four ints . The number of columns is built into the function, but the number of rows is left open . The same function will work with, say, a 12-by-4 array if 12 is passed as the number of rows. That's because size is the number of elements; however, because each element is an array, or row, size becomes the number of rows.

Note that ar is used in the same fashion as junk is used in main() . This is possible because ar and junk are the same type: pointer to array-of-four- int s.

Here is the output:

 4    8   10   16  6   10   12   18 24   20   16   12

Be aware that the following declaration will not work properly:

 int ar[][]; /* faulty declaration */

Recall that the compiler converts array notation to pointer notation. This means, for example, that ar[1] will become ar+1 . For the compiler to evaluate this, it needs to know what size of object ar points to. The declaration

 int ar[][4];

says that ar points to an array of four ints , hence to an object 16 bytes long on our system, so ar+1 means "add 16 bytes to the address." With the empty- bracket version, the compiler would not know what to do.

You can also include a size in the other bracket pair, as shown here, but it is ignored:

 void dub2(ar, n) int ar[3][4];    /* the 3 is ignored */ int n;

In general, to declare a pointer corresponding to an N -dimensional array, you must supply values for all but the leftmost set of brackets.

I l @ ve RuBoard

Pointers and Multidimensional Arrays