The General Utilities Library

I l @ ve RuBoard

The general utilities library contains a grab bag of functions, including a random number generator, searching and sorting functions, conversion functions, and memory management functions. Under ANSI C, prototypes for these functions exist in the stdlib.h header file. Appendix F lists all the functions in this family; we'll take a closer look at a few of them now.

The `exit()` and `atexit()` Functions

We've already used exit() explicitly in several examples. In addition, the exit() function is invoked automatically upon return from main() . The ANSI standard has added a couple of nice features that we haven't used yet. The most important addition is that you can specify particular functions to be called when exit() executes. The atexit() function provides this feature by registering the functions to be called on exit; the atexit() function takes a function pointer as its argument. Listing 16.11 shows how this works.

Listing 16.11 The `byebye.c` program.

 /* byebye.c -- atexit() example */ #include <stdio.h> #include <stdlib.h> void sign_off(void); void too_bad(void); int main(void) {     int n;     atexit(sign_off);    /* register the sign_off() function */     puts("Enter an integer:");     if (scanf("%d",&n) != 1)     {         puts("That's no integer!");         atexit(too_bad); /* register the too_bad()  function */         exit(EXIT_FAILURE);     }     printf("%d is %s.\n", n,  (n % 2 == 0)? "even" : "odd");     return 0; } void sign_off(void) {     puts("Thus terminates another magnificent program from");     puts("SeeSaw Software!"); } void too_bad(void) {     puts("SeeSaw Software extends its heartfelt condolences");     puts("to you upon the failure of your program."); }

Here's one sample run:

 Enter an integer:  212  212 is even. Thus terminates another magnificent program from SeeSaw Software!

Here's a second run:

 Enter an integer:  what?  That's no integer! SeeSaw Software extends its heartfelt condolences to you upon the failure of your program. Thus terminates another magnificent program from SeeSaw Software!

Let's look at two main areas: the use of the atexit() and exit() arguments.

Using `atexit()`

Here's a function that uses function pointers! To use the atexit() function, simply pass it the address of the function you want called on exit. Because the name of a function acts as an address when used as a function argument, you use sign_off or too_bad as the argument. Then atexit() registers that function in a list of functions to be executed when exit() is called. ANSI guarantees that you can place at least 32 functions on the list. Each function is added with a separate call to atexit() . When the exit() function is finally called, it executes these functions, with the last function added being executed first.

Notice that both sign_off() and too_bad() were called when input failed, but only sign_off() was called when input worked. That's because the if statement registers too_bad() only if input fails. Also note that the last function registered was the first called.

The functions registered by atexit() should, like sign_off() and too_bad() , be type void functions taking no arguments. Typically, they would perform housekeeping tasks , such as updating a program-monitoring file or resetting environmental variables .

Note that sign_off() is called even when exit() is not called explicitly; that's because exit() is called implicitly when main() terminates.

Using `exit()`

After exit() executes the functions specified by atexit() , it does some tidying of its own. It flushes all output streams, closes all open streams, and closes temporary files created by calls to the standard I/O function tmpfile () . Then exit() returns control to the host environment and, if possible, reports a termination status to the environment. Traditionally, UNIX programs have used 0 to indicate successful termination and non-zero to report failure. UNIX return codes don't necessarily work with all systems, so ANSI C defined a macro called EXIT_FAILURE that can be used portably to indicate failure. Similarly, it defined EXIT_SUCCESS to indicate success, but exit() also accepts 0 for that purpose. Under ANSI C, using the exit() function in a nonrecursive main() function is equivalent to using the keyword return . However, exit() also terminates programs when used in functions other than main() .

Memory Allocation: `malloc()` and `free ()`

All programs have to set aside enough memory to store the data they use. Some of this memory allocation is done automatically. For example, you can declare

 char place[] = "Pork Liver Creek";

and enough memory to store that string is set aside, or you can be more explicit and ask for a certain amount of memory:

 int plates[100];

This declaration sets aside 100 memory locations, each fit to store an int value.

C goes beyond this. You can allot more memory as a program runs. The main tool is the malloc() function, which takes one argument: the number of bytes of memory you want. Then malloc() finds a suitable block of free memory and returns the address of the first byte of that block. Because char represents a byte, malloc() has traditionally been defined as type pointer-to- char . The ANSI C standard, however, uses a new type: pointer-to- void . This type is intended to be a "generic pointer." The malloc() function can be used to return pointers to arrays, structures, and so forth, so normally the return value is type-cast to the proper value. Under ANSI C, you should still typecast for clarity, but assigning a pointer-to- void value to a pointer of another type is not considered a type clash . If malloc() fails to find the required space, it returns the null pointer.

Let's apply malloc() to the task of creating an array. You can use malloc() to request a block of storage. You also need a pointer to keep track of where the block is in memory. For example, consider this code:

 double * ptd; ptd = (double *) malloc(30 * sizeof(double));

This code requests space for 30 type double values and sets ptd to point to the location. Note that ptd is declared as a pointer to a single double and not to a block of 30 double values. Remember that the name of an array is the address of its first element. Therefore, if you make ptd point to the first element of the block, you can use it just like an array name. That is, you can use the expression ptd[0] to access the first element of the block, ptd[1] to access the second element, and so on. As you've learned earlier, you can use pointer notation with array names , and you can use array notation with pointers.

You now have two ways to create an array:

Declare an array and use the array name to access elements.
Declare a pointer, call malloc() , and use the pointer to access elements.

You can use the second method to do something you can't do with a declared array ”create a dynamic array , one that's allocated while the program runs and that you can choose a size for while the program runs. Suppose, for example, that n is an integer variable. You can't do the following:

 double item[n];    /* not allowed if n is a variable */

However, you can do the following:

 ptd = (double *) malloc(n * sizeof(double)); /* okay */

By using malloc() , then, a program can decide what size array is needed and create it while the program runs. Listing 16.12 illustrates this possibility. It assigns the address of block of memory to the pointer ptd , then it uses ptd in the same fashion you would use an array name.

Listing 16.12 The `dyn_arr.c` program.

 /* dyn_arr.c -- dynamically allocated array */ #include <stdio.h> #include <stdlib.h> int main(void) {      double * ptd;      int max;      int number;      int i = 0;      puts("What is the maximum number of type double entries?");      scanf("%d", &max);      ptd = (double *) malloc(max * sizeof (double));      if (ptd == NULL)      {           puts("Memory allocation failed. Goodbye.");           exit(EXIT_FAILURE);      }      /* ptd now points to an array of max elements */      puts("Enter the values (q to quit):");      while (i < max && scanf("%lf", &ptd[i]) == 1)           ++i;      printf("Here are your %d entries:\n", number = i);      for (i = 0; i < number; i++)      {           printf("%7.2f ", ptd[i]);           if (i % 7 == 6)                putchar('\n');      }      if (i % 7 != 0)           putchar('\n');      puts("Done.");      free(ptd);      return 0; }

Here's a sample run. In it, we entered six numbers , but the program processes just five of them because we limited the array size to 5.

 What is the maximum number of entries?  5  Enter the values (q to quit):  20 30 35 25 40 80  Here are your 5 entries:   20.00   30.00   35.00   25.00   40.00 Done.

Let's look at the code. The program finds the desired array size with the following lines:

 puts("What is the maximum number of type double entries?");"" scanf("%d", &max);

Next, the following line allocates enough space to hold the requested number of entries and then assigns the address of the block to the pointer ptd :

 ptd = (double *) malloc(max * sizeof (double));

It's possible that malloc() can fail to procure the desired amount of memory. In that case, the function returns the null pointer, and the program terminates.

 if (ptd == NULL) {      puts("Memory allocation failed. Goodbye.");      exit(EXIT_FAILURE); }

If the program clears this hurdle , then it can treat ptd as though it were the name of an array of max elements, and so it does.

Note the free() function near the end of the program. It frees memory allocated by malloc() . Think of malloc() and free() as managing a pool of memory. Each call to malloc() allocates memory for program use, and each call to free() restores memory to the pool so it can be reused. The free() function frees only the block of memory that its argument points to. The argument to free() should be a pointer to a block of memory allocated by malloc() ; you can't use free() to free memory allocated by other means, such as declaring a structure or array. In this particular example, using free() isn't really necessary, for any allocated memory automatically is freed when the program terminates. In a more complex program, however, the ability to free and reuse memory can be important.

What have you gained by using a dynamic array? Primarily, you've gained program flexibility. Suppose you know that most of the time the program will need no more than 100 elements, but that sometimes it would need 10,000 elements. If you declared an array, you would have to allow for the worst case and declare an array with 10,000 elements. Most of the time, that program would be wasting memory. Then, the one time you need 10,001 elements, the program will fail. You can use a dynamic array to adjust the program to fit the circumstances.

The `calloc ()` Function

Another option for memory allotment is to use calloc() . A typical use looks like this:

 long * newmem; newmem = (long *)calloc(100, sizeof (long));

Like malloc() , calloc() returns a pointer-to- char in its pre-ANSI version and a pointer-to- void under ANSI. You should use the cast operator if you want to store a different type. This new function takes two arguments, both of which should be unsigned integers (type size_t under ANSI). The first argument is the number of memory cells you want. The second argument is the size of each cell in bytes. In our case, long uses 4 bytes, so this instruction sets up 100 4-byte units, using 400 bytes in all for storage.

Using sizeof (long) instead of 4 makes this coding more portable. It will work on those rare systems where long is some size other than 4.

The calloc() function throws in one more feature: It sets all the bits in the block to zero. (Note, however, that on some hardware systems, a floating-point value of is not represented by all bits set to 0.)

The free() function can also be used to free memory allocated by calloc() .

Dynamic memory allocation is the key to many advanced programming techniques. We'll examine some in Chapter 17, "Advanced Data Representation." Your own C library probably offers several other memory-management functions, some portable, some not. You might want to take a moment to look them over.

Storage Classes and Dynamic Memory Allocation

You might be wondering about the connection between storage classes and dynamic memory allocation. Let's look at an idealized model. You can think of a program as dividing its available memory into three separate sections: one for external, static, and static external variables; one for automatic variables; and one for dynamically allocated memory.

The amount of memory needed for the external, static, and external static storage classes is known at compile time, and the data stored in this section is available as long as the program runs. Each variable of these classes comes into being when the program starts and expires when the program ends.

An automatic variable, however, comes into existence when a program enters the block of code containing the variable's definition and expires when its block of code is exited. Therefore, as a program calls functions and as functions terminate, the amount of memory used by automatic variables grows and shrinks. This section of memory is typically handled as a stack. That means new variables are added sequentially in memory as they are created and then are removed in the opposite order as they pass away.

Dynamically allocated memory comes into existence when malloc() or a related function is called, and is freed when free() is called. Memory persistence is controlled by the programmer, not by a set of rigid rules, so a memory block can be created in one function and disposed of in another function. Because of this, the section of memory used for dynamic memory allocation can wind up fragmented , that is, having unused chunks interspersed among active blocks of memory. Using dynamic memory tends to be a slower process than using stack memory, however.

The `qsort ()` Function

The "quick sort" method is one of the most effective sorting algorithms, particularly for larger arrays. Developed by C. A. R. Hoare in 1962, it uses a recursive approach to partition arrays into ever smaller sizes until the element level is reached. First, the array is divided into two parts , with every value in one partition being less than every value in the other partition. This process continues until the array is fully sorted. The quick sort algorithm is an example of the divide-and-conquer form of recursion mentioned in Chapter 9, "Functions."

The name for the C implementation of the quick sort algorithm is qsort() . The qsort() function sorts an array of data objects. It has the following ANSI prototype:

 void qsort (void *base, size_t nmemb, size_t size,         int (*compar)(const void *, const void *));

The first argument is a pointer to the beginning of the array to be sorted. ANSI C permits any data pointer type to be typecast to a pointer-to- void , thus permitting the first actual argument to qsort() to refer to any kind of array.

The second argument is the number of items to be sorted. The prototype converts this value to type size_t . As you might recall from the discussion of fread() and fwrite() in Chapter 12, "File Input/Output," size_t is the integer type returned by the sizeof operator and is defined in the standard header files.

Because qsort() converts its first argument to a void pointer, qsort() loses track of how big each array element is. To compensate, you must tell qsort() explicitly the size of the data object. That's what the third argument is for. For instance, if you are sorting an array of type double , you would use sizeof(double) for this argument.

Finally, qsort() requires a pointer to the function to be used to determine the sorting order. The comparison function should take two arguments: pointers to the two items being compared. It should return a positive integer if the first item should follow the second value, zero if the two items are the same, and a negative integer if the second item should follow the first. The qsort() will use this function, passing it pointer values that it calculates from the other information given to it.

The form the comparison function must take is set forth in the qsort() prototype for the final argument:

 int (*compar)(const void *, const void *)

This states that the final argument is a pointer to a function that returns an int and that takes two arguments, each of which is a pointer to type const void . These two pointers point to the items being compared.

Listing 16.13 and the discussion following it illustrate how to define a comparison function and how to use qsort() . The program creates an array of random floating-point values and sorts the array.

Listing 16.13 The `qsorter.c` program.

 /* qsorter.c -- using qsort to sort groups of numbers */ #include <stdio.h> #include <stdlib.h> #define NUM 40 void fillarray(double ar[], int n); void showarray(const double ar[], int n); int mycomp(const void * p1, const void * p2); int main(void) {     double vals[NUM];     fillarray(vals, NUM);     puts("Random list:");     showarray(vals, NUM);     qsort(vals, NUM, sizeof(double), mycomp);     puts("\nSorted list:");     showarray(vals, NUM);     return 0; } void fillarray(double ar[], int n) {     int index;     for( index = 0; index < n; index++)         ar[index] = (double)rand()/((double) rand() + 0.1); } void showarray(const double ar[], int n) {     int index;     for( index = 0; index < n; index++)     {         printf("%9.4f ", ar[index]);         if (index % 6 == 5)             putchar('\n');     }     if (index % 6 != 0)         putchar('\n'); } /* sort values by increasing */ int mycomp(const void * p1, const void * p2) {     /* need to use pointers to double to access values   */     const double * a1 = p1; /* a1 is proper pointer type */     const double * a2 = p2;     if (*a1 < *a2)         return -1;     else if (*a1 == *a2)         return 0;     else         return 1; }

Here is a sample run:

 Random list:    0.0022    0.2390    1.2191    0.3910    1.1021    0.2027    1.3836   20.2872    0.2508    0.8880    2.2180   25.5033    0.0236    0.9308    0.9911    0.2507    1.2802    0.0939    0.9760    1.7218    1.2055    1.0326    3.7892    1.9636    4.1137    0.9241    0.9971    1.5582    0.8955   35.3843    4.0580   12.0467    0.0096    1.0110    0.8506    1.1530    2.3614    1.5876    0.4825    6.8751 Sorted list:    0.0022    0.0096    0.0236    0.0939    0.2027    0.2390    0.2507    0.2508    0.3910    0.4825    0.8506    0.8880    0.8955    0.9241    0.9308    0.9760    0.9911    0.9971    1.0110    1.0326    1.1021    1.1530    1.2055    1.2191    1.2802    1.3836    1.5582    1.5876    1.7218    1.9636    2.2180    2.3614    3.7892    4.0580    4.1137    6.8751   12.0467   20.2872   25.5033   35.3843

Let's look at two main areas: the use of qsort() and the definition of mycomp() .

Using `qsort()`

The qsort() function sorts an array of data objects. It has the following ANSI prototype:

 void qsort (void *base, size_t nmemb, size_t size,         int (*compar)(const void *, const void *));

The first argument is a pointer to the beginning of the array to be sorted. In this program, the actual argument is vals , the name of an array of double , hence a pointer to the first element of the array. The ANSI prototype causes the vals argument to be typecast to type pointer-to- void . That's because ANSI C permits any data pointer type to be typecast to a pointer-to- void , thus permitting the first actual argument to qsort() to refer to any kind of array.

The second argument is the number of items to be sorted. In Listing 16.13, it is N , the number of array elements. The prototype converts this value to type size_t .

The third argument is the size of each element ” sizeof(double) , in this case.

The final argument is mycomp , the address of the function to be used for comparing elements.

Defining `mycomp()`

As mentioned before, the qsort() prototype mandates the form of the comparison function:

 int (*compar)(const void *, const void *)

This states that the final argument is a pointer to a function that returns an int and that takes two arguments, each of which is a pointer to type const void . We made the prototype for the mycomp() function agree with this prototype:

 int mycomp(const void * p1, const void * p2);

Remember that the name of the function is a pointer to the function when used as an argument, so mycomp matches the compar prototype.

The qsort() function passes the addresses of the two elements to be compared to the comparison function. In this program, then, p1 and p2 are assigned the addresses of two type double values to be compared. Note that the first argument to qsort() refers to the whole array, and the two arguments in the comparison function refer to two elements in the array. There is a problem. To compare the pointed-to values, you need to dereference a pointer. Because the values are type double , you need to dereference a pointer to type double . However, qsort() requires pointers to type void . The way to get around this problem is to declare pointers of the proper type inside the function and initialize them to the values passed as arguments:

 int mycomp(const void * p1, const void * p2) {     /* need to use pointers to double to access values   */     const double * a1 = p1; /* a1 is proper pointer type */     const double * a2 = p2;     if (*a1 < *a2)         return -1;     else if (*a1 == *a2)         return 0;     else         return 1; }

In short, qsort() and the comparison function use void pointers for generality. As a consequence, you have to tell qsort() explicitly how large each element of the array is, and within the definition of the comparison function, you have to convert its pointer arguments to pointers of the proper type for your application.

Let's look at one more example of a comparison function. Suppose you have these declarations:

 struct names {     char first[40];     char last[40]; }; struct names staff[100];

What should a call to qsort() look like? Following the model in Listing 16.13, a call could look like this:

 qsort(staff, 100, sizeof(struct names), comp);

Here, comp is the name of the comparison function. What should this function look like? Suppose you want to sort by last name, and then by first name. You could write the function this way:

 #include <string.h> int comp(const void * p1, const void * p2)   /* mandatory form */ {     const struct names *ps1 = p1; /* get right type of pointer */     const struct names *ps2 = p2;     int res;     res = strcmp(ps1->last, ps2->last);  /* compare last names */     if (res != 0)         return res;     else                          /* last names identical      */         return strcmp(ps1->first, ps2->first); }

This function uses the strcmp() function to do the comparison; its possible return values match the requirements for the comparison function. Note that you need a pointer to a structure to use the -> operator.

I l @ ve RuBoard

The General Utilities Library