Sorting Numbers

I l @ ve RuBoard

Sorting Numbers

One of the most common tasks for a computer is sorting, so we'll develop a program to sort integers. We'll take a black box approach and think in terms of input and output. The overall plan, shown in Figure 13.2, is pretty simple.

Figure 13.2. Sorting program: a black box view.
graphics/13fig02.jpg

At this point, the program is still too vaguely defined for us to begin writing code. The next step is to identify the main tasks that the program must do to accomplish our goals. We can break the example down to three such tasks:

  1. Read the numbers.

  2. Sort them.

  3. Print the sorted numbers.

Figure 13.3 shows this breakdown as we descend from the top level of organization to a more detailed level.

Figure 13.3. Sorting program: peking inside.
graphics/13fig03.jpg

Global Decisions

Before designing the individual modules, we need to make some global decisions. We need to choose a data form and decide what information will be passed to the individual modules.

Data Form

How do we represent a collection of numbers? We could use a collection of variables , one for each number, but that is just too much trouble to even think about. Using an array is an immensely superior approach.

What kind of an array? Type int ? Type double ? We need to know how the program is going to be used. Assuming it is to be used with modestly sized integers, we will use an array of ints to store the numbers we read.

Information Flow

The first module gathers input. It should know where to place the values it reads. It should also know the maximum number of values it can accept and report the actual number of items read. The sorting module should know which array to sort and how many elements are present. The printing module should know which array to use and how many elements to print. These considerations suggest the main() function shown in Listing 13.10.

Listing 13.10 The sort_int.c program.
 /* sort_int.c -- sorts integers */ #include <stdio.h> #define MAXSIZE  100    /* limit to number of integers to sort */ extern int getarray(int ar[], int n); extern void sort(int ar[], int n); extern void print(const int ar[], int n); int main(void) {   int numbers[MAXSIZE];              /* array to hold input    */   int size;                          /* number of input items  */   size = getarray(numbers, MAXSIZE); /* put input into array   */   printf("\nOriginal data (%d values):\n", size);   print(numbers, size);              /* print original array   */   sort(numbers, size);               /* sort the array         */   puts("Sorted data: ");   print(numbers, size);              /* print the sorted array */   return 0; } 

Here we have the skeleton of the program. The function getarray() places the values read into the array numbers and reports how many values were read. That value is assigned to size . The program reports the number of elements entered and displays the original array. Then sort() sorts the array, and print() prints the sorted results.

Now that we have refined our picture of the information flow, we should modify the black box sketch. As Figure 13.4 shows, we now have three black boxes, each with its own input and output. Because we are emphasizing modularity, we have broken the original problem into three smaller, more manageable pieces. We can assign each part to a different programming team, provided that the numbers output by "read 'em" are in the same form that "sort 'em" uses for input.

Figure 13.4. Sorting program: adding details.
graphics/13fig04.jpg

We will apply our efforts to each of the three boxes separately, breaking them down to simpler units until we reach a point where the code is obvious. As we do this, we must pay attention to these important points: data-form choice, error-trapping, and information flow. Let's continue with the example, tackling the reading portion first (see Figure 13.5).

Figure 13.5. Sorting program: the first task.
graphics/13fig05.jpg

Reading Numeric Data

Many programs read numbers, so the ideas we develop here will be useful elsewhere. The general form for this part of the program is clear: Use a loop to read in numbers until all the numbers are read. However, there is more to reading numbers than you might think!

Ending Input

How will the function know when to quit reading numbers? We faced a similar problem in Chapter 10. There, we had a read_array() function that quit reading numbers when the user either simulated EOF or entered a non-numeric value. This time we'll take a different tack. The main change is how we handle non-numeric input. We'll design the function so that it alerts the user when the input is non-numeric and then gives the user the option of trying again.

Why this change? Suppose you are entering data at the keyboard and accidentally enter 1o instead of 10 . Would you rather have the program terminate, forcing you to reenter all the data or have it just reject that one entry and let you continue? It is simpler to rely on the "perfect user theory," which states that the user makes no entry errors. However, we recognize that this theory might not apply to users other than ourselves . Therefore, we opt for the second choice. This checking input for suitability is termed data validation.

Here is one approach:

  1. Read a word.

  2. While not EOF .

  3. If the word is an integer, assign it to an array.

  4. Else skip that word and provide the option of entering a number again.

  5. Terminate if input is non-numeric.

  6. Otherwise , read the next word if the array isn't full.

Note that there are three separate conditions that bring this portion of the program to a close: an EOF signal, filling the array, or entering non-numeric data twice in succession.

The getarray() Function

Now let's look at getarray() , shown in Listing 13.11.

Listing 13.11 The getarray.c file.
 /* getarray.c -- reads in an array */ #include <stdio.h> #define NONUM 0 #define YESNUM 1 int getarray(int array[], int limit) {   int num, status;   int index = 0;     /* array index */   printf("This program stops reading numbers after %d ",          limit);   printf("values\nor if EOF is encountered. First value: ");   status = scanf("%d", &num);   while (index < limit && status != EOF)   {     if (status == YESNUM)     {       array[index++] = num;       printf("%d accepted. ", num);       if (index < limit)            /* if there's room, */       {                             /* get next value   */         printf("Next value: ");         status = scanf("%d", &num);       }     }     else if (status == NONUM)     {       scanf("%*s"); /* dispose of bad input    */       printf("That was no integer! Enter an integer to \n");       printf("continue or non-numeric input to quit: ");       if ((status = scanf("%d", &num)) == NONUM)         break;      /* quit loop if nonnumeric */     }     else     {       printf("Oops! Program should never reach here!\n");       break;     }   }   if (index == limit )    /* report if array gets filled */     printf("All %d elements of the array were filled.\n",         limit);   return(index); } 

This is not a simple function, so we have quite a few points to note.

Explanation

We set up getarray() to handle each of the possible return values for scanf() . Recall that scanf() returns the number of items successfully read. If it finds an int , it returns 1 , or YESNUM . If it finds a non-integer, it returns , or NONUM , and if it finds the end of file, it returns EOF .

An EOF return causes the while loop to end. A YESNUM return results in a valid entry being stored in the awaiting array and the index being incremented. Also, that number is echoed back to the user showing that it was accepted.

A NONUM return initiates a more complex chain of events. The program first disposes of the invalid input with this line:

 scanf("%*s"); /* dispose of bad input    */ 

Recall that the %*s specifier causes scanf() to skip over the next word. If status is NONUM , then the word is not an integer, so we want to skip it. Otherwise, scanf() will remain stuck on that word forever. Next, the program sends the user back for another try. If the user responds with another non-integer, the program uses break to exit the loop, terminating input. If the user responds with an integer or with end-of-file, the program continues to the loop test. The loop test will catch an EOF return; otherwise, the value is added to the array.

There is one more else statement. Logically, the only way this final else statement can be reached is if scanf() returns a value other than YESNUM , NONUM , or EOF , but those are the only values that can be returned, so it seems to be a useless statement. We included it as an example of defensive programming, the art of protecting a program from future fiddling. Someday, we, or someone else, might decide to rewrite getarray() so that, say, it reads two values at a time, making 2 a possible return value. Most likely, we will have forgotten by then ”and they might never have known ”that getarray() assumes there are just three possible responses. Therefore, to help future debugging, we include this final else to trap any new responses that show up.

We use the keyword return to communicate the number of items read, so the function call

 size = getarray(numbers, MAXSIZE); 

assigns a value to size and puts values into the numbers array.

In functions such as this one that involve counters and limits, the most likely place to find an error is at the boundary conditions, where counts reach their limits. Are you going to read a maximum of MAXSIZE numbers, or are you going to be off by 1? You need to pay attention to details such as ++index versus index++ and < versus <= . You also have to keep in mind that arrays start their subscripting with , not 1 . Take a moment to check our code and see whether it works as it should. The easiest way to do that is to imagine that limit is 1 and then walk through the procedure step by step.

Finally, the function informs the user when the array is full so that he or she is not taken by surprise.

Getting a program to interact in a convenient , dependable manner with the user is often the most difficult part of writing code. That is the case with this part of the program. You'll find sort() and print() easier. Let's move on to sort() now.

Sorting the Data

Let's look at main() again from Listing 13.10:

 /* sort_int.c -- sorts integers */ #include <stdio.h> #define MAXSIZE  100    /* limit to number of integers to sort */ extern int getarray(int ar[], int n); extern void sort(int ar[], int n); extern void print(const int ar[], int n); int main(void) {   int numbers[MAXSIZE];              /* array to hold input    */   int size;                          /* number of input items  */   size = getarray(numbers, MAXSIZE); /* put input into array   */   printf("\nOriginal data (%d values):\n", size);   print(numbers, size);              /* print original array   */   sort(numbers, size);               /* sort the array         */   puts("Sorted data: ");   print(numbers, size);              /* print the sorted array */   return 0; } 

You can see that the two arguments to sort() are an array of integers to be sorted and a count of the number of elements to be sorted. The function sorts the numbers but has no return value. We still haven't decided how to do the sorting, so we need to refine this description further.

One obvious decision is the direction of the sort. Are we going to sort from largest to smallest or vice versa? Again, we'll be arbitrary and say that we'll sort from largest to smallest (see Figure 13.6). (We could design a function to do either, but then we would have to develop a way to tell the function which choice we want.)

Figure 13.6. Sorting function: the second task.
graphics/13fig06.jpg

Now let's consider the method we will use to sort. Many sorting algorithms have been developed for computers, but we'll use one of the simplest ”the selection sort.

Here is our plan in pseudocode:

 for n = first to n = next-to-last element,      find largest remaining number and place it in the nth element 

The plan works like this. First, start with n = 1 . Scan the entire array, find the largest number, and swap it with the first element. Next, set n = 2 , and scan all but the first element of the array. Find the largest remaining number, and swap it with the second element. Continue this process until reaching the next-to-last element. Now only two elements are left. Compare them and place the larger in the next-to-last position. This leaves the smallest element of all in the final position.

It looks like a for loop task, but we still have to describe the "find and place" process in more detail. Here is one way to select the largest remaining value. Compare the first and second elements of the remaining array. If the second is larger, swap the two values. Now compare the first element with the third. If the third is larger, swap those two. Each swap moves a larger element to the top. Continue this way until you have compared the first with the last element. When you finish, the largest number is now in the first element of the remaining array. You have sorted the array for the first element, but the rest of the array is in a jumble . Here is the procedure in pseudocode:

 for n - second element to last element,   compare nth element with first element; if nth is greater, swap values 

This process looks like another for loop. It will be nested in the first for loop. The outer loop indicates which array element is to be filled, and the inner loop finds the value to put there. Putting the two parts of the pseudocode together and translating them into C, we get the function in Listing 13.12.

Listing 13.12 The sort.c file.
 /* sort.c -- sorts an integer array in decreasing order */ void sort(int array[], int limit) {    int top, search, temp;    for (top = 0; top < limit -1; top++)        for (search = top + 1; search < limit; search++)             if (array[search] > array[top])             {                  temp = array[search];                  array[search] = array[top];                  array[top] = temp;             } } 

In Listing 13.12, we were clever enough to remember that the first element has for a subscript. Also, we used the swapping technique discussed in Chapter 9. We used top as the subscript for the array element that is to be filled because it is at the top of the unsorted part of the array. The search index roams over the array below the current top element. Now we just have print() left to write.

Printing the Data

The last task is to write the function that will print the sorted numbers (see Figure 13.7). This function, as shown in Listing 13.13, is pretty simple. Because the function does not modify the array, it declares array as const .

Figure 13.7. Printing results: the third task.
graphics/13fig07.jpg
Listing 13.13 The print.c file.
 /* print.c -- prints an array */ #include <stdio.h> void print(const int array[], int limit) {    int index;    for (index = 0; index < limit; index++)    {       printf("%d ", array[index]);       if (index % 10 == 9)          putchar('\n');    }    if (index % 10 != 0)       putchar('\n'); } 

Listing 13.13 uses the modulus operator to control printing newlines:

 if (index % 10 == 9)    putchar('\n'); 

The condition is true when index is 9, 19, 29, and so on. Because array indexing begins with 0, this results in displaying a newline after every tenth element. The function also displays a new line at the end unless the very last thing printed inside the loop was a new line:

 if (index % 10 != 0)       putchar('\n'); 

Suppose a newline is the last thing printed in the loop before it terminates; then the value of the modulus is 9 . However, the for loop update condition increments index , so the value of the modulus is 10 after the loop terminates.

If you want something a little different, such as printing in one column or using fixed-width fields, you can always change this function, leaving the other functions untouched. Similarly, if you found a sorting algorithm that you liked better, you could replace the sort() module. That is one of the advantages of a modular program.

Results

Let's compile and test this package. That is, compile Listing 13.10, Listing 13.11, Listing 13.12, and Listing 13.13 together as a single program or project. To make checking the boundary conditions simpler, we'll temporarily change MAXSIZE to 5 . For the first test, we will feed numbers to the program until it refuses to take more:

 This program stops reading numbers after 5 values or if EOF is encountered. First value: 78 78 accepted. Next value: 85 85 accepted. Next value: ninety That was no integer! Enter an integer to continue or non-numeric input to quit: 90 90 accepted. Next value: 88 88 accepted. Next value: 95 95 accepted. All 5 elements of the array were filled. Original data (5 values): 78 85 90 88 95 Sorted data: 95 90 88 85 78 

Good, it stopped when five numbers were read, and it sorted the results. Now we'll restore the limit to 100 values, recompile, and test to see whether it stops when the end of file is encountered.

 This program stops reading numbers after 100 values or if EOF is encountered. First value: 87 87 accepted. Next value: 88 88 accepted. Next value: 67 67 accepted. Next value: 93 93 accepted. Next value: <Control-Z> Original data (4 values): 87 88 67 93 Sorted data: 93 88 87 67 

Faster than you can say "oikology is the science of housekeeping," the whole enormous array is sorted. Recall that Ctrl+D transmits EOF on UNIX systems. Finally, let's see if consecutive non-numeric responses halt input:

 This program stops reading numbers after 100 values or if EOF is encountered. First value: 56 56 accepted. Next value: 68 68 accepted. Next value: 74 74 accepted. Next value: 54 54 accepted. Next value: no That was no integer! Enter an integer to continue or non-numeric input to quit: q Original data (4 values): 56 68 74 54 Sorted data: 74 68 56 54 

Success wasn't easy, but it was possible. By breaking the problem into smaller parts and by thinking about what information should flow into and out of each part, we reduced the problem to manageable proportions . Furthermore, the individual modules we produced could be used as parts of similar programs.

Comments

One major advantage of the modular design is that it makes it simpler to tinker with the program. For instance, consider the sorting module. After thinking about it, you might decide that you can improve the efficiency by changing the algorithm slightly. As the program moves through the inner loop, instead of immediately promoting each new claimant to the top ranking to the top, just keep track of the index of the claimant. Then, when the inner loop finishes, just swap the current top claimant with the current top position. Or you might want to modify the printing module to print six numbers to a line. With a modular approach, you can fix just that module and leave the rest of the program unchanged. If each function has its own file, you need only recompile the altered one, and then link it to the compiled versions of the other files. Integrated environments do this automatically when you have a project file. That is, they keep track of which files need to be recompiled and which have been unaltered.

I l @ ve RuBoard


C++ Primer Plus
C Primer Plus (5th Edition)
ISBN: 0672326965
EAN: 2147483647
Year: 2000
Pages: 314
Authors: Stephen Prata

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net