The programs of this chapter, and quicksort from the previous chapter, are typical of implementations of divide-and-conquer algorithms. We shall see several algorithms with similar structure in later chapters, so it is worthwhile to take a more detailed look at basic characteristics of these implementations.
Quicksort might perhaps more properly be called a conquer-and-divide algorithm: In a recursive implementation, most of the work for a particular activation is done before the recursive calls. On the other hand, the recursive mergesort has more the spirit of divide and conquer: First, the file is divided into two parts; then, each part is conquered individually. The first problem for which mergesort does processing is a small one; at the finish, the largest subfile is processed. Quicksort starts with processing on the largest subfile and finishes up with the small ones. It is amusing to contrast the algorithms in the context of the management analogy mentioned at the beginning of this chapter: quicksort corresponds to each manager investing effort to make the right decision on how to divide up the task, so the job is complete when the subtasks are done, whereas mergesort corresponds to each manager making a quick arbitrary choice to divide the task in half, then needing to work to cope with the consequences after the subtasks are done.
This difference is manifest in the nonrecursive implementations. Quicksort must maintain a stack, because it has to save large subproblems that are divided up in a data-dependent manner. Mergesort admits a simple nonrecursive version because the way in which it divides the file is independent of the data, so we can rearrange the order in which it processes subproblems to give a simpler program.
We might argue that quicksort is more naturally thought of as a top-down algorithm, because it does work at the top of the recursion tree, then proceeds down to finish the sort. We could contemplate a nonrecursive quicksort that traverses the recursion tree in level order from top to bottom. Thus, a sort makes multiple passes through the array, partitioning files into smaller subfiles. For arrays, this method is not practical, because of the bookkeeping cost of keeping track of the subfiles; for linked lists, it is analogous to bottom-up mergesort.
We have noted that mergesort and quicksort differ on the issue of stability. For mergesort, if we assume that the subfiles have been sorted stably, then we need be sure only that the merge is done in a stable manner, which is easy to arrange. The recursive structure of the algorithm leads immediately to an inductive proof of stability. For an array-based implementation of quicksort, no easy way of doing the partitioning in a stable manner suggests itself, so the possibility of stability is foreclosed even before the recursion comes into play. The straightforward implementation of quicksort for linked lists is, however, stable (see Exercise 7.4).
As we saw in Chapter 5, algorithms with one recursive call reduce to a loop, but algorithms with two recursive calls, like mergesort and quicksort, open up the world of divide-and-conquer algorithms and tree structures, where many of our best algorithms are found. Mergesort and quicksort are worthy of careful study, not just because of their practical importance as sorting algorithms but also because of the insights they give into the nature of recursion, which can serve us well in developing and understanding other recursive algorithms.
8.45 Suppose that mergesort is implemented to split the file at a random position, rather than exactly in the middle. How many comparisons are used by such a method to sort N elements, on the average?
8.46 Study the performance of mergesort when it is sorting strings. How many character comparisons are involved when a large file is sorted, on the average?
8.47 Run empirical studies to compare the performance of quicksort for linked lists (see Exercise 7.4) and top-down mergesort for linked lists (Program 8.7).
Top |