7.4. Sorting and SearchingA sort is an algorithm for ordering an array. Of the many different techniques for sorting an array we discuss two, the bubble sort and the Shell sort. Both require the interchange of values stored in a pair of variables. If varl, var2, and temp are all variables of the same data type (such as all String), then the statements temp = varl varl = var2 var2 = temp assign var1's value to var2 and var2's value to var1. Example 1.
Bubble SortThe bubble sort is an algorithm that compares adjacent items and swaps those that are out of order. If this process is repeated enough times, the list will be ordered. Let's carry out this process on the list Pebbles, Barney, Wilma, Fred, and Dino. The steps for each pass through the list are as follows:
The first time through the list, this process is repeated to the end of the list. This is called the first pass. After the first pass, the last item (Wilma) will be in its proper position. Therefore, the second pass does not have to consider it and so requires one less comparison. At the end of the second pass, the last two items will be in their proper position. (The items that must have reached their proper position have been underlined.) Each successive pass requires one less comparison. After four passes, the last four items will be in their proper positions, and hence, the first will be also.
Example 2. |
The following program alphabetizes the names Pebbles, Barney, Wilma, Fred, and Dino. Sorting the list requires a pair of nested loops. The inner loop performs a single pass, and the outer loop controls the number of passes. Dim person() As String = {"Pebbles", "Barney", "Wilma", "Fred", "Dino"} Private Sub frmFlintstones_Load(...) Handles MyBase.Load 'Display unsorted list For i As Integer = 0 To 4 lstPeople.Items.Add(person(i)) Next End Sub Private Sub btnSort_Click(...) Handles btnSort.Click 'Bubble sort names Dim temp As String [Run, and click the button.]
|
Table 7.5 contains facts about the 10 most populous states with listings in ascending order by state abbreviation. The following program sorts the table in descending order by population. Data are read from a file into an array of structures by the form's Load event procedure. When btnDisplayStats is clicked, the array is sorted based on the population field.
[Run, and click the button.]
|
The bubble sort is easy to understand and program. However, it is too slow for really long lists. The Shell sort, named for its inventor, Donald L. Shell, is much more efficient in such cases. It compares distant items first and works its way down to nearby items. The interval separating the compared items is called the gap. The gap begins at one-half the length of the list and is successively halved until eventually each item is compared with its neighbor as in the bubble sort. The algorithm for a list of n + 1 items referred to as item 0 through item I is as follows.
Begin with a gap of g = Int([n + 14]/2).
Compare items 0 and g, items 1 and 2 +g, 2 and 2+g,..., n-g and n. Swap any pairs that are out of order.
Repeat Step 2 until no swaps are made for gap g.
Halve the value of g.
Repeat Steps 2, 3, and 4 until the value of g is 0.
The Shell sort is illustrated in what follows, in which crossing arrows indicate that a swap occurred:
Initial Gap = Int([Number of Items]/2) = Int(5/2) = 2
Because there was a swap, use the same gap for the second pass.
Again, there was a swap, so keep the current gap.
There were no swaps for the current gap of 2, so
Next Gap = Int([Previous Gap]/2) = Int(2/2) = 1.
Because there was a swap (actually two swaps), keep the same gap.
Because there were no swaps for the current gap, then
Next Gap = Int([Previous Gap]/2) = Int(1/2) = 0,
and the Shell sort is complete.
Notice that the Shell sort required 17 comparisons to sort the list, whereas the bubble sort required only 10 comparisons for the same list. This illustrates the fact that for very short lists, the bubble sort is preferable; however, for lists of 30 items or more, the Shell sort will consistently outperform the bubble sort. Table 7.6 shows the average number of comparisons required to sort arrays of varying sizes.
Array Elements | Bubble Sort Comparisons | Shell Sort Comparisons |
---|---|---|
5 | 10 | 15 |
10 | 45 | 57 |
15 | 105 | 115 |
20 | 190 | 192 |
25 | 300 | 302 |
30 | 435 | 364 |
50 | 1225 | 926 |
100 | 4950 | 2638 |
500 | 124,750 | 22,517 |
1000 | 499,500 | 58,460 |
The following program uses the Shell sort to alphabetize the parts of a running shoe. (See Figure 7.5.) The data are read into an array whose size is large enough to guarantee sufficient space. In the form's Load event procedure, the variable numParts provides the subscripts for the array and serves as a counter. The final value of numParts is available to all procedures because the variable was created in the Declarations section of the Code window. The Sub procedure SortData uses a flag to indicate if a swap has been made during a pass. Figure 7.5. Running shoe.
[Run, and click the button.]
|
Suppose we had an array of 1000 names in alphabetical order and wanted to locate a specific person in the list. One approach would be to start with the first name and consider each name until a match was found. This process is called a sequential search. We would find a person whose name begins with "A" rather quickly, but 1000 comparisons might be necessary to find a person whose name begins with "Z." For much longer lists, searching could be a time-consuming matter. However, there is a method, called a binary search, that shortens the task considerably.
Let us refer to the sought item as quarry. The binary search looks for quarry by determining in which half of the list it lies. The other half is then discarded, and the retained half is temporarily regarded as the entire list. The process is repeated until the item is found. A flag can indicate if quarry has been found.
The algorithm for a binary search of an ascending list is as follows (Figure 7.6 shows the flowchart for a binary search):
At each stage, denote the subscript of the first item in the retained list by first and the subscript of the last item by last. Initially, the value of first is 0, the value of last is one less than the number of items in the list, and the value of flag is False.
Look at the middle item of the current list, the item having the subscript middle = Int((first + last)/2).
If the middle item is quarry, then flag is set to True and the search is over.
If the middle item is greater than quarry, then quarry should be in the first half of the list. So the subscript of quarry must lie between first and middle -1. That is, the new value of last is middle -1.
If the middle item is less than quarry, then quarry should be in the second half of the list of possible items. So the subscript of quarry must lie between middle +1 and last. That is, the new value of first is middle +1.
Repeat Steps 2 through 5 until quarry is found or until the halving process uses up the entire list. (When the entire list has been used up, first > last.) In the second case, quarry was not in the original list.
In the following program the array firm() contains the alphabetized names of up to 100 corporations. The program requests the name of a corporation as input and uses a binary search to determine whether the corporation is in the array. Dim firm(99) As String Dim numFirms As Integer Private Sub frmCompanies_Load(...) Handles MyBase .Load 'Fill array with data from FIRMS.TXT, which contains 'the names of up to 100 companies Dim sr As IO.StreamReader = IO.File.OpenText("FIRMS.TXT") numFirms = 0 Do While (sr.Peek <> -1) And (numFirms < firm.GetUpperBound(0)) firm(numFirms) = sr.ReadLine numFirms += 1 Loop sr.Close() End Sub Private Sub btnSearch_Click(...) Handles btnSearch.Click Dim corp, result As String corp = txtCorporation.Text.ToUpper result = "" BinarySearch(corp, result) 'Display results of search txtResult.Text = corp & " " & result End Sub Sub BinarySearch(ByVal corp As String, ByRef result As String) 'Array firm() assumed already ordered alphabetically 'Binary search of firm() for corp Dim first, middle, last As Integer Dim foundFlag As Boolean = False first = 0 last = numFirms - 1 Do While (first <= last) And (Not foundFlag) middle = CInt(Int((first + last) / 2)) Select Case firm(middle).ToUpper Case corp foundFlag = True [Run, type "IBM" into the text box, and click the button.]
|
Suppose the array contains 100 corporations and the corporation input in Example 5 is in the second half of the array. On the first pass, middle would be assigned Int((0 + 99)/2) = Int(49.5) = 49 and then first would be altered to 49 + 1 = 50. On the second pass, middle would be assigned Int((50 + 99)/2) = Int(74.5) = 74. If the corporation is not the array element with subscript 74, then either last would be assigned 73 or first would be assigned 75, depending on whether the corporation appears before or after the 74th element. Each pass through the loop halves the range of subscripts containing the corporation until the corporation is located.
In Example 5, the binary search merely reported whether an array contained a certain item. After finding the item, its array subscript was not needed. However, if the search had been carried out on an array of structures (as in Table 7.5), the subscript of the found item could be used to retrieve the related information for the other members. This process, called a table lookup, is used in the following example.
The following program uses a binary search procedure to locate the data for a state from Example 3 requested by the user. The program does not include a sort of the text file STATES.TXT, because the file is already ordered alphabetically by city name. Structure District Dim name As String Dim income As Double Dim collegeGrad As Double Dim population As Double End Structure [Run, type "OH" into the masked text box, and click the button.]
|
Suppose that our bubble sort algorithm is applied to an ordered list. The algorithm will still make n - 1 passes through the list. The process could be shortened for some lists by flagging the presence of out-of-order items as in the Shell sort.
In Example 3, an array of structures already ordered by one member was sorted by another member. Usually, an array of structures is sorted by the member to be searched when accessing the member. This member is called the key member.
Suppose that an array of 2000 items is searched sequentiallythat is, one item after anotherin order to locate a specific item. The number of comparisons would vary from 1 to 2000, with an average of 1000. With a binary search, the number of comparisons would be at most 11, because 211 >2000.
The method ToUpper converts all the characters in a string to uppercase. ToUpper is useful in sorting and searching arrays of strings when the alphabetic case (upper or lower) is unimportant. For instance, Example 5 includes ToUpper in the Select Case comparisons, and so the binary search will locate "Mobil" in the array even if the user entered "MOBIL".
Visual Basic has a built-in sort routine that can be used to sort arrays. The statement
Array.Sort(arrayName)
sorts arrayName into ascending order. For instance, in Example 2 the event procedure can be replaced with the following code.
Private Sub btnSort_Click(...) Handles btnSort.Click Dim name() As String = {"Pebbles", "Barney", "Wilma", _ "Fred", "Dino"} Array.Sort(name) 'Display alphabetized list lstNames.Items.Clear()[Page 371] For i As Integer = 0 To 4 lstNames.Items.Add(name(i)) Next End Sub
Array.Sort cannot be used with arrays of structures and therefore is not adequate for our purposes.
1. | The pseudocode for a bubble sort of an array of n items follows. Why is the terminating value of the outer loop n-1 and the terminating value of the inner loop n-j? For j As Integer = 1 To n - 1 For k As Integer = 1 To n - j If [(k-1)st and kth items are out of order] Then [swap them] Next Next | |||||||||
2. | Complete the table that follows by filling in the values of each variable after successive passes of a binary search of an array of 20 items, where the sought item is in the position 13.
|
In Exercises 1 through 4, determine the output displayed in the list box when the button is clicked.
1. | Private Sub btnDisplay_Click(...) Handles btnDisplay.Click Dim p As Integer = 100 Dim q As Integer = 200 Dim temp As Integer = p p = q q = temp lstOutput.Items.Add(p) lstOutput.Items.Add(q) End Sub |
2. | Dim gag() As String = {"Stan"," Oliver"} Private Sub btnDisplay_Click(...) Handles btnDisplay.Click Dim temp As String If gag(1) < gag(0) Then temp = gag(1) |
3. | Private Sub btnDisplay_Click(...) Handles btnDisplay.Click Dim x, y, temp As Double Dim swappedFlag As Boolean = False x = 7 y = 11 If y > x Then temp = x x = y y = temp swappedF1ag = True End If lstOutput.Items.Add(x) lstOutput.Items.Add(y) If swappedFlag Then lstOutput.Items.Add("Numbers interchanged.") End If End Sub |
4. | Dim a(2) As Integer Private Sub btnDisplay_Click(...) Handles btnDisplay.Click Dim temp As Integer For j As Integer = 0 To 1 For k As Integer = 0 To 2 - j If a(k) > a(k + 1) Then temp = a(k) a(k) = a(k + 1) a(k + 1) = temp End If Next Next For j As Integer = 0 To 2 lstOutput.Items.Add(a(j)) Next End Sub Private Sub frmNumbers_Load(...) Handles MyBase.Load Dim sr As IO.StreamReader = IO.File.OpenText("DATA.TXT") For j As Integer = 0 To 2 a(j) = CInt(sr.ReadLine) Next sr.Close() End Sub (Assume that the three lines of the file DATA.TXT contain the following entries: 7, 4, 3.) |
In Exercises 5 and 6, identify the errors.
5. | Dim c(3), d(3) As Integer Private Sub btnDisplay_Click(...) Handles btnDisplay.Click 'Swap two items c(3) = d(3) d(3) = c(3) lstOutput.Items.Add(c(3)) lstOutput.Items.Add(d(3)) End Sub Private Sub frmNumbers_Load(...) Handles MyBase.Load Dim sr As IO.StreamReader = IO.File.OpenText("DATA.TXT") For i As Integer = 0 To 3 c(i) = CInt(sr.ReadLine) d(i) = CInt(sr.ReadLine) Next sr.Close() End Sub (Assume that the eight lines of the file DATA.TXT contain the following entries:1, 2, 3, 4, 5, 6, 7, 8.) |
6. | Dim a(2), b(2) As Integer Private Sub btnDisplay_Click(...) Handles btnDisplay.Click 'Swap the two arrays Dim temp(2) As Integer a = b b = a End Sub Private Sub frmNumbers_Load(...) Handles MyBase.Load Dim sr As IO.StreamReader = IO.File.OpenText("DATA.TXT") For i As Integer = 0 To 2 a(i) = CInt(sr.ReadLine) b(i) = CInt(sr.ReadLine) Next sr.Close() End Sub (Assume that the six lines of the file DATA.TXT contain the following entries:1, 3, 5, 7, 9, 11.) |
7. | Which type of search would be best for the following array?
|
8. | Which type of search would be best for the following array?
|
9. | Consider the items Tin Man, Dorothy, Scarecrow, and Lion in that order. After how many swaps in a bubble sort will the list be in alphabetical order? |
10. | How many comparisons will be made in a bubble sort of six items? |
11. | How many comparisons will be made in a bubble sort of n items? |
12. | Modify the program in Example 2 so that it will keep track of the number of swaps and comparisons and display these numbers before ending. |
13. | Rework Exercise 9 using the Shell sort. |
14. | How many comparisons would be made in a Shell sort of six items if the items were originally in descending order and were sorted in ascending order? |
15. | If a list of six items is already in the proper order, how many comparisons will be made by a Shell sort? |
16. | Suppose a list of 5000 numbers is to be sorted, but the numbers consist of only 1, 2, 3, and 4. Describe a method of sorting the list that would be much faster than either the bubble or Shell sort. |
17. | What is the maximum number of comparisons required to find an item in a sequential search of 16 items? What is the average number of comparisons? What is the maximum number of comparisons required to find an item in a binary search of 16 items? |
18. | Redo Exercise 17 with 2n items, where is any positive integer. |
In Exercises 19 through 26, write a program (or procedure) to complete the stated task.
19. | Exchange the values of the variables x, y, and z so that x has y's value, y has z's value, and z has x's value. | |||||||||||||||||||||||||||||||||||||||||
20. | Display the names of the seven dwarfs in alphabetical order. For the contents of a text file use Doc, Grumpy, Sleepy, Happy, Bashful, Sneezy, and Dopey. | |||||||||||||||||||||||||||||||||||||||||
21. | The nation's capital has long been a popular staging area for political, religious, and other large public rallies, protest marches, and demonstrations. The events in Table 7.7 have drawn the largest crowds, according to estimates from D.C., U.S. Park, or Capitol police. Read the data into an array of structures and display a similar table with the event names in alphabetical order. Note: The data is contained in the file EVENTS.TXT. | |||||||||||||||||||||||||||||||||||||||||
22. | Table 7.8 presents statistics on the five leading sneaker brands. Read the data into an array of structures, and display a similar table with wholesale revenue in descending order. | |||||||||||||||||||||||||||||||||||||||||
23. | Accept 10 words to be input in alphabetical order, and store them in an array. Then accept an 11th word as input, and store it in the array in its correct alphabetical position.
| |||||||||||||||||||||||||||||||||||||||||
24. | An airline has a list of 200 flight numbers (between 1 and 1000) in ascending order in the file FLIGHTS.TXT. Accept a number as input and do a binary search of the list to determine if the flight number is valid. | |||||||||||||||||||||||||||||||||||||||||
25. | Allow a number n to be input by the user. Then accept as input a list of numbers. Place the numbers into an array and apply a bubble sort. | |||||||||||||||||||||||||||||||||||||||||
26. | Rework Exercise 25, but use a flag to detect the presence of out-of-order items as in the Shell sort. | |||||||||||||||||||||||||||||||||||||||||
27. | Write a program that accepts a word as input and converts it into Morse code. The dots and dashes corresponding to each letter of the alphabet are as follows:
| |||||||||||||||||||||||||||||||||||||||||
28. | Write a program that accepts an American word as input and performs a binary search to translate it into its British equivalent. Use the following list of words for data, and account for the case when the word requested is not in the list:
| |||||||||||||||||||||||||||||||||||||||||
29. | Write a program that accepts a student's name and seven test scores as input and calculates the average score after dropping the two lowest scores. | |||||||||||||||||||||||||||||||||||||||||
30. | Suppose letter grades are assigned as follows:
Write a program that accepts a grade as input and displays the corresponding letter. Hint: This problem shows that when you search an array, you don't always look for equality. Set up an array of structures with the member range containing the values 97, 94, 90, 87, 84, ..., 0 and the member letter containing A+, A, A , B+,...,F. gNext, perform a sequential search to find the first structure such that the value of the range member is less than or equal to the input grade. | |||||||||||||||||||||||||||||||||||||||||
31. | The median of a set of n measurements is a number such that half the n measurements fall below the median, and half fall above. If the number of measurements n is odd, the median is the middle number when the measurements are arranged in ascending or descending order. If the number of measurements n is even, the median is the average of the two middle measurements when the measurements are arranged in ascending or descending order. Write a program that requests a number n and a set of n measurements as input and then displays the median. | |||||||||||||||||||||||||||||||||||||||||
32. | Write a program with two buttons labeled Ascending Order and Descending Order that displays the eight vegetables in V8® in either ascending or descending alphabetic order. The vegetables (tomato, carrot, celery, beet, parsley, lettuce, watercress, and spinach) should be stored in a class-level array. |
1. | The outer loop controls the number of passes, one less than the number of items in the list. The inner loop performs a single pass, and the jth pass consists of n j comparisons. | ||||||||||||||||||
2. |
|