A common programming task is to search through a list of data for a specific value. The list might be a list of names , phone numbers, part numbers , or any one of a thousand other possibilities. Quite often, the reason for the search is to get a piece of information that can be used to find additional information associated with the list. For example, a list of names might have an account number tied to it, which then opens the door into a database. Therefore, because list searching is often the front end for other aspects of a program, we want it to be as efficient as possible.
Volumes have been written on how to organize and structure data. For the moment, we'll examine only two very simple ways to search data: a sequential search and a binary search.
The simplest method for search a list of data is to start at the first object in the list and compare it to the object you're looking for. If they match, you're done. If they don't match, you look at the next object for a match. This process of starting at the beginning of the list and proceeding through the list on a one-by-one basis is called a sequential search. Clearly, if you get lucky using a sequential search, you could find the object on the first try. If you're unlucky, you might find it after looking at every object in the list.
On average, a sequential search will examine N /2 objects in a list of N objects before finding a match. Therefore, if you have 100 objects in the list, on average, you'll look at 50 items to find a match. Although this works, it isn't very efficient.
A binary search of a list of data is much more efficient. That's the good news. The bad news is that a binary search requires that the list being searched be in sorted order. To understand how a binary search works, suppose that your friend wants you to guess a number he has in mind that's between 1 and 100. What's the best way to find the number in the fewest guesses? We'll assume that your friend will tell you if your guess is too high, too low, or the correct number.
To illustrate , let's assume that your friend is thinking of the number 77. You should guess 50 on your first try. Your friend responds, "Too low." You now can eliminate half of the numbers in the list because the number must fall between 51 and 100. (We'll assume that your friend doesn't lie.)
With your next guess, you should again try to eliminate half the list, so you would guess 75. Your friend would again respond, "Too low." You now know the number falls between 76 and 100. Again picking the midpoint of the numbers that remain , you guess 88. This time, the response is, "Too high." You now know the number falls between 76 and 87.
Picking the next midpoint yields a guess of 81. The response is, "Too high." The value must fall between 76 and 80. Again selecting the midpoint, you try 78 and the response is, "Too high." Now you know the number is either 76 or 77. You try 77 and get the correct value.
This process of dividing the range in half with each guess is called a binary search. Notice how you were able to zero in on the number with only 6 guesses. If you had used a sequential search, you would have had to make 77 guesses to find the number. When the length of the list is in the thousands or millions of items, the performance gain for the binary search is significant.
Let's take a closer look at the binary search process. Given that our program already sorts the data, let's extend the program to perform a binary search.
We need to add a few more objects to our form, as shown in Figure 14.6.
Figure 14.6. Form and objects for the binary search program.
We added a new label named Label2 , a text box named txtSearch , and a button named btnFind . The other controls are the same as they were in Figure 14.3.
Visual Basic .NET provides a binary search method that you can use on data arrays. Listing 14.4 shows the code for the program.
Listing 14.4 Complete Code Using the BinarySearch() Method
Public Class frmRandomNumbers Inherits System.Windows.Forms.Form Dim Elements, Numbers() As Integer Region " Windows Form Designer generated code " Private Sub btnGen_Click(ByVal sender As System.Object, ByVal e As _ System.EventArgs) Handles btnGen.Click Dim i As Integer ReDim Numbers(Elements) ' Remember if Elements = 10, you actually get 11 (0 10) txtNumList.Text = "" For i = 0 To Elements Numbers(i) = CInt(Rnd() * 100) txtNumList.Text &= CStr(Numbers(i)) & vbCrLf Next i txtSearch.Visible = False Label2.Visible = False btnFind.Visible = False End Sub Private Sub btnExit_Click(ByVal sender As System.Object, ByVal e As _ System.EventArgs) Handles btnExit.Click Me.Dispose() End Sub Private Sub txtElements_Leave(ByVal sender As Object, ByVal e As _ System.EventArgs) Handles txtElements.Leave Elements = CInt(txtElements.Text) End Sub Private Sub btnSort_Click(ByVal sender As System.Object, ByVal e As _ System.EventArgs) Handles btnSort.Click Dim i As Integer Array.Sort(Numbers) txtNumList.Text = "" For i = 0 To Elements txtNumList.Text &= Format(i, "00") & " " & CStr(Numbers(i)) & vbCrLf Next txtSearch.Visible = True Label2.Visible = True btnFind.Visible = True End Sub Private Sub frmRandomNumbers_Load(ByVal sender As System.Object, _ ByVal e As System.EventArgs) Handles MyBase.Load txtSearch.Visible = False Label2.Visible = False btnFind.Visible = False End Sub Private Sub btnFind_Click(ByVal sender As System.Object, ByVal e As _ System.EventArgs) Handles btnFind.Click Dim val As Integer val = CInt(txtSearch.Text) txtSearch.Text &= " = " & CStr(Array.BinarySearch(Numbers, val)) End Sub End Class
There actually isn't much new in the program. In the frmRandomNumbers_Load event, we make the label and text box for the binary search value invisible. We also set the Visibility property for btnFind to False . These minor details simply prevent the user from trying to find a value in the list until after the list of sorted values is calculated and displayed.
The real work in the btnFind Click event is done with only two statements:
val = CInt(txtSearch.Text) txtSearch.Text &= " = " & CStr(Array.BinarySearch(Numbers, val))
The txtSearch text box accepts the user input for the value to find. The second statement then calls the BinarySearch() method of the Array class to find the value entered by the user. The first parameter to the method is the array to search ( Numbers ) and the second parameter is the value to find ( val ).
If BinarySearch() can find a match on val in Numbers() , it returns the index for the match. If the value occurs in the array multiple times, the index returned is for the first appearance in the array. If the value isn't found in the list, a negative value is returned. A sample run is shown in Figure 14.7.
Figure 14.7. Sample run for BinarySearch() array method.
If you look closely at Figure 14.7, you'll notice that there are two values of 71 in the array. In the Search for: text box, you'll see that the BinarySearch() method returned the index for the first 71 it found in the list.
Try running the program yourself to confirm the behavior of the BinarySearch() method. Also try some values that aren't present in the list. What happens in those cases?
Sometimes the array you want to search isn't in sorted order, and there's no reasonable means by which to sort it. That rules out BinarySearch() . Indeed, it may take more time to sort the array than it would just to do a sequential search of it. In such a situation, the IndexOf() method provides a viable alternative to the binary search.
The syntax for the IndexOf() method is
Element = Array.IndexOf( ArrayToSearch, ValueToFind )
The first parameter, ArrayToSearch , is the array that you want to search. The second parameter, ValueToFind , is the value in the array that you want to locate. If the value is in the array, the call returns the index of the element holding the value ( Element ). If the value cannot be found in the array, a negative value for Element is returned.
You might try changing the code presented in Listing 14.4 to use the IndexOf() method instead of BinarySearch() . I leave this as an exercise for the reader.