I l @ ve RuBoard |
The binary search algorithm is fairly simple. You want to see whether a given number is in an ordered list. Check your number against the one in the middle of the list. If it is the number, you were lucky ”stop. If your number is bigger, you might find it in the top half of the list. Try the middle of the top half. If it is smaller, try the bottom half. Keep trying and dividing the list in half until you find the number or the list gets down to a single number. 17.5.1 The First Bug, a Segmentation FaultExample 17-2 uses a binary search to see whether a number can be found in the file numbers .dat . Example 17-2. search/search0.cpp1: /******************************************************** 2: * search -- Search a set of numbers. * 3: * * 4: * Usage: * 5: * search * 6: * you will be asked numbers to lookup * 7: * * 8: * Files: * 9: * numbers.dat -- numbers 1 per line to search * 10: * (Numbers must be ordered) * 11: ********************************************************/ 12: #include <iostream> 13: #include <fstream> 14: #include <cstdlib> 15: #include <cstdio> 16: #include <assert.h> 17: 18: const int MAX_NUMBERS = 1000; // Max numbers in file 19: const char *const DATA_FILE = "numbers.dat";// File with numbers 20: 21: int data[MAX_NUMBERS]; // Array of numbers to search 22: int max_count; // Number of valid elements in data 23: 24: int main( ) 25: { 26: std::ifstream in_file; // Input file 27: int middle; // Middle of our search range 28: int low, high; // Upper/lower bound 29: int search; // number to search for 30: 31: in_file.open(DATA_FILE, std::ios::in); 32: if (in_file.bad( )) { 33: std::cerr << "Error:Unable to open " << DATA_FILE << '\n'; 34: exit (8); 35: } 36: 37: /* 38: * Read in data 39: */ 40: 41: max_count = 0; 42: while (true) { 43: char line[30]; // Line from the input file 44: 45: if (in_file.eof( )) 46: break; 47: 48: in_file.getline(line, sizeof(line)); 49: 50: assert(max_count >= 0); 51: assert(max_count < sizeof(data)/sizeof(data[0])); 52: std::sscanf(line, "0", data[max_count]); 53: if (data[max_count] == -1) 54: break; 55: 56: ++max_count; 57: } 58: 59: while (true) { 60: std::cout << "Enter number to search for or -1 to quit:" ; 61: std::cin >> search; 62: 63: if (search == -1) 64: break; 65: 66: low = 0; 67: high = max_count; 68: 69: while (true) { 70: middle = (low + high) / 2; 71: 72: assert(middle >= 0); 73: assert(middle < sizeof(data)/sizeof(data[0])); 74: if (data[middle] == search) { 75: std::cout << "Found at index " << middle << '\n'; 76: } 77: 78: if (low == high) { 79: std::cout << "Not found\n"; 80: break; 81: } 82: 83: assert(middle >= 0); 84: assert(middle < sizeof(data)/sizeof(data[0])); 85: if (data[middle] < search) 86: low = middle; 87: else 88: high = middle; 89: } 90: } 91: return (0); 92: } Here's our data file: File: numbers.dat 4 6 14 16 17 -1 When we run this program in Unix, the results are: % search Segmentation fault (core dumped) When we run this program under Microsoft Windows, we get an application error (if we're lucky). Either way this is not good. It means something went wrong in our program and the program tried to read memory that wasn't there. The debugger GDB can read this file and help us determine what happened : % gdb search GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.12 (m68k-sun-sunos4.0.3), Copyright 1994 Free Software Foundation, Inc... (gdb) run Starting program: /usr/sdo/search/search Program received signal SIGSEGV, Segmentation fault. 0xec46320 in number ( ) (gdb) The debugger tells us we have been killed by a segmentation fault generated from the procedure number . But we don't have a procedure number ! The routine must belong to the C++ library. We now use the where command to find out which function called which function. This report is called a stack trace . (gdb) where #0 0xec46320 in number ( ) #1 0xec45cc2 in _doscan ( ) #2 0xec45b34 in sscanf ( ) #3 0x2400 in main ( ) at search.cpp:52 (gdb) The current function is printed first, then the function that called it, and so on until we reach the outer function main . From this we see that number was called by _doscan , which was called by sscanf . We recognize sscanf as a library routine. The other functions must be subroutines called by sscanf . The last function that had control was the call of sscanf, which was made from line 52 of main . Now we use the list command to take a look at the source for this line: (gdb) list 52 45 if (in_file.eof( )) 46 break; 47 48 in_file.getline(line, sizeof(line)); 49 50 assert(max_count >= 0); 51 assert(max_count < sizeof(data)/sizeof(data[0])); 52 sscanf(line, "%d", data[max_count]); 53 if (data[max_count] == -1) 54 break; 55 56 ++max_count; (gdb) quit The program is running. Quit anyway (and kill it)? (y or n) Y Line 52 caused the problem. Another way of finding the problem is to single-step through the program until the error occurs. First list a section of the program to find a convenient place to put the breakpoint, and then start the execution and single-step process: Script started on Mon Oct 31 10:07:19 1994 % gdb search GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.12 (m68k-sun-sunos4.0.3), Copyright 1994 Free Software Foundation, Inc... (gdb) list main 20 const char *const DATA_FILE = "numbers.dat"; // File with nums 21 22 int data[MAX_NUMBERS]; // Array of numbers to search 23 int max_count; // Number of valid elements in data 24 int main( ) 25 { 26 std::ifstream in_file; // Input file 27 int middle; // Middle of our search range 28 int low, high; // Upper/lower bound 29 int search; // Number to search for (gdb) break main Breakpoint 1 at 0x2318: file search.cpp, line 25. (gdb) run Starting program: /usr/sdo/search/search Breakpoint 1, main ( ) at search.cpp:25 26 ifstream in_file; // Input file (gdb) step 31 in_file.open(DATA_FILE, ios::in); (gdb) step 32 if (in_file.bad( )) { (gdb) step 41 max_count = 0; (gdb) step 45 if (in_file.eof( )) (gdb) step 48 in_file.getline(line, sizeof(line)); (gdb) step 50 assert(max_count >= 0); (gdb) step 51 assert(max_count < sizeof(data)/sizeof(data[0])); (gdb) step 52 sscanf(line, "%d", data[max_count]); (gdb) step Program received signal SIGSEGV, Segmentation fault. 0xec46320 in number ( ) (gdb) quit The program is running. Quit anyway (and kill it)? (y or n) y This method, too, points at line 52 as the culprit. On inspection we notice that we forgot to put an ampersand ( & ) in front of the third parameter for std:: sscanf . So we change line 52 from: std::sscanf(line, "%d", data[max_count]); to: std::sscanf(line, "%d", &data[max_count]); and try again.
17.5.2 The Unintended Infinite LoopRather than fix the std::scanf call, we've decided to enter the 21st century and use C++ style I/O calls. The first number in our list is 4, so we try it. This time our output looks like: Enter number to search for or -1 to quit: 4 Found at index 0 Found at index 0 Not found Enter number to search for or -1 to quit: ^C The program should find the number, let us know it's at index 0, and then ask for another number. Instead we get two found messages and one not found message. We know that everything is running smoothly up to the time we get the first found message. After that things go downhill. Getting back into the debugger, we use the list command to locate the found message and put a breakpoint there: % gdb search GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.12 (m68k-sun-sunos4.0.3), Copyright 1994 Free Software Foundation, Inc... (gdb) list 69,81 69 while (true) { 70 middle = (low + high) / 2; 71 72 assert(middle >= 0); 73 assert(middle <sizeof(data)/sizeof(data[0])); 74 if (data[middle] == search) { 75 std::cout << "Found at index " << middle << '\n'; 76 } 77 78 if (low == high) { 79 std::cout << "Not found\n"; 80 break; 81 } (gdb) break 75 Breakpoint 1 at 0x249e: file search.cpp, line 71. (gdb) run Starting program: /usr/sdo/search/search Enter number to search for or -1 to quit: 4 Breakpoint 1, main ( ) at search.cpp:71 75 std::cout << "Found at index " << middle << '\n'; (gdb) step Found at index 0 78 if (low == high) { (gdb) step 83 assert(middle >= 0); (gdb) step 84 assert(middle <sizeof(data)/sizeof(data[0])); (gdb) step 85 if (data[middle] < search) (gdb) step 88 high = middle; (gdb) step 70 middle = (low + high) / 2; (gdb) step 72 assert(middle >= 0); (gdb) step 73 assert(middle <sizeof(data)/sizeof(data[0])); (gdb) step 74 if (data[middle] == search) { (gdb) step 75 std::cout << "Found at index " << middle << '\n'; (gdb) step Found at index 0 78 if (low == high) { (gdb) quit The program is running. Quit anyway (and kill it)? (y or n) y The program doesn't exit the loop. Instead it continues with the search. Because the number has already been found, this search results in strange behavior. We are missing a break after the cout . We need to change: if (data[middle] == search) { std::cout << "Found at index " << middle << '\n'; } to: if (data[middle] == search) { std::cout << "Found at index " << middle << '\n'; break; } Making this fix, we try the program again: % search Enter number to search for or -1 to quit: 4 Found at index 0 Enter number to search for or -1 to quit: 6 Found at index 1 Enter number to search for or -1 to quit: 3 Not found Enter number to search for or -1 to quit: 5 program runs forever (or until we abort it) We have a runaway program. This time, instead of setting a breakpoint, we just start running the program. After a few seconds pass and we believe that we are stuck in the infinite loop, we stop the program with a control-C ( ^C ). Normally this would abort the program and return us to the shell prompt. Since we are running with the debugger, it returns control to GDB: % gdb search GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.12 (m68k-sun-sunos4.0.3), Copyright 1994 Free Software Foundation, Inc... (gdb) run Starting program: /usr/sdo/search/search Enter number to search for or -1 to quit: 5 ^C Program received signal SIGINT, Interrupt. 0x2500 in main ( ) at search.cpp:79 79 if (data[middle] < search) Now we can use the single-step command to step through the infinite loop, looking at key values along the way. 87 if (data[middle] < search) (gdb) print middle = 0 (gdb) print data[middle] = 4 (gdb) print search = 5 (gdb) step 88 low = middle; (gdb) step 71 middle = (low + high) / 2; (gdb) step 73 assert(middle >= 0); (gdb) step 74 assert(middle <sizeof(data)/sizeof(data[0])); (gdb) step 75 if (data[middle] == search) { (gdb) step 80 if (low == high) { (gdb) step 85 assert(middle >= 0); (gdb) step 86 assert(middle <sizeof(data)/sizeof(data[0])); (gdb) step 87 if (data[middle] < search) (gdb) step 88 low = middle; (gdb) step 71 middle = (low + high) / 2; (gdb) step 73 assert(middle >= 0); (gdb) step 74 assert(middle <sizeof(data)/sizeof(data[0])); (gdb) step 75 if (data[middle] == search) { (gdb) step 80 if (low == high) { (gdb) step 85 assert(middle >= 0); (gdb) step 86 assert(middle <sizeof(data)/sizeof(data[0])); (gdb) step 87 if (data[middle] < search) (gdb) step 88 low = middle; (gdb) step 71 middle = (low + high) / 2; (gdb) step 73 assert(middle >= 0); (gdb) step 74 assert(middle <sizeof(data)/sizeof(data[0])); (gdb) step 75 if (data[middle] == search) { (gdb) step 80 if (low == high) { (gdb) step 85 assert(middle >= 0); (gdb) step 86 assert(middle <sizeof(data)/sizeof(data[0])); (gdb) step 87 if (data[middle] < search) (gdb) step 88 low = middle; (gdb) step 71 middle = (low + high) / 2; (gdb) step 73 assert(middle >= 0); (gdb) step 74 assert(middle <sizeof(data)/sizeof(data[0])); (gdb) step 75 if (data[middle] == search) { (gdb) print low = 0 (gdb) print middle = 0 (gdb) print high = 1 (gdb) print search = 5 (gdb) print data[0] = 4 (gdb) print data[1] = 6 (gdb) quit The program is running. Quit anyway (and kill it)? (y or n) y The problem is that we have reached a point where the following is true: low = 0 middle = 0 high = 1 The item we are searching for falls exactly between elements 0 and 1. Our algorithm has an off-by-one error. Obviously the middle element does not match. If it did, we'd exit with a found at message. So there is no point including the middle element in our new search range. Our code to adjust the interval is: if (data[middle] < search) low = middle; else high = middle; It should be: if (data[middle] < search) low = middle + 1; else high = middle - 1; The full version of the corrected program is shown in Example 17-3. Example 17-3. search/search4.cpp1: /******************************************************** 2: * search -- Search a set of numbers. * 3: * * 4: * Usage: * 5: * search * 6: * you will be asked numbers to lookup * 7: * * 8: * Files: * 9: * numbers.dat -- numbers 1 per line to search * 10: * (Numbers must be ordered) * 11: ********************************************************/ 12: #include <iostream> 13: #include <fstream> 14: #include <cstdlib> 15: #include <cstdio> 16: #include <assert.h> 17: 18: const int MAX_NUMBERS = 1000; // Max numbers in file 19: const char *const DATA_FILE = "numbers.dat";// File with numbers 20: 21: int data[MAX_NUMBERS]; // Array of numbers to search 22: int max_count; // Number of valid elements in data 23: int main( ) 24: { 25: std::ifstream in_file; // Input file 26: int middle; // Middle of our search range 27: int low, high; // Upper/lower bound 28: int search; // number to search for 29: 30: in_file.open(DATA_FILE, std::ios::in); 31: if (in_file.bad( )) { 32: std::cerr << "Error:Unable to open " << DATA_FILE << '\n'; 33: exit (8); 34: } 35: 36: /* 37: * Read in data 38: */ 39: 40: max_count = 0; 41: 42: while (true) { 43: char line[30]; // Line from the input file 44: 45: if (in_file.eof( )) 46: break; 47: 48: in_file.getline(line, sizeof(line)); 49: 50: assert(max_count >= 0); 51: assert(max_count < sizeof(data)/sizeof(data[0])); 52: std::sscanf(line, "0", &data[max_count]); 53: if (data[max_count] == -1) 54: break; 55: 56: ++max_count; 57: } 58: 59: while (true) { 60: std::cout << "Enter number to search for or -1 to quit:" ; 61: std::cin >> search; 62: 63: if (search == -1) 64: break; 65: 66: low = 0; 67: high = max_count; 68: 69: while (true) { 70: if (low >= high) { 71: std::cout << "Not found\n"; 72: break; 73: } 74: middle = (low + high) / 2; 75: 76: assert(middle >= 0); 77: assert(middle < sizeof(data)/sizeof(data[0])); 78: if (data[middle] == search) { 79: std::cout << "Found at index " << middle << '\n'; 80: break; 81: } 82: 83: assert(middle >= 0); 84: assert(middle < sizeof(data)/sizeof(data[0])); 85: if (data[middle] < search) 86: low = middle +1; 87: else 88: high = middle -1; 89: } 90: } 91: return (0); 92: } |
I l @ ve RuBoard |