17.5 Debugging a Binary Search

I l @ ve RuBoard

The binary search algorithm is fairly simple. You want to see whether a given number is in an ordered list. Check your number against the one in the middle of the list. If it is the number, you were lucky ”stop. If your number is bigger, you might find it in the top half of the list. Try the middle of the top half. If it is smaller, try the bottom half. Keep trying and dividing the list in half until you find the number or the list gets down to a single number.

17.5.1 The First Bug, a Segmentation Fault

Example 17-2 uses a binary search to see whether a number can be found in the file numbers .dat .

Example 17-2. search/search0.cpp
 1: /********************************************************  2:  * search -- Search a set of numbers.                   *  3:  *                                                      *  4:  * Usage:                                               *  5:  *      search                                          *  6:  *              you will be asked numbers to lookup     *  7:  *                                                      *  8:  * Files:                                               *  9:  *      numbers.dat -- numbers 1 per line to search     * 10:  *                      (Numbers must be ordered)       * 11:  ********************************************************/ 12: #include <iostream> 13: #include <fstream> 14: #include <cstdlib> 15: #include <cstdio> 16: #include <assert.h> 17:  18: const int MAX_NUMBERS = 1000;   // Max numbers in file  19: const char *const DATA_FILE = "numbers.dat";// File with numbers  20:  21: int data[MAX_NUMBERS];  // Array of numbers to search  22: int max_count;          // Number of valid elements in data  23:  24: int main(  ) 25: { 26:     std::ifstream in_file;      // Input file  27:     int middle;         // Middle of our search range  28:     int low, high;      // Upper/lower bound  29:     int search;         // number to search for  30:  31:     in_file.open(DATA_FILE, std::ios::in); 32:     if (in_file.bad(  )) { 33:         std::cerr << "Error:Unable to open " << DATA_FILE << '\n'; 34:         exit (8); 35:     } 36:  37:     /* 38:      * Read in data  39:      */ 40:  41:     max_count = 0; 42:     while (true) { 43:         char line[30];  // Line from the input file 44:  45:         if (in_file.eof(  )) 46:             break; 47:  48:         in_file.getline(line, sizeof(line)); 49:  50:         assert(max_count >= 0); 51:         assert(max_count < sizeof(data)/sizeof(data[0])); 52:         std::sscanf(line, "0", data[max_count]); 53:         if (data[max_count] == -1) 54:             break; 55:  56:         ++max_count; 57:     } 58:  59:     while (true) { 60:         std::cout << "Enter number to search for or -1 to quit:" ; 61:         std::cin >> search; 62:  63:         if (search == -1) 64:             break; 65:  66:         low = 0; 67:         high = max_count; 68:  69:         while (true) { 70:             middle = (low + high) / 2; 71:  72:             assert(middle >= 0); 73:             assert(middle < sizeof(data)/sizeof(data[0])); 74:             if (data[middle] == search) { 75:                 std::cout << "Found at index " << middle << '\n'; 76:             } 77:  78:             if (low == high) { 79:                 std::cout << "Not found\n"; 80:                 break; 81:             } 82:  83:             assert(middle >= 0); 84:             assert(middle < sizeof(data)/sizeof(data[0])); 85:             if (data[middle] < search) 86:                 low = middle; 87:             else 88:                 high = middle; 89:         } 90:    } 91:    return (0); 92: } 

Here's our data file:

File: numbers.dat

 4  6  14  16  17  -1 

When we run this program in Unix, the results are:

 %  search  Segmentation fault (core dumped) 

When we run this program under Microsoft Windows, we get an application error (if we're lucky).

Either way this is not good. It means something went wrong in our program and the program tried to read memory that wasn't there. The debugger GDB can read this file and help us determine what happened :

 %  gdb search  GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.12 (m68k-sun-sunos4.0.3),  Copyright 1994 Free Software Foundation, Inc... (gdb)  run  Starting program: /usr/sdo/search/search  Program received signal SIGSEGV, Segmentation fault. 0xec46320 in number (  ) (gdb) 

The debugger tells us we have been killed by a segmentation fault generated from the procedure number . But we don't have a procedure number ! The routine must belong to the C++ library.

We now use the where command to find out which function called which function. This report is called a stack trace .

 (gdb)  where  #0  0xec46320 in number (  ) #1  0xec45cc2 in _doscan (  ) #2  0xec45b34 in sscanf (  ) #3  0x2400 in main (  ) at search.cpp:52 (gdb) 

The current function is printed first, then the function that called it, and so on until we reach the outer function main . From this we see that number was called by _doscan , which was called by sscanf . We recognize sscanf as a library routine. The other functions must be subroutines called by sscanf . The last function that had control was the call of sscanf, which was made from line 52 of main .

Now we use the list command to take a look at the source for this line:

 (gdb)  list 52  45              if (in_file.eof(  )) 46                  break; 47       48              in_file.getline(line, sizeof(line)); 49       50              assert(max_count >= 0); 51              assert(max_count < sizeof(data)/sizeof(data[0])); 52              sscanf(line, "%d", data[max_count]); 53              if (data[max_count] == -1) 54                  break; 55       56              ++max_count; (gdb)  quit  The program is running.  Quit anyway (and kill it)? (y or n)  Y  

Line 52 caused the problem.

Another way of finding the problem is to single-step through the program until the error occurs. First list a section of the program to find a convenient place to put the breakpoint, and then start the execution and single-step process:

 Script started on Mon Oct 31 10:07:19 1994 %  gdb search  GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.12 (m68k-sun-sunos4.0.3),  Copyright 1994 Free Software Foundation, Inc... (gdb)  list   main  20      const char *const DATA_FILE = "numbers.dat"; // File with nums 21       22      int data[MAX_NUMBERS];  // Array of numbers to search  23      int max_count;          // Number of valid elements in data  24      int main(  ) 25      { 26          std::ifstream in_file;   // Input file  27          int middle;         // Middle of our search range  28          int low, high;      // Upper/lower bound  29          int search;         // Number to search for  (gdb)  break main  Breakpoint 1 at 0x2318: file search.cpp, line 25. (gdb)  run  Starting program: /usr/sdo/search/search Breakpoint 1, main (  ) at search.cpp:25 26          ifstream in_file;   // Input file  (gdb)  step  31          in_file.open(DATA_FILE, ios::in); (gdb)  step  32          if (in_file.bad(  )) { (gdb)  step  41          max_count = 0; (gdb)  step  45              if (in_file.eof(  )) (gdb)  step  48              in_file.getline(line, sizeof(line)); (gdb)  step  50              assert(max_count >= 0); (gdb)  step  51              assert(max_count < sizeof(data)/sizeof(data[0])); (gdb)  step  52              sscanf(line, "%d", data[max_count]); (gdb)  step  Program received signal SIGSEGV, Segmentation fault. 0xec46320 in number (  ) (gdb)  quit  The program is running.  Quit anyway (and kill it)? (y or n)  y  

This method, too, points at line 52 as the culprit. On inspection we notice that we forgot to put an ampersand ( & ) in front of the third parameter for std:: sscanf . So we change line 52 from:

 std::sscanf(line, "%d", data[max_count]); 

to:

 std::sscanf(line, "%d", &data[max_count]); 

and try again.

You might wonder why we use the function sscanf when the line:

 in_file >> data[max_count]; 

performs the same function.

The answer is simple. We used sscanf to cause problems. Without the pointer error, we would have nothing to debug. The in_file statement is more reliable, and reliable code has no place in a chapter on debugging.

17.5.2 The Unintended Infinite Loop

Rather than fix the std::scanf call, we've decided to enter the 21st century and use C++ style I/O calls. The first number in our list is 4, so we try it. This time our output looks like:

 Enter number to search for or -1 to quit:  4  Found at index 0  Found at index 0  Not found  Enter number to search for or -1 to quit:  ^C  

The program should find the number, let us know it's at index 0, and then ask for another number. Instead we get two found messages and one not found message. We know that everything is running smoothly up to the time we get the first found message. After that things go downhill.

Getting back into the debugger, we use the list command to locate the found message and put a breakpoint there:

 %  gdb search  GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.12 (m68k-sun-sunos4.0.3),  Copyright 1994 Free Software Foundation, Inc... (gdb)  list 69,81  69              while (true) { 70                  middle = (low + high) / 2; 71 72                  assert(middle >= 0); 73                  assert(middle <sizeof(data)/sizeof(data[0]));       74                  if (data[middle] == search) { 75                    std::cout << "Found at index " << middle << '\n'; 76                  } 77       78                  if (low == high) { 79                      std::cout << "Not found\n"; 80                      break; 81                  } (gdb)  break 75  Breakpoint 1 at 0x249e: file search.cpp, line 71. (gdb)  run  Starting program: /usr/sdo/search/search  Enter number to search for or -1 to quit: 4 Breakpoint 1, main (  ) at search.cpp:71 75                    std::cout << "Found at index " << middle << '\n'; (gdb)  step  Found at index 0 78                  if (low == high) { (gdb)  step  83                  assert(middle >= 0); (gdb)  step  84                  assert(middle <sizeof(data)/sizeof(data[0]));       (gdb)  step  85                  if (data[middle] < search) (gdb)  step  88                      high = middle; (gdb)  step  70                  middle = (low + high) / 2; (gdb)  step  72                  assert(middle >= 0); (gdb)  step  73                  assert(middle <sizeof(data)/sizeof(data[0]));       (gdb)  step  74                  if (data[middle] == search) { (gdb)  step  75                    std::cout << "Found at index " << middle << '\n'; (gdb)  step  Found at index 0 78                  if (low == high) { (gdb)  quit  The program is running.  Quit anyway (and kill it)? (y or n)  y  

The program doesn't exit the loop. Instead it continues with the search. Because the number has already been found, this search results in strange behavior. We are missing a break after the cout .

We need to change:

 if (data[middle] == search) {          std::cout << "Found at index " << middle << '\n';      } 

to:

 if (data[middle] == search) {          std::cout << "Found at index " << middle << '\n';          break;      } 

Making this fix, we try the program again:

 %  search  Enter number to search for or -1 to quit:  4  Found at index 0  Enter number to search for or -1 to quit:  6  Found at index 1  Enter number to search for or -1 to quit:  3  Not found  Enter number to search for or -1 to quit:  5   program runs forever (or until we abort it)  

We have a runaway program. This time, instead of setting a breakpoint, we just start running the program. After a few seconds pass and we believe that we are stuck in the infinite loop, we stop the program with a control-C ( ^C ). Normally this would abort the program and return us to the shell prompt. Since we are running with the debugger, it returns control to GDB:

 %  gdb search  GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.12 (m68k-sun-sunos4.0.3),  Copyright 1994 Free Software Foundation, Inc... (gdb)  run  Starting program: /usr/sdo/search/search  Enter number to search for or -1 to quit:  5   ^C  Program received signal SIGINT, Interrupt. 0x2500 in main (  ) at search.cpp:79 79                  if (data[middle] < search) 

Now we can use the single-step command to step through the infinite loop, looking at key values along the way.

 87                  if (data[middle] < search) (gdb)  print middle  = 0 (gdb)  print data[middle]  = 4 (gdb)  print search  = 5 (gdb)  step  88                      low = middle; (gdb)  step  71                  middle = (low + high) / 2; (gdb)  step  73                  assert(middle >= 0); (gdb)  step  74                  assert(middle <sizeof(data)/sizeof(data[0]));       (gdb)  step  75                  if (data[middle] == search) { (gdb)  step  80                  if (low == high) { (gdb)  step  85                  assert(middle >= 0); (gdb)  step  86                  assert(middle <sizeof(data)/sizeof(data[0]));       (gdb)  step  87                  if (data[middle] < search) (gdb)  step  88                      low = middle; (gdb)  step  71                  middle = (low + high) / 2; (gdb)  step  73                  assert(middle >= 0); (gdb)  step  74                  assert(middle <sizeof(data)/sizeof(data[0]));       (gdb)  step  75                  if (data[middle] == search) { (gdb)  step  80                  if (low == high) { (gdb)  step  85                  assert(middle >= 0); (gdb)  step  86                  assert(middle <sizeof(data)/sizeof(data[0]));       (gdb)  step  87                  if (data[middle] < search) (gdb)  step  88                      low = middle; (gdb)  step  71                  middle = (low + high) / 2; (gdb)  step  73                  assert(middle >= 0); (gdb)  step  74                  assert(middle <sizeof(data)/sizeof(data[0]));       (gdb)  step  75                  if (data[middle] == search) { (gdb)  step  80                  if (low == high) { (gdb)  step  85                  assert(middle >= 0); (gdb)  step  86                  assert(middle <sizeof(data)/sizeof(data[0]));       (gdb)  step  87                  if (data[middle] < search) (gdb)  step  88                      low = middle; (gdb)  step  71                  middle = (low + high) / 2; (gdb)  step  73                  assert(middle >= 0); (gdb)  step  74                  assert(middle <sizeof(data)/sizeof(data[0]));       (gdb)  step  75                  if (data[middle] == search) { (gdb)  print   low  = 0 (gdb)  print   middle  = 0 (gdb)  print   high  = 1 (gdb)  print   search  = 5 (gdb)  print data[0]  = 4 (gdb)  print data[1]  = 6 (gdb)  quit  The program is running.  Quit anyway (and kill it)? (y or n)  y  

The problem is that we have reached a point where the following is true:

 low = 0  middle = 0  high = 1 

The item we are searching for falls exactly between elements 0 and 1. Our algorithm has an off-by-one error. Obviously the middle element does not match. If it did, we'd exit with a found at message. So there is no point including the middle element in our new search range. Our code to adjust the interval is:

 if (data[middle] < search)                  low = middle;              else                  high = middle; 

It should be:

 if (data[middle] < search)                  low = middle + 1;              else                  high = middle - 1; 

The full version of the corrected program is shown in Example 17-3.

Example 17-3. search/search4.cpp
 1: /********************************************************  2:  * search -- Search a set of numbers.                   *  3:  *                                                      *  4:  * Usage:                                               *  5:  *      search                                          *  6:  *              you will be asked numbers to lookup     *  7:  *                                                      *  8:  * Files:                                               *  9:  *      numbers.dat -- numbers 1 per line to search     * 10:  *                      (Numbers must be ordered)       * 11:  ********************************************************/ 12: #include <iostream> 13: #include <fstream> 14: #include <cstdlib> 15: #include <cstdio> 16: #include <assert.h> 17:  18: const int MAX_NUMBERS = 1000;   // Max numbers in file  19: const char *const DATA_FILE = "numbers.dat";// File with numbers  20:  21: int data[MAX_NUMBERS];  // Array of numbers to search  22: int max_count;          // Number of valid elements in data  23: int main(  ) 24: { 25:     std::ifstream in_file;      // Input file  26:     int middle;         // Middle of our search range  27:     int low, high;      // Upper/lower bound  28:     int search;         // number to search for  29:  30:     in_file.open(DATA_FILE, std::ios::in); 31:     if (in_file.bad(  )) { 32:         std::cerr << "Error:Unable to open " << DATA_FILE << '\n'; 33:         exit (8); 34:     } 35:  36:     /* 37:      * Read in data  38:      */ 39:  40:     max_count = 0; 41:  42:     while (true) { 43:         char line[30];  // Line from the input file 44:  45:         if (in_file.eof(  )) 46:             break; 47:  48:         in_file.getline(line, sizeof(line)); 49:  50:         assert(max_count >= 0); 51:         assert(max_count < sizeof(data)/sizeof(data[0])); 52:         std::sscanf(line, "0", &data[max_count]); 53:         if (data[max_count] == -1) 54:             break; 55:  56:         ++max_count; 57:     } 58:  59:     while (true) { 60:         std::cout << "Enter number to search for or -1 to quit:" ; 61:         std::cin >> search; 62:  63:         if (search == -1) 64:             break; 65:  66:         low = 0; 67:         high = max_count; 68:  69:         while (true) { 70:             if (low >= high) { 71:                 std::cout << "Not found\n"; 72:                 break; 73:             } 74:             middle = (low + high) / 2; 75:  76:             assert(middle >= 0); 77:             assert(middle < sizeof(data)/sizeof(data[0])); 78:             if (data[middle] == search) { 79:                 std::cout << "Found at index " << middle << '\n'; 80:                 break; 81:             } 82:  83:             assert(middle >= 0); 84:             assert(middle < sizeof(data)/sizeof(data[0])); 85:             if (data[middle] < search) 86:                 low = middle +1; 87:             else 88:                 high = middle -1; 89:         } 90:    } 91:    return (0); 92: } 
I l @ ve RuBoard


Practical C++ Programming
Practical C Programming, 3rd Edition
ISBN: 1565923065
EAN: 2147483647
Year: 2003
Pages: 364

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net