Historical Inevitability of Overflow Errors

Overflow errors are fundamental programming errors, which are hard to trace. Their fundamental nature is due to the nature of the C programming language ” the most popular one for all times and nations ” or, to be more precise, its low-level interaction with the memory. Support of arrays is implemented only partially, and programmers must work with arrays with extreme care and caution. Tools for automatic control over the boundaries are missing, and there is no capability of controlling the number of array elements using the pointer. Zero- terminated strings are a separate issue.

The point is that the slightest negligence or incorrectly implemented argument check results in a potential vulnerability. The main problem is that a correct argument check is principally impossible . For instance, consider a function that determines the length of the string passed to it and reads that string, character by character, until it encounters the terminating zero. What happens if no terminating zero is encountered ? In this case, the function will go beyond the limits of the predefined memory block and start to process untouched land of someone else's memory, which it has no right to access. In the best case, this will throw an exception. In the worst case, confidential data will be accessed.

It is possible to pass the maximum length of the string buffer in a separate argument; however, no one can guarantee that it would be correct. After all, you'll have to form this argument manually, and no one is guaranteed against errors. Briefly, the called function must rely on the correctness of the arguments passed to it. Because this is so, any checks are out of the question. On the other hand, buffer allocation is possible only after computing the length of the data structure to be received. In other words, the buffer must be allocated dynamically. This hinders buffer allocation in the stack, because stack buffers have a fixed size , which is defined at the compile stage. On the other hand, stack buffers are automatically released when exiting the function. This relieves programmers from the need of carrying out this task and allows them to prevent potential problems related to memory leaks.

Dynamic buffers allocated in the heap are less popular, because the use of such buffers disfigures the program structure. In contrast to the situation that existed earlier, when error handling was reduced to immediate return, now before exiting the function it will be necessary to execute special code releasing all that was previously allocated by the programmer. Without the goto operator, the most popular target of everyone's criticism (which is error-prone in itself), this task can be carried out only by deeply nested if operators, structured exception handlers, macros, or external functions. Consequently, the program code becomes cluttered with structured exception handlers, macros, and external functions. This not only clutters the listing and obscures the entire source code but also becomes the source of random errors, which are hard to trace or reproduce.

Most library functions (such as gets and sprintf ) have no means of limiting the length of return data and, consequently, easily cause overflow errors. Manuals on security are full of recommendations instructing programmers to avoid using such functions and advising them to use their "safe" analogues instead, such as fgets and snprintf , which explicitly specify the maximum buffer length passed in a special argument. However, in addition to unjustified cluttering of the program listing with extraneous arguments and natural problems related to their synchronization (when working with complex data structures, the only buffer stores lots of stuff, computation of the length of the remaining "tail" ceases to be a trivial arithmetic problem, and errors become likely to occur), the programmer must control the integrity of the processed data. At the least, it is necessary to make sure that the data being processed weren't truncated. At most, it is necessary to make sure that the situation with data truncation is handled correctly. What could be done in this situation? It is possible to increase the buffer length and call the function again to copy the tail there. However, this is an awkward solution; furthermore, it is always possible to lose the terminating zero.

In C++, the situation with overflow errors is somewhat better, although there are still lots of problems. Support for dynamic arrays and "transparent" text strings has been implemented at last (which is good), but most implementations of dynamic arrays are slow. Strings implementation is even slower than implementation of dynamic arrays. Therefore, in critical situations it is better to abandon them altogether. It is simply impossible to proceed otherwise , because there is only one method of building dynamic arrays of variable length, namely, representing their contents in the form of some referential structure (such as a bidirectional list). For quick access to an arbitrary list element, the list must be indexed and the index table must be stored somewhere. Thus, reading or writing a single character needs tens of machine commands and multiple memory access operations. Therefore, it is necessary to remember that the memory always was the most serious bottleneck, considerably reducing overall system performance, and that the situation hasn't changed and is unlikely to do so.

Even if the compiler takes control over array boundaries (this requires one additional memory access operation and three or four machine commands), this won't solve the problem. In case of overflow, the compiled program won't be able to do anything better than terminate its execution abnormally. Don't even suggest exception calls, because if the programmer forgets to handle it (which is most frequently the case) this will allow a Denial-of-Service (DoS) attack. This isn't as dangerous as allowing the intruder to gain full control over the system; nevertheless, such situations must be avoided.

Thus, overflow errors always existed and are not likely to be eliminated. It is impossible to avoid them, and, because you are forced to coexist with them, it is necessary to study them in more detail.



Shellcoder's Programming Uncovered
Shellcoders Programming Uncovered (Uncovered series)
ISBN: 193176946X
EAN: 2147483647
Year: 2003
Pages: 164

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net