Sources of Potential Threat

There are only three sources of potential threat. These are as follows :

  • Enforcement of a vulnerable program to accept fictitious classifiers

  • Inherited misbalance of specifiers

  • Natural overflow of the target buffer if the string is not checked for the maximum allowed length

Enforcement of Fictitious Specifiers

If the user input falls into the formatted output string (which happens frequently) and the specifiers located there are not filtered (which usually is the case), then intruders will be able to manipulate the formatted output interpreter at their discretion, causing access errors, reading and overwriting memory cells , and even, under most favorable circumstances, capturing full control over the remote system.

Consider the following example (Listing 6.2), which will be used many times in this book. Where do you think the vulnerability is located?

Listing 6.2: Demo example of a program vulnerable to various types of overflows errors
image from book
 f() {        char buf_in[32], buf_out[32];         printf("Enter the name:"); gets(buf_in);         sprintf(buf_out, "hello, %s!\n", buf_in);         printf(buf_out); } 
image from book
 

DoS Implementation

For the program provided in Listing 6.2 to terminate abnormally, it is enough to cause access violation by trying to access an unallocated , nonexistent, or blocked memory cell . This is not difficult. Having encountered the %s specifier , the formatted output interpreter retrieves from the stack the argument that corresponds to it, and it interprets this as the pointer to a string. If such an argument is missing, then the interpreter takes the first pointer encountered and begins to read the memory contents located at that address until it encounters a zero or an invalid cell. The policy of limiting access to cells varies from operating system to operating system. In particular, when accessing addresses 00000000h to 0000FFFFh and 7FFF000h to FFFFFFFFh , Windows NT always throws an exception. All other addresses might be available or not depending on the state of the heap, stack, and static memory.

Compile the example provided in Listing 6.2, and start it for execution. Instead of the user name, enter the string %s. The program will answer as in Listing 6.3.

Listing 6.3: Program's reaction to the %s specifier
image from book
 Enter the name:%s hello, hello, %s!\n!" 
image from book
 

To understand what is the "hello, %s!" string and where it comes from, it is necessary to analyze the stack status at the moment of the call to the printf (buf_out) function. To do so, it is necessary to use any debugger, for instance, the one supplied with Microsoft Visual Studio (Fig. 6.1).

image from book
Figure 6.1: Stack status at the moment of the call to the printf function

The 00l2FF5Ch double word goes first. (On Intel microprocessors, the least significant bit is located at the smaller address; in other words, all numbers in memory are written from right to left.) This is a pointer that corresponds to the argument of the printf function, which, in turn , corresponds to the buf_out buffer containing the unpaired "%s" specifier, which makes the printf function retrieve the next double word from the stack. This double word is garbage left by the previous function. Because of the current circumstances, both the pointer and the garbage point to the same buf_out buffer; therefore, no access violation takes place. At the same time, the hello word is displayed twice.

Now dig further, popping from the stack the following sequence of addresses: 00408000h (the pointer to the "hello, %s!\n" string), 0012FF3Ch (the pointer to the buf_out buffer), 0012FF3Ch (the same pointer), 0040800Ch (the pointer to the "Enter the name:" string), 73257325h (the contents of the buf_in buffer interpreted as a pointer, which, by the way, is pointing to an unallocated memory cell).

Thus, the first five %s specifiers pass through the interpreter of the formatted output without any problems; however, the sixth will "launch into space." The processor throws an exception, and program execution terminates abnormally (Fig. 6.2). It is not necessary to have exactly six specifiers, because all further ones will never gain control. Note that Windows NT will produce the same address as was planned.

image from book
Figure 6.2: Reaction of the demo program at the sequence of six %s specifiers

Peek Implementation

For viewing the contents of the memory of the vulnerable program, it is possible to use the following specifiers: %x, %d , and %c . Specifiers such as %x and %d retrieve a double word paired to them from the stack and display it in hexadecimal or decimal format, respectively. The %c specifier retrieves the paired double word from the stack, converts it to the single-byte char type, and displays it as a character, discarding the 3 most significant bytes. Thus, the most significant are the %x and %c specifiers.

Every %x specifier displays only one double word, which is located near the stack top (the exact location depends on the prototype of the function being called). Accordingly, N specifiers display 4*N bytes, and the maximum depth of viewing is equal to 2*C , where C is the maximum allowed size of user input in bytes. Alas! Reading the entire memory of a vulnerable application is impossible . The hacker will be able to read only a small piece, where some secret data might be encountered provided that the hacker is lucky enough (for instance, these might be passwords or pointers to them). Anyway, knowing the current pointer position is a good result. I will continue with this potential threat in more detail.

Start the demo program and enter the %x specifier. The program will answer as shown in Listing 6.4.

Listing 6.4: Program's reaction to the %X specifier
image from book
 Enter the name:%X hello, 12FF5C! 
image from book
 

Why 12FF5C ? Where does it come from? Return to the memory dump (see Fig. 6.1) and you'll see that this is the double word that follows the buf_out argument. It represents the result of activity of the previous function, or, so to say, garbage. However, what is the use of knowing this? The buffer contains the user input, which surely doesn't contain anything interesting. However, this is only the tip of the iceberg. As was already mentioned in Chapter 4 , to pass control to the shellcode, the hacker must know its absolute address. In most cases, this address is not known beforehand; however, the %x specifier makes the program display it on the screen.

Now enter several %x specifiers, separating them with blank characters for convenience even though the separation is not necessary. The program will respond as shown in Listing 6.5.

Listing 6.5: Viewing the memory dump using specifiers
image from book
 Enter the name:%X%X%X%X%X%X%X hello, 12FF5C 408000 12FF3C 12FF3C 40800C  25205825 58252058!  
image from book
 

Pay attention to the last two double words, which are in bold. They represent the contents of the user input buffer (the ASCII string %x in hexadecimal notation appears as 25 58 20 ).

The idea consists of forming the pointer to the required memory cell, placing it into the buffer, and then setting the %s specifier against it. This specifier reads the memory until the 0 byte or the prohibited cell is encountered. The 0 byte is not an obstacle , because it is enough to form a new pointer located after its tail. Prohibited cells are much more perfidious, because any attempt at accessing one causes an abnormal termination of the program. Until the administrator gets the server up and running again, the attacker will have to wait. After restart, the location of vulnerable buffers might be different, which will render useless all results that the attacker achieved before. Nothing ventured, nothing gained , but it is inexpedient to go off the deep end. In other words, the hacker must be careful with the %s specifier; otherwise , nothing but a DoS attack will result.

Assume that the hacker wants to read the memory contents at the 77F86669h address (by doing this, it is possible to determine the operating system version, which varies from computer to computer). Location of the user input buffer is known already ” meaningful data start from the sixth double word (see Listing 6.6). Now, the intruder must only prepare the "weapons and ammunition " required for the attack. For example, the attacker might enter the target address, writing it in inverse order and entering non-printable characters using the <ALT> key and numeric keypad. Then the attacker might add six %x, %d , or %c specifiers (because the contents of these cells are of no importance and any values will do), add some token (for instance, an asterisk or a colon ) that will be followed by the string output specifiers, and feed the result to the vulnerable program. The token is needed only to quickly determine where the garbage ends and meaningful data begin.

Listing 6.6: Manually viewing the memory dump at the artificially formed pointer
image from book
 Enter the name:if<ALT-248>w%C%C%C%C%C:%s hello, ifw \ <<:JIF@! 
image from book
 

If the string JIF @ is converted to hexadecimal format, then you'll obtain the following sequence: 8b 46 B3 40 3E B3 00 . Where does zero come from? Well, this is an ASIIZ string, and zero is the string terminator. If there was no terminating zero here, then the %s specifier would display much more information.

This example implements the analogue of the peek Basic function; however, it is limited in its capabilities. The pointer formed at the start of the buffer cannot contain the character; therefore, the first 17 MB of the address space will be unavailable for viewing. The pointer formed in the end of the buffer can point at practically any address, because the most significant byte of the address matches the terminating character. However, to access this pointer, the hacker will have to traverse the entire buffer, which is not always possible.

The disassembler states that the demo program contains the Microsoft's copyright notice at the 004053B4h address (Listing 6.7). Is it possible to display it on the screen? As you recall, the beginning of the buffer corresponds to the sixth specifier. Every specifier takes 2 bytes and pops 4 bytes from the stack. Two more bytes are required for the %s specifier that displays the string. How many specifiers is it necessary to pass to the program? Compose a simple linear equation, solve it, and you'll get the result ” 12. The first 11 specifiers pop all unneeded information from the stack, and the twelfth one displays the contents of the pointer located after them.

Listing 6.7: Disassembled fragment of the demo program
image from book
 .rdata:004053B4 aMicrosoftVisua db 'Microsoft Visual C++ Runtime Library',0 
image from book
 

The pointer is formed trivially: it is only necessary to open an ASCII character table (or, as a variant, start HIEW) and convert the value 4053B4h into a character representation. The result appears as follows: @s § . Turn it inside out and then feed it to the program, using the <ALT> key and numeric keypad as necessary (Listing 6.8).

Listing 6.8: Forming the pointer in the end of buffer and displaying it on the screen
image from book
 Enter the name:%c%c%c%c%c%c%c%c%c%c%c%s<Alt-180>S@ hello, \ <<%%%%%%Microsoft Visual C++ Runtime LibraryS@! 
image from book
 

Well, it works! Proceeding further in such a way, the hacker will be able to view practically the entire memory allocated to the program. By the way, Unicode functions working with wide characters use the 00 character for string termination and are tolerant to characters.

Poke Implementation

The %n specifier writes the number of bytes displayed at the moment into the pointer that is paired to it, thus allowing hackers to modify the contents of pointers at their discretion. Note that the pointer itself is not modified; on the contrary, the cell to which it points is modified. The cells to be modified must belong to the page with the PAGE_RERDWRITE attribute; otherwise, an exception will be generated.

Before demonstrating this capability, it is necessary to find a suitable pointer in the stack garbage and read its contents using something like the following string: %x %x %x... (see Listing 6.9). Assume that the l2FF3Ch pointer has been chosen , which points to the user input buffer ( buf_in ). To achieve this, it is necessary to pop two double words from the stack using the %c%c specifiers.

Listing 6.9: Overwriting the cell with the %n specifier
image from book
 Enter the name:qwerty%c%c%  n  hello, qwerty\ ! 
image from book
 

Now it is necessary to determine the number that will be written into the buffer. Only small numbers can be written, because large ones won't fit in the buffer. For distinctness, assume that this is the 0Fh number. Now compute: two characters are displayed by the specifiers that pop unneeded double words from the stack top, and seven are required for the hello , string (yes, it also is a participant). The result will be as follows: 0Fh - 02h - 07h == 06h . Thus, it is necessary to enter any six characters. Any characters can be chosen, for example, qwerty . It only remains to add the %n specifier and pass the formed string to the program, as in Listing 6.9.

Because modification of the buffer is carried out after its output, it is necessary to use the debugger to prove the modification. Load the program into Microsoft Visual Studio or any other debugger, set the breakpoint at the address 401000 (this is the address of the main function) or move the cursor to it (<Ctrl>+<G>, Address , 401000 , <Enter>), then press the <Ctrl>+<F10> combination to skip the start-up code instructions, which are of no interest for the moment.

Trace the program step by step by pressing <F10> ( Step Over), enter the specified string when the program prompts you to do so, and continue tracing until the 0040l03Ch line is reached, which calls the printf function. Next, go to the memory dump window and enter ESP in the address string, informing the debugger that you need to view the stack contents. After doing this, return to the disassembled code and press <F10> again.

The contents of the user input buffer will change immediately, highlighting the number OF 00 00 00 written at its beginning in red. Thus, the memory cell has been modified successfully (Fig. 6.3).

image from book
Figure 6.3: Demonstration of memory cell overwriting

Recall that if specifiers overlap the user input buffer, hackers can form the pointer on their own, overwriting the chosen memory cells arbitrarily. Well in a practically arbitrary manner. The limitations implied on the choice of target addresses are now complemented by limitations implied on the overwritten value. Note, by the way, that these limitations are stringent.

The lower limit is defined by the number of already-displayed characters (in this case, this is the lengths of the hello , string), and the maximum value is practically unlimited ” it is enough to choose a couple of pointers to strings of a suitable length and set the %s specifiers against them. Note, however, that there is no guarantee that such strings will be available. Therefore, it is not always possible to obtain control over the remote machine using formatted output. This is practically unrealistic . However, the hacker will be able to organize an efficient DoS attack. Strings like %n%n%n%n%n... drop the system much more efficiently than %S%S%S%S%....

Misbalance of Specifiers

Each specifier must have a paired argument. However, "must" doesn't mean that it is obliged to have one. After all, programmers have to manually enter specifiers and arguments, and they tend to err. The compiler will compile such a program normally or perhaps with some warnings. However, programmers also tend to ignore such warnings. But what will happen some time later?

If arguments happen to be more numerous than specifiers, then "extra" arguments will be ignored. However, if the situation is opposite , then the formatted output function, not knowing how many arguments it has been passed, will pop the first garbage that it encounters on the stack, and afterwards the events will develop according to the scenario described in the " Enforcement of Fictitious Specifiers " section. The only difference here will be that the intruder will have the possibility of implicitly enforcing the classifiers (or won't be able to do so).

Errors of this type are encountered only in programs written by beginners ; therefore, they are not urgent. In other words, describing them is not worth the paper.

Target Buffer Overflow

The sprintf function is one of the most dangerous C functions, and all security manuals state that it is much better to use its safe analogue ” snprintf . Why? This is because the nature of the formatted output is such that the maximum allow ed length of the resulting string is difficult to compute beforehand. Consider, for example, the code presented in Listing 6.10.

Listing 6.10: An example that demonstrates overflow of the target buffer
image from book
 f() {         char buf[???];         sprintf(buf, "Name:%s Age:%02d Weight:%03d Height:%03d\n",                 name, age, m, h);         ... } 
image from book
 

What do you think the size of the required buffer will be? Among unknown values are the length of the name string and the lengths of the integer variables age, m , and h , which the sprintf function converts into a character representation. At first glance, it seems logical that if you allocate two columns for age, three columns for height, and three columns for weight, then, having subtracted the length of the name and the length of the formatted string, only 8 bytes will be required. Is this correct? No! If the string representation of the data doesn't fit the allocated positions , it is automatically extended to avoid the truncations of the result. In reality, however, decimal representation of 32-bit values of the int type requires the programmer to reserve at least 11 bytes of memory; otherwise, the program will be vulnerable to the buffer overflow.

Overflow errors of this type occur according to the rules common for all overflowing buffers; therefore, they will not be considered here.



Shellcoder's Programming Uncovered
Shellcoders Programming Uncovered (Uncovered series)
ISBN: 193176946X
EAN: 2147483647
Year: 2003
Pages: 164

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net