White Box Testing | Hunting Security Bugs

In addition to black box testing for overflows, it is important to do code analysis and review. A number of approaches can be used to review the code for overflows:

Manual linear review In manual linear review, the code is reviewed by class or file. The main advantage to this is the ability to track review coverage. The main disadvantages of this method include spending time reviewing code that is never called or never called by an attacker s data and some difficulty in validating how callers use the code without extra research.
Following the input In input tracing review, the code is reviewed starting at the entry point of the data (the API that reads the network bytes, the file, the infrared port, or other input mechanism). The code is then reviewed, often in the debugger, to follow the data as it is copied , parsed, and output. The primary advantage of this method is that it tends to give higher code review coverage to more exploitable scenarios. One main disadvantage is that it is hard to track which code is reviewed in the process.
Looking for known dangerous functions In looking for known dangerous functions, the strategy is to take functions or other code constructs that are known to have caused problems in the past and to audit their use in the application. Although this can be an effective way to identify copies of known common issues, it isn t a thorough approach. For example, looking for the strcpy function might find bugs , but it will probably miss loops and other equivalent code that might still have overflows.
Automated code review In automated code review, the strategy is to employ a tool that can analyze the code and point out overruns. Although a number of these tools exist and some are improving in quality, most have fairly sketchy coverage and produce a rather high incidence of false positives. Of all tools, clearly the compiler is in the best position to do analysis of the source code itself. Microsoft Visual Studio 2005 with proper build flags, for example, gives compiler warnings for a number of functions that have been deprecated in favor of more secure versions.

Note

There are advantages and disadvantages to reviewing other programmers code versus reviewing your own. The main advantage in reviewing your own code is the fact that you are most familiar with it and hence you don t have to research how it works. Conversely, it is that research and different perspective of a new reviewer that is an advantage in spotting cases where you made the same incorrect assumption writing the code as in reviewing it. In any case, all critical code should be reviewed by programmers who understand buffer overruns and are familiar with how they look in code.

A number of code analysis utilities are beginning to emerge, but two worth mentioning are LCLint and Prefast :

LCLint LCLint is a static code analysis tool that looks through the code for common cases of buffer overruns. For more information, see http://www.usenix.org/events/sec01/full_papers/larochelle/larochelle_html/index.html .
Prefast Prefast is a static code analysis tool provided as part of Visual Studio 2005. For more information, see http://msdn2.microsoft.com/en-us/library/d3bbz7tz(en-US,VS.80).aspx .

Things to Look For

Programmers write overruns without realizing it ”and they are looking at the code while they write it. The question then arises, how does looking at the code help find overruns? It doesn t ”the key to finding overruns is to stop looking at the code itself and stop trying to make things work. Start looking at how the code handles the data, and start trying to make things break. Instead of asking, How does this function work? and What does this function do? you should start asking, How can this function be broken if an attacker reverse engineers it? and What assumptions doesn t this function validate that it should?

Important

How can anyone claim they have thoroughly reviewed or there are no overruns in a set of programming code unless that person first understands what the code does? When you are reviewing code for overruns and encounter functions or references you aren t familiar with, look up how these unfamiliar elements work rather than assuming they are fine as is.

Although we cannot present an encyclopedic algorithm for reviewing code to identify overruns, we can direct you to a few areas to focus on, which include places where data is copied, allocated, parsed, expanded, and freed.

Data Copying

Any time there is a data copy being performed, ask these questions:

How long could the actual input data potentially be?
What indicates the size of the data? How reliable is that indication? Are sizes specified in bytes or characters ? Is there enough room for a null character at the end of the data?
Is there any check to make sure the destination buffer actually was allocated?
Are counts of bytes and characters signed or unsigned? Have appropriate checks been done to ensure no integer overflows are possible?

The following code is vulnerable. Can you spot why?

   //Function copies a chunk of ANSI data //  and makes sure it is null terminated. //Returns true if the operation succeeds. //Note: This function contains a security bug. bool SecureCopyString(char *pDestBuff, size_t DestBuffSizeBytes,    const char *pSrcBuff, size_t SourceBuffSizeBytes) {    if ((!pDestBuff)  (!pSrcBuff)        (DestBuffSizeBytes < SourceBuffSizeBytes)        (DestBuffSizeBytes==0)  (SourceBuffSizeBytes==0))    {       return false;    }    memcpy(pDestBuff,pSrcBuff,SourceBuffSizeBytes);    //Does it need to be null terminated?    if (*(pDestBuff + SourceBuffSizeBytes - 1) != '    //Function copies a chunk of ANSI data // and makes sure it is null terminated. //Returns true if the operation succeeds. //Note: This function contains a security bug. bool SecureCopyString(char *pDestBuff, size_t DestBuffSizeBytes, const char *pSrcBuff, size_t SourceBuffSizeBytes) { if ((!pDestBuff)  (!pSrcBuff)  (DestBuffSizeBytes < SourceBuffSizeBytes)  (DestBuffSizeBytes==0)  (SourceBuffSizeBytes==0)) { return false; } memcpy (pDestBuff,pSrcBuff,SourceBuffSizeBytes); //Does it need to be null terminated ? if (*(pDestBuff + SourceBuffSizeBytes - 1) != '\0') { *(pDestBuff + SourceBuffSizeBytes) = '\0'; } return true; }   
 ')    {       *(pDestBuff + SourceBuffSizeBytes) = '    //Function copies a chunk of ANSI data // and makes sure it is null terminated. //Returns true if the operation succeeds. //Note: This function contains a security bug. bool SecureCopyString(char *pDestBuff, size_t DestBuffSizeBytes, const char *pSrcBuff, size_t SourceBuffSizeBytes) { if ((!pDestBuff)  (!pSrcBuff)  (DestBuffSizeBytes < SourceBuffSizeBytes)  (DestBuffSizeBytes==0)  (SourceBuffSizeBytes==0)) { return false; } memcpy (pDestBuff,pSrcBuff,SourceBuffSizeBytes); //Does it need to be null terminated ? if (*(pDestBuff + SourceBuffSizeBytes - 1) != '\0') { *(pDestBuff + SourceBuffSizeBytes) = '\0'; } return true; }   
 ';    }    return true; }

The null byte is sometimes written one byte past the end of the allocated buffer.

More Info

In general, even overruns that overflow the target buffer by one byte are exploitable. For more information about circumstances when similar bugs are exploitable, see The Frame Pointer Overwrite ( http://phrack.org/phrack/55/P55-08 ).

Duplicate Lengths or Size Data

If there is more than one place where the size of the data is stored, analyze whether the allocation and the copy routines use the correct sizes. Can you spot the problem in the following code? Hint: There is at least one.

   typedef struct structString {    wchar_t *pData;    size_t ulDataLength; } PACKETSTRING; typedef struct structField {    size_t FieldSize;    PACKETSTRING Data; } PACKETFIELD, *LPPACKETFIELD; LPPACKETFIELD CopyPacketField(const LPPACKETFIELD pSrcField) {    if (!pSrcField) return NULL;    if (pSrcField->FieldSize <       (sizeof(PACKETFIELD) + pSrcField->Data.ulDataLength))    {        return NULL;    }    LPPACKETFIELD fldReturn = (LPPACKETFIELD)malloc(pSrcField->FieldSize);    if (!fldReturn) return NULL;    memcpy(fldReturn,pSrcField,sizeof(PACKETFIELD));    fldReturn->Data.pData = (wchar_t*)(fldReturn + 1);    wmemcpy(fldReturn->Data.pData, pSrcField->Data.pData,    pSrcField->Data.ulDataLength);    return fldReturn; }

The allocated memory is based on pSrcField->FieldSize , whereas the actual amount of data copied is pSrcField->Data.ulDataLength . The data length check accidentally fails to multiply pSrcField->Data.ulDataLength by sizeof(wchar_t) , so it doesn t allocate enough memory. Can you spot another issue? What happens if pSrcField->Data.ulDataLength + sizeof(PACKET- FIELD) overflows? If pSrcField->FieldSize is sufficiently small (less than the overflowed pSrcField->Data.ulDataLength + sizeof(PACKETFIELD) ), a large amount of memory will be copied into a small buffer.

How about this code?

   #define min(a,b)            (((a) < (b)) ? (a) : (b)) bool CopyBuffer(char *pDestBuff, int DestBuffSize,    const char *pSrcBuff, int SrcBuffSize) {    if ((!pDestBuff)(!pSrcBuff)) return false;    if (DestBuffSize<=0) return false;    if (SrcBuffSize<0) SrcBuffSize=min(-SrcBuffSize,DestBuffSize);    if (SrcBuffSize > DestBuffSize) return false;    memcpy(pDestBuff,pSrcBuff,SrcBuffSize);    return true; }

This code has a bug when SrcBuffSize is exactly “2147483648. In a nutshell , “2147483648 looks like 1000 0000 0000 0000 0000 0000 0000 0000 in binary. To compute the negative of a signed data type, each bit is inverted (0111 1111 1111 1111 1111 1111 1111 1111), which yields positive 2147483647, and then the value is incremented by one, which overflows the most significant bit (leftmost), resulting in the original negative number. The (SrcBuffSize > DestBuffSize) upper bounds check passes because SrcBuffSize is negative. When memcpy is finally called, this huge negative number is converted back into its unsigned equivalent positive 2147483648, and that s how many bytes the computer tries to copy.

Parsers

Parsers that accept input from untrusted sources are particularly vulnerable to attack. It really pays to understand how your parsers work as well as the parsers your program relies on. It is amazing how often the parser programmer assumes the data is validated or input only in a certain format but the parser caller assumes the parser is robust against attacker-supplied data. A good general rule of thumb for parsers that are opaque to code analysis is to assume the parser is exploitable until proved otherwise .

In-Place Expansion of Data

One special case of overflows involves expansion of data. Examples of this include ANSI to Unicode, relative path expansion, and various encoding and decoding and decompression operations.

ANSI/OEM to and from Unicode The primary mistakes programmers make when converting from ANSI to UCS-2 (Unicode) include failure to null terminate the destination buffer with a full wide null character (two bytes) and calling the malloc function to allocate memory and passing in a character count instead of a byte count.

The main issue to look for in converting from UCS-2 to ANSI is the accidental assumption that all ANSI conversions will be half as large in memory as their UCS-2 counterparts. When it comes to UCS-2 characters with Double Byte Character Set (DBCS) ANSI equivalents, both forms use two bytes per character. Malicious input with UCS-2 input that converts to DBCS can lead to overruns.

Note

When we found our first overflow, we were testing a product that used secured Microsoft et databases. We attempted to enter a correct long DBCS password, and the product refused to open the database. At first, we thought this was a regular functionality bug. When the developers investigated, however, they discovered that the bug was an exploitable overrun . The programmer had assumed the conversion from Unicode to ANSI would generate a password half as long, so only half of the memory was allocated. When we tried to enter the DBCS password, the conversion that took place wrote past the end of the allocated space because the DBCS characters in their ANSI form each used two bytes, not one.

More Info

For more information about encodings, see Chapter 12, Canonicalization Issues , and http://www.microsoft.com/typography/unicode/cs.htm .

Relative Path Expansion Sometimes paths specified simply as ./foo.exe or the short c:\progra ^~ 1 or tokens %temp%\foo.tmp are expanded to their full glory and there isn t enough space allocated.

Encoding or Decoding The URL http://www.contoso.com/#%&#$) might be expanded to http://www.contoso.com/%23%25%26%23%24%29 , which increases its length some. Perhaps the logic the programmer used was the following:

Look at the URL specified and determine its length.
Is that length too long? If so, stop.
If not, URL escape the input (this can potentially expand the URL up to three or more times its length).

If before step 3 the programmer didn t check that the buffer used to expand the URL was large enough, the expansion might overflow when it takes place.

Failing to Null Terminate

To many, failure to end a string with the null byte might seem like a trivial bug. In practice, however, it hides very effectively. Consider, for example, that functions such as strncpy and RegQueryValueEx claim to end the string returned with a null ”most of the time but not always. To review code effectively, be on the lookout for cases where the developer makes an incorrect assumption about the function the program calls.

Failing to Reset Freed Pointers

It is generally good practice to reset unused pointers when you are finished with them. That way, other code that tries to write to the memory referenced by the pointer will not reference a new allocation on the heap or stack instead. Failing to reset unused pointers can also lead to memory leaks and double free bugs.

Overflow Exploitability

In the process of investigating buffer overruns and trying to exploit them a number of specific situations arise that present interesting cases. Although we are not trying to present a complete analysis of the topic of exploitability, some discussion is warranted because it is easy to make the wrong assumption about the exploitability of an overrun. The general rule of thumb is that if you can own one byte (or perhaps even fewer bits in some cases) of critical registers, you can usually ”through persistence and cleverness ”find a way to exploit the overrun.

Why is it important to determine whether an overflow is serious, or how serious it is likely to be? One of the problems with nearly every automated approach to finding overflows is that the approach tends to generate many potential candidate issues, several of which actually aren t necessarily more serious than ordinary crashes or hangs are. You might end up reporting 100 issues and all but 3 are really duplicates of the same issue or a failure on the programmer s part to check for null before dereferencing a pointer. Part of your job is to narrow down the number of issues by interpreting how serious the problems really are so that the important issues are prioritized appropriately.

One thing programmers often say is, Show me the exploit ”and all too often the virus writers have more time on their hands and are all too willing. We have seen overflows in which the programmer thought a particular overflow wasn t exploitable because it was able to be overflowed by only one byte (into EBP), and it would always be null . Eventually, the overflow was shown to be exploitable. If it isn t clear whether the bug is exploitable, often it is easier to fix the issue than it is to tell how exploitable the bug is.

Note

that arbitrary null byte overwrites (writing a null byte anywhere in memory) are typically exploitable. Some ways attackers might opt to exploit them include overwriting return addresses, base stack pointers, exception handlers, vtable entries changing the values of variables in memory (changing true to false) and trimming strings.

Suppose you have a crash and want to know how exploitable it is. The first thing to look for is whether EIP or EBP were controlled in any manner. If so, the overflow is exploitable. The next step is to look at the code or disassembly to see whether the cause of the exception can be identified. If so, that will often clarify how serious the issue is.

Some crashes/exceptions are not directly exploitable, but sometimes the input that generated the crash can be changed to cause a different code path or different conditions that would wind up being exploitable. Such a case is Pizza, an app that reads an untrusted input file and takes an order for a pizza.

The format of the input file is as follows :

1 byte ”crust
1 byte ”size
First byte ”size of the topping name , followed by the topping

A sample file looks something like the one shown in Figure 8-19.

Figure 8-19: A binary editor s view of a sample Meat.Pizza file

Running with the Meat.Pizza file results in the following:

 E:\Chapter8\Code\Pizza>Release\Pizza.exe Meat.Pizza Reading pizza file. Thick crust. Medium. Sounds delicious!

After editing the file some and retrying , you might discover a crash with the OverHeated.Pizza input file shown in Figure 8-20.

Figure 8-20: OverHeated.Pizza, which causes Pizza.exe to crash

When you debug the crash, you see this dialog box:

Look at the registers.

Where is the current point of execution?

Is this an overrun? Possibly, because this happens only with long data. At first, you might carelessly think this is not exploitable because you are simply writing a null value someplace in memory.

At this point it isn t clear whether this is exploitable. You could take three approaches to clarify:

Try changing the content of the data without changing the length to see if you can control where in memory this value is written. This doesn t seem like a valuable approach because it probably isn t very important if you can write the 0x00 someplace else.
Follow the disassembly up to see how ESI got its value.
Look at the code and debug the crash.

Satisfy your curiosity on the first point, and plug in the Try1.pizza input file in Figure 8-21. Notice you are using different long input to try to determine how the input influences what occurs when Pizza.exe crashes.

Figure 8-21: Changing the content of the input data

When you run the file in Figure 8-21 the following appears when you debug:

Aha! Last time you used aaaaaaa aaaa and crashed trying to write to 0x9EB19D95; this time you used bbbbbbbb bbbb and crashed trying to write to 0x9DB09C93. Those are different places. Hey, wait a minute! 0x9EB19D95 minus 0x9DB09C93 is 0x01010102, which is very close to how much you changed the data! Apparently, you can control where you write this data.

Now turn your attention to the second point mentioned earlier and follow the disassembly up to see how ESI got its value. ESI points to invalid memory when the program crashes. So where is ESI incorrectly set? When you look to find where ESI is changed most recently prior to the crash, you ll see this line of code revealed in the Disassembly window:

 0040104C 2B 33 sub esi,dword ptr [ebx]

This code takes what EBX points to and subtracts it from ESI. By looking at the disassembly, you can see that EBX doesn t change between this instruction and the crash, so the current value should work for your investigation. Look in the Memory window to see what EBX points to:

Somehow the program tried to do math on what it thought was a number but which was actually part of the input string. It looks like either you are overwriting data in memory that you should not be overwriting or the file parser was expecting a number instead of the data. However, the file format doesn t have 4-byte lengths, so it is most likely the former case.

What happens if attackers can make the data change so that they can work around this crash? What would they change the data to? Well, adding ESI plus [EBX] would give the value of ESI prior to the sub assembly instruction at 0x0040104C:

This memory looks like it might be on the stack from the addresses; look farther up in memory from there to see if you can find out what the sub subtraction might have been.

Presto! It looks like all 34 letter b characters are present, so you can figure out that the location in the data you need to overwrite is EBX (the value pulled to subtract) minus 0x0012FEC0 (the start of the data). The data will have to look like bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbXXXXbbbb , where XXXX is the value subtracted from ESI. Now, what value will you subtract? Well, 0x0012FEC0 isn t a bad choice because it is in your data and you wouldn t disturb other data. ESI was 0x0012FEF5 as calculated earlier, so you need to subtract (0x0012FEF5 “ 0x0012FEC0), or 0x35.

You can try to modify the input:

When you run with this (Try2.pizza), the debugger presents the following dialog box:

This is clearly exploitable. So what happened ?

With stack overflows, the architecture of the overrun is as follows.

   Function foo() { //Overrun happens here. //Other code runs. //Overrun is exploited when the function returns or some other key event takes place. }

Sometimes the other code that runs after the overrun uses values on the stack that are overwritten by the overrun. If no exception handler also is overrun, the application might crash in ways that aren t clearly overruns. Remember, when it isn t clear whether the overrun is exploitable, the main thing to focus on is an analysis of the code.

In general, when long input crashes and short input does not crash, you should consider the scenario likely to be exploitable unless good source code analysis proves otherwise. Good source code analysis usually costs more than fixing the overrun.

Unicode Data

Sometimes in the course of looking for an overflow, you might find an overflow where you control the right CPU registers to exploit the overflow but your data is Unicode-encoded (UCS-2) ( http://www.unicode.org ). If the input is a long string aaaaaaa , instead of 0x61616161 being overwritten you might see 0x61006100 or 0x00610061. Although these cases are still fairly easy to exploit, programmers sometimes mistakenly assume otherwise because every other byte is 0x00.

Despite the fact that you can use fancy means to successfully exploit the data even if every other byte is a zero, sometimes you can inject the payload directly into the UCS-2 data, which generally does not require every other byte to be a zero. This works because often Unicode and ASCII data both are stored in the file, or either is accepted. If the program notices the data is Unicode, it does not convert the data. Say, for example, you had data that looked as shown in Figure 8-22 when saved in a file.

Figure 8-22: Example of Unicode data

Why not simply replace the UCS-2 data with the exploit string? No rules suggest that the zeros must be preserved. In this case, UCS-2 is a dream for attackers because single null characters don t end the string; 0x0000 must end the string. As shown in Figure 8-23, notice that Unicode data does not necessarily have to contain null bytes.

Figure 8-23: Example of exploited Unicode

More Info

For more information, refer to Creating Arbitrary Shellcode in Unicode Expanded Strings ( http://www.nextgenss.com/papers/unicodebo.pdf ).

Filtered Data

Sometimes when you discover an overflow, the argument might well be, We don t need to fix that bug because only a handful of characters will ever make it through that network protocol to the weak application. Perhaps. But in many cases, there is an associated encoding mechanism to represent arbitrary data in that subset of characters.

UCS Transformation Format 8 (UTF-8) and other encodings provide for another way to encode the exploit such that there are no null bytes. Often, the data attackers provide can supply a characterization to the parser about how that data is formatted. This is covered in greater detail in Chapter 12. Suppose you are testing a popular antivirus product and examine the product to discover an overrun in how it processes the Content-Disposition e-mail header. If you want to send null data as part of the exploit, you might be able to do just that because Multipurpose Internet Mail Extensions (MIME) allows you to encode the data any way you please as follows:

  =?encoding?q?data?=

where you can hex-escape unprintable characters by using leading equal signs (=). For example, a space, hex 0x20, would be represented as =20. If you like, you could then encode the entire exploit in UTF-8 by properly escaping all of the characters to work around problems in getting the right bits to the vulnerable program.

Other encodings could be interesting, such as base-64 and uuencoding, depending on what the target program supports.