IDA ProThe Tool of the Trade | Hacking Ubuntu: Serious Hacks Mods and Customizations (ExtremeTech)

Binary Auditing Introduction

In order to audit binaries successfully, you must understand compiler-generated code correctly and accurately. There are many compiler code constructs whose purposes aren't intuitive or immediately obvious. This chapter will attempt to introduce binary auditors to most of the standard code constructs as well as non-standard code constructs that are often seen in code, with the hope of making compiled code almost as easy to understand as source code.

Stack Frames

Understanding the stack frame layout of any given function will make understanding the code much easier, and in some cases determining whether a stack-based overflow exists will be much easier as well. Although there are some common stack frame layouts on x86, nothing is standardized, and the layout is compiler-determined. The more common examples will be covered here.

Traditional BP-Based Stack Frames

The most common stack frame layout for functions is the traditional BP-based frame, where the frame pointer register, EBP , is a constant pointer to the previous stack frame. The frame pointer is also a constant location relative to which function arguments and local stack variables are accessed.

The prologue for a function using this traditional stack frame looks like the following in Intel notation.

 push ebp              // save the old frame pointer to the stack      mov ebp, esp          // set the new frame pointer to esp      sub esp, 5ch          // reserve space for local variables

At this point, local stack variables are located at a negative offset to EBP , and function arguments are located at a positive offset. The first function argument is found at EBP + 8 . IDA Pro will rename the location EBP + 8 to EBP+arg_0 .

Nearly all references to arguments and local stack variables will be made relative to the frame pointer in functions with this frame type. This stack layout has been very well documented and is the easiest to follow when auditing. Most code generated by MSVC++ and by gcc will make use of this stack frame.

Functions without a Frame Pointer

For the sake of optimization, many compilers will generate code that optimizes out the use of the frame pointer. Some compilers may even in some cases use the frame pointer register as a general purpose register. In this case, a function will access its arguments and local variables relative to the stack pointer ESP instead of the frame pointer. Although the frame pointer in a traditional stack frame is constant, the frame pointer floats location throughout the function, changing every time an operation pushes or pops something from the stack. The following example attempts to illustrate this.

 this_function:          push esi          push edi          push ebx              push dword ptr [esp+10h]      // first argument to this_function      push dword ptr [esp+18h]      // second argument to this_function      call some_function

When the function is first entered, the first argument is at ESP+4 . After saving three registers, that first argument is now at ESP+10h . After pushing the first function argument as a parameter to some_function , the second function argument is now located at ESP+18h .

IDA Pro makes an attempt to determine the location of the stack pointer at any given place in a function. By doing this, it tries to identify what stack pointer relative data accesses really refers to. However, when it does not know the calling conventions used by external functions, IDA Pro may get this wrong and create a very confusing disassembly. Sometimes, it may be necessary to manually calculate the location of the stack pointer at a certain point in a function in order to determine the size of stack buffers. Thankfully, this confusion does not happen too often.

Non-Traditional BP-Based Stack Frames

Microsoft Visual Studio .NET 2003 occasionally creates code with a stack frame that makes use of a constant frame pointer, although not in the traditional sense. When the frame pointer is constant, and all access to arguments and local stack variables are relative to it, it does not point to the calling function's frame pointer but rather to a location at a negative offset from where the traditional frame pointer would be. A sample function prologue might look like the following.

 push ebp           lea ebp, [esp-5ch]           sub esp, 98h

The first function argument would be located at EBP+64h , instead of the traditional location of EBP+8 . The memory range from EBP-3ch to EBP+5ch would be occupied by local stack variables.

The Windows Server 2003 operating system was compiled with code that contains this non-traditional BP-based frame , and it can be found throughout system libraries and services. At the time of writing, IDA Pro does not recognize this code construct and will completely misinterpret the local stack frame for functions with this type. Hopefully support for this compiler quirk will be added in the near future.

Calling Conventions

Different functions in an application may use different calling conventions, especially if parts of the application were written in different languages. It is useful to understand the different calling conventions seen in C-based languages. In general, only two calling conventions will be commonly seen in C or C++ code generated by MSVC++ or gcc.

The C Calling Convention

The C calling convention does not only refer to C code, but is a way of passing arguments and restoring the program stack. With this calling convention, function arguments are pushed onto the stack from right to left as they appear in the source code. In other words, the last argument is pushed first and the first argument is the last one pushed prior to the function call. It is up to the calling function to restore its stack pointer after the call returns. An example of the C calling convention is:

 some_function(some_pointer,some_integer);

This function call would look something like the following when using the C calling convention.

 push some_integer           push some_pointer           call some_function           add esp, 8

Note that the second function argument is pushed before the first and that the stack pointer is restored by the calling function. Since this function had two arguments, the stack pointer had to be incremented by 8 bytes. It is also common to see the stack restored by using the x86 instruction POP with the destination of a scratch register. In this example, it would have been possible to restore the stack by doing POP ECX twice, restoring 4 bytes each time.

The Stdcall Calling Convention

The other calling convention commonly seen in C and C++ code is Stdcall . Arguments are passed in the same order as in the C calling convention, with the first function argument being pushed to the stack last before the function call. However, it is generally up to the called function to restore the stack. This is usually done on x86 by using the return instruction that releases stack space. For example, a function that has three arguments and uses the Stdcall calling convention would return with RET 0Ch , releasing 12 bytes from the stack upon return.

Stdcall is generally more efficient because the calling function does not have to release stack space. Functions that accept a variable number of arguments, such as printf -like functions, cannot release the stack space taken up by their arguments. This must be done by the calling function, which has knowledge of how many arguments existed.

Compiler-Generated Code

Compilers can generate much code that can be confusing at first glance. Let's look at some of the common areas in which a compiler will add instructions and at ways that we can recognize these compiler-generated structures.

Function Layouts

The layout of compiler-generated code in a function is somewhat variable. A function will generally begin with a function prologue and end with a function epilogue and a return. However, a function does not necessarily have to end in a return, and it is fairly common to see a function with code after its return instruction. This code will eventually jump back to the return instruction. Although a function may return in many places, the compiler will optimize the function's ability to jump to one common return location.

Since Visual Studio 6, the MSVC++ compiler has generated code with very unconventional function layouts. The compiler uses some logic to make determinations as to what branches are likely to be taken and which ones are less likely. Those deemed less likely are taken out of line of the main function and are placed as code fragments at far-off memory locations. These code snippets are often code that deals with uncommon error conditions or unlikely scenarios. However, vulnerabilities are often likely to exist in these code fragments and they should be reviewed when auditing binaries. These code fragments are often indicated by red jump arrows in IDA Pro and have been a common part of the MSVC++ compiled code for many years . IDA Pro does not deal properly with these code fragments and will not note accesses to local stack variables within them or graph them correctly.

In highly optimized code, several functions may share code fragments. For example, if several functions return in the same manner and restore the same registers and stack space, it is technically possible for them to share the same function epilogue and return code. However, this is quite uncommon and has only really been seen within NTDLL on Windows NT operating systems.

If Statements

if statements are one of the most common C code constructs and sometimes are very easy to see and interpret in compiled code. They are most often represented by the CMP or TEST instructions, followed by a conditional jump. The following example shows a simple C if statement and its corresponding assembly representation.

C Code:

 int some_int; if(some_int != 32)                     some_int = 32;

Compiled Representation ( ebp-4 = some_int ):

 mov eax, [ebp-4]           cmp eax, 32           jnz past_next_instruction           mov eax, 32

if statements are generally characterized by forward jumps or branches; however, this is not necessarily true, and reorganization of code by the compiler can create havoc with this problem. In some contexts, it will be very obvious that a conditional branch was an if statement, but in other contexts if statements are difficult to differentiate from other code constructs such as loops . A better understanding of the overall structure of a function should make it clear where if statements are found.

For and While Loops

Loop constructs within an application are a very common place to find vulnerabilities. Recognizing them within binaries is often a key part of auditing. While it's not really possible to absolutely distinguish different types of loops from one another in compiled code, recognizing them functionally within binaries is usually pretty simple. They are generally characterized by a backwards branch or jump that leads to a repeated section of code. The following example illustrates a simple while loop and its compiled representation.

C Code:

 char *ptr,*output,*outputend;           while(*ptr) {                         *output++ = *ptr++;                         if(output >= outputend)                               break;               }

Compiled Representation ( ecx = ptr, edx = output, ebp+8 = outputend ):

 mov al, [ecx]           test al, al           jz loop_end               mov [edx], al           inc ecx           inc edx                      cmp edx, [ebp+8]           jae loop_end           jmp loop_begin

The code could have been functionally the same as a simple for loop, which makes it difficult to determine what kind of statement was in the original source code. However, the code's functionality is more important than its original state as source code, and loops like the one shown here are the source of many errors in closed source applications.

Switch Statements

switch statements are generally rather complex constructs in assembly code and can sometimes lead to compiled code that looks a little bit strange . Depending on the compiler and on the actual switch statement, the constructed code might vary quite a lot in structure.

A switch statement can be inefficiently broken down into several if statements, and some compilers will do this in certain situations. The statements themselves may be simpler to understand, and an auditor reading the code may never suspect that the code in question was ever anything but a group of sequential if statements.

If the switch cases are sequential, the compiler will often generate a jump table and index it with the switch case. This is a very efficient way to deal with switches with sequential cases, but is not always possible. An example might look like the following.

C Code:

 int some_int,other_int; switch(some_int) {                    case 0:                other_int = 0;                break;           case 1:                other_int = 10;                break;           case 2:                other_int = 30;                break;           default:                other_int = 50;                break;      }

Compiled Representation ( some_int = eax, other_int = ebx ):

 cmp eax, 2           ja default_case               jmp switch_jmp_table[eax*4];     case_0:           xor ebx, ebx           jmp end_switch case_1:           mov ebx, 10           jmp end_switch case 2:           mov ebx, 30           jmp end_switch default_case:           mov ebx, 50 end_switch:

At a read-only location in memory, the data table switch_jmp_table would be found containing the offsets of case_0 , case_1 , and case_2 sequentially.

IDA Pro does a very good job of detecting switch statements constructed as above, and would indicate very accurately to the user which cases would be triggered by which values.

In the case where switch case values are not sequentially ordered, they cannot be easily or efficiently used as an index into a jump table. At this point, compilers often use a construct where the switch value is decremented or subtracted from until it reaches a zero value matching the switch case value. This allows the switch statement to efficiently deal with case values that are distant numerically . For example, if a switch statement was meant to deal with the case values 3, 4, 7, and 24 it might do so in the following manner (EAX = case value).

 sub eax, 3      jz case_three      dec eax      jz case_four      sub eax, 3      jz case_seven      sub eax, 17      jz case_twenty_four      jmp default

This code would deal correctly with all the possible switch cases, as well as with default values, and is commonly seen in code generated by modern MSVC++ compilers.

memcpy -Like Code Constructs

Many compilers will optimize the memcpy library function into some simple assembly instructions that are much more efficient than a function call. This type of memory copy operation can potentially be the source of buffer overflow vulnerabilities and can easily be recognized within a disassembly. The set of instructions used is the following:

 mov esi, source_address      mov ebx, ecx      shr ecx, 2               // length divided by four      mov edi, eax               // destination address      repe movsd               // copy four byte blocks      mov ecx, ebx      and ecx, 3               // remainder size      repe movsb               // copy it

In this case, the data is copied from the source register ESI to the destination register EDI . The data is copied in 4-byte blocks initially for the sake of speed by the instruction repe movsd . This copies ECX number of 4-byte blocks from ESI to EDI , which is why the length in ECX is divided by 4. The repe movsb instruction copies the remainder of the data.

memset is often optimized out in exactly the same manner using the repe stosd instruction with the AL register holding the character to memset . memmove is not optimized in this manner due to the possibility of overlapping data regions .

strlen-Like Code Constructs

Like memcpy , the strlen library function is often optimized into some simple x86 assembly instructions by certain compilers. Once again, this saves the overhead introduced by a function call. For those not familiar with compiler-generated code, the strlen code construction may seem strange at first. It generally looks much like the following:

 mov edi, string      or ecx, 0xffffffff      xor eax, eax      repne scasb      not ecx      dec ecx

The result of these instructions is that the length of the string is stored in the ECX register. The repne scasb instruction scans from EDI for the character stored in the low byte of EAX , which is zero in this case. For each character this operation examines, it decrements ECX and increments EDI .

At the end of the repne scasb operation, when a null byte is found, EDI is pointing one character past the null byte and ECX is the negative string length minus two. A logical NOT of ECX , followed by a decrement results in the correct string length in ECX . It is often common to see a sub edi, ecx immediately following the not ecx instruction, which resets EDI back to its original position.

This code construct will be widely used in any code that handles string data; therefore, you should recognize it and understand how it works.

C++ Code Constructs

Most modern closed source code in operating systems and servers is written in C++. In many respects, this code is constructed in ways very similar to plain C code. The calling conventions are very close, and for compilers that support both C and C++, the same assembly code generation engines are used. However, in certain ways, auditing C++ code is different, and some special cases must be mentioned. In general, auditing binaries composed of C++ code is a little more difficult than that written in plain C; however, with some familiarization it's not that much of a leap.

The this Pointer

The this pointer refers to a specific instance of the class to which the current function (method) belongs. this must be passed to a function by its caller; however it is not passed as a normal function argument might be. Instead, the this pointer is passed in the ECX register. This calling convention used in C++ code is called thiscall . The following code shows an example of a function passing a class pointer to another function.

 push    edi push    esi push    [ebp+arg_0] lea     ecx, [ebx+5Ch] call    ?ParseInput@HTTP_HEADERSqAEHPBDKPAK@Z

As you can see, a pointer is stored in the ECX register immediately before calling a function. In this case, the value stored in ECX is a pointer to a HTTP_HEADERS object. Since the ECX register is quite volatile, the this pointer is often kept in another register after a function call, but it is almost always passed in the ECX register.