Goals and Possibilities of Attacks | Shellcoders Programming Uncovered (Uncovered series)

The final goal of any attack is to make the target system carry out some illegal or malicious operations. In other words, these must be operations that cannot be carried out legally. There are at least four methods of implementing an attack:

Reading confidential variables
Modifying confidential variables
Passing control to some secret function of a program
Passing control to the code passed to the victim by the intruder

Reading Confidential Variables

Passwords for accessing confidential information and/or logging into the system are the first candidates for this role. All such passwords are, some way or another, present in the address space of a vulnerable process. Most frequently, they are located at fixed addresses. ("System login" here is interpreted as the combination of user name and password, which provides the possibility of remotely passing control to a vulnerable application.)

In addition, the address space of the vulnerable process contains handles to secret files, sockets, identifiers of TCP/IP connections, and much more. Naturally, outside the current context they have no practical meaning. However, they can be used by the code passed by the intruder to the target system. For example, the intruder can establish an "invisible" TCP/IP connection concealed within the existing one.

Strictly speaking, memory cells storing pointers to other cells are not "secret" ones. However, knowing their contents considerably simplifies the attack. Otherwise , the attacker must determine the reference addresses manually. For example, assume that the vulnerable program contains the following code: char * p = malloc (MAX_BUF_SIZE) . Here, p is the pointer to the buffer that contains the secret password. Also assume that there is an overflow error in the program, which allows the intruder to read the contents of any cell of the memory space. The entire problem consists of finding that buffer. Scanning the entire heap takes too long and, furthermore, is potentially dangerous, because there exists the possibility of encountering an unallocated memory page, in which case the process would terminate abnormally. Automatic and static variables are more predictable in this respect. Therefore, the attacker must read first the contents of the p pointer and then the password, to which it points. Naturally, this is only an example, and the possibilities of read overflow are not limited to it.

As relates to read overflow as such, it can be implemented by at least the following four mechanisms: "loss" of the terminating zero in string buffers, pointer modification (see "Pointers and Indexes" section later in this chapter), index overflow (see the same section) and extra specifiers offered to printf and other functions intended for formatted output

Modifying Secret Variables

Variable modification offers the most possibilities for attack, including the following:

Offering fictitious passwords, file descriptors, TCP/IP identifiers, etc., to the vulnerable program
Modifying variables that control program branching
Manipulating indexes and pointers by passing control using arbitrary addresses (including addresses that contain the code specially prepared by the intruder)

Most frequently, modification of secret variables is implemented using sequential buffer overflow, which usually causes a cascade of side effects. For example, if after the end of the overflowing buffer there is the pointer to some variable, into which something is written after overflow, then the intruder will be able to overwrite any memory cell (except for the cells that are explicitly protected against modification, such as the code section or .rodata section).

Passing Control to Some Secret Program Function

Modification of pointers to the executable code provides the possibility of passing control to any function of a vulnerable program (there are certain problems with passing the arguments, however). Practically any program contains some functions available only to root and that provide some capabilities of control over the target system (such as creating a new user account, starting a remote control session, or starting files for execution). In some sophisticated cases, control is passed to the middle of the function (or even to the middle of machine instruction) to make the processor execute the instructions of the intruder even if the program developer didn't make a provision for anything of the sort .

Passing control is ensured either by changing the program execution logic of by replacing the pointers to the code. Both rely on modification of the program memory cells, which was considered earlier.

Passing Control to the Intruder's Code

This is a variant of the mechanism based on passing control to some secret function of the program. However, this time the role of the secret function is played by code prepared by the intruder and passed to the remote computer in some way. To achieve this goal, it is possible to use the overflowing buffer or any other buffer available to the intruder for direct modification and present in the address space of the vulnerable application (in this case, it must be located by more or less predictable addresses; otherwise, it will be impossible to know where to pass control).

Targets of Overflow Attacks

Overflow can overwrite memory cells of the following types: pointers, scalar variables, and buffers. Objects of the C++ language include both pointers (which point to the table of virtual functions if they are present within the object) and scalar data (if they are present). They do not represent a standalone entity and fit well within the framework of the previously-provided classification (more detailed information can be found in Hacker Debugging Uncovered by Kris Kaspersky.

Pointers and Indexes

In classic Pascal and other "regular" languages, there are no pointers. In C/C++, however, pointers are omnipresent. Most frequently, it is necessary to deal with pointers to data, and pointers to executable code (such as pointers to virtual functions and pointers to functions loaded by DLLs) are encountered more rarely. Contemporary Pascal (previously associated with the Turbo Pascal compiler and now also associated with Delphi) also cannot be imagined without pointers. Even although pointers are not explicitly supported, all dynamic data structures used within the language framework (including the heap and sparse arrays) entirely rely on pointers.

Pointers are convenient . They make programming easy, illustrative , efficient, and natural. At the same time, pointers are potentially dangerous in all respects. They might cause devastating consequences if handled with malicious purpose by a worm or a hacker. They might be considered deadly weapons. Going slightly ahead, it is necessary to mention that pointers of both types are potentially capable of passing control to unauthorized machine code.

Well, start with pointers to the executable code. Consider a situation, in which the overflowing buffer, buff , is followed by the pointer to the function, which is initialized before buffer overflow and called after it (it might be called not immediately but after a certain time interval). In this case, you'll see an analogue of the call function or, in other words, an instrument of passing control using any (or practically any) machine address, including the overflowing buffer (in which case control will be passed to the code prepared by the intruder).

Listing 4.3: Vulnerability to buffer overflow that overwrites the pointer to executable code

 code_ptr() {         char buff[8]; void (*some_func) ();         ...         printf("passws:"); gets(buff);         ...  some_func();  }

More detailed information about choosing target addresses will be provided later. For the moment, however, it is time to concentrate on searching for the pointers that are going to be overwritten. The return address of the function located at the bottom of the stack frame comes to mind first. It is necessary to mention, however, that to access this return address you'll need to cross the entire stack frame. No one can guarantee that you'll manage to succeed when carrying out this operation; in addition, lots of protection systems control its integrity (see Chapter 9 ).

Pointers to objects also represent a popular target of attack. C++ programs usually contain lots of objects, most of which are created by calling the new operator that returns the pointer to the newly-created object. Nonvirtual member functions of the class are called in the same way, as normal C functions (in other words, by their actual offset); therefore, they are not vulnerable to attack. Virtual member functions are called in a more complicated way, through the chain of the following operations: pointer to the object instance ’ pointer to the table of virtual functions ’ pointer to the required virtual function. Pointers to the table of virtual functions do not belong to the object and are encapsulated into each object instance, which is stored most frequently in memory and more rarely in register variables. Pointers to objects are placed either in the memory or in the registers, and many pointers can point to the same object. Among these objects, there might be ones that are located directly after the overflowing buffer.

A table of virtual functions (henceforth, simply a virtual table) doesn't belong to the object instance. On the contrary, it belongs to the object itself. In other words, there is one virtual table for every object. This is a simplified explanation, because in reality a virtual table is placed into every OBJ file, where members of this object are accessed (this is the effect of separate compiling). And, although linkers in most cases successfully eliminate redundant virtual tables, they are duplicated from time to time (these are minor details, however). Depending on the chosen development environment and on the programmer's professional skills, virtual tables are placed either into the .data section (which isn't write protected) or into the .rodata section (available only for reading). With all that being so, the latter case is encountered most frequently.

Now, for simplicity, consider applications containing virtual tables in the .data section. Suppose that the intruder manages to modify one of the elements of the virtual table. Then, if appropriate virtual function is called, a different code will gain control instead of that function. However, this goal is difficult to achieve. As a rule, virtual tables are located at the beginning of the .data section, in other words, before static buffers and far from automatic buffers. Note that it is difficult to specify this location more precisely because, depending on the operating system, the stack might be located either below the .data section or above it. Thus, sequential overflow is not suitable in this case, and the hacker must rely on index overflow, which is rather exotic.

Modifying a pointer to an object or a pointer to a virtual table is much easier, because they are located in the memory area available for modification; furthermore, they usually are located near overflowing buffers.

Modification of the this pointer results in replacement of the object's virtual functions. It is enough to find the pointer to the required function in the memory (or manually form it in the overflowing buffer) and set the this pointer to point at it so that the address of the next virtual function to be called would match the fictitious pointer. From the engineering point of view, this is a complicated operation because, in addition to virtual functions, objects contain variables, which are used actively. Resetting the this pointer changes the contents of these variables; consequently, it is highly probable that the vulnerable program would abnormally terminate much earlier than it calls the fictitious virtual function. It is possible to emulate the entire object; however, there is no guarantee that such an attempt would be successful. These points also relate to pointers to objects because, from the compiler's point of view, they have more common than different features. However, the presence of two dissimilar entities gives some freedom of choice to the attacker. In some cases, it might be preferable to overwrite the this pointer, and in some cases it would be better to overwrite the pointer to an object.

Listing 4.4: Vulnerability to sequential write overflow, with overwriting of the pointer to the virtual table

 class A{ public:         virtual void f() { printf("legal\n");}; }; main{) {         char buff[8]; A *a  =  new A;         printf("passwd:"); gets(buff); a -> f(); }

Listing 4.5: Disassembled listing of the vulnerable program with brief comments

 .text:00401000 main        proc near         ; CODE XREF: start + AFVp .text:00401000 .text:00401000 var_14  =  dword ptr -14h       ; this .text=00401000 var_10  =  dword ptr -l0h       ; *a .text=00401000 var_C = byte ptr -0Ch .text=00401000 var_4  =  dword ptr -4 .text:00401000 .text=00401000               PUSH EBP .text=00401001               MOV  EBP, ESP .text=00401003               SUB  ESP, 14h .text:00401003 ; Open the stack frame and reserve 14h bits .text:00401003 ; of the stack memory. .text=00401006               PUSH  4 .text=00401008               CALL  operator new(uint) .text:0040100D               ADD   ESP, 4 .text:0040100D ; Allocate memory for the new instance of object A .text:0040100D ; and obtain the pointer. .text:00401010               MOV   [EBP + var_10], EAX .text:00401010 ; Write the pointer to object into the var_10 variable. .text=00401010 ; .text:00401013               CMP   [EBP + var_10], 0 .text=00401017               JZ    short loc_401026 .text:00401017 ; Check whether the memory allocation was successful. .text=00401017 ; .text:00401019               MOV   ECX, [EBP + var_10] .text:0040101C               CALL  A::A .text:0040101C ; Call the constructor of object A. .text:0040101C ; .text:00401021               MOV   [EBP + var_14], EAX .text:00401021 ; Load the returned this pointer into the var_14 variable. .text=00401021 ; ... .text:0040102D loc_40102D:                  ; CODE XREF: main + 24^j .text:0040102D               MOV   EAX, [EBP + var_14] .text:00401030               MOV   [EBP + var_4], EAX .text:00401030 ; Take the this pointer and hide it in the var_4 variable. .text=00401030 ; .text=00401033               PUSH  offset aPasswd  ; "passwd:" .text=00401038               CALL  _printf .text:0040103D               ADD   ESP, 4 .text:0040103D ; Display the input prompt. .text:0040103D ; .text=00401040               LEA   ECX, [EBP + var_C] .text:00401040 ; The overflowing buffer is below .text:00401040 ; the object pointer and the .text:00401040 ; primary this pointer but above the derived this pointer, .text:00401040 ; which makes the latter vulnerable. .text=00401040 ; .text=00401043               PUSH  ECX .text=00401044               CALL  _gets .text:00401049               ADD   ESP, 4 .text:00401049 ; Read the string into the buffer. .text=00401049 ; .text:0040104C               MOV   EDX, [EBP + var_4] .text:0040104C ; Load the vulnerable this pointer into the EDX register. .text:0040104C ; .text:0040104F               MOV   EAX, [EDX] .text:0040104F ; Retrieve the virtual table address. .text:0040104F ; .text=00401051               MOV   ECX, [EBP + var_4] .text:00401051 ; Pass the this pointer to the function. .text=00401051 ; .text=00401054               CALL  dword ptr [eax] .text=00401054 ; Call the first virtual function of the virtual table. .text=00401054 ; .text=00401056               MOV   ESP, EBP .text=00401058               POP   EBP .text=00401059               RETN .text=00401059 main          ENDP

Consider a situation, in which an overflowing buffer is followed by the pointer to the scalar variable p and variable x , which at a certain instance of program execution is written by this pointer. (The order, in which the two variables follow each other, is of no importance; the only issue is making sure that the overflowing buffer would overwrite them both.) Also, assume that, starting from when buffer overflow takes place, neither the pointer nor the variable are changed. In case if they are, they are changed predictably. Then, depending on the state of the cells overwriting the original contents of the x and p variables, it will be possible to write any value x by the arbitrary address p ” and the vulnerable program will do this itself for the hacker. In other words, the hacker receives an analogue of the poke and PatchByte/PatchWord functions of the Basic and IDA-C languages, respectively. Some limitations might be imposed on the choice of arguments (for example, the gets function doesn't accept a zero character in the middle of the string). However, these limitations are not too stringent, and available capabilities are enough for the intruder to gain control over the system under attack.

Listing 4.6: Vulnerability to sequential write overflow and overwriting a scalar variable and pointer to data

 data_ptr() {         char buff[8]; int x; int *p;         printf("passws:"); gets(buff);         ...  *P = x;  }

The simplest approach is feeding the address to the function that already exists. Passing control directly to the overflowing buffer is much more difficult. This can be carried out in several ways. First, it is possible to find the jmp esp instruction in the memory and pass control to it, and the function will then pass control to the top of the stack frame, slightly below which the shellcode resides. There is only a small chance of reaching the shellcode without damage, having bypassed all the garbage that might be encountered on the way. However, such a chance exists. The second issue is that if the size of the overflowing buffer exceed the variability of its allocation in memory, it is possible to place a long chain of NOP commands before the shellcode and pass control into the middle in hopes that there will be no miss . This approach was used by the Love San worm, known for frequently missing and crashing the machine without infecting it. Third, if the attacker can influence static buffers located in the data segment (and their addresses are constant), then there won't be any problems with passing control there. After all, the shellcode didn't promise to be located in the overflowing buffer. It can be located anywhere . No one guarantees that in the course of buffer overflow the function will survive until the return, because everything located after the end of the buffer will be corrupted.

Indexes are a kind of pointer. In other words, these are relative pointers addressed in relation to some base. For example, p[i] can be represented as *(p + i) , which practically makes p and i equal in rights.

Figure 4.2: Using NOP s to simplify the penetration into the shellcode limits

Modification of indexes has its strong and weak points. The strong point is that pointers require you to specify the absolute address of the target cell, which usually is unknown, and the relative address can be easily computed. Indexes stored in char variables are free from the zero-character problem. Indexes stored in int variables can freely overwrite the cells located above the starting address (in other words, located at less significant addresses), and more significant bytes of the index contain FFh characters, which are more tolerable than zero characters .

However, in contrast to detection of index corruption, which is practically impossible (do not suggest duplicating their values in reserved variables), there are no difficulties with evaluating index correctness before using them. Therefore, most programmers do exactly that thing (although "most" doesn't mean "all"). Another weak point of indexes is their limited range, which is ±128/256 bytes for signed/unsigned char indexes and ±2147483648 bytes for signed int indexes.

Listing 4.7: Vulnerability to sequential write overflow with index overwriting

 index_ptr() {         char *p; char buff[MAX_BUF_SIZE]; int i;         p = malloc(MAX_BUF_SIZE); i = MAX_BUF_SIZE;         ...         printf("passws:"); gets(buff);         ...         // if ((i < 1)  (i > MAX_BUF_SIZE)) error         while(i--) p[i] = buff[MAX_BUF_SIZE - i]; }

Scalar Variables

Scalar variables, which are neither indexes nor pointers, are considerably less interesting for attackers , because in most cases their capabilities are limited. However, if there are no other vulnerabilities they will do. Combined use of scalar variables and pointers and indexes was already considered. Now, it is time to explain standalone use of scalar variables.

Consider a case, in which the end of the overflowing buffer is directly followed by the bucks variable initialized before the overflow occurs. After the overflow, this variable is going to be used for computing the sum of money billed to a banking account (not necessarily that of the intruder). Assume that the program carefully checks the input data and doesn't allow input of negative values ” but also doesn't control the integrity of the bucks variable. By varying its value as needed, the intruder will be able to easily bypass all checks and limitations.

Listing 4.8: Vulnerability to overflow with overwriting a scalar variable

 var_demo (float *rnoney_account) {          char buff[MAX_BUF_SIZE]; float bucks = CURRENT_BUCKS_RATE;          printf("input money:"); gets(buff);          if (atof(buff)<0) Error! Enter a positive value          ...          *money_account -= (atof(buff) * CURRENT_BUCKS_RATE); }

Although this example is somewhat artificial, it is illustrative. Modification of scalar variables allows the intruder to gain control over the system only in emergencies. However, it easily allows intruders to make a medley from numeric data. This point isn't much, but it is something. However, what cases can be considered emergencies? First, many programs contain debug variables left by the developers. These might be used, for example, to disable an authentication system. Second, there are lots of variables storing initial or maximum allowed values of other variables, such as loop counters. Consider, for example, the following construct: for (a = b; a < c; a++) *p++ = *x++ . Modification of the b and c variables will result in overflow of the p buffer, with all possible consequences. It is also possible to invent many other tricks, which are so numerous that it simply doesn't make sense to list them all. Overwriting scalar variables during overflow usually doesn't result in an immediate crash of the program. Therefore, such errors might remain undetected for a long time. So, be careful!

Arrays and Buffers

What interesting thing could be located in buffers? First, there are the strings stored in Pascal format. As you know, these are strings that contain the length field in the beginning. Overwriting this field causes a cascade of secondary overflow events. Vulnerability of buffers containing confidential information was already described. Listing 4.9 shows a practical, although somewhat artificial, example.

Listing 4.9: Vulnerability to sequential write overflow with overwriting another buffer

 buff_demo() {         char buff[MAX_BUF_SIZE];         char pswd[MAX_BUF_SIZE];         ...         fgets(pswd, MAX_BUF_SIZE, f);         ...         printf("passwd:"); gets(buff);         if (strncmp(buff, pwsd, MAX_BUF_SIZE))                                 // Wrong password         else                                 // correct password }

Buffers containing names of files to be opened are even more interesting. For example, it is possible to make the application write the confidential data into the file accessible by everyone or make the program open the public file instead of the confidential one. This can be achieved easily because several buffers following each other are not rare.

Specific Features of Different Overflow Types

Overflowing buffers might be in one of the following three locations of the process address space: the stack (also called automatic memory), the data segment (although under Windows 9 x /NT, this isn't actually a segment), or the heap (dynamic memory).

Stack overflow is the most common, although its significance is strongly exaggerated. The stack bottom varies from operating system to operating system, and the height of the stack top depends on the previous requests to the program. Thus, the absolute address of automatic variables is not known beforehand. On the other hand, automatic buffers are attractive because the return address from the function lies directly near their end. If this return address is overwritten, then a different branch of the program will gain control. The situation with the heap is much more complicated. Nevertheless, hackers manage to overflow even it.

Stack Overflow

Cases of automatic buffer overflow are the most frequent and the most perfidious. They occur frequently because the size of such buffers is hard-encoded at compile time, and procedures of checking the data being processed for correctness are either missing or implemented with blatant errors. They are perfidious because directly near automatic buffers there is the return address, overwriting which allows the intruder to pass control to arbitrary code.

In addition, the stack contains the pointer to the frame of the parent function saved by the compiler before opening the frame of the child function. In general, optimizing compilers supporting the "floating" frame technology do without it, using the stack top pointer as a normal general-purpose register. However, even superficial analysis detects a large number of vulnerable applications with the frame inside; therefore, this technique retains its importance. Modification of the stack frame corrupts addressing of local variables and arguments of the parent function and provides the attacker with the possibility of controlling them as desired. By setting the frame of the parent function to the chosen buffer, the intruder can assign any values to the variables or arguments of the parent function (including knowingly invalid ones, because the check of the arguments' validity is usually carried out before the call to any child functions, and automatic variables usually are not checked for correctness after initialization).

Important

Because after return from the child function, all local variables belonging to it are automatically released, it is not recommended that you use the child buffer for storing variables of the parent function (more precisely, it is possible to do so, but the hacker must be careful). It is better to use the heap, static memory, or automatic memory of the parallel thread influencing it indirectly.

The general scheme of stack memory allocation is shown in Fig. 4.3.

Free space

Automatic variables of the child function

Saved registers

Stack frame of the parent function

Return address to the parent function

Arguments of the child function

Automatic variables of the parent function

Saved registers

Stack frame of the grandparent function

Return address to the grandparent function

Arguments of the parent function

Stack bottom

Figure 4.3: Map of stack memory allocation

Above the stack frame are saved values of the registers, which are restored after exiting from the function. If the parent function stores critical variables in one or more such registers, then the attacker can freely influence them.

Next there is the area occupied by local variables (including overflowing buffer). Depending on the whim of the compiler, the latter might be located either on the top of the stack frame or in the middle of local variables. Variables located "below" the overflowing buffer might be overwritten during sequential overflow ” the most common type of overflow. Variables located "above" the overflowing buffer are overwritten only in the course of index overflow, which is encountered rarely.

Finally, above the stack frame is the free stack space. There is nothing to overwrite here, and this space can be used for auxiliary needs of the shellcode. At the same time, the hacker must bear in mind that, first, the stack size is limited and second, if one of the sleeping objects of the victim process suddenly wakes up, the contents of the free stack memory will be modified. To avoid such a situation, shellcode must "pull" the ESP register to the top level, thus reserving the required number of memory bytes. Because the stack memory belonging to the thread is allocated dynamically, any attempt to go beyond the limits of the page guard throws an exception. Thus, the hacker must not request more than 4 KB or read at least one cell from each page being reserved, going from bottom to top. More detailed information on this topic can be found in Advanced Windows by Jeffrey Richter.

Depending on the level of limitations implied on the maximum allowed length of the overflowing buffer, either local variables or auxiliary data structures might be overwritten. It is highly possible that the shellcode won't succeed in reaching the return address. Even if the shellcode achieves this, the function may crash long before its completion. Assume that directly after the end of the overflowing string buffer, there is a pointer from which something is read or into which something is written after the overflow. Because buffer overflow inevitably overwrites the pointer, any attempt at reading it causes an immediate exception and, consequently, abnormal termination of the program. It probably will be impossible to overwrite the return address by supplying the correct address to the pointer, because in Windows all addresses guaranteed to be available are located considerably lower than 01010101h ” the smallest address that can be inserted into the middle of the string buffer (see Chapter 10 for more details). Thus, buffers located at the bottom of the stack frame are preferred targets for overflow.

After the end of the return address lies the memory area belonging to parent functions and containing arguments of the child function, automatic variables of the parent function, saved registers and the stack frame of the grandparent function, the return address to the grandparent function, etc. (Fig. 4.3). In theory, an overflowing buffer can overwrite all this information (there are such aggressive buffers); however, in practice this is either unneeded or impossible. If the hacker can force the program to accept the correct return address (in other words, the return address pointing to the shellcode or to any address of the "native" code of the program), it will not return to the parent function and all machinations with the parent variables will remain unnoticed. If for some reason it is impossible to supply the correct return address, then, even more so, the parent function won't obtain control.

Reading the parent memory area is much more informative (see " Pointers and Indexes " ), because lots of interesting information can be encountered here, including confidential data (such as passwords or credit card numbers ), descriptors of secret files that cannot be opened in a normal way, and sockets of established TCP connections that can be used for bypassing firewalls.

Modification of the arguments of the child function is less practical, although sometimes it can be useful. Traditionally, there are lots of pointers among C/C++ arguments. As a rule, these are pointers to data; however, pointers to code can be encountered. From the attacker's point of view, they are the most promising because they allow the intruder to gain control over the program before it crashes. Naturally, pointers to data are also good targets for the attack ( especially those that allow writing of fictitious data at forced addresses, in other words, the ones that work like the poke function in Basic). However, to reach these arguments in the course of sequential buffer overflow, it is necessary to pass over the cells that are occupied by the return address.

Overwriting of the return address relates to one particularly interesting feature: the return address is the absolute address. Consequently, if the hacker needs to pass control directly to the overflowing buffer, it is necessary either to hope that the overflowing buffer of the vulnerable program will be located at the specific address (which can't be guaranteed) or to search for the mechanism of passing control to the stack top.

The Love San worm solved the problem by replacing the return address with the address of the jmp esp machine instruction located in the domain of the operating system. Drawbacks of such an approach are obvious. First, it won't work when the overflowing buffer is located below the stack top. Second, the location of the jmp esp instruction is closely related to the version of the operating system. However, there are no better methods of passing control.

Heap Overflow

Buffers located in dynamic memory are also vulnerable to overflow. Many programmers, lazy by nature, first allocate a fixed-size buffer and then define how much memory they actually use. They typically forget to handle correctly situations, in which there isn't enough memory. Buffers of two types are usually encountered in the heap: structure elements and dynamically-allocated memory blocks.

Assume that there is a structure called demo in the program, which contains a fixed-size buffer:

Listing 4.10: An example of a structure with an overflowing buffer (highlighted in bold)

 struct demo {         int a;  char buf[8];  int b; }

Casual handling of the data being processed (for example, lack of required checks in the required place) can result in overflow of the buf buffer and, consequently, in overwriting of the variables that follow it. These are member variables of the structure itself (in this case, variable b ), the strategy of whose modification will be typical and will observe the rules common for all overflowing buffers. The possibility of overwriting memory cells located beyond the limits of the allocated memory block is less evident. By the way, for the buffers that have monopolistic access to the entire allocated memory block, this is the only possible strategy. Consider the code in Listing 4.11. In your opinion, is there anything that can overflow?

Listing 4.11: An example of a dynamic memory block vulnerable to overflow

 #define MAX_BUF_SIZE  8 #define MAX_STR_SIZE  256 char *p  ;  ... p = malloc(MAX_BUF_SIZE); ... strncpy(p, MAX_STR_SIZE, str) ;

For a long time, it was assumed that here there was nothing to overflow. At most, it was possible to organize a trivial DoS attack. However, it was thought to be impossible to gain control over the target computer because of the chaotic distribution of dynamic blocks over the memory. The base address of the p block is generally arbitrary, and practically anything can be located beyond its end, including an unallocated memory region. Any attempt at accessing such a region results in an immediate exception, which in turn results in abnormal program termination.

However, this common point of view is erroneous. Currently, no one would be surprised by overflow of dynamic buffers. For a long time, this technology was used as a universal technique of gaining control, and not without success. For example, the much-talked-of Slapper worm, which is one of the few worms that infect UNIX machines, propagates in this manner. How is it possible? Consider the propagation mechanism of this worm in more detail.

Allocation and release of the dynamic memory takes place chaotically, and at any given instance any allocated block can be followed by another block. Even if several memory blocks are allocated sequentially, no one can guarantee that they will be allocated in the same order at every program start-up. This is because the order of memory block allocation depends on the size of the released memory buffers and the order, according to which they were freed. Nevertheless, the structure of auxiliary data structures that runs through dynamic memory as a kind of supporting framework is easily predictable, although it may differ from compiler to compiler.

There are lots of dynamic memory implementations . Different manufacturers use different algorithms. Allocated memory blocks may be supported by a tree or by a linked or double-linked list, references to which might be represented both by pointers and by indexes stored in the beginning or end of each allocated block or in a separate data structure. The latter method of implementation is encountered rarely.

Without diving deep into the technical details of the dynamic memory manager, it is possible to say that at least two auxiliary variables are related to each allocated memory block: the pointer (index) to the next block and the block allocation flag. These variables can be located before the allocated block, after it, or in a different location. When releasing the memory block, the free function checks the allocation flag of the next block and, if it is free, joins these two blocks together, updating the pointer. And, where there is a pointer, there practically always is the poke function. In other words, by overwriting the data after the end of the allocated block in strictly-measured doses, it is possible to modify practically any cell of the vulnerable program, for example, by redirecting some pointer to the shellcode.

Consider the dynamic memory organization, according to which all allocated blocks are connected using double-linked lists, the pointers to which are located in the beginning of every block (Fig. 4.4). In addition, adjacent memory blocks need not reside in adjacent elements of the list. This means that in the course of multiple allocation-and-release operations the list inevitably becomes fragmented , and constant defragmentation of this list is too inconvenient.

Figure 4.4: Dynamic memory blocks

Buffer overflow overwrites auxiliary structures of the next memory block, thus providing the possibility of modifying them (Fig. 4.5). However, what benefit will the attacker receive? Access to the cells of every block is carried out by the pointer returned to the program at the instance of its allocation, not by the "auxiliary" pointer that the intruder is going to overwrite. Auxiliary pointers are used exclusively by malloc/free (and similar functions). Modifying the pointer to the next or previous block allows the intruder to force the function to accept the address of the next allocated block, for example, by superimposing it over the available buffer. However, the hacker has no guarantees that this operation will be successful because, when allocating a memory block, the malloc function looks for the most suitable (from its point of view) memory region. As a rule, this is the first free block in a chunk matching the size of the requested one. Thus, there is no guarantee that the desired region will be suitable for it. Briefly speaking, the prospect is not too optimistic.

Pointer to the next block in chunk	Memory block 1
Pointer to the previous block in chunk
Size
Status (allocated/free)
Memory allocated to the block
Pointer to the next block in chunk	Memory block 2
Pointer to the previous block in chunk
Size
Status (allocated/free)
Memory allocated to the block

Figure 4.5: Approximate map of dynamic memory allocation

Release of memory blocks is a different matter. To reduce the fragmentation of the dynamic memory, the free function automatically joins the block currently being released with the next one, provided that the next block also is free. Because adjacent blocks might be located on opposite ends of the list that links them, the free function before connecting the foreign block must remove it from the chunk. This is carried out by concatenation of the previous and next pointers. In pseudocode, this operation appears approximately as follows : pointer to the next block in the chunk = pointer to the previous block in the chunk . Yet, this is nothing but the analogue of the poke function in Basic, which allows modification of any cell of the vulnerable program.

More details on this topic can be found in the " Once upon a free() " article published in issue 39h of the Phrack e-zine ( http://www.phrack.org ). This article is overstuffed with technical details of dynamic memory implementation in different libraries, but it is useful reading.

As a rule, the possibility of writing into the memory is used for modifying the import table to replace some API function, which is guaranteed to be called by the vulnerable program soon after the overflow takes place. The fate of the program is predefined, because the integrity of the supporting framework of dynamic memory is already violated and this instable construction can crash at any moment. It probably will be impossible to pass control to the overflowing buffer, however, because its address isn't known beforehand. The hacker must improvise under these circumstances. First, it is possible to place the shellcode in any other available buffer with a known address (see the next section). Second, among the functions of the vulnerable program it is possible to encounter the ones that pass control to the pointer passed to them, along with some argument (conventionally, denote such a function the f-function). After that, the only thing that remains for the hacker is to find an API function that takes the pointer to the overflowing buffer and replaces its address with that of the f-function. In C++ programs, with their virtual functions and this pointers, such situation is not rare, although it cannot be called common. However, when designing shellcode, it is not recommended that you rely on the standard solutions. Hackers have to be creative.

Be prepared that, in some implementations of the heap, indexes instead of pointers might be encountered. In general, indexes are relative addresses counted either from the first byte of the heap or from the current memory cell. The latter case is encountered most frequently (in particular, the library of the Microsoft Visual C++ 6.0 compilers is built exactly in this way). Thus, it is expedient to consider it in more detail. As was already mentioned, absolute addresses of the overflowing buffers are not known beforehand and change unpredictably depending on specific circumstances. However, the addresses of the cells that are the most desirable for modification are absolute addresses. What could be done about this? It is possible to investigate the strategy of allocation and release of the memory for the current application to detect the most probable combination, because surely some patterns in assigning addresses to the overflowing buffers can be detected . By carefully testing all possible variants one after another, the attacker sooner or later will succeed in gaining control over the server. However, before this successful attempt of the attack, the server might freeze a couple of times, which will disclose the attack and make administrators vigilant.

Overflowing Buffers in the .data Section

Overflowing buffers located in the .data section (static buffers) represent a goldmine from the intruder's point of view. This is the only type of buffer whose addresses are explicitly specified at link time and are constant for each version of a vulnerable application, no matter under which operating system is it running.

The main issue is that the .data section contains lots of pointers to functions, data, global flags, file descriptors, the heap, file names, text strings, buffers of some library functions, etc. However, to reap all this wealth, the hacker must spend some effort. If the length of the overflowing buffer happens to be strictly limited from above (which most often is the case), the attacker won't gain any advantages.

In addition, in contrast to the stack and the heap, which are guaranteed to contain pointers in specific locations and support universal mechanisms of obtaining control, with static buffers the attacker must rely only on fortune . Overflow of static buffers are rare and always occur according to the unique scenario, which cannot even be generalized or classified .