Auditing Memory Management

Memory management is a core element of every program, whether it is performed explicitly by the developer or implicitly by the programming language and runtime. To complete your understanding of programming building blocks you need to examine the common issues in managing memory, and the security-relevant impact of mismanagement. The following sections explore these issues and present you with a few tools to help make you more productive in identifying memory management vulnerabilities.

ACC Logs

Errors in memory management are almost always the result of length miscalculations; so one of the first steps in auditing memory management is to develop a good process for identifying length miscalculations. Some miscalculations stand out, but others are quite easy to miss. So there's a tool help you identify even the most subtle length miscalculations, called allocation-check-copy (ACC) logs. An ACC log is simply intended to record any variations in allocation sizes, length checks, and data element copies that occur on a memory block. An ACC log is divided into three columns for each memory allocation. The first column contains a formula for describing the size of memory that's allocated, which can be a formula or a static number if the buffer is statically sized. The next column contains any length checks that data elements are subjected to before being copied into the allocated buffer. The third column is used to list which data elements are copied into the buffer and the way in which they are copied. Separate copies are listed one after the other. Finally, you can have an optional fourth column, where you note any interesting discrepancies you determined from the information in the other three columns. Look at a sample function in Listing 7-33, and then examine its corresponding ACC log in Table 7-5.

Listing 7-33. Length Miscalculation Example for Constructing an ACC Log

int read_packet(int sockfd) {     unsigned int challenge_length, ciphers_count;     char challenge[64];     struct cipher *cipherlist;     int i;     challenge_length = read_integer(sockfd);     if(challenge_length > 64)         return -1;     if(read_bytes(sockfd, challenge, challenge_length) < 0)         return -1;     ciphers_count = read_integer(sockfd);     cipherlist = (struct cipher *)allocate(ciphers_count *                   sizeof(struct cipher));     if(cipherlist == NULL)         return -1;     for(i = 0; i < ciphers_count; i++)     {         if(read_bytes(sockfd, &cipherlist[i],                       sizeof(struct cipher) < 0)         {             free(cipherlist);             return -1;         }     }     ... more stuff here ... }

Table 7-5. ACC Log
	Allocation	Check	Copy	Notes
`challenge` variable	64	Supplied length is less than or equal to 64 (check is unsigned)	Copies length bytes	Seems like a safe copy; checks are consistent
`cipherlist` variable	`ciphers_count * sizeof (struct cipher)`	N/A	Reads individual ciphers one at a time	Integer overflow if `(ciphers_count > 0xFFFFFFFF) / sizeof(struct cipher)`

Listing 7-33 shows some code that reads a packet from a fictitious protocol and allocates and reads different elements from the packet. A sample ACC log is shown is Table 7-5.

In the ACC log, you record the specifics of how a buffer is allocated, what length checks are performed, and how data is copied into the buffer. This compact format quickly summarizes how dynamic memory allocations and copies are done and whether they are safe. Notice that the entry for the cipherlist variable mentions that ciphers are copied one at a time. This detail is important when you're determining whether an operation is safe. If this function did a single read of ciphers_count * sizeof(struct cipher), the allocation and copy lengths would be identical, so the code would be safe regardless of whether an integer overflow occurred. Checks sometimes happen before an allocation; if so, you might want to rearrange the first two columns to make the record easier to understand.

ACC logs are intended to help you identify length checks that could cause problems; however, they aren't a complete assessment of the memory safety of an operation. To understand this point, look at the following example:

    ciphers_count = read_integer(sockfd);     if(ciphers_count >= ((unsigned int)(~0))                          /sizeof(struct cipher))         return -1;     cipherlist = (struct cipher *)         allocate(ciphers_count * sizeof(struct cipher));     if(cipherlist == NULL)         return -1;

This code has a length check that you would add to your ACC record, but does this mean you can conclude this memory copy is secure? No. This function doesn't use a system allocator to allocate cipherlist; instead, it uses a custom allocate() function. To determine whether this code is secure, you need to consult your allocator scorecard (a tool introduced later in this section) as well. Only then could you conclude whether this allocation is safe.

The following sections present several examples of buffer length miscalculations you can use to test out your ACC logging skills. These examples help expose you to a variety of situations in which length miscalculations occur, so you're comfortable as you encounter similar situations in your own code assessments.

Unanticipated Conditions

Length miscalculations can arise when unanticipated conditions occur during data processing. In the following example, the code is printing some user-supplied data out in hexadecimal:

u_char *src, *dst, buf[1024]; for(src = user_data, dst = buf; *src; src++){     snprintf(dst, sizeof(buf) - (dst  buf), "%2.2x", src);     dst += 2; }

This developer makes the assumption, however, that snprintf() successfully writes the two bytes into the buffer because the loop always increments dst by 2 (as shown in the bolded line). If no bytes or only one byte were left in the buffer, dst would be incremented too far, and subsequent calls to snprintf() would be given a negative size argument. This size would be converted to a size_t and, therefore, interpreted as a large positive value, which would allow bytes to be written past the end of the destination buffer.

Data Assumptions

Quite often when auditing code dealing with binary data, you see that programmers tend to be more trusting of the content, particularly in applications involving proprietary file formats and protocols. This is because they haven't considered the consequences of certain actions or they assume that only their applications will generate the client data or files. Often developers assume that no one would bother to reverse-engineer the data structures necessary to communicate with their software. History has told a very different story, however. People can, and frequently do, reverse-engineer closed-source products for the purpose of discovering security problems. If anything, researchers are even more willing and prepared to scrutinize complex and proprietary protocols via manual analysis, blackbox testing, and automated fuzzing.

Some of the simplest examples of data assumption errors are those in which developers make assumptions about a data element's largest possible size, even when a length is specified before the variable-length data field! Listing 7-34 shows an example from the NSS library used in Netscape Enterprise (and Netscape-derived Web servers) for handling SSL traffic.

Listing 7-34. Buffer Overflow in NSS Library's ssl2_HandleClientHelloMessage

  csLen         = (data[3] << 8)  | data[4];   sdLen         = (data[5] << 8)  | data[6];   challengeLen  = (data[7] << 8)  | data[8];   cs            = data + SSL_HL_CLIENT_HELLO_HBYTES;   sd            = cs + csLen;   challenge     = sd + sdLen;   PRINT_BUF(7, (ss, "server, client session-id value:", sd,             sdLen));   if ((unsigned)ss->gs.recordLen != SSL_HL_CLIENT_HELLO_HBYTES                    + csLen + sdLen + challengeLen) {     SSL_DBG((       "%d: SSL[%d]: bad client hello message, len=%d should=%d",       SSL_GETPID(), ss->fd, ss->gs.recordLen,       SSL_HL_CLIENT_HELLO_HBYTES+csLen+sdLen+challengeLen));     goto bad_client;   }   ...   /* Squirrel away the challenge for later */   PORT_Memcpy(ss->sec.ci.clientChallenge, challenge,   challengeLen);

In Listing 7-34, the server takes a length field of challenge data supplied by the client, and then copies that much data from the packet into the ss->sec.ci.ClientChallenge buffer, which is statically sized to 32 bytes. The code simply neglects to check whether the supplied length is smaller than the destination buffer. This simple error is fairly commoneven more so in closed-source applications.

Order of Actions

Actions that aren't performed in the correct order can also result in length miscalculation. Listing 7-35 shows a subtle example of how this problem could occur.

Listing 7-35. Out-of-Order Statements

int log(int level, char *fmt, ...) {     char buf[1024], *ptr = buf, *level_string;     size_t maxsize = sizeof(buf) - 1;     va_list ap;     ...     switch(level){         case ERROR:             level_string = "error";             break;         case WARNING:             level_string = "warning";             break;         case FATAL:             level_string = "fatal";             break;         default:             level_string = "";             break;     }     sprintf(ptr, "[%s]: ", level_string);     maxsize -= strlen(ptr);     ptr += strlen(ptr);     sprintf(ptr, "%s: ", get_time_string());     ptr += strlen(ptr);     maxsize -= strlen(ptr);     va_start(ap, fmt);     vsnprintf(ptr, maxsize, fmt, ap);     va_end(ap);     ...

Listing 7-35 contains an error where it writes the time string, returned from get_time_string(), into the buffer. The ptr variable is incremented to the end of the time string, and then the string length of ptr is subtracted from maxsize. These two operations happen in the wrong order. Because ptr has already been incremented, maxsize is decremented by zero. Therefore, maxsize fails to account for the time string, and a buffer overflow could occur when vsnprintf() is called with the incorrect length.

Multiple Length Calculations on the Same Input

A common situation that leads to length miscalculations in applications is data being processed more than once at different places in the programtypically with an initial pass to determine the length and then a subsequent pass to perform the data copy. In this situation, the auditor must determine whether any differences exist between the length calculation code fragment and the data copy code fragment. The following code from Netscape Enterprise/Mozilla's NSS library shows code responsible for processing UCS2 data strings. The function iterates through the string and calculates the amount of space needed for output, and if the destination buffer is large enough, the function stores it. Listing 7-36 shows the loop for this calculation.

Listing 7-36. Netscape NSS Library UCS2 Length Miscalculation

R_IMPLEMENT(PRBool) sec_port_ucs2_utf8_conversion_function (  PRBool toUnicode,  unsigned char *inBuf,  unsigned int inBufLen,  unsigned char *outBuf,  unsigned int maxOutBufLen,  unsigned int *outBufLen ) {  PORT_Assert((unsigned int *)NULL != outBufLen);  if( toUnicode ) {    ..  } else {    unsigned int i, len = 0;    PORT_Assert((inBufLen % 2) == 0);    if ((inBufLen % 2) != 0) {      *outBufLen = 0;      return PR_FALSE;    }    for( i = 0; i < inBufLen; i += 2 ) {      if( (inBuf[i+H_0] == 0x00)         && ((inBuf[i+H_0] & 0x80) == 0x00) )        len += 1;      else if( inBuf[i+H_0] < 0x08 ) len += 2;      else if( ((inBuf[i+0+H_0] & 0xDC) == 0xD8) ) {        if( ((inBuf[i+2+H_0] & 0xDC) == 0xDC)           && ((inBufLen - i) > 2) ) {          i += 2;          len += 4;        } else {          return PR_FALSE;        }      }      else len += 3;    }

Note that there's a small variance when the data copy actually occurs later in the same function, as shown in the following code:

    for( i = 0; i < inBufLen; i += 2 ) {       if( (inBuf[i+H_0] == 0x00)          && ((inBuf[i+H_1] & 0x80) == 0x00) ) {         /* 0000-007F -> 0xxxxxx */         /* 00000000 0abcdefg -> 0abcdefg */         outBuf[len] = inBuf[i+H_1] & 0x7F;         len += 1;       } else if( inBuf[i+H_0] < 0x08 ) {         /* 0080-07FF -> 110xxxxx 10xxxxxx */         /* 00000abc defghijk -> 110abcde 10fghijk */         outBuf[len+0] = 0xC0 | ((inBuf[i+H_0] & 0x07) << 2)                              | ((inBuf[i+H_1] & 0xC0) >> 6);         outBuf[len+1] = 0x80 | ((inBuf[i+H_1] & 0x3F) >> 0);         len += 2;         ...

Do you see it? When the length calculation is performed, only one byte of output is expected when a NUL byte is encountered in the character stream because the H_0 offset into inBuf is used twice in the length calculation. You can see that the developer intended to test the following byte to see whether the high-bit is set but uses H_0 instead of H_1. The same mistake isn't made when the actual copy occurs. During the copy operation, you can clearly see that if the following byte has the highest bit set, two bytes are written to the output buffer because a second check is in the bolded if clause. Therefore, by supplying data containing the byte sequence 0x00, 0x80, you can cause more data to be written to the output buffer than was originally anticipated. As it turns out, the vulnerability can't be exploited in Netscape because the output buffer is rather large, and not enough input data can be supplied to overwrite arbitrary memory. Even though the error isn't exploitable, the function still performs a length calculation incorrectly, so it's worth examining.

Allocation Functions

Problems can occur when allocation functions don't act as the programmer expects. Why would they not act as expected? You supply a size, and the function returns a memory block of that size. It's simple, right? However, code doesn't always behave exactly as expected; when dealing with memory allocations you need to be aware of the unusual cases.

Larger applications often use their own internal memory allocation instead of calling the OS's allocation routines directly. These application-specific allocation routines can range from doing nothing except calling the OS routines (simple wrappers) to complex allocation subsystems that optimize the memory management for the application's particular needs.

You can generally assume that system libraries for memory allocation are used extensively and are presumably quite sound; however, the same can't be said for application-specific allocators because they run the gamut in terms of quality. Therefore, code reviewers must watch for erroneous handling of requests instead of assuming these custom routines are sound. You should audit them as you would any other complex codeby keeping a log of the semantics of these routines and noting possible error conditions and the implications of those errors.

Because allocation routines are so universal and try to achieve much the same purpose from application to application, the following sections cover the most common problems you should watch for.

Is It Legal to Allocate 0 Bytes?

Many code auditors know that requesting an allocation of 0 bytes on most OS allocation routines is legal. A chunk of a certain minimum size (typically 12 or 16 bytes) is returned. This piece of information is important when you're searching for integer-related vulnerabilities. Consider the code in Listing 7-37.

Listing 7-37. Integer Overflow with 0-Byte Allocation Check

char *get_string_from_network(int sockfd) {   unsigned int length, read_bytes;   char *string;   int n;   length = get_integer_from_network(sockfd);   string = (char *)my_malloc(length + 1);   if(!string)     return NULL;   for(read_bytes = 0; read_bytes < length; read_bytes += n){     n = read(sockfd, string + read_bytes,                  length  read_bytes);     if(n < 0){       free(string);       return NULL;     }   }   string[length] = '\0';   return string; }

In this code, attackers can specify a length that's incremented and passed to my_malloc(). The call to my_malloc() will be passed the value 0 when the length variable contains the maximum integer that can be represented (0xFFFFFFFF), due to an integer overflow. String data of length bytes is then read into the chunk of memory returned by the allocator. If this code called the malloc() or calloc() system allocation routines directly, you could conclude that it's a vulnerability because attackers can cause a large amount of data to be copied directly into a very small buffer, thus corrupting the heap. However, the code isn't using system libraries directly; it's using a custom allocation routine. Here is the code for my_malloc():

void *my_malloc(unsigned int size) {     if(size == 0)         return NULL;     return malloc(size); }

Although the allocation routine does little except act as a wrapper to the system library, the one thing it does do is significant: It specifically checks for 0-byte allocations and fails if one is requested. Therefore, the get_string_from_network() function, although not securely coded, isn't vulnerable (or, more accurately, isn't exploitable) to the integer overflow bug explained previously.

The example in Listing 7-37 is very common. Developers often write small wrappers to allocation routines that check for 0-byte allocations as well as wrappers to free() functions that check for NULL pointers. In addition, potential vulnerabilities, such as the one in get_string_from_network(), are common when processing binary protocols or file formats. It is often necessary to add a fixed size header or an extra space for the NUL character before allocating a chunk of memory. Therefore, you must know whether 0-byte allocations are legal, as they can mean the difference between code being vulnerable or not vulnerable to a remote memory corruption bug.

Does the Allocation Routine Perform Rounding on the Requested Size?

Allocation function wrappers nearly always round up an allocation size request to some boundary (8-byte boundary, 16-byte boundary, and so on). This practice is usually acceptable and often necessary; however, if not performed properly it could expose the function to an integer overflow vulnerability. An allocation routine potentially exposes itself to this vulnerability when it rounds a requested size up to the next relevant boundary without performing any sanity checks on the request size first. Listing 7-38 shows an example.

Listing 7-38. Allocator-Rounding Vulnerability

void *my_malloc2(unsigned int size) {     if(size == 0)         return NULL;     size = (size + 15) & 0xFFFFFFF0;     return malloc(size); }

The intention of the bolded line in this function is to round up size to the next 16-byte boundary by adding 15 to the request size, and then masking out the lower four bits. The function fails to check that size is less than the 0xFFFFFFF1, however. If this specific request size is passed (or any request size between 0xFFFFFFF1 up to and including 0xFFFFFFFF), the function overflows a 32-bit unsigned integer and results in a 0-byte allocation. Keep in mind that this function would not be vulnerable if size had been checked against 0 after the rounding operation. Often the difference between vulnerable and safe code is a minor change in the order of events, just like this one.

Are Other Arithmetic Operations Performed on the Request Size?

Although rounding up an unchecked request size is the most common error that exposes an allocation routine to integer vulnerabilities, other arithmetic operations could result in integer-wrapping vulnerabilities. The second most common error happens when an application performs an extra layer of memory management on top of the OS's management. Typically, the application memory management routines request large memory chunks from the OS and then divide it into smaller chunks for individual requests. Some sort of header is usually prepended to the chunk and hence the size of such a header is added to the requested chunk size. Listing 7-39 shows an example.

Listing 7-39. Allocator with Header Data Structure

void *my_malloc3(unsigned int size) {     struct block_hdr *hdr;     char *data;     data = (char *)malloc(size + sizeof(struct block_hdr));     if(!data)         return NULL;     hdr = (struct block_hdr *)data;     hdr->data_ptr = (char *)(data + sizeof(struct block_hdr));     hdr->end_ptr = data + sizeof(struct block_hdr) + size;     return hdr->data_ptr; }

This simple addition operation introduces the potential for an integer overflow vulnerability that is very similar to the problem in Listing 7-37. In this case, the my_malloc3() function is vulnerable to an integer overflow for any size values between 0xFFFFFFFF and 0xFFFFFFFF - sizeof(struct block_hdr). Any value in this range will result in the allocation of a small buffer for an extremely large length request.

Reallocation functions are also susceptible to integer overflow vulnerabilities because an addition operation is usually required when determining the size of the new memory block to allocate. Therefore, if users can specify one of these sizes, there's a good chance of an integer wrap occurring. Adequate sanity checking is rarely done to ensure the safety of reallocation functions, so code reviewers should inspect carefully to make sure these checks are done. Listing 7-40 shows a function that increases a buffer to make space for more data to be appended.

Listing 7-40. Reallocation Integer Overflow

int buffer_grow(struct buffer *buf, unsigned long bytes) {     if(buf->alloc_size  buf->used >= bytes)         return 0;     buf->data = (char *)realloc(buf->data,                                 buf->alloc_size + bytes);     if(!buf->data)         return 1;     buf->alloc_size += bytes;     return 0; }

The bolded code in Listing 7-40 shows a potentially dangerous addition operation. If users can specify the bytes value, bytes + buf->alloc_size can be made to wrap, and realloc() returns a small chunk without enough space to hold the necessary data.

Are the Data Types for Request Sizes Consistent?

Sometimes allocation functions can behave unexpectedly because of typing issues. Many of the typing issues discussed in Chapter 6 are especially relevant when dealing with allocators, as any mistake in type conversions more than likely results in a memory corruption vulnerability that's readily exploitable.

On occasion, you might come across memory allocators that use 16-bit sizes. These functions are more vulnerable to typing issues than regular allocators because the maximum value they can represent is 65535 bytes, and users are more likely to be able to specify data chunks of this size or larger. Listing 7-41 shows an example.

Listing 7-41. Dangerous Data Type Use

void *my_malloc4(unsigned short size) {     if(!size)         return NULL;     return malloc(size); }

The only thing you need to do to trigger a vulnerability is find a place in the code where my_malloc4() can be called with a value can be larger than 65535 (0xFFFF) bytes. If you can trigger an allocation of a size such as 0x00010001 (which, depending on the application, isn't unlikely), the value is truncated to a short, resulting in a 1-byte allocation.

The introduction of 64-bit systems can also render allocation routines vulnerable. Chapter 6 discusses 64-bit typing issues in more detail, but problems can happen when intermixing long, size_t, and int data types. In the LP64 compiler model, long and size_t data types are 64-bit, whereas int types occupy only 32 bits. Therefore, using these types interchangeably can have unintended and unexpected results. To see how this might be a problem, take another look at a previous example.

void *my_malloc(unsigned int size) {     if(size == 0)         return NULL;     return malloc(size); }

As stated previously, this allocation wrapper doesn't do much except check for a 0-length allocation. However, it does one significant thing: It takes an unsigned int parameter, as opposed to a size_t, which is what the malloc() function takes. On a 32-bit system, these data types are equivalent; however, on LP64 systems, they are certainly not. Imagine if this function was called as in Listing 7-42.

Listing 7-42. Problems with 64-Bit Systems

int read_string(int fd) {     size_t length;     char *data;     length = get_network_integer(fd);     if(length + 2 < length)         return -1;     data = (char *)my_malloc(length + 2);     ... read data ... }

The read_string() function specifically checks for integer overflows before calling the allocation routine. On 32-bit systems, this code is fine, but what about 64-bit systems? The length variable in read_string() is a size_t, which is 64 bits. Assuming that get_network_integer() returns an int, look at the integer overflow check more carefully:

    if(length + 2 < length)         return -1;

On an LP64 system both sides of this expression are 64-bit integers, so the check can only verify that a 64-bit value does not overflow. When my_malloc() is called, however, the result is truncated to 32 bits because that function takes a 32-bit integer parameter. Therefore, on a 64-bit system, this code could pass the first check with a value of 0x100000001, and then be truncated to a much smaller value of 0x1 when passed as a 32-bit parameter.

Whether values passed to memory allocation routines are signed also becomes quite important. Every memory allocation routine should be checked for this condition. If an allocation routine doesn't do anything except pass the integer to the OS, it might not matter whether the size parameter is signed. If the routine is more complex and performs calculations and comparisons based on the size parameter, however, whether the value is signed is definitely important. Usually, the more complicated the allocation routine, the more likely it is that the signed condition of size parameters can become an issue.

Is There a Maximum Request Size?

A lot of the previous vulnerability conditions have been based on a failure to sanity check request sizes. Occasionally, application developers decide to arbitrarily build in a maximum limit for how much memory the code allocates, as shown in Listing 7-43. A maximum request size often thwarts many potential attacks on allocation routines. Code auditors should identify whether a maximum limit exists, as it could have an impact on potential memory corruption vulnerabilities elsewhere in the program.

Listing 7-43. Maximum Limit on Memory Allocation

#define MAX_MEMORY_BLOCK 100000 void *my_malloc5(unsigned int size) {     if(size > MAX_MEMORY_BLOCK)         return NULL;     size = (size + 15) & 0xFFFFFFF0;     return malloc(size); }

The allocator in Listing 7-43 is quite restrictive, in that it allows allocating only small chunks. Therefore, it's not susceptible to integer overflows when rounding up the request size after the size check. If rounding were performed before the size check rather than after, however, the allocator would still be vulnerable to an integer overflow. Also, note whether the size parameter is signed. Had this argument been negative, you could evade this maximum size check (and wrap the integer over the 0-boundary during the rounding up that follows the size check).

Is a Different Size Memory Chunk Than Was Requested Ever Returned?

Essentially all integer-wrapping vulnerabilities become exploitable bugs for one reason: A different size memory chunk than was requested is returned. When this happens, there's the potential for exploitation. Although rare, occasionally a memory allocation routine can resize a memory request. Listing 7-44 shows the previous example slightly modified.

Listing 7-44. Maximum Memory Allocation Limit Vulnerability

#define MAX_MEMORY_BLOCK 100000 void *my_malloc6(unsigned int size) {     if(size > MAX_MEMORY_BLOCK)         size = MAX_MEMORY_BLOCK;     size = (size + 15) & 0xFFFFFFF0;     return malloc(size); }

The my_malloc6() function in Listing 7-44 doesn't allocate a block larger than MAX_MEMORY_BLOCK. When a request is made for a larger block, the function resizes the request instead of failing. This is very dangerous when the caller passes a size that can be larger than MAX_MEMORY_BLOCK and assumes it got a memory block of the size it requested. In fact, there's no way for the calling function to know whether my_malloc6() capped the request size at MAX_MEMORY_BLOCK, unless every function that called this one checked to make sure it wasn't about to request a block larger than MAX_MEMORY_BLOCK, which is extremely unlikely. To trigger a vulnerability in this program, attackers simply have to find a place where they can request more than MAX_MEMORY_BLOCK bytes. The request is silently truncated to a smaller size than expected, and the calling routine invariably copies more data into that block than was allocated, resulting in memory corruption.

Allocator Scorecards and Error Domains

When reviewing applications, you should identify allocation routines early during the audit and perform a cursory examination on them. At a minimum, you should address each potential danger area by scoring allocation routines based on the associated vulnerability issuescreating a sort of scorecard. You can use this scorecard as a shorthand method of dealing with allocators so that you don't need to create extensive audit log. However, you should still search for and note any unique situations that haven't been addressed in your scorecard, particularly when the allocation routine is complex. Take a look at what these allocator scorecards might look like in Table 7-6.

Table 7-6. Allocator Scorecard
Function prototype	`int my_malloc(unsigned long size)`
0 bytes legal	Yes
Rounds to	16 bytes
Additional operations	None
Maximum size	100 000 bytes
Exceptional circumstances	When a request is made larger than 100 000 bytes, the function rounds off the size to 100 000.
Notes	The rounding is done after the maximum size check, so there is no integer wrap there.
Errors	None, only if malloc() fails.

This scorecard summarizes all potential allocator problem areas. There's no column indicating whether values are signed or listing 16-bit issues because you can instantly deduce this information from looking at the function prototype. If the function has internal issues caused by the signed conditions of values, list them in the Notes row of the scorecard. For simple allocators, you might be able to summarize even further to error domains. An error domain is a set of values that, when supplied to the function, generate one of the exceptional conditions that could result in memory corruption. Table 7-7 provides an example of summarizing a single error domain for a function.

Table 7-7. Error Domain
Function prototype	`int my_malloc()`
Error domain	0xFFFFFFF1 to 0xFFFFFFFF
Implication	Integer wrap; allocates a small chunk

Each allocator might have a series of error domains, each with different implications. This shorthand summary is a useful tool for code auditing because you can refer to it and know right away that, if an allocator is called with one of the listed values, there's a vulnerability. You can go through each allocator quickly as it's called to see if this possibility exists. The advantage of this tool is that it's compact, but the downside is you lose some detail. For more complicated allocators you may need to refer to more detailed notes and function audit logs.

Error domain tables can be used with any functions you audit, not just allocators; however, there are some disadvantages. Allocation functions tend to be small and specific, and you more or less know exactly what they do. Allocator scorecards and error domain tables help capture the differences between using system-supplied allocation routines and application-specific ones that wrap them. With other functions that perform more complex tasks, you might lose too much information when attempting to summarize them this compactly.

Double-Frees

Occasionally, developers make the mistake of deallocating objects twice (or more), which can have consequences as serious as any other form of heap corruption. Deallocating objects more than once is dangerous for several reasons. For example, what if a memory block is freed and then reallocated and filled with other data? When the second free() occurs, there's no longer a control structure at the address passed as a parameter to free(), just some arbitrary program data. What's to prevent this memory location from containing specially crafted data to exploit the heap management routines?

There is also a threat if memory isn't reused between successive calls to free() because the memory block could be entered into free-block list twice. Later in the program, the same memory block could be returned from an allocation request twice, and the program might attempt to store two different objects at the same location, possibly allowing arbitrary code to run. The second example is less common these days because most memory management libraries (namely, Windows and GNU libc implementations) have updated their memory allocators to ensure that a block passed to free() is already in use; if it's not, the memory allocators don't do anything. However, some OSs have allocators that don't protect against a double free attack; so bugs of this nature are still considered serious.

When auditing code that makes use of dynamic memory allocations, you should track each path throughout a variable's lifespan to see whether it's accidentally deallocated with the free() function more than once. Listing 7-45 shows an example of a double-free vulnerability.

Listing 7-45. Double-Free Vulnerability

int read_data(int sockfd) {     char *data;     int length;     length = get_short_from_network(sockfd);     data = (char *)malloc(length+1);     if(!data)         return 1;     read_string(sockfd, data, length);     switch(get_keyword(data)){         case USERNAME:             success = record_username(data);             break;         case PASSWORD:             success = authenticate(data);             break;         default:             error("unknown keyword supplied!\n");             success = -1;             free(data);     }     free(data);     return success; }

In this example, you can see that the bolded code path frees data twice because when it doesn't identify a valid keyword. Although this error seems easy to avoid, complex applications often have subtleties that make these mistakes harder to spot. Listing 7-46 is a real-world example from OpenSSL 0.9.7. The root cause of the problem is the CRYPTO_realloc_clean() function.

Listing 7-46. Double-Free Vulnerability in OpenSSL

void *CRYPTO_realloc_clean(void *str, int old_len, int num, const char *file,                int line)     {     void *ret = NULL;     if (str == NULL)         return CRYPTO_malloc(num, file, line);      if (num < 0) return NULL;     if (realloc_debug_func != NULL)         realloc_debug_func(str, NULL, num, file, line, 0);     ret=malloc_ex_func(num,file,line);     if(ret)         memcpy(ret,str,old_len);     OPENSSL_cleanse(str,old_len);     free_func(str);     ...     return ret;     }

As you can see, the CRYPTO_realloc_clean() function frees the str parameter passed to it, whether it succeeds or fails. This interface is quite unintuitive and can easily lead to double-free errors. The CRYPTO_realloc_clean() function is used internally in a buffer-management routine, BUF_MEM_grow_clean(), which is shown in the following code:

int BUF_MEM_grow_clean(BUF_MEM *str, int len)     {     char *ret;     unsigned int n;     if (str->length >= len)         {         memset(&str->data[len],0,str->length-len);         str->length=len;         return(len);         }     if (str->max >= len)         {         memset(&str->data[str->length],0,len-str->length);         str->length=len;         return(len);         }     n=(len+3)/3*4;     if (str->data == NULL)         ret=OPENSSL_malloc(n);     else         ret=OPENSSL_realloc_clean(str->data,str->max,n);     if (ret == NULL)         {         BUFerr(BUF_F_BUF_MEM_GROW,ERR_R_MALLOC_FAILURE);         len=0;         }     else         {         str->data=ret;         str->max=n;         memset(&str->data[str->length],0,len-str->length);         str->length=len;         }     return(len); }

As a result of calling OPENSSL_realloc_clean(), the BUF_MEM_grow_clean() function might actually free its own data element. However, it doesn't set data to NULL when this reallocation failure occurs. This quirky behavior makes a double-free error likely in functions that use BUF_MEM structures. Take a look at this call in asn1_collate_primitive():

       if (d2i_ASN1_bytes(&os,&c->p,c->max-c->p, c->tag,c->xclass)            == NULL)            {            c->error=ERR_R_ASN1_LIB;            goto err;            }        if (!BUF_MEM_grow_clean(&b,num+os->length))             {             c->error=ERR_R_BUF_LIB;             goto err;             }     ... err:     ASN1err(ASN1_F_ASN1_COLLATE_PRIMITIVE,c->error);     if (os != NULL) ASN1_STRING_free(os);     if (b.data != NULL) OPENSSL_free(b.data);     return(0);     }

This function attempts to grow the BUF_MEM structure b, but when an error is returned, it frees any resources it has and returns 0. As you know now, if BUF_MEM_grow_clean() fails because of a failure in CRYPTO_realloc_clean(), it frees b.data but doesn't set it to NULL. Therefore, the bolded code frees b.data a second time.

Code auditors should be especially aware of double-frees when auditing C++ code. Sometimes keeping track of an object's internal state is difficult, and unexpected states could lead to double-frees. Be mindful of members that are freed in more than one member function in an object (such as a regular member function and the destructor), and attempt to determine whether the class is ever used in such a way that an object can be destructed when some member variables have already been freed.

Double-free errors can crop up in other ways. Many operating systems' reallocation routines free a buffer that they're supposed to reallocate if the new size for the buffer is 0. This is true on most UNIX implementations. Therefore, if an attacker can cause a call to realloc() with a new size of 0, that same buffer might be freed again later; there's a good chance the buffer that was just freed will be written into. Listing 7-47 shows a simple example.

Listing 7-47. Reallocation Double-Free Vulnerability

#define ROUNDUP(x) (((x)+15) & 0xFFFFFFF0) int buffer_grow(buffer *buf, unsigned int size) {     char *data;     unsigned int new_size = size + buf->used;     if(new_size < size)         return 1;            /* integer overflow */     data = (char *)realloc(buf->data, ROUNDUP(new_size));     if(!data)         return 1;     buf->data = data;     buf->size = new_size;     return 0; } int buffer_free(buffer *buf) {     free(buf->data);     free(buf);     return 0; } buffer *buffer_new(void) {     buffer *buf;     buf = calloc(1, sizeof(buffer));     if(!buf)         return NULL;     buf->data = (char *)malloc(1024);     if(!buf->data){         free(buf);         return NULL;     }     return buf; }

This code shows some typical buffer-management routines. From what you have learned about allocation routines, you can classify a couple of interesting characteristics about buffer_grow(). Primarily, it checks for integer overflows when increasing the buffer, but that rounding is performed after the check. Therefore, whenever new_size() and buf->used are added together and give a result between 0xFFFFFFF1 and 0xFFFFFFFF, the roundup causes an integer overflow, and the value 0 is passed to realloc(). Also, notice that if realloc() fails, buf->data isn't set to a NULL pointer. This is important because when realloc() frees a buffer because of a 0-length parameter, it returns NULL. The following code shows some potential implications:

int process_login(int sockfd) {     int length;     buffer *buf;     buf = buffer_new();     length = read_integer(sockfd);     if(buffer_grow(buf, length) < 0){         buffer_free(buf);         return 1;     }     ... read data into the buffer ...     return 0; }

The process_login() function attempts to increase the buffer enough to store subsequent data. If the supplied length is large enough to make the integer wrap, the buf->data member is freed twiceonce during buffer_grow() when a size of 0 is passed to realloc(), and once more in buffer_free(). This example spans multiple functions for a specific reason; often bugs of this nature are spread out in this way and are less obvious. This bug would be easy to miss if you didn't pay careful attention to how buffer_grow() works (to notice the integer overflow) and to the nuances of how realloc() works.