Understanding Why Format Strings Are a Problem

Suppose a programmer wants to use printf or one of its related functions to write out the contents of a buffer named szUntrustedInputBuffer. The most obvious direct way to do it is this way:

 printf(szUntrustedInputBuffer); 

Another way to accomplish the task is this:

 printf("%s", szUntrustedInputBuffer); 

Both of the preceding printf statements accomplish the task. Which of the two is better? Thefirst is much easier and more obvious to code, but consider this: the printf function in its compiled form doesn t distinguish how many parameters it has. Why might that matter? The answer requires a closer look at how format string specifiers work with the stack.

Anatomy of a printf Call

When analyzing the printf stack usage, remember that arguments are placed on the stack from last to first in C. Consider the following code:

 printf("%s", szUntrusted); 

The code translates into assembly that looks roughly equivalent to the following instructions:

 push address of "Contents of szUntrusted" push address of "%s" call printf 

Once the two parameters are pushed onto the stack and the call instruction is processed , the stack looks like the following. (Note this stack is the reverse of the way the stack appeared in the preceding chapter.)

image from book

When more parameters are in the printf call, they are simply pushed onto the stack sooner. For example, look at the following:

 printf("%s ate %d cheeseburgers.", "Chris Gallagher", 1000); 

The stack would look comparable to the following:

image from book

Misinterpreting the Stack

What does printf use at run time to determine how the stack is arranged? Unlike most ordinary functions, printf uses the content of the first parameter (which is the first parameter it pulls off the stack) to interpret what it sees on the stack. Therefore, the content referenced by one stack parameter can dictate the number of parameters and whether each parameter is interpreted as a value or a reference. The processing of these format string identifiers and the preceding fact make format string specifiers especially useful for attackers , who can inject content into the first parameter of the printf function.

Important  

Format string attacks happen when attackers can inject content into the first parameter of the printf function. By controlling the first parameter of the printf function, the attacker can control the interpretation of the stack by the printf function.

You can gain a lot of insight into how this works and how specifying different format string specifiers can affect the stack by comparing how printf views the stack and how the stack is actually configured. The following comparison focuses on this simple case:

 printf(szUntrustedInputBuffer); 

The printf function expects the stack to look as follows :

image from book

For the basic case where szUntrustedInputBuffer references a string with no format specifiers, the stack is actually constructed the way printf expects it to be.

Remember that szUntrustedInputBuffer is the first parameter to printf , which means printf will interpret it as a format. What happens when untrustworthy input data specifies format string specifiers the programmer didn t anticipate as part of the input in szUntrustedInput-Buffer?

For the case when szUntrustedInputBuffer contains a single %s format specifier , the printf function expects the stack to be laid out differently:

image from book

The net result is that the call to printf takes what is referenced by the last sizeof(char*) bytes of what precedes it on the stack, interprets it as a null- terminated string, and copies it to the output.

For the case when szUntrustedInputBuffer contains both the %d and %s format specifiers (in that order), the printf function expects the stack to be laid out differently yet:

image from book

If the input data specifies %d%d%s , the %s references an item still farther back on the stack:

image from book

When you look back at the last three examples, it becomes apparent that the contents of szUntrustedInputBuffer determine what memory address printf expects to use as a reference to fill in the %s value in the output. Suppose an attacker wanted the printf call to use a value the attacker knows is on the stack, but not on the top of the stack? Well, the attacker could getthe desired data to the top of the stack by removing ( popping ) values off the stack by using the necessary number of format string specifiers. If an attacker knows the correct offset to where something interesting is on the stack, the attacker can compute the necessary number of %d and other format string specifiers to inject to have the value referenced appear in the output.

Overwriting Memory

It turns out there is another format string specifier, %n , that does something quite different from %d and %s . Unlike the other format specifiers, %n causes information to be written to a place in memory specified on the stack. When printf sees %n in the format string, it considers the associated parameter to reference an integer, so it writes the number of formatted characters to the address designated by the parameter.

How does that work? Suppose you have the following code:

 int NumberWritten = 0; printf("Soda%n", &NumberWritten); 

NumberWritten would be 4, one for each of the letters in the word Soda . Similarly, consider this:

 printf("So%nda", &NumberWritten); 

In this case, what would NumberWritten be set to? Two. How about this?

 printf("%d%s%n", 1000, " hamburgers!", &NumberWritten); 

If you counted 16, you are correct.

Remember from the earlier discussion that malicious input data can cause printf to misinterpret the stack. It turns out that by specifying the correct input format string an attacker can trick printf into using another attacker-specified value that is also on the stack as the parameter for %n . The result will be that when printf processes the %n it will write the current number of characters output to a memory address of the attacker s choosing.

Let s return to the basic printf function:

 printf(szUntrustedInputBuffer); 

It is worth a quick look at what happens to the stack when the input data is %d%d%d%d%n :

image from book
Important  

You can cause functions that rely on format string specifiers for interpreting how the stack is arranged to read and write values you control to memory anywhere the program can. At that point, you have full control of the program.

Given the basics of how format string bugs work, you must prioritize them as important vulnerabilities and focus on them appropriately during testing. Obviously, testing and finding the bugs are a great first step. This chapter also includes a walkthrough that provides more details and information on countering these real-world problems.



Hunting Security Bugs
Hunting Security Bugs
ISBN: 073562187X
EAN: 2147483647
Year: 2004
Pages: 156

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net