Chapter 15: The Art of Fuzzing | Hacking Ubuntu: Serious Hacks Mods and Customizations (ExtremeTech)

General Theory of Fuzzing

One method of fuzzing involves the technique of fault injection (we have dedicated all of Chapter 14 to fault injection). In the software security world, fault injection usually involves sending bad data into an application by means of directly manipulating various API calls within it, usually with some form of debugger or library call interceptor. For example, you could randomly make the free() call return NULL (meaning failure), or have every getenv() call return a long string. Most papers and books on the subject talk about instrumenting the executable and then injecting hypothesized anomalies into it. Basically, they make free() return zero and then use Ven Diagrams to discuss the statistical value of this event. The whole process makes more sense when you're thinking about hardware failures, which do occur randomly. But the types of bugs we're looking for are anything but random events. In terms of finding security bugs , instrumentation is valuable , but usually only when combined with a decent fuzzer as well, at which point it becomes runtime analysis.

One rather lame but effective example of fault injection style fuzzing is sharefuzz . sharefuzz is a tool available from www.immunitysec.com . It is a shared library for Solaris or Linux that allows you to test for common local buffer overflows in setuid programs. How often have you seen an advisory that says " TERM =`perl -e 'print "A" x 5000'` ./setuid.binary gets you root!" Well, sharefuzz was designed to render these advisories (even more) pointless by making the process of discovering them completely automatic. To a large extent, it succeeded. During its first week of use, sharefuzz discovered the libsldap.so vulnerability in Solaris, although this was never reported to Sun. The vulnerability was released to Sun by a subsequent security researcher.

Let's take a closer look at sharefuzz in order to understand its internals.

 /*sharefuzz.c - a fuzzer originally designed for local fuzzing but equally good against all sorts of other clib functions. Load with LD_PRELOAD on most systems.     LICENSE: GPLv2 */     #include <stdio.h>     /*defines*/ /*#define DOLOCALE /*LOCALE FUZZING*/     #define SIZE 11500 /*size of our returned environment*/ #define FUZCHAR 0x41 /*our fuzzer character*/ static char *stuff; static char *stuff2; static char display[] = "localhost:0"; /*display to return when asked*/ static char mypath[] = "/usr/bin:/usr/sbin:/bin:/sbin"; static char ld_preload[] = "";     #include <sys/select.h>     int  select(int  n,  fd_set  *readfds,  fd_set  *writefds,                     fd_set *exceptfds, struct timeval *timeout) {          printf("SELECT CALLED!\n");     } int  getuid() {      printf("***getuid!\n");  return 501;  }     int geteuid() {      printf("***geteuid\n");      return 501; }     int getgid() {      printf("getgid\n");      return 501; }     int getegid() {      printf("getegid\n");      return 501; } int getgid32() {      printf("***getgid32\n");      return 501; } int getegid32() {      printf("***getegid32\n");      return 501; }     /*Getenv fuzzing - modify this as needed to suit your particular fuzzing needs*/ char * getenv(char * environment) {  fprintf(stderr,"GETENV: %s\n",environment);  fflush(0);     /*sometimes you don't want to mess with this stuff*/  if (!strcmp(environment,"DISPLAY"))    return display; #if 0  if (!strcmp(environment,"PATH"))  {       return NULL;    return mypath;  } #endif     #if 0  if (!strcmp(environment,"HOME"))                return "/home/dave";      if (!strcmp(environment,"LD_PRELOAD"))    return NULL;      if (!strcmp(environment,"LOGNAME"))       return NULL;      if (!strcmp(environment,"ORGMAIL"))  {       fprintf(stderr,"ORGMAIL=%s\n",stuff2);                                                    return "ASDFASDFsd";  }  if (!strcmp(environment,"TZ"))                return NULL; #endif     fprintf(stderr,"continued to return default\n") ;  //sleep(1); /*return NULL when you don't want to destroy the environment*/ //return NULL;  /*return stuff when you want to return long strings as each variable*/  fflush(0);  return stuff; }     int putenv(char * string) { fprintf(stderr,"putenv %s\n",string); return 0; }     int clearenv() {              fprintf(stderr,"clearenv \n");                      return 0; }     int unsetenv(char * string) {      fprintf(stderr,"unsetenv %s\n",string);      return 0; }     _init() {      stuff=malloc(SIZE);      stuff2=malloc(SIZE);         printf("shared library loader working\n");         memset(stuff,FUZCHAR,SIZE-1);          stuff[SIZE-1]=0;         memset(stuff2,FUZCHAR,SIZE-1);         stuff2[1]=0;         //system("/bin/sh"); }

This program is compiled into a shared library, and then loaded by using LD_PRELOAD (on systems that support it). When loaded, sharefuzz will override the getenv() call and always return a long string. You can set DISPLAY to a valid X Windows display in order to test programs that need to put up a window on the screen.

Ignoring the fact that, in the strictest sense, sharefuzz is an "instrumenting fault injector," we'll briefly go over the process of using sharefuzz. Although sharefuzz is a very limited fuzzer, it clearly illustrates many of the strengths and weaknesses of more advanced fuzzers such as SPIKE, which will be discussed later in this chapter.

Root and commercial fuzzers

Of course, to use LD_PRELOAD on a setuid program, you must be logged in as root, which somewhat changes a fuzzer's behavior. Don't forget that some programs will not drop core , so you probably want to attach to them with gdb. As with any fuzzing process, any and all unexpected behavior during your fuzz session should be noted and examined later for clues into potential bugs. There are still default setuid Solaris programs that will fall to sharefuzz. We leave finding these to the reader's next lazy afternoon.

For a more polished example of a fuzzer-like sharefuzz for Windows applications, check out Holodeck ( www.sisecure.com/holodeck/ ). In general though, fuzzers of this nature (also known as fault- injectors ) access the program at too primitive a layer to be truly useful for security testing. They leave most questions on reachability of bugs unanswered, and have many problems with false positives. Holodeck costs $5,000 per license ”don't waste your money.

Static Analysis versus Fuzzing

Unlike static analysis (such as using binary or source code analysis), when a fuzzer "finds" a security hole, it has typically given the user the set of input that was used to find it. For example, when a process crashes under sharefuzz, we can get a printout that describes which environment variables sharefuzz was fuzzing at the time and exactly which variables might have crashed it. Then we can test each of these manually, to see which one caused the overflow.

Under static analysis, you tend to find an enormous wealth of bugs that may or may not be reachable by input sent to the application externally. Tracking down each bug found during a static analysis session to see if it can actually be triggered is not an efficient or scalable process.

On the other hand, sometimes a fuzzer will find a bug that is not easily reproducible. Double free bugs, or other bugs that require two events to happen in a row, are a good example. This is why most fuzzers send pseudo-random input to their targets and allow for the pseudo-random seed value to be specified by the user in order to replicate a successful session. This mechanism allows a fuzzer to explore a large space by attempting random values, but also allows this process to be completely duplicated later when trying to narrow in on a specific bug.

Fuzzing Is Scalable

Static analysis is a very involved, very labor- intensive process. Because static analysis does not determine the reachability of any given bug, a security researcher is left tracing each and every bug to examine it for exploitability. This process does not port to other instances of the program. A bug's exploitability can depend on many things, including program configuration, compiler options, machine architectures, or a number of other variables. In addition, a bug reachable in one version of the program may be completely unreachable in another. But almost inevitably, an exploitable bug will cause an access violation or some other detectable corruption. As a hacker, we're typically not interested in non-exploitable bugs or bugs that cannot be reached. Therefore, a fuzzer is perfect for our needs.

We say fuzzing is scalable because a fuzzer built to test SMTP can test any number of SMTP servers (or configurations of the same server), and it will most likely find similar bugs in all of them, if the bugs are present and reachable. This quality makes a good fuzzer worth its weight in gold when you are trying to attack a new system that runs services similar to other systems you have already attacked .

Another reason we say fuzzing is scalable is because the strings with which you locate bugs in one protocol will be similar to strings with which you locate bugs in other protocols. Let's look, for example, at the directory traversal string written in Python.

 print "../"*5000

While this string is used to find bugs that will let you pull arbitrary files from particular servers (Web CGI programs, for example), it also exhibits a very interesting bug in modern versions of HelixServer (also known as RealServer). The bug is similar to the following C code snippet, which stores pointers to each directory in a buffer on the stack.

 void example(){ char * ptrs[1024]; char * c; char **p; for (p=ptrs,c=instring; *c!=0; c++)  {    if (*c=='/') {      *p=c;       p++;    }  } }

At the end of this function, we should have a set of pointers to each level in the directory. However, if we have more than 1,024 slashes , we have overwritten the saved frame pointer and stored a return address with pointers to our string. This makes for a great offsetless exploit. In addition, this is one of the few vulnerabilities for which it is useful to write a multiple architecture shellcode, since no return address is needed and RealServer is available for Linux, Windows, and FreeBSD.

This particular bug is in the registry code in RealServer. But the fuzzer doesn't need to know that the registry code looks at every URL passed into the handler. All it needs to know is that it will replace every string it sees with a large set of strings it has internally, building on prior knowledge in a beautifully effective way.

It's important to note that a large part of building a new fuzzer is going back to old vulnerabilities and testing whether your fuzzer can detect them, and then abstracting the test as far as possible. In this way, you can detect future and unknown vulnerabilities in the same "class" without having to specifically code a test aimed at triggering them. Your personal taste will decide how far you abstract your fuzzer. This gives each fuzzer a personality, as parts of them are abstracted to different levels, and this is part of what differentiates the results of each fuzzer.