Common Architectural Failures | Hacking Ubuntu: Serious Hacks Mods and Customizations (ExtremeTech)

Bypassing Input Validation and Attack Detection

Understanding input validation and knowing how to bypass it are essential skills for the bug hunter. We'll give you a brief overview of the subject to help you understand where mistakes are made and provide you with some useful validation bypass techniques.

Stripping Bad Data

People often use flawed regular expressions to try to limit (or detect) potential attacks. One common application is to strip out input that is known to be bad ”if you are defending against SQL Injection you might, for example, write a filter that strips out SQL reserved words such as select , union , where , from and so on.

For instance, the input string

 ' union select name, password from sys.user$

might become

 ' name, password sys.user$

This produces an error. Sometimes you can bypass this error by recursively including bad data within itself, like this:

 ' uniunionon selselectect name, password frfromom sys.user$

Each bad term is included within itself. As the values are stripped out, the enclosing bad data is reconstituted, leaving us with precisely the data we wanted. Obviously this only works when the known bad terms are composed of at least two distinct characters .

Using Alternate Encodings

The most obvious way of bypassing input validation is to use an alternate encoding of your data. For instance, you might find that the way a Web server or Web application environment behaves depends on how you encode form data. The IIS Unicode encoding specifier %u is a good example. In IIS, these two are equivalent:

www.example.com/%c0%af
www.example.com/%uc0af

Another good example is treatment of whitespace. You might find that an application treats space characters as delimiters, but not TAB, carriage return, or linefeed characters. In the Oracle TZ_OFFSET overflow, a space will terminate the timezone specifier, but a TAB character will not. We wrote an exploit for this bug that ran a command and was having trouble specifying parameters in the exploit. We quickly modified the exploit to change all spaces to TABs, which worked fine, because most shells treat both spaces and TABs as delimiters.

Another classic example was an ISAPI filter that attempted to restrict access to an IIS virtual directory based on certain credentials. The filter would kick in if you requested anything in the /downloads directory ( www.example.com/downloads/hot_new_file.zip ). Obviously, the first thing to try in order to bypass it is this:

 http://www.example.com/Downloads/hot_new_file.zip

which doesn't work. Then you try this:

 http://www.example.com/%64ownloads/hot_new_file.zip

and the filter is bypassed. You now have full access to the downloads directory without authentication.

Using File-Handling Features

Some of the techniques presented in this section apply only to Windows, but you can normally find a way around these kinds of problems on Unix platforms as well. The idea is to trick an application so that either:

It believes that a required string is present in a file path .
It believes that a prohibited string is not present in a file path.
It applies the wrong behavior to a file if file handling is based on a file's extension.

Required String Is Present in Path

The first case is easy. In most situations in which you can submit a filename, you can submit a directory name. In an audit we performed, we encountered a situation in which a Web application script would serve files provided that they were in a given constant list. This was implemented by ensuring that the name of one of the specified files:

data/foo.xls
data/bar.xls
data/wibble.xls
data/wobble.xls

appeared in the file_path parameter. A typical request might look like this:

 http://www.example.com/getfile?file_path=data/foo.xls

The interesting thing is that when most file systems encounter a parent path specifier, they don't bother to verify that all the referenced directories exist. Therefore, we were able to bypass the validation by making requests such as:

 http://www.example.com/getfile?file_path=data/foo.xls/../../../etc/passwd

Prohibited String Not Present in Path

This situation is a little trickier, and again, it involves directories. Let's say the file serving script mentioned in the last section allows us to access any file but prohibits the use of parent path specifiers ( /../ ) and additionally prohibits access to a private data directory by checking for this string in the file_path parameter:

 data/private

We can bypass this protection by making such requests as:

 http://www.example.com/getfile?file_path=data/./private/accounts.xls

because the /./ specifier does nothing in the context of a path.

Incorrect Behavior Based on File Extension

Let's say that Web site administrators tire of people downloading their accounts spreadsheets and decide to apply a filter that prohibits any file_path parameter that ends in .xls . So we try:

 http://www.example.com/getfile?file_path=data/foo.xls/../private/accounts.xls

and it fails. Then we try:

 http://www.example.com/getfile?file_path=data/./private/accounts.xls

and it also fails.

One of the most interesting aspects of the Windows NT NTFS file system is its support for alternate data streams within files, which are denoted by a colon (:) character at the end of the file name and a stream name after that.

We can use this concept to get the account's data. We simply request:

 http://www.example.com/getfile?file_path=data/./private/accounts.xls::$DATA

and the data is returned to us. The reason this happens is that the "default" data stream in a file is called ::$DATA . We request the same data, but the filename doesn't end in the .xls extension, so the application allows it.

To see this for yourself, run the following on an NT box (in an NTFS volume):

 echo foobar > foo.txt

Then run:

 more < foo.txt::$DATA

and you'll see foobar . In addition to its ability to confuse input validation, this technique also provides a great way to hide data.

A bug in IIS a few years ago let you read the source of ASP pages by requesting something like:

 http://www.example.com/foo.asp:::$DATA

This worked for the same reason.

Another trick relating to file extensions in Windows systems is to add one or more trailing dots to the extension. That would make our request to the file serving script become:

 http://www.example.com/getfile?file_path=data/./private/accounts.xls.

In some cases, you'll get the same data. Sometimes the application will think the extension is blank; sometimes it will think the extension is .xls . Again, you can quickly observe this by running

 echo foobar > foo.txt

then

 type foo.txt.

 notepad foo.txt.....

Evading Attack Signatures

Most IDS systems rely on some form of signature-based recognition of attacks. In the shellcode field, people have already published much information about nop -equivalence, but I'd like to address the point here briefly , because it's important.

When you write shellcode, you can insert an almost infinite variety of instructions that do nothing in between the instructions meaningful to your exploit. It's important to bear in mind that these instructions need not actually do nothing ”they must simply do nothing that is relevant to the state of your exploit. So for example, you might insert a complex series of stack and frame manipulations into your shellcode, interleaving the instructions with the actual instructions that make up your exploit.

It's also possible to come up with an almost infinite number of ways to perform a given shellcode task, such as pushing parameters onto the stack or loading them into registers. It's fairly easy to write a generator that takes one form of the assembler for an exploit and spits out a functionally identical exploit with no code sequences in common .

Defeating Length Limitations

In some cases, a given parameter to an application is truncated to a fixed length. Generally, this is an attempt to guard against buffer overflows, but sometimes it's used in Web applications as a generic defense mechanism against SQL Injection or command execution. There are a number of techniques that can help in this kind of situation.

Sea Monkey Data

Depending on the nature of the data, you might be able to submit some form of input that expands within the application. For example, in most Web-based applications you wind up encoding double-quote characters as:

 &quot;

which is a ratio of six characters to one.

Any character that is likely to be "escaped" in the input is a good candidate for this sort of thing ”single quotes, backslashes, pipe characters, and dollar symbols are quite good in this respect.

If you're able to submit UTF-8 sequences, submit overly long sequences, because they might be treated as a single character. You might be lucky and come across an application that treats all non-ASCII characters as 16 bits. You might then overflow it by giving it characters that are longer than this, depending on how it calculates string length.

%2e is the URL encoding for (.). However:

 %f0%80%80%ae

and

 %fc%80%80%80%80%ae

are also encodings of (.).

Harmful Truncation ”Severing Escape Characters

The most obvious application of this technique is to SQL Injection, although bearing in mind the earlier discussion of canonicalization, it's possible to come up with all sorts of interesting ways of applying the technique wherever delimited or escaped data is used. Running commands in perl is good for possibly injecting into an SMTP stream.

Essentially, if data is being both escaped and truncated, you can sometimes break out of the delimited area by ensuring that the truncation occurs in the middle of an escape sequence.

There is an obvious SQL Injection example: If an application that escapes single quotes by doubling them up accepts a username and password, the username is limited to (say) 16 characters, and the password is also limited to 16 characters, the following username/password combination would execute the shutdown command, which shuts down SQL Server:

 Username: aaaaaaaaaaaaaaa' Password: ' shutdown

The application attempts to escape the single quote at the end of the username, but the string is then cut to 16 characters, deleting the "escaping" single quote. The result is that the password field can contain some SQL if it begins with a single quote. The query might end up looking like this:

 select * from users where username='aaaaaaaaaaaaaaa'' and password=''' shutdown

Effectively, the username in the query has become:

 aaaaaaaaaaaaaaa' and password='

so the trailing SQL runs, and SQL Server shuts down.

In general, this technique applies to any length-limited data that includes escape sequences. There are obvious applications for this technique in the world of perl, since perl applications have a tendency to call out to external scripts.

Multiple Attempts

Even if all you can do is write a single value somewhere in memory, you can normally upload and execute shellcode. Even if you don't have space for a good exploit (perhaps you're overflowing a 32-byte buffer, although that's enough for execve or winexec , with space left over), you can still execute arbitrary code by writing a small payload into some location in memory. As long as you can do that multiple times, you can build up your exploit at some arbitrary location in memory, and then (once you've got the whole thing uploaded) trigger it, because you already know where it is. This technique is very similar to the excellent non-executable stack exploit technique when exploiting format string bugs .

This method might even be applicable to a heap overflow situation, although the target process would have to be very good at handling exceptions. You just use your "write anything anywhere " primitive with repeated attempts to build up your payload, and then trigger it by overwriting a function pointer, exception handler address, VPTR, or whatever.

Context-Free Length Limits

Sometimes a given data item can be submitted multiple times in a given set of input, with the length limit applied to each instance of the data, but with the data then being concatenated into a single item that exceeds the length.

Good examples of this are the HTTP host header field, when taken in the context of Web Intrusion Prevention technologies. It's not unusual for these things to treat each header separately from the others. Apache (for example) will then concatenate the host headers into one long host header, effectively bypassing the host header length limit. IIS does something similar.

You can use this technique in any protocol in which each data item is identified by name, such as SMTP, HTTP parameters, form fields and cookie variables , HTML and XML tag attributes, and (in fact) any function-calling mechanism that accepts parameters by name.