Abusing URL Encoding

Abusing URL Encoding

The architects of the HTTP protocol created URL encoding to allow nonalphanumeric characters in URL strings so that regular alphanumeric characters and symbols presented on most keyboards could be used. Certain Web servers can be fooled by nonstandard methods of encoding characters on the URL string. Two of the most significant recent Web server vulnerabilities are attributed to errors in URL decoding.

Unicode Encoding and Code Red's Shell Code

By itself Unicode encoded characters on the URL are no different than regular hexadecimal encoded ASCII characters. However, the %uXXXX scheme allows a more compact representation of a 16-bit word than do two hexadecimal encoded ASCII symbols.

The creators of the Code Red worm, which cost companies an estimated $500 million and wreaked havoc on IIS Web servers, used Unicode encoded bytes to inject the shell code in the request to the .IDA handler causing a buffer overflow condition. The HTTP request made by Code Red is:

/default.ida?NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN%u9090%u6858
%ucbd3%u7801%u9090%u6858%ucbd3%u7801%u9090%u6858%ucbd3
%u7801%u9090%u9090%u8190%u00c3%u0003%u8b00%u531b%u53ff
%u0078%u0000%u00=a

Note how the attackers have encoded the assembly shell code in 16-bit sequences, using Unicode encoding. For example, %u9090 translates to 0x90 0x90, which in turn means two NOP instructions in x86 machine code.

Unicode Vulnerability

In October 2000, Microsoft's IIS was found vulnerable to the "Unicode bug," whereby an illegal Unicode encoding of the "/" character allowed users to craft URLs that could jump outside the Web document directory and call the command shell command (cmd.exe) within the Windows system directory. An example of such a URL is:

http://192.168.7.21/scripts/..%c0%af../winnt/system32/cmd.exe?/c+dir+d:\

Figure 5-2 shows what would happen if this URL was launched from a browser against 192.168.7.21.

Figure 5-2. Using Unicode to execute commands

graphics/05fig02.gif

So how does this attack work? Well "%c0%af" is an illegal Unicode representation of "/." The URL causes the Web server to interpret the Unicode characters as back slashes, bypassing the normal Web server filtering for such an event and effectively traversing two directory levels above the location of the /scripts/ directory and targeting /winnt/system32/cmd.exe. The /scripts/ directory usually is located in the C:\inetpub\scripts directory. Under normal circumstances, the Web server would never allow a URL to access a location outside the Web document directory (in this case, C:\inetpub). However, the Web server fails to recognize the Unicode representation of "/" when it performs directory location checks. Internally, "..%c0%af../" translates to "../../" and the resource accessed by the Web server becomes: C:\inetpub\scripts\..\..\winnt\system32\cmd.exe, which boils down to C:\winnt\system32\cmd.exe and command execution.

How does "%c0%af" translate to "/"? For an explanation we need to show how the illegal Unicode representation is constructed. The "/" character's ASCII code in hex is 2F, which is 00101111 in binary. Unicode encoding, or more precisely UTF-8 encoding, allows for character sets larger than 256 symbols and hence more than 8 bits in length. The correct way to represent 2F in UTF-8 format is still 2F. However, it is possible to represent 2F by using a multibyte UTF-8 representation. The character "/" can be represented in single-, double-, and triple-byte UTF-8 encoding formats as follows:

Used

"/"

Binary

Decimal

Hex

1byte

0xxxxxxx

00101111

47

2F

2 bytes

110xxxxx 10xxxxxx

11000000 10101111

49327

C0 AF

3 bytes

1110xxxx 10xxxxxx 10xxxxxx

11100000 10000000 10101111

14713007

E0 80AF

The x's represent the bit pattern of the character encoded, from right to left. Hence the UTF-8 double-byte representation of "/" is "C0 AF." On the URL, it is represented as two hex encoded characters, "%c0%af."

The UTF-8 encoding specifications state that "a UTF-8 decoder must not accept UTF-8 sequences that are longer than necessary to encode a character. Any overlong UTF-8 sequence could be abused to bypass UTF-8 substring tests that look only for the shortest possible encoding." IIS failed to observe this guideline, and as a result, the vulnerability allowed thousands of hackers everywhere to run arbitrary commands on IIS servers.

The same attack also works if triple-byte UTF-8 encoding is used. The following URL is equivalent to the preceding URL:

http://192.168.7.21/scripts/..%e0%80%af../winnt/system32/cmd.exe?/c+dir+d:\

If you're intrigued by the complexities of Unicode and UTF-8 encoding, go to the Unicode and UTF-8 FAQ at http://www.cl.cam.ac.uk/~mgk25/unicode.html.

The Double-Decode or Superfluous Decode Vulnerability

Just when Microsoft was cleaning up the mess caused by the Unicode bug, another vulnerability surfaced in May 2001. It became known as the "Double Decode" or "Superfluous Decode" vulnerability. In many ways, the method of exploitation and the effects caused are almost identical to those involving the Unicode vulnerability. The double decode vulnerability of a URL is exploited as follows:

http://192.168.7.21/scripts/..%25%32%66../winnt/system32/cmd.exe?/c+dir+d:\

Figure 5-3 shows the output generated by IIS.

Figure 5-3. Double decode technique for executing commands

graphics/05fig03.gif

The "/" character is replaced by the string "%25%32%66." If the preceding URL is decoded once, it results in:

http://192.168.7.21/scripts/..%2f../winnt/system32/cmd.exe?/c+dir+d:\
%25 = "%"
%32 = "2"
%66 = "f"

If this URL is decoded once more, it becomes:

http://192.168.7.21/scripts/../../winnt/system32/cmd.exe?/c+dir+d:\

The string "%25%32%66" isn't the only string that takes advantage of this vulnerability. The following strings, shown with their translations to ASCII, also works.

Encoded Pattern

Hex Representation

ASCII Character

%25%35%63

%5c

"\"

%25%35f

%2f

"/"

%252f

%2f

"/"

%252F

%2F

"/"

%255C

%5C

"\"

Many more permutations and combinations are possible. Thus input validation, when missed or done incorrectly, becomes an enormous problem.

To summarize, oversights in the implementation of the URL decoding mechanism lead to huge security vulnerabilities. Those that we just discussed are only two of the most troublesome examples. Many other Web server products can be fooled by unusual URL encoding patterns. The HTTP W3C specifications (http://www.w3c.org/Protocols/) must be followed closely while implementing Web servers if such vulnerabilities are to be avoided.

 



Web Hacking(c) Attacks and Defense
Web Hacking: Attacks and Defense
ISBN: 0201761769
EAN: 2147483647
Year: 2005
Pages: 156

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net