Userland Hooks | Rootkits: Subverting the Windows Kernel

< Day Day Up >

In Windows, there are three subsystems on which most processes depend. They are the Win32, POSIX, and OS/2 subsystems. These subsystems comprise a well-documented set of APIs. Through these APIs, a process can request the aid of the OS. Because programs such as Taskmgr.exe, Windows Explorer, and the Registry Editor rely upon these APIs, they are a perfect target for your rootkit.

For example, suppose an application lists all the files in a directory and performs some operation on them. This application may run in user space as a user application or as a service. Assume further that the application is a Win32 application, which implies it will use Kernel32, User32.dll, Gui32.dll, and Advapi.dll to eventually issue calls into the kernel.

Under Win32, to list all the files in a directory, an application first calls FindFirstFile, which is exported by Kernel32.dll. FindFirstFile returns a handle if it is successful.

This handle is used in subsequent calls to FindNextFile to iterate through all the files and subdirectories in the directory. FindNextFile is also an exported function in Kernel32.dll. To use these functions, the application will load Kernel32.dll at runtime and copy the memory addresses of the functions into its function Import Address Table (IAT). When the application calls FindNextFile, execution in the process jumps to a location in its IAT. Execution in the process then continues to the address of FindNextFile in Kernel32.dll. The same is true for FindFirstFile.

FindNextFile in Kernel32.dll then calls into Ntdll.dll. Ntdll.dll loads the EAX register with the system service number for FindNextFile's equivalent kernel function, which happens to be NtQueryDirectoryFile. Ntdll.dll also loads EDX, with the user space address of the parameters to FindNextFile. Ntdll.dll then issues an INT 2E or a SYSENTER instruction to trap to the kernel. (These traps into the kernel are covered later in this chapter.) This sequence of calls is illustrated in Figure 4-1.

Figure 4-1. FindNextFile execution path.

Because the application loads Kernel32.dll into its private address space between memory addresses 0x00010000 and 0x7FFE0000, your rootkit can directly overwrite any function in Kernel32.dll or in the application's import table as long as the rootkit can access the address space of the target process. This is called API hooking. In our example, your rootkit could overwrite FindNextFile with your own hand-crafted machine code in order to prevent listing of certain files or otherwise change the performance of FindNextFile. The rootkit could also overwrite the import table in the target application so that it points to the rootkit's own function instead of Kernel32.dll's. By hooking APIs, you can hide a process, hide a network port, redirect file writes to a different file, prevent an application from opening a handle to a particular process, and more. In fact, what you do with this technique is largely up to your imagination.

Now that you understand the basic theory of API hooking and what you can accomplish using it, we will detail implementing an API hook in a user process in the following three sections. The first section outlines how an IAT hook works, and the second section describes what an inline function hook is and how it works. The third section covers injecting a DLL into a userland process.

Import Address Table Hooking

The simpler of the two userland hooking processes is called Import Address Table hooking. When an application uses a function in another binary, the application must import the address of the function. Most applications that use the Win32 API do so through an IAT, as noted earlier. Each DLL the application uses is contained in the application's image in the file system in a structure called the IMAGE_IMPORT_DESCRIPTOR. This structure contains the name of the DLL whose functions are imported by the application, and two pointers to two arrays of IMAGE_IMPORT_BY_NAME structures. The IMAGE_IMPORT_BY_NAME structure contains the names of the imported functions used by the application.

When the operating system loads the application in memory, it parses these IMAGE_IMPORT_DESCRIPTOR structures and loads each required DLL into the application's memory. Once the DLL is mapped, the operating system then locates each imported function in memory and overwrites one of the IMAGE_IMPORT_BY_NAME arrays with the actual address of the function. (To learn more about these and other structures in the Windows PE format, see Matt Pietrek's article. ^[1])

^[1] M. Pietrek, "Peering Inside the PE: A Tour of the Win32 Portable Executable File Format," Microsoft Systems Journal, March 1994.

Once your rootkit's hook function is in the application's address space, your rootkit can parse the PE format of the target application in memory and replace the target function's address in the IAT with the address of the hook function. Then, when the function is called, your hook will be executed instead of the original function. Figure 4-2 illustrates the control flow once the IAT is hooked.

Figure 4-2. Normal execution path vs. hooked execution path for an IAT hook.

We will discuss how to get your rootkit into the address space of the target application later in the chapter. For code to hook the IAT of a given binary, see the section titled Hybrid Hooking Approach near the end of this chapter.

You can see from Figure 4-2 that this is a very powerful yet rather simple technique. It does have its drawbacks, though, in that it is relatively easy to discover these types of hooks. On the other hand, hooks like these are used frequently, even by the operating system itself in a process called DLL forwarding. Even if someone is trying to detect a rootkit hook, determining what is a benign hook as opposed to a malicious hook is difficult.

Another problem with this technique has to do with the binding time. Some applications do late-demand binding. With late-demand binding, function addresses are not resolved until the function is called. This reduces the amount of memory the application will use. These functions may not have addresses in the IAT when your rootkit attempts to hook them. Also, if the application uses LoadLibrary and GetProcAddress to find the addresses of functions, your IAT hook will not work.

Inline Function Hooking

The second userland hooking process we will discuss is called inline function hooking. Inline function hooks are much more powerful than IAT hooks. They do not suffer from the problems associated with the binding time of the DLL. When implementing an inline function hook, your rootkit will actually overwrite the code bytes of the target function so that no matter how or when the application resolves the function address, it will still be hooked. This technique can be used in the kernel or in a userland process, but it is more common in userland.

Typically, an inline function hook is implemented by saving the first several bytes of the target function that the hook will overwrite. After the original bytes are saved, an immediate jump is usually placed in the first five bytes of the target function. The jump leads to the rootkit hook. The hook can then call the original function using the saved bytes of the target function that were overwritten. Using this method, the original function will return execution control to the rootkit hook. Now, the hook can alter the data returned by the original function.

The easiest location to use for placement of an inline hook is within the first five bytes of the function. There are two reasons for this. The first concerns the structure of most functions in memory. Most of the functions in the Win32 API begin the same way. This structure is called the preamble. The following block of code is the Assembly language for typical function preambles.

 Pre-XP SP2   Code Bytes          Assembly                   55             push ebp                   8bec           mov ebp, esp                   ...            ... Post-XP SP2  Code Bytes          Assembly                   8bff           mov edi, edi                   55             push ebp                   8bec           mov ebp, esp

It is important to determine which version of the function preamble your rootkit is to overwrite. An unconditional jump to your rootkit hook on the x86 architecture will usually require five bytes. The first byte is for the jmp opcode, and the remaining four bytes are the address of your hook. An illustration of this is provided in Chapter 5.

In the pre-XP SP2 case, you will overwrite the three bytes of the preamble and two bytes of some other instruction. To account for this, your patching function must be able to disassemble the beginning of the function and determine instruction lengths in order to preserve the original function's opcodes. In post-XP SP2, Microsoft has made your job easier. The preamble is exactly five bytes, so you have exactly enough room. Microsoft actually did this to allow for hot patching (insertion of new code without rebooting the machine). Even Microsoft knows how convenient an inline hook is when all the bytes line up properly.

The second reason why the beginning of the target function is usually overwritten is because the deeper into the function the hook is placed, the more you have to worry about code re-entry. The location you are hooking may be called by the target function many times. This can cause undesired results. To simplify matters, your rootkit will want to hook the single ingress point of the function and alter the results of the target function after it has left an egress point.

Your rootkit saves the original function bytes in what is called a trampoline. The jump you place in the target function is called the detour. Your detour calls the trampoline, which jumps to the target function plus five bytes, roughly. When the target function returns to your detour, you can alter the results returned by the target function. Figure 4-3 demonstrates the process. The source function is the code that originally called the target function.

Figure 4-3. Temporal ordering of a detoured function.

More information about how to implement an inline hook is provided in Chapter 5, Runtime Patching. We also encourage you to read the landmark paper on inline function patching from Microsoft Research.^[2]

^[2] G. Hunt and D. Brubacker, "Detours: Binary Interception of Win32 Functions," Proceedings of the Third USENIX Windows NT Symposium, July 1999, pp. 135 43.

Injecting a DLL into Userland Processes

The next three sections discuss userland techniques for getting your rootkit code into the address space of another process. These methods were first documented by Jeffrey Richter.^[3] Once your DLL is loaded into the target process, it can alter the execution path of commonly used APIs.

^[3] J. Richter, "Load Your 32-bit DLL into Another Process's Address Space Using INJLIB," Microsoft Systems Journal/9 No. 5 (May 1994).

Injecting a DLL using the Registry

In Windows NT/2000/XP/2003, there is a Registry key named HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\ Windows\AppInit_DLLs. Your rootkit can set the value of this key to one of its own DLLs that modifies the target process's IAT, or modifies kernel32.dll or ntdll.dll directly. When an application is loaded that uses User32.dll, the DLL listed as the value of this key will also be loaded by User32.dll into the application's address space.

User32.dll loads the DLLs listed in this key with a call to the LoadLibrary function. As each DLL is loaded, its DllMain function is called with the reason of DLL_PROCESS_ATTACH. There are four reasons why a DLL may be loaded into a process's address space, but we are interested only in DLL_PROCESS_ATTACH. Your rootkit should hook whatever functions are its target if the rootkit DLL is being loaded for the first time by the process, which is indicated by DLL_PROCESS_ATTACH. Since DllMain is automatically called and the DLL is in every application's address space that uses User32.dll, which includes most applications (aside from some console applications), your rootkit could easily hook function calls to hide evidence of files, registry keys, etc.

Some sources will tell you there is a drawback to this technique that after a rootkit changes the value of this key, the computer must be rebooted for the value to take effect. However, this is not entirely correct. All the processes created before your rootkit has modified the Registry key will remain uninfected, but all processes created after the Registry key is modified will be injected with your DLL, without rebooting the machine.

Injecting a DLL using Windows Hooks

Applications receive event messages for many events in the computer that relate to the application. For example, an application can receive event messages when a key is typed while one of its windows is active, when a button is pushed, or when the mouse is in focus.

Microsoft defines a function that makes it possible to hook window messages in another process, which will effectively load your rootkit DLL into the address space of that other process.

Suppose the application you are trying to inject your DLL into is called process B. A separate process, call it process A or the rootkit loader, can call SetWindowsHookEx. The function prototype of SetWindowsHookEx as defined by the Microsoft MSDN is listed below.

 HHOOK SetWindowsHookEx(   int idHook,   HOOKPROC lpfn,   HINSTANCE hMod,   DWORD dwThreadId );

Four parameters are indicated. The first parameter is the type of event message that will trigger the hook. An example is WH_KEYBOARD, which installs a hook procedure that monitors keystroke messages. The second parameter identifies the address (in process A of the function) the system should call when a window is about to process the specified message. The virtual-memory address of the DLL that contains this function is the third parameter. The last parameter is the thread to hook. If this parameter is 0, the system hooks all threads in the current Windows desktop.

If process A calls SetWindowsHookEx(WH_KEYBOARD, myKeyBrdFuncAd, myDllHandle, 0), for example, when process B is about to receive a keyboard event process B will load the rootkit DLL specified by myDllHandle that contains the myKeyBrdFuncAd function. Again, this DLL could be the part of your rootkit that hooks the IATs in the process's address space or implements inline hooks. The following code is a template of how your rootkit DLL would be implemented.

 BOOL APIENTRY DllMain(HANDLE hModule,           DWORD ul_reason_for_call,                       LPVOID lpReserved) {    if (ul_reason_for_call == DLL_PROCESS_ATTACH)    {       // YOU CAN ADD CODE HERE TO HOOK ANYTHING       // YOU WOULD LIKE, NOW THAT YOU ARE INJECTED       // INTO THE VICTIM PROCESS ADDRESS SPACE.    }    return TRUE; } __declspec (dllexport) LRESULT myKeyBrdFuncAd (int code,                                              WPARAM wParam,                                              LPARAM lParam) {    // To be nice, your rootkit should call the next-lower    // hook, but you never know what this hook may be.    return CallNextHookEx(g_hhook, code, wParam, lParam); }

Injecting a DLL using Remote Threads

Another way to load your rootkit DLL into the target process is by creating what is called a remote thread in that process. You will need to write a program that will create the thread specifying the rootkit DLL to load. This strategy is similar to that described in the previous section.

Create Remote Thread takes seven parameters:

 HANDLE CreateRemoteThread(          HANDLE hProcess,          LPSECURITY_ATTRIBUTES lpThreadAttributes,          SIZE_T dwStackSize,          LPTHREAD_START_ROUTINE lpStartAddress,          LPVOID lpParameter,          DWORD dwCreationFlags,          LPDWORD lpThreadId );

The first parameter is a handle to the process in which to inject the thread. To get a handle to the target process, your rootkit loader can call OpenProcess with the target Process Identifier (PID). OpenProcess has the following function prototype:

 HANDLE OpenProcess(DWORD dwDesiredAccess,              BOOL     bInheritHandle,              DWORD dwProcessId );

The PID of the target process can be found by using the Taskmgr.exe utility in Windows. Obviously, the PID can also be found programmatically.

Set the second and seventh parameters of CreateRemoteThread to NULL and the third and sixth parameters to 0.

This leaves the two parameters that are the crux of the attack: the fourth and the fifth. Your rootkit loader should set the fourth parameter to the address of LoadLibrary in the target process. You can use the address of LoadLibrary in your current rootkit loader application. Since this address must exist in the target process, this works only if Kernel32.dll, which exports LoadLibrary, is loaded in the target process. To get the address of LoadLibrary, your rootkit loader can call the GetProcAddress function like this:

 GetProcAddress(GetModuleHandle(TEXT( "Kernel32")), "LoadLibraryA").

The above call gets the address of LoadLibrary in the process that is doing the injecting, assuming that Kernel32.dll is at the same base location in the target process. (This is usually the case, because rebasing DLLs costs the operating system more time when loading the DLL into memory, and Microsoft wants to avoid the performance hit that would be caused by rebasing its DLLs.) LoadLibrary has the same format and return type as a THREAD_START_ROUTINE function, so its address can be used as the fourth parameter to CreateRemoteThread.

The last interesting parameter, the fifth, is the address in memory of the argument that will get passed to LoadLibrary. Your rootkit loader cannot just pass a string here, because that would refer to an address in the rootkit loader's address space and therefore be meaningless to the target process. Microsoft has provided two functions that will help the rootkit loader get around this hurdle.

By calling VirtualAllocEx, your rootkit loader can allocate memory in the target process:

 LPVOID VirtualAllocEx(   HANDLE hProcess,   LPVOID lpAddress,   SIZE_T dwSize,   DWORD flAllocationType,   DWORD flProtect );

To write the name of the rootkit DLL to be used in the call to LoadLibrary in the target process, call WriteProcessMemory with the address you received from the call to VirtualAllocEx. The prototype of WriteProcessMemory is:

 BOOL WriteProcessMemory(   HANDLE hProcess,   LPVOID lpBaseAddress,   LPCVOID lpBuffer,   SIZE_T nSize,   SIZE_T* lpNumberOfBytesWritten );

In the preceding overview of userland hooks, we have seen that these hooks are typically IAT or inline function hooks; that in order to implement hooks in userland, you must get access to the target process's address space; and that injecting a DLL or a thread into the target process is a common way to access the target process's address space.

Now that you understand these concepts regarding userland hooks, the following section will introduce kernel hooks.

< Day Day Up >