14. Portable Executable Files

Page 238
  14. Portable Executable Files  
   
  In this chapter, we will discuss executable files. Our main goal is to create a VB application that will display various information about a given executable file, including the file's export table, which is a list of the functions that the DLL makes available to calling programs. This application is called rpiPEInfo.  
   
  Executable files are files that have the Portable Executable File format, or PE file format. According to the documentation:  
  The name ''Portable Executable" refers to the fact that the format is not architecture-specific.  
   
  However, this seems to be in contradiction to the presence of a machine flag in the file (discussed later) about which the documentation states:  
  An image file can be run only on the specified machine, or a system emulating it.  
   
  In any case, PE files include EXE and DLL files, as well as OCX, DRV, and other files. Unfortunately, however, the term executable file is often applied just to EXE files. (We will not do this.) Executable files are also often called image files.  
   
  By the way, closely related to PE files are the files that are produced by compilers, such as Visual C++. These are called object files and have the Common Object File Format (or COFF). You will often hear these types of files mentioned in the same sentence with PE files.  
   
  The PE file format is extremely complex, and this is no doubt exasperated by the fact that the format is poorly documented in the MSDN library. Our intention is just to give a reasonable amount of detail, but by no means a complete description of the format. You should feel free to skim over some of this material and use it as a reference for future needs.  
Page 239
 
  Module Relocation  
   
  As we have mentioned, each module has a default base address that is stored in the file itself. The default base address for a DLL compiled under Visual C++ is &10000000 and under Visual Basic it is &H11000000, although these addresses can easily be changed. (In VB, we can change the default base address using the Compile tab of the Project Properties dialog.)  
   
  When a module (DLL, OCX, etc.) is loaded into the address space of a process, Windows attempts to place the module at its default base address. However, if there is another module located at that address, a conflict arises and Windows must relocate the new module.  
   
  There are some interesting and not very well publicized consequences of module relocation that can be important to understand, especially since these consequences can have a negative effect on an application's performance.  
   
  The problem with relocation is this. Executable files contain references to memory addresses. For instance, the object code corresponding to the source code:  
 
  Dim i As Integer
i = 5
 
   
  will contain a reference to a memory location containing data (the number 5). This is a direct memory reference.  
   
  Also, the function call:  
 
  Call AFunction(7)  
   
  is actually a jump to the address of the function AFunction. Since this jump is made relative to the address of the Call instruction, the object code for this function call will contain the offset from the address of the function call to the address of the function itself. This is a relative memory reference. Keep in mind that the function AFunction may be in the same executable file as the call to that function or it may be in another executable file.  
   
  Now, since actual addresses cannot be known at link time (when the executable is constructed), the linker cannot replace the object code for the examples above with the actual addresses. The best it can do is use addresses that are relative to some other location in the module.  
   
  For a direct reference, the address is relative to the default base address of the module. For instance, suppose that the base address of a module is &H10000000. If the memory variable i has offset 100 (say) from the start of the module, then  

4th Edition

Page 240
   
  the machine language instruction to place the value 5 in that location might look something like the following (in assembly language):  
 
  mov dword ptr [01000100], 5  
   
  Note the presence of the base address. In short, the base address of the executable is hardcoded into the executable.  
   
  A similar problem arises when a call is to a function that is contained in a different module (such as another DLL). If the base address of the other DLL is, say, &12000000, the jump instruction would be relative to that base address.  
   
  Now, the problem is that there is no guarantee that a module will be loaded at its default base address. In fact, only one module can be loaded at any given address, and, as you can see from our process viewing utility, a process space may have a great many modules. It is not surprising that two modules might have the same default base address.  
   
  Fortunately, Microsoft has been careful about assigning default base addresses to its modules in such a way as to minimize conflicts. You can see this by browsing the rpiEnumProcs utility. However, all modules that we create in VB (or VC++) will have the same default base address unless we specify otherwise.  
   
  In order that a module be relocatable, executable files contain relocation information for the relocatable code in the file, that is, the code that may need adjustment if the file is not loaded at its default base address. This relocation information is also called fixup information or fixups.  
   
  Now, if a module needs to be relocated for a particular process, its relocatable code needs to be changed for that process, but not for any other process that is using this module. Suppose, for instance, that Process1 has Module1 loaded into its address space at Module1's default address. Process2 wants to load Module1, but it already has a module loaded at Module1's default base address. Thus, Module1 will need to be relocated in Process2's address space, which will necessitate some changes to the relocatable code in Module1. Since Process1 cannot tolerate any changes to the code for Module1, Windows must make a new physical copy of Module1 so that it can perform the necessary fixups and map the module into the virtual address space of Process2. This is done using the copy-on-write mechanism that we discussed earlier.  
   
  The point, of course, is not only that this copy-on-write process requires additional processor time, but also that the new physical copy of the module uses additional precious physical memory. Thus, it behooves us to attempt to minimize base address conflicts when creating modules.  
Page 241
 
  The PE File Format  
   
  Now we are ready to discuss the format of an executable file. Recall that our main goal is to create the rpiPEInfo application that displays various information about a given executable file, including the file's export table.  
   
  The overall structure of a PE file is:  
   
  PE file header  
 
  MS-DOS stub  
 
  PE signature  
 
  COFF file header (also sometimes called the PE header!)  
 
  Optional header  
   
  Section table (table of section headers)  
   
  Sections  
   
  Figure 14-1 shows the overall structure in more detail.  
   
  The PE File Header  
   
  A PE file begins with a PE file header, which consists of the following items:  
   
  MS-DOS stub  
   
  PE signature  
   
  COFF file header  
   
  Optional header  
   
  MS-DOS stub  
   
  The PR file header begins with the MS-DOS stub, which is an actual DOS application that by default just prints the message: "This program cannot be run in DOS mode" when the image is run in DOS mode.  
   
  The PE signature  
   
  Following the DOS stub is the PE signature. The PE signature lies at the file offset specified at location &H3C and currently consists of the four bytes:  
 
  "P"/"E"/NULL/NULL  
   
  (Of course, there is nothing to prevent a non-PE file from having this signature.)  
Page 242
   
  0242-01.gif  
   
  Figure 14-1.
Structure of a PE file
 
   
  COFF file header  
   
  The COFF file header appears immediately after the PE signature. (The COFF header appears after the signature in a PE file, but at the beginning of a COFF file.) This header contains a variety of information about the file, as shown in Table 14-1.  
Page 243
Table 14-1. The COFF File Header for a PE File
Offset Into Optional Header Size in Bytes Field Description
0 2 Computer type Required processor type. This value is 0 for unknown and &H14C for Intel and compatible processors.
2 2 NumberOfSections Number of sections (i.e., number of entries in section table).
4 4 TimeDateStamp Time and date the file was created.
8 4 PointerToSymbolTable File offset of the COFF symbol table or 0 if no table is present.
12 4 NumberOfSymbols Number of entries in the symbol table. This can be used to locate the string table, which immediately follows the symbol table.
16 2 SizeOfOptionalHeader Size of the PE file optional header.
18 2 Characteristics Attributes of the file.


   
  The Characteristics flag holds various file information:  
 
  IMAGE_FILE_RELOCS_STRIPPED (&H0001)
Indicates that the file does not contain base relocations and must therefore be loaded at its preferred (default) base address.
 
 
  IMAGE_FILE_EXECUTABLE_IMAGE (&H0002)
Indicates that the image file is valid and can be run. If this flag is not set, it generally indicates a linker error.
 
 
  IMAGE_FILE_AGGRESSIVE_WS_TRIM (&H0010)
Aggressively trim working set. (I do not know how this is done, however.)
 
 
  IMAGE_FILE_LARGE_ADDRESS_AWARE (&H0020)
The application can handle addresses greater than 2GB.
 
 
  IMAGE_FILE_BYTES_REVERSED_LO (&H0080)
Indicates that the little-endian memory storage scheme is being used; that is, the least significant byte of a 16-bit word is stored first in memory, followed by the most significant byte. (We have encountered this issue earlier in the book.) All Intel-style PCs use little-endian. (Macintosh computers use big-endian, however.)
 
 
  IMAGE_FILE_BYTES_REVERSED_HI (&H8000)
Big-endian storage is used.
 
 
  IMAGE_FILE_32BIT_MACHINE (&H0100)
The PC is based on a 32-bit-word architecture.
 
Page 244
 
  IMAGE_FILE_DEBUG_STRIPPED (&H0200)
Debugging information has been removed from the image file.
 
 
  IMAGE_FILE_REMOVABLE_RUN_FROM_SWAP (&H0400)
If an attempt is made to run the image file from removable media, the file is copied and run from the pagefile (swap file) instead.
 
 
  IMAGE_FILE_SYSTEM (&H1000)
The image file is a system file, not a user program.
 
 
  IMAGE_FILE_DLL (&H2000)
The image file is a DLL.
 
 
  IMAGE_FILE_UP_SYSTEM_ONLY (&H4000)
File should be run only on a UP machine (whatever that means).
 
 
  IMAGE_FILE_LINE_NUMS_STRIPPED (&H0004)
COFF line numbers have been removed.
 
 
  IMAGE_FILE_LOCAL_SYMS_STRIPPED (&H0008)
COFF symbol table entries for local symbols have been removed.
 
 
  IMAGE_FILE_16BIT_MACHINE (&H0040)
Reserved.
 
   
  The rpiPEInfo utility checks for the PE file signature. Finding this signature, the code deciphers the COFF header characteristics flag.  
   
  The optional header  
   
  The optional header (which is not optional in a PE file, by the way) supplies information to the loader, that is, to the Windows system program that is responsible for loading the executable into memory. Unfortunately, this optional header is also referred to as the PE header (as opposed to the PE file header)!  
   
  The optional header has three parts, which we describe next.  
   
  Part 1: standard fields. The standard fields are described in Table 14-2. (Again, don't worry about all of the details of this and other tables. We present them here generally for reference purposes.)  
Table 14-2. Standard Fields in an Optional Header
Offset Into Optional Header Size in Bytes Field Name Description
0 2 Magic An unsigned integer identifying the state of the image file. The most common value is &H10B, indicating a normal executable file.


   
  (table continued on next page.)  
Page 245
   
  (table continued from previous page.)  
Table 14-2. Standard Fields in an Optional Header (continued)
Offset Into Optional Header Size in Bytes Field Name Description
2 1 MajorLinkerVersion Linker major version number.
3 1 MinorLinkerVersion Linker minor version number.
4 4 SizeOfCode Size of the code (text) section of the PE file, or the sum of all code sections if there are multiple sections.
8 4 SizeOfInitializedData Size of the initialized data section of the PE file, or the sum of all such sections if there are multiple data sections.
12 4 SizeOfUninitializedData Size of the uninitialized data section (BSS) of the PE file, or the sum of all such sections if there are mutiple BSS sections.
16 4 AddressOfEntryPoint Offset of the entry point of the PE file relative to the image's base address when it is loaded into memory. For program images, this is the starting address. Optional for DLLs.
20 4 BaseOfCode Offset of the beginning of the code section relative to the image's base address when loaded into memory.
24 4 BaseOfData Offset of the beginning of the data section relative to the image's base address when loaded into memory.


   
  Part 2: Windows-specific fields. The next 21 fields in the optional header contain additional information needed by the Windows linker and loader. Table 14-3 describes these fields.  
Table 14-3. Windows-Specific Fields
Offset Into Optional Header Size in Bytes Field Name Description
28 4 ImageBase The preferred base address of the image when loaded into memory. This must be aligned on an allocation boundary (a multiple of 64KB). The default for DLLs is &H10000000. The default for Windows NT or Windows 9x EXE files is &H00400000.


   
  (table continued on next page.)  
Page 246
   
  (table continued from previous page.)  
Table 14-3. Windows-Specific Fields (continued)
Offset Into Optional Header Size in Bytes Field Name Description
32 4 SectionAlignment Alignment (in bytes) of the PE file's sections when loaded into memory. The default is the page size for the architecture (4KB for Intel).
36 4 FileAlignment Alignment (in bytes) for the raw data of the sections in the image file.
40 2 MajorOperatingSystemVersion Major version number of required OS.
42 2 MinorOperatingSystemVersion Minor version number of required OS.
44 2 MajorImageVersion Major version number of image.
46 2 MinorImageVersion Minor version number of image.
48 2 MajorSubsystemVersion Major version number of subsystem.
50 2 MinorSubsystemVersion Minor version number of subsystem.
52 4 Reserved  
56 4 SizeOfImage Size, in bytes, of the image, including all headers.
60 4 SizeOfHeaders Combined size of MS-DOS stub, PE header, and section headers rounded up to a multiple of FileAlignment.
64 4 Checksum An image file checksum, used to catch errors in the file.
68 2 Subsystem The subsystem required to run this image. See text discussion.
70 2 DllCharacteristics DLL characteristics.
72 4 SizeOfStackReserve Size of stack to reserve.
76 4 SizeOfStackCommit Size of stack to commit.
80 4 sizeOfHeapReserve Size of local heap space to reserve.
84 4 SizeOfHeapCommit Size of local heap space to commit.
88 4 LoaderFlags Obsolete.
92 4 NumberOfRvaAndSizes Number of data-dictionary entries in the remainder of the Optional Header.


Page 247
   
  The subsystem field can be one of the following values:  
 
  IMAGE_SUBSYSTEM_UNKNOWN (0)
Unknown subsystem
 
 
  IMAGE_SUBSYSTEM_NATIVE (1)
Used for device drivers and native Windows NT processes
 
 
  IMAGE_SUBSYSTEM_WINDOWS_GUI (2)
Windows graphical-mode (GUI) image
 
 
  IMAGE_SUBSYSTEM_WINDOWS_CUI (3)
Windows character-mode (CUI) image, that is, runs in a text-based console window
 
 
  IMAGE_SUBSYSTEM_POSIX_CUI (7)
Posix image
 
 
  IMAGE_SUBSYSTEM_WINDOWS_CE_GUI (9)
Runs under Windows CE
 
   
  Part 3: Data Directory. Each PE file contains several tables and strings that are required by Windows. The Data Directory is a table that describes the location and size of these resources. It will prove very important to us in creating the rpiPEFile program. The entries in the Data Directory are called Data Directories.  
   
  Each entry in the Data Directory has the form defined in the following typedef and is thus 8 bytes long:  
 
  typedef struct _IMAGE_DATA_DIRECTORY {
    DWORD RVA;
    DWORD Size;   // Size in bytes of item
} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;
 
   
  The RVA field is the relative virtual address of the data item (table or string). This confusing field needs some explanation. First, here is what the documentation says:  
 
  RVA (Relative Virtual Address)
In an image file, an RVA is always the address of an item once loaded into memory, with the base address of the image file subtracted from it. The RVA of an item will almost always differ from its position within the file on disk (File Pointer).
 
 
  VA (Virtual Address)
Same As RVA (see above), except that the base address of the image file is not subtracted. The address is called a "Virtual Address" because Windows NT creates a distinct virtual address space for each process, independent of physical memory. For almost all purposes, a virtual address should be considered just an address. A virtual address is not As predictable As an RVA, because the loader might not load the image at its preferred location.
 
Page 248
   
  This seems clear enough, and would lead us to think that we could just add the RVA to the base address of the image file when loaded into memory to get the address of the table. However, this does not seem to work. Fortunately, As we will see in the rpiPEInfo program, there is an API function called ImageRvaToVa that, to quote the documentation:  
  locates a relative virtual address (RVA) within the image header of a file that is mapped As a file and returns the virtual address of the corresponding byte in the file.  
   
  (I suppose that if all we needed to do was add the RVA to the base address to get the VA, then the ImageRvaToVa function, which requires the base address, would not be needed. So something else is probably going on here.)  
   
  In any case, Table 14-4 shows the standard Data Directory. Note that this table is not fixed in length and may grow. The NumberOfRvaAndSizes field in the optional header (Table 14-3) gives the number of entries in the Data Directory.  
Table 14-4. Data Directory
Offset Into Optional Header Size in Bytes Field Description
96 8 Export Table Export Table address and size.
104 8 Import Table Import Table address and size.
112 8 Resource Table Resource Table address and size.
120 8 Exception Table Exception Table address and size.
128 8 Certificate Table Attribute Certificate Table address and size.
136 8 Base Relocation Table Base Relocation Table address and size.
144 8 Debug Debug data starting address and size.
152 8 Architecture Architecture-specific data address and size.
160 8 Global Ptr Relative virtual address of the global pointer register. Size member of this structure is set to 0.
168 8 TLS Table Thread Local Storage (TLS) Table address and size.
176 8 Load Config Table Load Configuration Table address and size.
184 8 Bound Import Bound Import Table address and size.
192 8 IAT Import Address Table address and size.


   
  (table continued on next page.)  
Page 249
   
  (table continued from previous page.)  
Table 14-4. Data Directory (continued)
Offset Into Optional Header Size in Bytes Field Description
200 8 Delay Import Descriptor Address and size of the Delay Import Descriptor.
208 16 Reserved  


   
  The Section Table  
   
  After the optional header (and hence the PE file header itself), we find the section table, each entry of which is a section header. The main bulk of a PE file is contained in sections, of which these entries are the headers. Note that all of the headers come next, followed by all of the sections (rather than a header followed immediately by its section). Each entry of the section table gives, among other things, the offset and size of the corresponding section.  
   
  Note that the section table immediately follows the optional header, which is the only way to locate the section table. (The size of the optional header is specified in the COFF file header.) Also, the number of entries in the section table is given by the NumberOfSections field in the COFF file header. (Entries in the section table are numbered starting from 1.)  
   
  Each section header (section table entry) has the format shown in Table 14-5 and is 40 bytes in size.  
Table 14-5. Format of a Section Table Entry (Section Header)
Offset Size in Bytes Field Description
0 8 Name An 8-byte, null-padded AsCII string.
8 4 VirtualSize Total size of the section when loaded into memory. If this value is greater than SizeofRawData, the section is zero-padded.
12 4 VirtualAddress The address of the first byte of the section, when loaded into memory, relative to the image base.
16 4 SizeOfRawData Size of the initialized data on disk.
20 4 PointerToRawData File pointer to section's first page within the file. This must be a multiple of the FileAlignment value from the optional header.
24 4 PointerToRelocations 0 for PE files.
28 4 PointerToLinenumbers File pointer to beginning of line-number entries for the section. Set of 0 if there are no COFF line numbers.
32 2 NumberOfRelocations 0 for PE files.


   
  (table continued on next page.)  
Page 250
   
  (table continued from previous page.)  
Table 14-5. Format of a Section Table Entry (section Header) (continued)
Offset Size in Bytes Field Description
34 2 NumberOfLinenumbers Number of line-number entries for the section.
36 4 Characteristics Flags describing the section's characteristics.


   
  Sections  
   
  An image file typically has some or all of the following sections:  
   
  Text section (executable code)  
   
  The text section is named .text and contains the executable code for the file.  
   
  Data sections (.bss, .rdata, and .data)  
   
  The data sections are named .bss, .rdata, and .data. The .bss section contains uninitialized data. including static variables. The .rdata section contains read-only data, such As literal strings and constants. All other variables are stored in the .data section.  
   
  Resource section  
   
  This section, named .rsrc, contains information about the resources used by the file.  
   
  Relocation section  
   
  This section, named .reloc, contains the Fix-Up Table. This table contains entries for all fixups for the file. We discussed fixups when we discussed module relocation earlier in this chapter. Basically, fixups are required to make address adjustments in an image file based on its actual load (base) address, which may differ from its default load address.  
   
  Export section  
   
  The export data section, named .edata, contains information about exported functions and global variables. In particular, the export section begins with a table called the Export Directory Table, shown in Table 14-6.  
Table 14-6. The Export Directory Table
Offset Size in Bytes Field Description
0 4 Export flags Reserved.
4 4 Time/data stamp Time and date the export data was created.


   
  (table continued on next page.)  
Page 251
   
  (table continued from previous page.)  
Table 14-6. The Export Directory Table (continued)
Offset Size in Bytes Field Description
8 2 Major version Major version number.
10 2 Minor version Minor version number.
12 4 Name RVA RVA of the AsCII string containing the name of the DLL.
16 4 Ordinal base Starting ordinal number for export As listed in the Export Address Table.
20 4 Address table entries Number of entries in the Export Address Table.
24 4 Number of name pointers Number of entires in the Name pointer Table (and Ordinal Table).
28 4 Export address table RVA RVA of the Export Address Table.
32 4 Name pointer RVA RVA of the Export Address Table.
36 4 Ordinal table RVA RVA of the Ordinal Table.


   
  Incidentally, the Export Address Table contains the address of the exported functions in the PE file, but we are only interested in the names of these functions. In fact, our interest centers on the tables shown in Figure 14-2.  
   
  0251-01.gif  
   
  Figure 14-2.
The Export Name tables
 
   
  Note that at offset 32 of the Export Directory Table is the address of the Export Name Pointer Table (relative to the image base). This table is an array of pointers into the Export Name Table, which, at last, contains the (null-terminated) names of the exported functions. (Since these names may vary in length, we need a pointer to each name.)  
   
  As it happens, not all DLLs that export functions appear to have an export section (at least not one named .edata). This means that we cannot simply search for an  
Page 252
   
  export section in order to find the Export Directory Table. We must take a different approach in the rpiPEFile application. Fortunately, the export table data directory entry (see Table 14-4) seems always to be valid, so we can use this entry to get the RVA of the Export Directory Table.  
   
  Import section  
   
  This section, called .idata, contains information about functions that are imported by the file. Let us take a closer look at this section. Figure 14-3 shows the details. (Incidentally, in the Hint/Name Table, the documentation says that the name begins at offset 4 in each entry, but experimentation seems to indicate that the correct offset is 2.)  
   
  0252-01.gif  
   
  Figure 14-3.
Import table section details
 
   
  The section begins with an Import Directory Table. There is one 20-byte entry in this table for each DLL that imports functions to the image file. Let us refer to these DLLs As import DLLs. Each entry in this table has two important items for our purposes: the RVA (relative virtual address) of the Import Lookup Table for the corresponding import DLL and the RVA of a string that gives the name of this import DLL.  
   
  Each entry in the Import Lookup Table for an import DLL is a 32-bit field. There is one entry for each imported function from this import DLL. The high-order bit of  
Page 253
   
  each entry signals whether to import the function by name (bit = 1) or by ordinal number, that is, by position (bit = 0). The remaining 31 bits from the RVA into the Hint/Name Table.  
   
  Each entry in the Hint/Name Table contains the actual name of the function (not a pointer to the name) starting at offset 4. Thus, the entry lengths in the Hint/Name Table will vary. An entry is padded with another bit (if necessary) only to make the entry's length even, so that the next entry starts on an even boundary line.  
 
  Example: Getting PE File Information  
   
  Now that we have a basic understanding of the format of a PE file, we can create the rpiPEInfo utility. Figure 14-4 shows the utility, As it dissects the file COMCTL32.DLL.  
   
  0253-01.gif  
   
  Figure 14-4.
The rpiPEInfo utility
 
   
  The complete source code is on the accompanying CD, so we will just cover some of the highlights. As we have seen, the PE file itself contains relative virtual addresses (RVAs) for most items. We also remarked that these addresses are relative to the file's image in memory, which is not the same As its disk image. Accordingly, the approach taken in the rpiPEInfo utility is first to map the executable file  
Page 254
   
  in question into memory and then translate relative virtual addresses (RVAs) into virtual addresses (VAs).  
   
  To map a PE file into memory, we can use the MapAndLoad function exported by the IMAGEHLP.DLL library:  
 
  BOOL MapAndLoad(
   IN LPSTR ImageName,
   IN LPSTR DllPath,
   OUT PLOADED_IMAGE LoadedImage,
   IN BOOL DotDll,
   IN BOOL ReadOnly);
 
   
  or, in VB:  
 
  Public Declare Function MapAndLoad Lib "Imagehlp.dll" (_
   ByVal ImageName As String, _
   ByVal DLLPath As String, _
   LoadedImage As LOADED_IMAGE, _
   DotDLL As Long, _
   ReadOnly As Long) As Long
 
   
  There is also a corresponding UnMapAndLoad function:  
 
  BOOL UnMapAndLoad(
  IN PLOADED_IMAGE LoadedImage
);
 
   
  or, in VB:  
 
  Public Declare Function UnMapAndLoad Lib "Imagehlp.dll" (_
   LoadedImage As LOADED_IMAGE) As Long
 
   
  The MapAndLoad function maps an image file into virtual memory and fills a LOADED_IMAGE structure with all sorts of useful stuff:  
  Public Type LOADED_IMAGE     48 bytes (46 bytes packed)       ModuleName As Long       hFile As Long       MappedAddress As Long      Base address of mapped file       pFileHeader As Long        Pointer to IMAGE_PE_FILE_HEADER       pLastRvaSection As Long    Pointer to first COFF section header                                  (section table)??       NumberOfSections As Long       pSections As Long          Pointer to first COFF section header                                  (section table)??       Characteristics As Long    Image characteristics value       fSystemImage As Byte       fDOSImage As Byte       Links As LIST_ENTRY        2 longs       SizeOfImage As Long     End Type  
   
  For us, the important information is the base address of the loaded image  
   
  (MappedAddress) and the pointer to the PE file header  
   
  (pFileheader), which, incidentally, the documentation also  
   
  refers to As the NT headers.  
Page 255
   
  To get a VA from an RVA, we use the function:  
 
  LPVOID ImageRvaToVa(
  IN PIMAGE_PE_FILE_HEADER NtHeaders,
  IN LPVOID Base,
  IN DWORD Rva,
  IN OUT PIMAGE_SECTION_HEADER *LastRvaSection
);
 
   
  or, in VB:  
 
  Public Declare Function ImageRvaToVa Lib "Imagehlp.dll" ( _
   ByVal NTHeaders As Long, _
   ByVal Base As Long, _
   ByVal rva As Long, _
   ByVal LastRvaSection As Long) As Long
 
   
  Note that this function requires not only the RVA, but also a pointer to the PE file header (NT headers) as well as the base address of the loaded image. This is further evidence that more is going on in translating an RVA to a VA than just adding the base address. (The last parameter to ImageRvaToVa is not important.) This function returns the VA.  
   
  The Structures  
   
  There are several structures (user-defined types) that we require. These structures can be found in the Winnt.h include file and reflect various tables in the PE file. (We changed a few names to bring them into conformance with the documentation on PE files.)  
   
  In the rpiPEFile program, we must declare the structures from the bottom up, in order to avoid forward reference error messages from VB. It will help to refer to Figure 14-1.  
   
  We begin with a Data Directory entry, which is used to get the RVA for the export and import tables:  
 
  ' Data Directory entry
Public Type IMAGE_DATA_DIRECTORY  ' 8 bytes
   RVA As Long
   size As Long
End Type
 
   
  Next comes the optional header all three parts. The constant IMAGE_NUMBEROF_DIRECTORY_ENTRIES (see the last member below) is defined, in one of the include files, to be equal to 16, even though the documentation says that the number of Data Directory entries is not fixed.  
 
  ' Optional header (all three parts)
Public Type IMAGE_OPTIONAL_HEADER     ' 232 bytes
 
 

Page 256
 
     'Standard fields.
   Magic As Integer
   MajorLinkerVersion As Byte
   MinorLinkerVersion As Byte
   SizeOfCode  As Long
   SizeOfInitializedData  As Long
   SizeOfUninitializedData  As Long
   AddressOfEntryPoint  As Long
   BaseOfCode  As Long
   BaseOfData  As Long

   'NT additional fields.
   ImageBase As Long
   SectionAlignment As Long
   FileAlignment  As Long
   MajorOperatingSystemVersion  As Integer
   MinorOperatingSystemVersion  As Integer
   MajorImageVersion  As Integer
   MinorImageVersion  As Integer
   MajorSubsystemVersion  As Integer
   MinorSubsystemVersion  As Integer
   Win32VersionValue  As Long
   SizeOfImage  As Long
   SizeOfHeaders  As Long
   CheckSum  As Long
   Subsystem  As Integer
   DllCharacteristics  As Integer
   SizeOfStackReserve  As Long
   SizeOfStackCommit  As Long
   SizeOfHeapReserve  As Long
   SizeOfHeapCommit  As Long
   LoaderFlags  As Long
   NumberOfRvaAndSizes  As Long     '96

   ' Data directories
   DataDirectory(0 To IMAGE_NUMBEROF_DIRECTORY_ENTRIES) _
      As IMAGE_DATA_DIRECTORY   ' 17*8 + 96 = 232
End Type
 
   
  Next comes the PE file header without the MS-DOS stub:  
 
  ' PE File header without MS-DOS stub
Public Type IMAGE_PE_FILE_HEADER   ' 256 bytes
   Signature  As Long                        ' 4 bytes -- PE signature
   FileHeader As IMAGE_COFF_HEADER            ' 20 bytes -- COFF header
   OptionalHeader As IMAGE_OPTIONAL_HEADER    ' 232 bytes
End Type
 
   
  We also declared the COFF header itself, but didn't use it:  
 
  ' COFF File header
Public Type IMAGE_COFF_HEADER     ' 20 bytes
   Machine As Integer
   NumberOfSections As Integer
   TimeDateStamp As Long
   PointerToSymbolTable As Long
   NumberOfSymbols As Long
 
 

Page 257
 
     SizeOfOptionalHeader As Integer
   Characteristics As Integer
End Type
 
   
  Finally, we have the Export Directory Table:  
 
  ' Export Directory table
Public Type IMAGE_EXPORT_DIRECTORY_TABLE  ' 40 bytes
   Characteristics As Long
   TimeDateStamp As Long
   MajorVersion As Integer
   MinorVersion As Integer
   Name As Long
   Base As Long
   NumberOfFunctions As Long
   NumberOfNames As Long          ' We need this one
   pAddressOfFunctions As Long
   ExportNamePointerTableRVA As Long    ' We need this one
   pAddressOfNameOrdinals As Long
End Type
 
   
  We declared a few other structures, as you can see by looking at the source code.  
   
  Getting Version Information  
   
  The first function we call when the user clicks on a file name is GetVersionInfo. This function uses several functions from the VERSION.DLL library: GetFileVersionInfoSize, GetFileVersionInfo, VerLanguageName, and VerQueryValue. These functions are used primarily by setup programs.  
   
  The GetFileVersionInfo function fills a buffer with version information about the file. To get at this information, we use VerQueryValue:  
 
  Public Declare Function VerQueryValue Lib "version.dll" Alias "VerQueryValueA" ( _
   pBlock As Byte, _
   ByVal lpSubBlock As String, _
   lplpBuffer As Long, _
   puLen As Long _
) As Long
 
   
  passing various descriptive strings in the lpSubBlock parameter. For instance, passing the string ''\" returns the root block, which is a pointer to the following structure (MS stands for most significant and LS for least significant):  
 
  Public Type VS_FIXEDFILEINFO
   dwSignature As Long
   dwStrucVersion As Long
   dwFileVersionMS As Long
   dwFileVersionLS As Long
   dwProductVersionMS As Long
   dwProductVersionLS As Long
   dwFileFlagsMask As Long
   dwFileFlags As Long
 
 

Page 258
 
     dwFileOS As Long
   dwFileType As Long
   dwFileSubtype As Long
   dwFileDateMS As Long
   dwFileDateLS As Long
End Type
 
   
  There is a tricky part to getting the company name we first need to search the translation array that specifies the languages that are supported by the file. Finding the correct language and code page codes for English, we can then pass VerQueryValue the subblock:  
 
  "\StringFileInfo\" & sLangID & sCodePageID & "\CompanyName"  
   
  Getting File Characteristics  
   
  The next step is to open the file and look for the PE signature. According to the PE documentation, the offset of the PE signature is found at offset &H3C of the file. The following code in the function GetPEFileChars gets the 4 bytes at this offset and checks it for the correct signature:  
 
  'Check for PE file signature
Get #fr, &H3C + 1, bSigOffset
Get #fr, bSigOffset + 1, lSignature
If Not lSignature = &H4550 Then       ' PE\0\0 backwards in memory
   Close fr
   txtDetails = txtDetails & vbCrLf & "   No PE signature"
   Exit Function
End If
 
   
  It is then a straightforward matter to get the characteristics flag from the COFF header as well as some other data, such as the section names. There does seem to be a little problem here, however. You may notice that some section names are a bit strange, as in Figure 14-4. The number of sections is retrieved from the optional header and must be correct, since it is used to calculate the offset of the Section Table. Whatever.  
   
  Getting Export Names  
   
  The function used to get the exports is where the fun really starts. This function is called only for PE files. Here is an overview.  
   
  First, we map the file (sFile) into memory and load a LOADED_IMAGE structure:  
 
  Dim loadimage As LOADED_IMAGE
lret = MapAndLoad(sFile, "", loadimage, True, True)
 
   
  From this, we can get the base address of the loaded image:  
 
  baseaddr = loadimage.MappedAddress  
Page 259
   
  Next, we copy the PE file header to our own variable, so that we can access its members:  
 
  Dim peheader As IMAGE_PE_FILE_HEADER
CopyMemory ByVal VarPtr (peheader), ByVal loadimage.pFileHeader, 256
 
   
  Next, we retrieve the VA from the RVA of the first Data Directory, which is the Data Directory for the Export Directory Table. The constant IMAGE_DIRECTORY_ENTRY_EXPORT is defined to be 0 the index of the first Data Directory:  
 
  rvaExportDirTable = peheader.OptionalHeader. _
   DataDirectory(IMAGE_DIRECTORY_ENTRY_EXPORT).RVA

vaExportDirTable = ImageRvaToVa(loadimage.pFileHeader, _
   loadimage.MappedAddress, rvaExportDirTable, 0&)
 
   
  Again we make a copy into our own structure:  
 
  ' Export directory
Dim exportdir As IMAGE_EXPORT_DIRECTORY_TABLE
CopyMemory ByVal VarPtr(exportdir), ByVal vaExportDirTable, LenB(exportdir)
 
   
  From this copy (see Figure 14-2), we can get the number of exported names:  
 
  cNames = exportdir.NumberOfNames  
   
  Now, exportdir.ExportNamePointerTableRVA is the RVA for the Export Name Pointer Table (see Figure 14-2), so we get the VA for this table as follows:  
 
  ExportNamePointerTableVA = ImageRvaToVa(loadimage.pFileHeader, _
   loadimage.MappedAddress, exportdir.ExportNamePointerTableRVA, 0&)
 
   
  Now we can simply march through the Export Name Pointer Table, collecting the target strings:  
 
  ' Start at the beginning of names
pNextAddress = ExportNamePointerTableVA

' Get the next address (to export name)
VBGetTarget lNextAddress, pNextAddress, 4

lvExports.ListItems.Clear

For i = 0 To cNames - 1

   ' Convert address of this name from RVA to VA
   lNextAddress = ImageRvaToVa(loadimage.pFileHeader, _
      loadimage.MappedAddress, lNextAddress, 0&)

   ' Convert ANSI string to BSTR
   sName = LPSTRtoBSTR(lNextAddress)
   lvExports.ListItems.Add , , sName

   ' Point to next address in table
   pNextAddress = pNextAddress + 4
 
 

Page 260
 
     ' Get the address
   VBGetTarget lNextAddress, pNextAddress, 4

Next
 
   
  Finally, we call UnMapAndLoad.  
   
  Getting Import Names  
   
  Getting imports requires a different approach, since the table structure is different. Here is an overview. Once again we map and load the file. This could have been done once for both imports and exports, but it seemed easier to follow the code by separating the two tasks. In a similar manner as for exports, we get the VA of the Import Directory Table (see Figure 14-3):  
 
  rvaImportDirTable = peheader.OptionalHeader._
   DataDirectory(IMAGE_DIRECTORY_ENTRY_IMPORT).RVA

' Call RvaToVa to get VA from RVA
vaImportDirTable = ImageRvaToVa(loadimage.pFileHeader, _
   loadimage.MappedAddress, rvaImportDirTable, 0&)
 
   
  Then we need to cycle through the entries of this table, getting the Import Lookup Table and DLL name for each entry, until we encounter a null entry. For each non-NULL entry, the following Do loop gathers the import names:  
 
  Do
   VBGetTarget LookupTableEntry, pLookupTableEntry, 4
   If LookupTableEntry = 0 Then Exit Do

   ' Check most significant bit
   ' If 0 then skip since it is by ordinal not by name
   If LookupTableEntry >= 0 Then

      cNames = cNames + 1
      ' Mask MSB
      LookupTableEntry = LookupTableEntry And &H7FFFFFFF

      ' Convert RVA to VA to get address of function name
      pImportFunctionName = ImageRvaToVa(loadimage.pFileHeader, _
      loadimage.MappedAddress, LookupTableEntry, 0&)

      ' Name is at offset 2 in entry
      sFunctionName = LPSTRtoBSTR(pImportFunctionName + 2)

      Set li = lvImports.ListItems.Add()
      li.Text = sFunctionName
      li.ListSubItems.Add , , sDLLName

   End If

   ' Next entry
   pLookupTableEntry = pLookupTableEntry + 4
Loop
 



WIN32 API Programming with Visual Basic
Win32 API Programming with Visual Basic
ISBN: 1565926315
EAN: 2147483647
Year: 1999
Pages: 31
Authors: Steven Roman

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net