14. Portable Executable Files | Win32 API Programming with Visual Basic

Page 238

14. Portable Executable Files


		In this chapter, we will discuss executable files. Our main goal is to create a VB application that will display various information about a given executable file, including the file's export table, which is a list of the functions that the DLL makes available to calling programs. This application is called rpiPEInfo.


		Executable files are files that have the Portable Executable File format, or PE file format. According to the documentation:


		The name ''Portable Executable" refers to the fact that the format is not architecture-specific.


		However, this seems to be in contradiction to the presence of a machine flag in the file (discussed later) about which the documentation states:


		An image file can be run only on the specified machine, or a system emulating it.


		In any case, PE files include EXE and DLL files, as well as OCX, DRV, and other files. Unfortunately, however, the term executable file is often applied just to EXE files. (We will not do this.) Executable files are also often called image files.


		By the way, closely related to PE files are the files that are produced by compilers, such as Visual C++. These are called object files and have the Common Object File Format (or COFF). You will often hear these types of files mentioned in the same sentence with PE files.


		The PE file format is extremely complex, and this is no doubt exasperated by the fact that the format is poorly documented in the MSDN library. Our intention is just to give a reasonable amount of detail, but by no means a complete description of the format. You should feel free to skim over some of this material and use it as a reference for future needs.

Page 239


	Module Relocation


		As we have mentioned, each module has a default base address that is stored in the file itself. The default base address for a DLL compiled under Visual C++ is &10000000 and under Visual Basic it is &H11000000, although these addresses can easily be changed. (In VB, we can change the default base address using the Compile tab of the Project Properties dialog.)


		When a module (DLL, OCX, etc.) is loaded into the address space of a process, Windows attempts to place the module at its default base address. However, if there is another module located at that address, a conflict arises and Windows must relocate the new module.


		There are some interesting and not very well publicized consequences of module relocation that can be important to understand, especially since these consequences can have a negative effect on an application's performance.


		The problem with relocation is this. Executable files contain references to memory addresses. For instance, the object code corresponding to the source code:


		Dim i As Integer i = 5


		will contain a reference to a memory location containing data (the number 5). This is a direct memory reference.


		Also, the function call:


		Call AFunction(7)


		is actually a jump to the address of the function AFunction. Since this jump is made relative to the address of the Call instruction, the object code for this function call will contain the offset from the address of the function call to the address of the function itself. This is a relative memory reference. Keep in mind that the function AFunction may be in the same executable file as the call to that function or it may be in another executable file.


		Now, since actual addresses cannot be known at link time (when the executable is constructed), the linker cannot replace the object code for the examples above with the actual addresses. The best it can do is use addresses that are relative to some other location in the module.


		For a direct reference, the address is relative to the default base address of the module. For instance, suppose that the base address of a module is &H10000000. If the memory variable i has offset 100 (say) from the start of the module, then

4th Edition

Page 240


		the machine language instruction to place the value 5 in that location might look something like the following (in assembly language):


		mov dword ptr [01000100], 5


		Note the presence of the base address. In short, the base address of the executable is hardcoded into the executable.


		A similar problem arises when a call is to a function that is contained in a different module (such as another DLL). If the base address of the other DLL is, say, &12000000, the jump instruction would be relative to that base address.


		Now, the problem is that there is no guarantee that a module will be loaded at its default base address. In fact, only one module can be loaded at any given address, and, as you can see from our process viewing utility, a process space may have a great many modules. It is not surprising that two modules might have the same default base address.


		Fortunately, Microsoft has been careful about assigning default base addresses to its modules in such a way as to minimize conflicts. You can see this by browsing the rpiEnumProcs utility. However, all modules that we create in VB (or VC++) will have the same default base address unless we specify otherwise.


		In order that a module be relocatable, executable files contain relocation information for the relocatable code in the file, that is, the code that may need adjustment if the file is not loaded at its default base address. This relocation information is also called fixup information or fixups.


		Now, if a module needs to be relocated for a particular process, its relocatable code needs to be changed for that process, but not for any other process that is using this module. Suppose, for instance, that Process1 has Module1 loaded into its address space at Module1's default address. Process2 wants to load Module1, but it already has a module loaded at Module1's default base address. Thus, Module1 will need to be relocated in Process2's address space, which will necessitate some changes to the relocatable code in Module1. Since Process1 cannot tolerate any changes to the code for Module1, Windows must make a new physical copy of Module1 so that it can perform the necessary fixups and map the module into the virtual address space of Process2. This is done using the copy-on-write mechanism that we discussed earlier.


		The point, of course, is not only that this copy-on-write process requires additional processor time, but also that the new physical copy of the module uses additional precious physical memory. Thus, it behooves us to attempt to minimize base address conflicts when creating modules.

Page 241


	The PE File Format


		Now we are ready to discuss the format of an executable file. Recall that our main goal is to create the rpiPEInfo application that displays various information about a given executable file, including the file's export table.


		The overall structure of a PE file is:


		PE file header


		MS-DOS stub


		PE signature


		COFF file header (also sometimes called the PE header!)


		Optional header


		Section table (table of section headers)


		Sections


		Figure 14-1 shows the overall structure in more detail.


		*The PE File Header*


		A PE file begins with a PE file header, which consists of the following items:


		MS-DOS stub


		PE signature


		COFF file header


		Optional header


		MS-DOS stub


		The PR file header begins with the MS-DOS stub, which is an actual DOS application that by default just prints the message: "This program cannot be run in DOS mode" when the image is run in DOS mode.


		The PE signature


		Following the DOS stub is the PE signature. The PE signature lies at the file offset specified at location &H3C and currently consists of the four bytes:


		"P"/"E"/NULL/NULL


		(Of course, there is nothing to prevent a non-PE file from having this signature.)

Page 242


		Figure 14-1. Structure of a PE file


		COFF file header


		The COFF file header appears immediately after the PE signature. (The COFF header appears after the signature in a PE file, but at the beginning of a COFF file.) This header contains a variety of information about the file, as shown in Table 14-1.

Page 243

Table 14-1. The COFF File Header for a PE File
Offset Into Optional Header	Size in Bytes	Field	Description
0	2	Computer type	Required processor type. This value is 0 for unknown and &H14C for Intel and compatible processors.
2	2	NumberOfSections	Number of sections (i.e., number of entries in section table).
4	4	TimeDateStamp	Time and date the file was created.
8	4	PointerToSymbolTable	File offset of the COFF symbol table or 0 if no table is present.
12	4	NumberOfSymbols	Number of entries in the symbol table. This can be used to locate the string table, which immediately follows the symbol table.
16	2	SizeOfOptionalHeader	Size of the PE file optional header.
18	2	Characteristics	Attributes of the file.


		The Characteristics flag holds various file information:


		IMAGE_FILE_RELOCS_STRIPPED (&H0001) Indicates that the file does not contain base relocations and must therefore be loaded at its preferred (default) base address.


		IMAGE_FILE_EXECUTABLE_IMAGE (&H0002) Indicates that the image file is valid and can be run. If this flag is not set, it generally indicates a linker error.


		IMAGE_FILE_AGGRESSIVE_WS_TRIM (&H0010) Aggressively trim working set. (I do not know how this is done, however.)


		IMAGE_FILE_LARGE_ADDRESS_AWARE (&H0020) The application can handle addresses greater than 2GB.


		IMAGE_FILE_BYTES_REVERSED_LO (&H0080) Indicates that the little-endian memory storage scheme is being used; that is, the least significant byte of a 16-bit word is stored first in memory, followed by the most significant byte. (We have encountered this issue earlier in the book.) All Intel-style PCs use little-endian. (Macintosh computers use big-endian, however.)


		IMAGE_FILE_BYTES_REVERSED_HI (&H8000) Big-endian storage is used.


		IMAGE_FILE_32BIT_MACHINE (&H0100) The PC is based on a 32-bit-word architecture.

Page 244


		IMAGE_FILE_DEBUG_STRIPPED (&H0200) Debugging information has been removed from the image file.


		IMAGE_FILE_REMOVABLE_RUN_FROM_SWAP (&H0400) If an attempt is made to run the image file from removable media, the file is copied and run from the pagefile (swap file) instead.


		IMAGE_FILE_SYSTEM (&H1000) The image file is a system file, not a user program.


		IMAGE_FILE_DLL (&H2000) The image file is a DLL.


		IMAGE_FILE_UP_SYSTEM_ONLY (&H4000) File should be run only on a UP machine (whatever that means).


		IMAGE_FILE_LINE_NUMS_STRIPPED (&H0004) COFF line numbers have been removed.


		IMAGE_FILE_LOCAL_SYMS_STRIPPED (&H0008) COFF symbol table entries for local symbols have been removed.


		IMAGE_FILE_16BIT_MACHINE (&H0040) Reserved.


		The rpiPEInfo utility checks for the PE file signature. Finding this signature, the code deciphers the COFF header characteristics flag.


		The optional header


		The optional header (which is not optional in a PE file, by the way) supplies information to the loader, that is, to the Windows system program that is responsible for loading the executable into memory. Unfortunately, this optional header is also referred to as the PE header (as opposed to the PE file header)!


		The optional header has three parts, which we describe next.


		*Part 1: standard fields.* The standard fields are described in Table 14-2. (Again, don't worry about all of the details of this and other tables. We present them here generally for reference purposes.)

Table 14-2. Standard Fields in an Optional Header
Offset Into Optional Header	Size in Bytes	Field Name	Description
0	2	Magic	An unsigned integer identifying the state of the image file. The most common value is &H10B, indicating a normal executable file.


		(table continued on next page.)

Page 245


		(table continued from previous page.)

Table 14-2. Standard Fields in an Optional Header (continued)
Offset Into Optional Header	Size in Bytes	Field Name	Description
2	1	MajorLinkerVersion	Linker major version number.
3	1	MinorLinkerVersion	Linker minor version number.
4	4	SizeOfCode	Size of the code (text) section of the PE file, or the sum of all code sections if there are multiple sections.
8	4	SizeOfInitializedData	Size of the initialized data section of the PE file, or the sum of all such sections if there are multiple data sections.
12	4	SizeOfUninitializedData	Size of the uninitialized data section (BSS) of the PE file, or the sum of all such sections if there are mutiple BSS sections.
16	4	AddressOfEntryPoint	Offset of the entry point of the PE file relative to the image's base address when it is loaded into memory. For program images, this is the starting address. Optional for DLLs.
20	4	BaseOfCode	Offset of the beginning of the code section relative to the image's base address when loaded into memory.
24	4	BaseOfData	Offset of the beginning of the data section relative to the image's base address when loaded into memory.


		*Part 2: Windows-specific fields.* The next 21 fields in the optional header contain additional information needed by the Windows linker and loader. Table 14-3 describes these fields.

Table 14-3. Windows-Specific Fields
Offset Into Optional Header	Size in Bytes	Field Name	Description
28	4	ImageBase	The preferred base address of the image when loaded into memory. This must be aligned on an allocation boundary (a multiple of 64KB). The default for DLLs is &H10000000. The default for Windows NT or Windows 9x EXE files is &H00400000.


		(table continued on next page.)

Page 246


		(table continued from previous page.)

Table 14-3. Windows-Specific Fields (continued)
Offset Into Optional Header	Size in Bytes	Field Name	Description
32	4	SectionAlignment	Alignment (in bytes) of the PE file's sections when loaded into memory. The default is the page size for the architecture (4KB for Intel).
36	4	FileAlignment	Alignment (in bytes) for the raw data of the sections in the image file.
40	2	MajorOperatingSystemVersion	Major version number of required OS.
42	2	MinorOperatingSystemVersion	Minor version number of required OS.
44	2	MajorImageVersion	Major version number of image.
46	2	MinorImageVersion	Minor version number of image.
48	2	MajorSubsystemVersion	Major version number of subsystem.
50	2	MinorSubsystemVersion	Minor version number of subsystem.
52	4	Reserved
56	4	SizeOfImage	Size, in bytes, of the image, including all headers.
60	4	SizeOfHeaders	Combined size of MS-DOS stub, PE header, and section headers rounded up to a multiple of FileAlignment.
64	4	Checksum	An image file checksum, used to catch errors in the file.
68	2	Subsystem	The subsystem required to run this image. See text discussion.
70	2	DllCharacteristics	DLL characteristics.
72	4	SizeOfStackReserve	Size of stack to reserve.
76	4	SizeOfStackCommit	Size of stack to commit.
80	4	sizeOfHeapReserve	Size of local heap space to reserve.
84	4	SizeOfHeapCommit	Size of local heap space to commit.
88	4	LoaderFlags	Obsolete.
92	4	NumberOfRvaAndSizes	Number of data-dictionary entries in the remainder of the Optional Header.

Page 247


		The subsystem field can be one of the following values:


		IMAGE_SUBSYSTEM_UNKNOWN (0) Unknown subsystem


		IMAGE_SUBSYSTEM_NATIVE (1) Used for device drivers and native Windows NT processes


		IMAGE_SUBSYSTEM_WINDOWS_GUI (2) Windows graphical-mode (GUI) image


		IMAGE_SUBSYSTEM_WINDOWS_CUI (3) Windows character-mode (CUI) image, that is, runs in a text-based console window


		IMAGE_SUBSYSTEM_POSIX_CUI (7) Posix image


		IMAGE_SUBSYSTEM_WINDOWS_CE_GUI (9) Runs under Windows CE


		*Part 3: Data Directory.* Each PE file contains several tables and strings that are required by Windows. The Data Directory is a table that describes the location and size of these resources. It will prove very important to us in creating the rpiPEFile program. The entries in the Data Directory are called Data Directories.


		Each entry in the Data Directory has the form defined in the following typedef and is thus 8 bytes long:


		typedef struct _IMAGE_DATA_DIRECTORY { DWORD RVA; DWORD Size; // Size in bytes of item } IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;


		The RVA field is the relative virtual address of the data item (table or string). This confusing field needs some explanation. First, here is what the documentation says:


		RVA (Relative Virtual Address) In an image file, an RVA is always the address of an item once loaded into memory, with the base address of the image file subtracted from it. The RVA of an item will almost always differ from its position within the file on disk (File Pointer).


		VA (Virtual Address) Same As RVA (see above), except that the base address of the image file is not subtracted. The address is called a "Virtual Address" because Windows NT creates a distinct virtual address space for each process, independent of physical memory. For almost all purposes, a virtual address should be considered just an address. A virtual address is not As predictable As an RVA, because the loader might not load the image at its preferred location.

Page 248


		This seems clear enough, and would lead us to think that we could just add the RVA to the base address of the image file when loaded into memory to get the address of the table. However, this does not seem to work. Fortunately, As we will see in the rpiPEInfo program, there is an API function called ImageRvaToVa that, to quote the documentation:


		locates a relative virtual address (RVA) within the image header of a file that is mapped As a file and returns the virtual address of the corresponding byte in the file.


		(I suppose that if all we needed to do was add the RVA to the base address to get the VA, then the ImageRvaToVa function, which requires the base address, would not be needed. So something else is probably going on here.)


		In any case, Table 14-4 shows the standard Data Directory. Note that this table is not fixed in length and may grow. The NumberOfRvaAndSizes field in the optional header (Table 14-3) gives the number of entries in the Data Directory.

Table 14-4. Data Directory
Offset Into Optional Header	Size in Bytes	Field	Description
96	8	Export Table	Export Table address and size.
104	8	Import Table	Import Table address and size.
112	8	Resource Table	Resource Table address and size.
120	8	Exception Table	Exception Table address and size.
128	8	Certificate Table	Attribute Certificate Table address and size.
136	8	Base Relocation Table	Base Relocation Table address and size.
144	8	Debug	Debug data starting address and size.
152	8	Architecture	Architecture-specific data address and size.
160	8	Global Ptr	Relative virtual address of the global pointer register. Size member of this structure is set to 0.
168	8	TLS Table	Thread Local Storage (TLS) Table address and size.
176	8	Load Config Table	Load Configuration Table address and size.
184	8	Bound Import	Bound Import Table address and size.
192	8	IAT	Import Address Table address and size.


		(table continued on next page.)

Page 249


		(table continued from previous page.)

Table 14-4. Data Directory (continued)
Offset Into Optional Header	Size in Bytes	Field	Description
200	8	Delay Import Descriptor	Address and size of the Delay Import Descriptor.
208	16	Reserved


		*The Section Table*


		After the optional header (and hence the PE file header itself), we find the section table, each entry of which is a section header. The main bulk of a PE file is contained in sections, of which these entries are the headers. Note that all of the headers come next, followed by all of the sections (rather than a header followed immediately by its section). Each entry of the section table gives, among other things, the offset and size of the corresponding section.


		Note that the section table immediately follows the optional header, which is the only way to locate the section table. (The size of the optional header is specified in the COFF file header.) Also, the number of entries in the section table is given by the NumberOfSections field in the COFF file header. (Entries in the section table are numbered starting from 1.)


		Each section header (section table entry) has the format shown in Table 14-5 and is 40 bytes in size.

Table 14-5. Format of a Section Table Entry (Section Header)
Offset	Size in Bytes	Field	Description
0	8	Name	An 8-byte, null-padded AsCII string.
8	4	VirtualSize	Total size of the section when loaded into memory. If this value is greater than SizeofRawData, the section is zero-padded.
12	4	VirtualAddress	The address of the first byte of the section, when loaded into memory, relative to the image base.
16	4	SizeOfRawData	Size of the initialized data on disk.
20	4	PointerToRawData	File pointer to section's first page within the file. This must be a multiple of the FileAlignment value from the optional header.
24	4	PointerToRelocations	0 for PE files.
28	4	PointerToLinenumbers	File pointer to beginning of line-number entries for the section. Set of 0 if there are no COFF line numbers.
32	2	NumberOfRelocations	0 for PE files.


		(table continued on next page.)

Page 250


		(table continued from previous page.)

Table 14-5. Format of a Section Table Entry (section Header) (continued)
Offset	Size in Bytes	Field	Description
34	2	NumberOfLinenumbers	Number of line-number entries for the section.
36	4	Characteristics	Flags describing the section's characteristics.


		*Sections*


		An image file typically has some or all of the following sections:


		Text section (executable code)


		The text section is named .text and contains the executable code for the file.


		Data sections (.bss, .rdata, and .data)


		The data sections are named .bss, .rdata, and .data. The .bss section contains uninitialized data. including static variables. The .rdata section contains read-only data, such As literal strings and constants. All other variables are stored in the .data section.


		Resource section


		This section, named .rsrc, contains information about the resources used by the file.


		Relocation section


		This section, named .reloc, contains the Fix-Up Table. This table contains entries for all fixups for the file. We discussed fixups when we discussed module relocation earlier in this chapter. Basically, fixups are required to make address adjustments in an image file based on its actual load (base) address, which may differ from its default load address.


		Export section


		The export data section, named .edata, contains information about exported functions and global variables. In particular, the export section begins with a table called the Export Directory Table, shown in Table 14-6.

Table 14-6. The Export Directory Table
Offset	Size in Bytes	Field	Description
0	4	Export flags	Reserved.
4	4	Time/data stamp	Time and date the export data was created.


		(table continued on next page.)

Page 251


		(table continued from previous page.)

Table 14-6. The Export Directory Table (continued)
Offset	Size in Bytes	Field	Description
8	2	Major version	Major version number.
10	2	Minor version	Minor version number.
12	4	Name RVA	RVA of the AsCII string containing the name of the DLL.
16	4	Ordinal base	Starting ordinal number for export As listed in the Export Address Table.
20	4	Address table entries	Number of entries in the Export Address Table.
24	4	Number of name pointers	Number of entires in the Name pointer Table (and Ordinal Table).
28	4	Export address table RVA	RVA of the Export Address Table.
32	4	Name pointer RVA	RVA of the Export Address Table.
36	4	Ordinal table RVA	RVA of the Ordinal Table.


		Incidentally, the Export Address Table contains the address of the exported functions in the PE file, but we are only interested in the names of these functions. In fact, our interest centers on the tables shown in Figure 14-2.


		Figure 14-2. The Export Name tables


		Note that at offset 32 of the Export Directory Table is the address of the Export Name Pointer Table (relative to the image base). This table is an array of pointers into the Export Name Table, which, at last, contains the (null-terminated) names of the exported functions. (Since these names may vary in length, we need a pointer to each name.)


		As it happens, not all DLLs that export functions appear to have an export section (at least not one named .edata). This means that we cannot simply search for an

Page 252


		export section in order to find the Export Directory Table. We must take a different approach in the rpiPEFile application. Fortunately, the export table data directory entry (see Table 14-4) seems always to be valid, so we can use this entry to get the RVA of the Export Directory Table.


		Import section


		This section, called .idata, contains information about functions that are imported by the file. Let us take a closer look at this section. Figure 14-3 shows the details. (Incidentally, in the Hint/Name Table, the documentation says that the name begins at offset 4 in each entry, but experimentation seems to indicate that the correct offset is 2.)


		Figure 14-3. Import table section details


		The section begins with an Import Directory Table. There is one 20-byte entry in this table for each DLL that imports functions to the image file. Let us refer to these DLLs As import DLLs. Each entry in this table has two important items for our purposes: the RVA (relative virtual address) of the Import Lookup Table for the corresponding import DLL and the RVA of a string that gives the name of this import DLL.


		Each entry in the Import Lookup Table for an import DLL is a 32-bit field. There is one entry for each imported function from this import DLL. The high-order bit of

Page 253


		each entry signals whether to import the function by name (bit = 1) or by ordinal number, that is, by position (bit = 0). The remaining 31 bits from the RVA into the Hint/Name Table.


		Each entry in the Hint/Name Table contains the actual name of the function (not a pointer to the name) starting at offset 4. Thus, the entry lengths in the Hint/Name Table will vary. An entry is padded with another bit (if necessary) only to make the entry's length even, so that the next entry starts on an even boundary line.


	Example: Getting PE File Information


		Now that we have a basic understanding of the format of a PE file, we can create the rpiPEInfo utility. Figure 14-4 shows the utility, As it dissects the file COMCTL32.DLL.


		Figure 14-4. The rpiPEInfo utility


		The complete source code is on the accompanying CD, so we will just cover some of the highlights. As we have seen, the PE file itself contains relative virtual addresses (RVAs) for most items. We also remarked that these addresses are relative to the file's image in memory, which is not the same As its disk image. Accordingly, the approach taken in the rpiPEInfo utility is first to map the executable file

Page 254


		in question into memory and then translate relative virtual addresses (RVAs) into virtual addresses (VAs).


		To map a PE file into memory, we can use the MapAndLoad function exported by the IMAGEHLP.DLL library:


		BOOL MapAndLoad( IN LPSTR ImageName, IN LPSTR DllPath, OUT PLOADED_IMAGE LoadedImage, IN BOOL DotDll, IN BOOL ReadOnly);


		or, in VB:


		Public Declare Function MapAndLoad Lib "Imagehlp.dll" (_ ByVal ImageName As String, _ ByVal DLLPath As String, _ LoadedImage As LOADED_IMAGE, _ DotDLL As Long, _ ReadOnly As Long) As Long


		There is also a corresponding UnMapAndLoad function:


		BOOL UnMapAndLoad( IN PLOADED_IMAGE LoadedImage );


		or, in VB:


		Public Declare Function UnMapAndLoad Lib "Imagehlp.dll" (_ LoadedImage As LOADED_IMAGE) As Long


		The MapAndLoad function maps an image file into virtual memory and fills a LOADED_IMAGE structure with all sorts of useful stuff:

  Public Type LOADED_IMAGE     48 bytes (46 bytes packed)       ModuleName As Long       hFile As Long       MappedAddress As Long      Base address of mapped file       pFileHeader As Long        Pointer to IMAGE_PE_FILE_HEADER       pLastRvaSection As Long    Pointer to first COFF section header                                  (section table)??       NumberOfSections As Long       pSections As Long          Pointer to first COFF section header                                  (section table)??       Characteristics As Long    Image characteristics value       fSystemImage As Byte       fDOSImage As Byte       Links As LIST_ENTRY        2 longs       SizeOfImage As Long     End Type


		For us, the important information is the base address of the loaded image


		(MappedAddress) and the pointer to the PE file header


		(pFileheader), which, incidentally, the documentation also


		refers to As the NT headers.

Page 255


		To get a VA from an RVA, we use the function:


		LPVOID ImageRvaToVa( IN PIMAGE_PE_FILE_HEADER NtHeaders, IN LPVOID Base, IN DWORD Rva, IN OUT PIMAGE_SECTION_HEADER *LastRvaSection );


		or, in VB:


		Public Declare Function ImageRvaToVa Lib "Imagehlp.dll" ( _ ByVal NTHeaders As Long, _ ByVal Base As Long, _ ByVal rva As Long, _ ByVal LastRvaSection As Long) As Long


		Note that this function requires not only the RVA, but also a pointer to the PE file header (NT headers) as well as the base address of the loaded image. This is further evidence that more is going on in translating an RVA to a VA than just adding the base address. (The last parameter to ImageRvaToVa is not important.) This function returns the VA.


		*The Structures*


		There are several structures (user-defined types) that we require. These structures can be found in the Winnt.h include file and reflect various tables in the PE file. (We changed a few names to bring them into conformance with the documentation on PE files.)


		In the rpiPEFile program, we must declare the structures from the bottom up, in order to avoid forward reference error messages from VB. It will help to refer to Figure 14-1.


		We begin with a Data Directory entry, which is used to get the RVA for the export and import tables:


		' Data Directory entry Public Type IMAGE_DATA_DIRECTORY ' 8 bytes RVA As Long size As Long End Type


		Next comes the optional header all three parts. The constant IMAGE_NUMBEROF_DIRECTORY_ENTRIES (see the last member below) is defined, in one of the include files, to be equal to 16, even though the documentation says that the number of Data Directory entries is not fixed.


		' Optional header (all three parts) Public Type IMAGE_OPTIONAL_HEADER ' 232 bytes

Page 256


		'Standard fields. Magic As Integer MajorLinkerVersion As Byte MinorLinkerVersion As Byte SizeOfCode As Long SizeOfInitializedData As Long SizeOfUninitializedData As Long AddressOfEntryPoint As Long BaseOfCode As Long BaseOfData As Long 'NT additional fields. ImageBase As Long SectionAlignment As Long FileAlignment As Long MajorOperatingSystemVersion As Integer MinorOperatingSystemVersion As Integer MajorImageVersion As Integer MinorImageVersion As Integer MajorSubsystemVersion As Integer MinorSubsystemVersion As Integer Win32VersionValue As Long SizeOfImage As Long SizeOfHeaders As Long CheckSum As Long Subsystem As Integer DllCharacteristics As Integer SizeOfStackReserve As Long SizeOfStackCommit As Long SizeOfHeapReserve As Long SizeOfHeapCommit As Long LoaderFlags As Long NumberOfRvaAndSizes As Long '96 ' Data directories DataDirectory(0 To IMAGE_NUMBEROF_DIRECTORY_ENTRIES) _ As IMAGE_DATA_DIRECTORY ' 17*8 + 96 = 232 End Type


		Next comes the PE file header without the MS-DOS stub:


		' PE File header without MS-DOS stub Public Type IMAGE_PE_FILE_HEADER ' 256 bytes Signature As Long ' 4 bytes -- PE signature FileHeader As IMAGE_COFF_HEADER ' 20 bytes -- COFF header OptionalHeader As IMAGE_OPTIONAL_HEADER ' 232 bytes End Type


		We also declared the COFF header itself, but didn't use it:


		' COFF File header Public Type IMAGE_COFF_HEADER ' 20 bytes Machine As Integer NumberOfSections As Integer TimeDateStamp As Long PointerToSymbolTable As Long NumberOfSymbols As Long

Page 257


		SizeOfOptionalHeader As Integer Characteristics As Integer End Type


		Finally, we have the Export Directory Table:


		' Export Directory table Public Type IMAGE_EXPORT_DIRECTORY_TABLE ' 40 bytes Characteristics As Long TimeDateStamp As Long MajorVersion As Integer MinorVersion As Integer Name As Long Base As Long NumberOfFunctions As Long NumberOfNames As Long ' We need this one pAddressOfFunctions As Long ExportNamePointerTableRVA As Long ' We need this one pAddressOfNameOrdinals As Long End Type


		We declared a few other structures, as you can see by looking at the source code.


		*Getting Version Information*


		The first function we call when the user clicks on a file name is GetVersionInfo. This function uses several functions from the VERSION.DLL library: GetFileVersionInfoSize, GetFileVersionInfo, VerLanguageName, and VerQueryValue. These functions are used primarily by setup programs.


		The GetFileVersionInfo function fills a buffer with version information about the file. To get at this information, we use VerQueryValue:


		Public Declare Function VerQueryValue Lib "version.dll" Alias "VerQueryValueA" ( _ pBlock As Byte, _ ByVal lpSubBlock As String, _ lplpBuffer As Long, _ puLen As Long _ ) As Long


		passing various descriptive strings in the lpSubBlock parameter. For instance, passing the string ''\" returns the root block, which is a pointer to the following structure (MS stands for most significant and LS for least significant):


		Public Type VS_FIXEDFILEINFO dwSignature As Long dwStrucVersion As Long dwFileVersionMS As Long dwFileVersionLS As Long dwProductVersionMS As Long dwProductVersionLS As Long dwFileFlagsMask As Long dwFileFlags As Long

Page 258


		dwFileOS As Long dwFileType As Long dwFileSubtype As Long dwFileDateMS As Long dwFileDateLS As Long End Type


		There is a tricky part to getting the company name we first need to search the translation array that specifies the languages that are supported by the file. Finding the correct language and code page codes for English, we can then pass VerQueryValue the subblock:


		"\StringFileInfo\" & sLangID & sCodePageID & "\CompanyName"


		*Getting File Characteristics*


		The next step is to open the file and look for the PE signature. According to the PE documentation, the offset of the PE signature is found at offset &H3C of the file. The following code in the function GetPEFileChars gets the 4 bytes at this offset and checks it for the correct signature:


		'Check for PE file signature Get #fr, &H3C + 1, bSigOffset Get #fr, bSigOffset + 1, lSignature If Not lSignature = &H4550 Then ' PE\0\0 backwards in memory Close fr txtDetails = txtDetails & vbCrLf & " No PE signature" Exit Function End If


		It is then a straightforward matter to get the characteristics flag from the COFF header as well as some other data, such as the section names. There does seem to be a little problem here, however. You may notice that some section names are a bit strange, as in Figure 14-4. The number of sections is retrieved from the optional header and must be correct, since it is used to calculate the offset of the Section Table. Whatever.


		*Getting Export Names*


		The function used to get the exports is where the fun really starts. This function is called only for PE files. Here is an overview.


		First, we map the file (sFile) into memory and load a LOADED_IMAGE structure:


		Dim loadimage As LOADED_IMAGE lret = MapAndLoad(sFile, "", loadimage, True, True)


		From this, we can get the base address of the loaded image:


		baseaddr = loadimage.MappedAddress

Page 259


		Next, we copy the PE file header to our own variable, so that we can access its members:


		Dim peheader As IMAGE_PE_FILE_HEADER CopyMemory ByVal VarPtr (peheader), ByVal loadimage.pFileHeader, 256


		Next, we retrieve the VA from the RVA of the first Data Directory, which is the Data Directory for the Export Directory Table. The constant IMAGE_DIRECTORY_ENTRY_EXPORT is defined to be 0 the index of the first Data Directory:


		rvaExportDirTable = peheader.OptionalHeader. _ DataDirectory(IMAGE_DIRECTORY_ENTRY_EXPORT).RVA vaExportDirTable = ImageRvaToVa(loadimage.pFileHeader, _ loadimage.MappedAddress, rvaExportDirTable, 0&)


		Again we make a copy into our own structure:


		' Export directory Dim exportdir As IMAGE_EXPORT_DIRECTORY_TABLE CopyMemory ByVal VarPtr(exportdir), ByVal vaExportDirTable, LenB(exportdir)


		From this copy (see Figure 14-2), we can get the number of exported names:


		cNames = exportdir.NumberOfNames


		Now, exportdir.ExportNamePointerTableRVA is the RVA for the Export Name Pointer Table (see Figure 14-2), so we get the VA for this table as follows:


		ExportNamePointerTableVA = ImageRvaToVa(loadimage.pFileHeader, _ loadimage.MappedAddress, exportdir.ExportNamePointerTableRVA, 0&)


		Now we can simply march through the Export Name Pointer Table, collecting the target strings:


		' Start at the beginning of names pNextAddress = ExportNamePointerTableVA ' Get the next address (to export name) VBGetTarget lNextAddress, pNextAddress, 4 lvExports.ListItems.Clear For i = 0 To cNames - 1 ' Convert address of this name from RVA to VA lNextAddress = ImageRvaToVa(loadimage.pFileHeader, _ loadimage.MappedAddress, lNextAddress, 0&) ' Convert ANSI string to BSTR sName = LPSTRtoBSTR(lNextAddress) lvExports.ListItems.Add , , sName ' Point to next address in table pNextAddress = pNextAddress + 4

Page 260


		' Get the address VBGetTarget lNextAddress, pNextAddress, 4 Next


		Finally, we call UnMapAndLoad.


		*Getting Import Names*


		Getting imports requires a different approach, since the table structure is different. Here is an overview. Once again we map and load the file. This could have been done once for both imports and exports, but it seemed easier to follow the code by separating the two tasks. In a similar manner as for exports, we get the VA of the Import Directory Table (see Figure 14-3):


		rvaImportDirTable = peheader.OptionalHeader._ DataDirectory(IMAGE_DIRECTORY_ENTRY_IMPORT).RVA ' Call RvaToVa to get VA from RVA vaImportDirTable = ImageRvaToVa(loadimage.pFileHeader, _ loadimage.MappedAddress, rvaImportDirTable, 0&)


		Then we need to cycle through the entries of this table, getting the Import Lookup Table and DLL name for each entry, until we encounter a null entry. For each non-NULL entry, the following Do loop gathers the import names:


		Do VBGetTarget LookupTableEntry, pLookupTableEntry, 4 If LookupTableEntry = 0 Then Exit Do ' Check most significant bit ' If 0 then skip since it is by ordinal not by name If LookupTableEntry >= 0 Then cNames = cNames + 1 ' Mask MSB LookupTableEntry = LookupTableEntry And &H7FFFFFFF ' Convert RVA to VA to get address of function name pImportFunctionName = ImageRvaToVa(loadimage.pFileHeader, _ loadimage.MappedAddress, LookupTableEntry, 0&) ' Name is at offset 2 in entry sFunctionName = LPSTRtoBSTR(pImportFunctionName + 2) Set li = lvImports.ListItems.Add() li.Text = sFunctionName li.ListSubItems.Add , , sDLLName End If ' Next entry pLookupTableEntry = pLookupTableEntry + 4 Loop