2. Preliminaries

Page 12
2. Preliminaries
In this chapter, we cover some preliminary topics that are required for the remainder of the book.
Character Codes
A character code is just a mapping from a set of consecutive integers (starting with 0) to a set of characters. The Windows API uses several different character codes.
ASCII
The original ASCII (American Standard Code for Information Interchange) character code contains 128 characters, including the familiar Roman letters, Arabic digits, and various punctuation marks, mapped to the first 128 nonnegative integers, represented as 8-bit binary numbers. When IBM introduced the PC in 1981, it extended the ASCII character code by adding an additional 128 characters, including some foreign language characters, a very few mathematical symbols, and several graphics symbols for making boxes and the like on a character mode display. This 256-character code is known as the extended ASCII code.
ANSI
Microsoft adopted the American National Standard Institute or ANSI character code for use with Windows 1.0 (released in 1985). The first 128 characters and their codes are the same as in ASCII, but the upper 128 characters are different. For some reason, not all of the upper 128 code values were mapped to characters. I have yet to hear a reasonable explanation for this seemingly unnecessary waste of precious resources.

Page 13
In 1987, Microsoft introduced the idea of a code page, which is just a mapping from characters to numbers. IBM's original extended ASCII character code became code page 437 or MS-DOS Latin US. All code pages have the same lower 128 characters (in the same places) but the upper 128 characters vary from page to page.
Although code pages proliferated, they did not solve the real problem, which is that 256 characters are simply not enough to satisfy the needs of the entire world! For instance, the Chinese and Japanese languages each have more than 20,000 characters, and there are many hundreds of mathematical and scientific symbols.
DBCS
To accommodate large character sets, the awkward double-byte character set (or DBCS) was devised. To say that DBCS is awkward is to be overly kind. We could say a lot more about a character map in which some characters have 8-bit codes whereas others have 16-bit codes! It is not hard to imagine the problems that such a character code creates. For instance, there is no way to determine the number of characters in a binary string from its length!
Fortunately, DBCS is supported only in versions of Windows designed specifically for countries that require this code. We will ignore it completely, but you will see it mentioned in the Win32 documentation.
Unicode
The Unicode project began in 1988. Unicode is a 2-byte character code, allowing for 65,536 distinct characters, which is enough for the foreseeable future. Version 2.0 of the standard includes 38,885 characters. Microsoft uses the term wide character as a synonym for Unicode character, although others use the term to represent any 2-byte character code.
The first 256 Unicode characters are the same as in the extended ASCII character code, with the high-order byte of each codeword set to 0. Beyond this, several blocks of contiguous code values are mapped to blocks of related symbols, with lots of room in between blocks for future expansion. For instance, the Greek alphabet lies within the range &H370-&H3FF, along with some other characters (the lowercase alpha has code number &H3B1).
The Unicode Consortium is responsible for overseeing the development of the Unicode character set and providing technical information about the code. The Consortium cooperates with the ISO in refining the Unicode specification and expanding the character set. The Consortium consists of major computer corporations (such as IBM, Apple, Hewlett-Packard, and Xerox), software companies (such as Microsoft, Abode, Lotus, and Netscape), international agencies, universities and even some individuals. You can find more information about Unicode and the Unicode Consortium at http://www.unicode.org/unicode/contents.html.

Page 14
It is important to note that when a 2-byte integer (such as a Unicode codeword) is stored in memory, the low-order byte is stored first. For instance, the Unicode codeword &H0041 will appear in memory as 41 00.
Unicode Support Under Windows
Windows NT uses Unicode as its native character code. In other words, Windows NT was designed specifically to use Unicode. It also supports ANSI for compatibility purposes. However, Windows 9x does not support Unicode, except in some special cases. In particular, Unicode is used for all OLE-related API functions and for a handful of other API functions. As I mentioned earlier, this is one of Windows 9x's major shortcomings.
On the other hand, Visual Basic uses Unicode internally to represent strings (whether running on Windows NT or Windows 9x). The lack of full support for Unicode under Windows 9x has significant consequences, as we shall see in Chapter 6, Strings.
Parameters and Arguments
Throughout the book, I will try to make the proper distinction between a parameter and an argument a parameter is a placekeeper that is used in the declaration of a function; an argument is the actual object being passed to the function. Thus, the parameter appears in the declaration of the function, whereas the argument appears in the call to the function.
IN and OUT Parameters
Generally speaking, a parameter to a function can be used for one of two non-mutually exclusive purposes:
To send a value to the function
To return a value from the function
A parameter that is used to send a value to a function is called an IN parameter, and one that is used to return a value is called an OUT parameter. If a parameter functions in both capacities, it is referred to as an IN/OUT parameter. The words IN and OUT are occasionally used in the documentation, but this is rare.
ByVal and ByRef
As you no doubt know, the difference between parameter passing by value and by reference can be summarized as follows:
Using ByVal asks VB to pass the value of the argument, whereas using ByRef asks VB to pass the address of (that is, a pointer to) the argument.

Page 15
It is particularly important to have a clear understanding of the differences between passing by value and passing by reference, because when dealing with API function calls, a slip here will generally mean a GPF.
Dynamic Link Libraries
The Windows API functions are implemented in dynamic link libraries (DLLs), so it might be a good idea to briefly discuss DLLs and how they work. We will have much more to say about DLLs later in the book.
A dynamic link library, or DLL for short, is an executable file that contains one or more exportable functions, that is, functions that can be called from another executable (EXE or DLL). In many ways, DLLs are much simpler than EXE files, since they do not contain code to manage a graphical interface or to process Windows messages, for instance.
To clarify the terminology, an executable file, also called an image file, is a file that adheres to the Portable Executable (PE) file format specification. These files include EXE and DLL files, as well as OCX, DRV, and other files. To confuse matters, the term executable file is often used to refer to EXE files only. (We will devote an entire chapter to the PE file format.)
Unfortunately, Visual Basic cannot create traditional DLLs. It is capable of creating a very useful form of DLL called an ActiveX server. However, these DLLs do not export functions in the traditional manner. Rather, they export Automation objects, along with their properties and methods. For this reason, they are also called Automation servers. We will speak no more about these special types of DLLs.
To say that DLLs are ubiquitous under the Windows operating system seems like an incredible understatement. For instance, the Windows NT system that I am using to write this book currently has no fewer than 1,029 distinct DLLs on the hard disk, taking about 93 megabytes of disk space!
Windows uses several DLLs to house the Win32 API functions. In fact, most of the more than 2,000 functions in the Win32 API are contained in three DLLs:
KERNEL32.DLL
Exports about 700 functions that manage memory, processes, and threads
USER32.DLL
Exports about 600 functions that control the user interface, such as creating windows and sending messages
GDI32.DLL
Exports about 400 functions for drawing graphical images, displaying text, and manipulating fonts
In addition to these DLLs, Windows also includes several other DLLs for more specialized tasks. Following are some examples.

Page 16
COMDLG32.DLL
Exports about 20 functions for controlling the Windows common dialogs
LZ32.DLL
Exports a dozen functions for file compression and expansion
ADVAPI32.DLL
Exports about 400 functions for object security and registry manipulation
WINMM.DLL
Exports about 200 multimedia-related functions
Export Tables
Needless to say, a DLL is useless if we cannot determine what functions it exports. (It also helps to have some good documentation!) Each DLL contains a table of names of the functions that the DLL exports. This is called the export table of the DLL. (Some functions are exported from a DLL by position only, but we won't worry about that.)
Also, each DLL has an import table that lists the external functions that are called by the DLL. This is called the import table of the DLL.
It may surprise you to learn that it is not easy to view the export or import tables of a DLL. It is almost as though there were a conspiracy to hide this information. In particular. Visual Basic has no tools to view these tables. Visual C++ comes with a program called DUMPBIN.EXE that can be used for this purpose, and sometimes the Window QuickView utility will show these tables for a DLL.
You might argue that there is no point in knowing just the names of a DLL's exportable functions, because without information about the functions' parameters and return values (that is, how to use the functions), the functions are not very useful. Nevertheless, there are times when the ''documentation" forgets to report which of several DLLs exports a given function! In these cases, we can benefit by searching export tables. Also, sometimes the import table of a DLL will give us an idea as to how a certain feature of the DLL is implemented.
In any case, one of the main applications that we will discuss in this book is a PE file information utility, whose main window is shown in Figure 2-1. (The application is included on the CD that accompanies this book.) Writing this application will give us some valuable experience dealing with the Win32 API, and the application may come in handy from time to time as well.
The Role of DLLs Dynamic Linking
Windows applications, whether written in Visual Basic or Visual C++, are extremely complex far too complex to be completely self-contained. Indeed, a Windows application requires a great many external functions, including Win32

Page 17
0017-01.gif
Figure 2-1.
PE file information utility
API functions and various VB or C runtime functions that are generally contained in precompiled code modules or code libraries of some form.
Generally speaking, there are two ways to incorporate external code into an application. The simplest method is to incorporate the external code directly into the application's executable at creation time (that is, at link time). This is referred to as static linking.
Static linking has both advantages and disadvantages. Among its advantages is simplicity, for it does produce self-contained applications. The problem with this is that a Windows application that contained all code necessary to be entirely self-contained would be prohibitively large.
Static linking also resolves versioning problems, since the executable carries with it its own version of all necessary code. But that also has a downside. If an error is found in an external code module, the executable will need to be relinked with the corrected library. The main problem, however, is that static linking promotes duplication of code. The same code library may end up in several dozen different applications on the same machine.

Page 18
An alternative to static linking is dynamic linking. In this case, a single external code module can service multiple applications. Simply put, an application is linked with references to external functions in external dynamic link libraries. As we will see, a single physical copy of a DLL can be mapped into the address space of several applications at the same time. In this way, there is only one copy of the DLL in physical memory, and yet each application thinks it has its own copy of the DLL in its own memory space. Thus, calling the DLL's functions is just as efficient as calling code within the application itself. Indeed, in a very real sense, the DLL becomes part of the application.
We will elaborate on these issues in Chapter 13, Windows Memory Architecture, so don't worry too much about it now.
Some C++ Syntax
As mentioned earlier, we will have many occasions to look at the VC++ declarations of API functions. For this and other reasons, we will need to be familiar with a little bit of C++ language syntax. (We will not be concerned, however, with the object-oriented aspects of C++.)
The Basics
Here are some basic facts about C++ language syntax:
C++ uses the double slash (//) to signal that the rest of the line is a comment. This is the analog of Visual Basic's apostrophe.
Extra whitespace (spaces and carriage returns) is ignored in C++. In particular, no line continuation character is required. For example, the function declaration:
VOID CopyMemory(PVOID Destination, CONST VOID *Source, DWORD Length);
is equivalent to the more readable:
VOID CopyMemory(
  PVOID Destination,  // pointer to address of copy destination
  CONST VOID 
*Source, // pointer to address of block to copy
  DWORD Length        // size, in bytes, of block to copy
);
This formatting also allows us to add comments to each parameter declaration, a feature that would be very useful in documenting VB declarations.
Almost all lines of C++ code end with a semicolon. Curly braces ({}) are used to enclose multiline blocks of code.
C++ is a case-sensitive language. This also applies to all of the Win32 API function names!

Page 19
Declaring Variables
A variable that is declared in Visual Basic in the form:
Dim VarName as VarType
is declared in C++ using the more concise syntax:
VarType VarName;
For instance:
Dim x as Long       ' declare a long
becomes:
int x;           // declare an integer
(In C++, an integer is 4 bytes in size.)
Declaring Arrays
To declare an array, such as an integer array with 100 elements, we write:
int iArr[100];
Note, however, that the index for this array ranges from 0 to 99, so that (unlike in VB), the value iArr(100) is invalid. Note also that VC++ uses square brackets rather than parentheses for array indices.
Declaring Functions
A function that is declared in Visual Basic in the form:
Function FName(Para1 as Type1, Para2 as Type2,  ) as ReturnType
is declared in C++ using the more concise syntax:
ReturnType FName(Type1 Para1, Type2 Para2,  )
For example:
Function Sum(x as Long, y as Long) As Long
becomes:
int Sum(int x, int y);
Pointers
Simply put, a pointer is a memory address. Under Win32, all memory addresses are 32 bits long. A pointer variable (often just called a pointer) is a variable of type pointer, that is, a variable that the compiler (VB or VC++) interprets as holding an address. Figure 2-2 shows a pointer variable.

Page 20
0020-01.gif
Figure 2-2.
A pointer
In this figure, Var is a variable of some type or other (integer, long, character, whatever). Its contents are yy yy and its address is bbbb. The variable pVar is a pointer variable. It contains the address of Var and so, like all pointer variables, has length 32 bits. We say that pVar points to Var and that Var is the target of the pointer. If Var has type Integer, for example, we would say that pVar is an integer pointer.
Pointers and pointer variables are extremely powerful objects, and Win32 makes extensive use of them. Accordingly, VC++ fully supports pointers and pointer operations. An example of the influence of pointers on the VC++ language can be seen in pointer arithmetic, which can seem quite strange to the uninitiated.
The following code (whose syntax we will discuss in more detail later), declares an integer pointer pi and then prints the value of both pi and pi+1:
int i = 1;     // declare integer and initialize
int 
*pi;       // declare pointer to integer
pi = &i;       // set pointer to point to integer (& is VC++ AddressOf operator)
cout < pi < " / " < pi+1;      // print value of pi and pi+1
Surprisingly, the output is:
0x0012FF78 / 0x0012FF7C
(0x is the prefix that VC++ uses to indicate hex). Note that pi + 1 is actually 4 larger than pi!
The reason for this is that the VC++ compiler understands both that pi points to (holds the address of) a 4-byte integer and that the only reason to add 1 to a pointer is to point to the next item (in this case, 4-byte integer) in memory, whose address is 4 greater than the address of the current integer. Hence, it adds 4 (not 1) to the value of the pointer! You can probably see that, while this may seem strange at first, it can be very useful, because it allows us to simply add 1 to any pointer to point to the next item of whatever the target type is: integer, long, character, etc.
Unlike VC++, Visual Basic makes every effort to hide pointers from the programmer. The reason is that programming with pointers can be dangerous in the sense that the accidental (or deliberate) misuse of a pointer can easily result in an

Page 21
attempt to access protected memory. In fact, as we will see in Chapter 13, the Windows operating system deliberately "wastes" some memory address space in order to safely detect the accidental use of the null pointer, that is, a pointer to the 0 address. For instance, it is a relatively common mistake to forget to initialize a pointer variable, since this requires defining a target variable, as we did in the previous code. If we forget to initialize a pointer, the result is a null pointer.
In any case, since Win32 uses pointers constantly, we will need to deal with pointers in VB. As it happens, there are some tricks that make this possible. We will be able to do just about anything necessary except call a function using a pointer to that function (which is very easy to do in VC++).
Declaring a pointer in VC++ is easy. It is done using the syntax:
targetdatatype *pointervariable;
(A space before or after the * is optional, but at least one of them should be included for clarity.) For instance, to declare a pointer to an integer, we write:
int *pi;
It is possible to declare a pointer variable without knowing the target data type. This is done by writing:
void *pWhatever;
Incidentally, as we will see, VC++ is replete with synonyms for data types. Thus, you may also see this declaration in the equivalent form:
LPVOID pWhatever;
where LPVOID stands for long pointer to a void.
By Indirections, Find Directions Out
The asterisk (*) is known as the indirection operator. It is used both to define a pointer variable (as above) and to get at the value pointed to by a pointer variable, that is, the value of the target variable. For instance, the following code:
int i;      // declare integer
int 
*pi;    // declare pointer
pi = &i;    // set pointer to address of integer
i = 5;      // place value in integer variable

// output to console
cout < "Pointer: " < pi < " Target: " < 
*pi;
produces the following output to a DOS window:
Pointer: 0x0012FF78 Target: 5
Note that in the last line, the expression *pi means "that which is pointed to by the pointer pi" or more succinctly, "the target of pi."

Page 22
Note also that the & operator, called the address-of operator, returns the address of its operand. Thus:
&var
is the address of the variable var. This operator is very useful for filling in pointer variables.
Pointers in Visual Basic
The indirection and address-of operators are the key to using pointers in VC++. As it happens, the address-of operator has an undocumented counterpart in Visual Basic and the indirection operator can be faked without too much trouble through the use of the API function CopyMemory.
Visual Basic has an undocumented function called VarPtr, left over from the days of QuickBasic. The QuickBasic documentation says that VarPtr returns the offset of a variable and VarSeg returns the segment. The days of segmented addressing are mercifully gone, but VarPtr seems to have survived. It now simply returns the address of a variable, that is:
VarPtr(var)
is the address of the variable var. If you are not comfortable using an undocumented VB function, you can also use the function rpiVarPtr from the rpiAPI.dll library on the CD that accompanies this book. The expression:
rpiVarPtr(var)
is equivalent to VarPtr(var). However, for reasons we will discuss in Chapter 6, rpiVarPtr does not work on string variables. (This has to do with the translation from Unicode to ANSI that is performed automatically by VB when it calls an external function with a string parameter.)
Incidentally, you might be interested in the C code for this function:
int WINAPI rpiVarPtr(int pVar)
{
   return pVar;
}
This function just returns its argument. By declaring this function in VB as:
Public Declare Function rpiVarPtr Lib "rpiAPI.dll" ( _
   ByRef pVar As Any _
) As Long
the argument is passed to the function by reference. That is, VB passes the address of the argument to this function, which is unceremoniously returned as a VB long.
Note that using As Any in the declaration of rpiVarPtr prevents VB from checking the data type of the parameter, so we can pass it a variable of any type. This

Page 23
means we don't need to have rpiVarPtrByte, rpiVarPtrInteger, and rpiVarPtrLong functions, for instance. (Again, rpiVarPtr does not work on strings.)
To do indirection, we need some way to retrieve the target of a pointer. The simplest way to do this is to use the CopyMemory API function, which copies bytes of memory from one address to another address. We will discuss the details of how to use CopyMemory in general and how to do indirection in particular in the next chapter. There are also some functions to do indirection in the rpiAPI.dll library that accompanies this book.



WIN32 API Programming with Visual Basic
Win32 API Programming with Visual Basic
ISBN: 1565926315
EAN: 2147483647
Year: 1999
Pages: 31
Authors: Steven Roman

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net