Calling Unmanaged Code with PInvoke | Advanced .NET Programming

Calling Unmanaged Code with P/Invoke

We are now going to examine how you can call unmanaged code using the platform invocation (P/Invoke) mechanism. P/Invoke is something that is well supported natively by the CLR. As a result, there is little new to learn in terms of concepts - calling unmanaged code through P/Invoke looks pretty much the same in IL as it does in high-level languages, other than obvious syntactical differences. We will, however, spend a little time looking at what actually happens under the hood.

We will illustrate P/Invoke by developing a small sample called PInvoke, which uses the platform invocation mechanism to display a message box using the Windows API MessageBox() function. In real life, of course, you wouldn't use P/Invoke to do this, because you can more easily use System.Windows.Forms.MessageBox. However, this sample is good for illustrating the principles, which you can then apply to calling other API functions that have no managed equivalents.

The native MessageBox() function has the following C/C++ signature:

 int MessageBox(    HWND hWnd,           // Handle to owner window    LPCTSTR lpText,      // Text in message box    LPCTSTR  lpCaption,  // Message box title    UINT uType           // Message box style );

The first parameter (hWnd) is a Windows handle, which indicates any parent window of the message box. The second and third parameters are pointers to C-style unmanaged strings. The final parameter, uType, is a 32-bit unsigned integer (= unsigned int32 in IL) which indicates the type of message box required and the buttons on it. For example, a value of zero here indicates the message box should just have an OK button. A value of 1 (which we will use) indicates it should have OK and Cancel buttons. The return type is a 32-bit integer that indicates which button the user pressed to quit the message box.

So that's what a message box looks like to unmanaged code. This is how we define a managed wrapper method for it:

 .method public static pinvokeimpl("user32.dll" winapi) int32 MessageBox(    native int hWnd, string text, string caption, unsigned int32 type) {}

This IL corresponds to this C# code:

 [DllImport("user32.dll")] extern static int MessageBox(IntPtr hWnd, string text, string caption,                         uint type);

In this code I've replaced the names of the parameters in the native MessageBox() method with .NET-style names, and the native types with suitable corresponding managed types (for example, LPCTSTR with string). Notice too that all imported functions must be declared static.

The important new keyword here is pinvokeimpl. This keyword indicates that we are not going to supply an implementation for this method, but are requesting the .NET runtime to track down the native implementation in the indicated DLL, and to also wrap suitable marshaling/data conversion code around it. There is of course no equivalent to IL's pinvokeimpl in high-level languages, but we can instead use the attribute, DllImportAttribute. How this gets converted into pinvokeimpl when C# code is compiled is something we'll examine later in this chapter when we cover attributes.

The pinvokeimpl keyword must be followed by parentheses in which we supply the filename of the DLL that implements the function. We also need to indicate the calling convention of this function, in this case winapi.

If you're not familiar with calling conventions, don't worry too much. Calling conventions are rules governing the precise details in memory of how parameters are passed to methods, and whether the caller or callee is responsible for cleaning up any memory allocated for the parameters. For purely managed code, that is all handled by the CLR, so you don't have to worry about it. In native code, the calling convention will still normally be handled by compilers, so developers don't need to worry about the details. However, unlike the CLR, there are for historical reasons several different calling conventions used in native code on Windows, such as cdecl and winapi, so if we are going to call a native method from managed code, we need to tell the CLR which calling convention to use. In almost all cases when using P/Invoke for Windows API functions this will be winapi. If a particular function takes a different calling convention, this will be indicated in the documentation.

Native methods don't have metadata, which means there is no way for the CLR to obtain any information on what data a given method is expecting. The only information the CLR can extract from the unmanaged DLL is what address the method is located at. So instead, you the developer have to look up the parameter types and calling convention in the documentation, and then tell the CLR what to expect by supplying an appropriate list of arguments in the declaration of the pinvokeimpl method. The arguments you indicate are of course managed types, but the CLR has its own list of rules for how it converts managed types to unmanaged types. It's up to you to choose a managed type that will be converted into the correct unmanaged type for the method. In the above example, I picked native int (System.IntPtr) as the first parameter to MessageBox() because I know that native int is the type that will be correctly marshaled to the native type, HWND. The full list of conversions is documented in MSDN.

The CLR will convert System.String instances to C-style strings and figure out the appropriate pointer to pass to the native code. Numeric items such as int32 will be passed without conversion. User-defined structs will be marshaled by breaking up into the individual fields and marshaling each field separately. The main work involved in the marshaling process (other than converting strings) is to make sure that the fields are laid out in memory in the way expected by the native function. And the real benefit to you is that you can use .NET types.

There is one extra task that the CLR will do: there are a number of API functions that come in two versions - an ANSI version and a Unicode version. MessageBox() is one of those. User32.dll actually contains two functions, MessageBoxA() and MessageBoxW() (W stands for "wide" and indicates the Unicode version). P/Invoke can identify the correct version to be called. It's possible to specify the version in the pinvokeimpl declaration, but if this information is missing (as is the case for our sample), the CLR will instead work using whichever marshaling flag was applied to the definition of the type in which the pinvokeimpl method was defined. That's just the ansi, unicode, or autochar flag that we described in the last chapter.

Let's now have a look at the code for the PInvoke sample (as usual for clarity I haven't shown the .assembly directives and so on):

 .method public static pinvokeimpl("user32.dll" winapi) int32 MessageBox(native int hWnd, string text, string caption, int32 type) { } .namespace Wrox.AdvDotNet.PInvokeDemo {    .class public auto ansi EntryPoint extends [mscorlib]System.Object    {       .method static void Main() cil managed       {          .maxstack 4          .entrypoint          ldc.i4.0          ldstr    "Hello, World"          ldstr    "Hello"          ldc.i4.1          call     int32 MessageBox(native int, string, string, int32)          pop          ret       }    } }

The EntryPoint class specified the ansi flag, which means that string instances will be marshaled to ANSI strings. This is the option you will normally use if you know that your code is to run on Windows 9x or if for some reason the unmanaged functions you are calling specifically expect ANSI strings (as might be the case for some third-party components). In most cases, on later versions of Windows, you'll get better performance by specifying unicode or autochar, but I'll stick with ansi here just so we illustrate some real data conversion. This means that for our sample, the CLR will make sure that it is MessageBoxA() that is ultimately invoked.

I've defined the MessageBox() wrapper method outside of the namespace. You don't have to do this, but personally I think it can make things clearer for P/Invoke methods. In any case, namespaces apply only to types, and have no effect on global functions.

The code to invoke MessageBox() in the Main() method is relatively simple. We just load the types specified by our wrapper onto the stack, in order. Note that we're not interested in the return value from MessageBox() here, so we just pop this value off the stack before returning. This involves another new IL instruction, pop, which simply removes the top value from the stack and discards it. We have to call pop here to get rid of the return value from the MessageBox() call, because the stack must be empty when returning from a void method.

Running the sample from the command line gives this result:

click to expand

One interesting point about the above code is that it actually passes type safety, despite its use of a P/Invoke method:

 Microsoft (R) .NET Framework PE Verifier Version 1.0.3705.0 Copyright (C) Microsoft Corporation 1998-2001. All rights reserved. All Classes and Methods in pinvokedemo.exe Verified

The reason for this is that type safety only measures what is happening in the managed code. So calling into unmanaged code using P/Invoke doesn't formally affect verifiability. However, in order to call unmanaged code, code must have the SkipVerification security permission, which will obviously only be granted to trusted code - so in practice this does not cause a security loophole.