Working with Unmanaged Code | Professional .NET Framework 2.0 (Programmer to Programmer)

There is plenty of non-COM code out there written in C++ that you might need to interoperate with, not the least important of which is the Win32 platform. Furthermore, you might have a number of old C++ applications and libraries that you wish to extend (without rewriting) or utilize from your new managed applications. You have a few options:

C++/CLI is a much enhanced language and set of tools when compared to v1.0's Managed C++ technology. As with MC++, you can use so-called it just works (IJW) techniques. This permits you to recompile your old, unmanaged C++ with the new compilers, exposing the functions to managed code or even mixing managed code in with it.
P/Invoke is a managed-to-unmanaged bridge through which you may make native function calls. The CLR handles all of the data marshaling and function calling conventions for you.

C++/CLI is a huge topic in and of itself. And yet it's very easy for those familiar with the C++ language to ramp up and become productive using the tools, especially if you have Visual Studio. Refer to the "Further Reading" section for some additional resources. The remainder of this section will focus on an overview of P/Invoke.

Platform Invoke (P/Invoke)

The Platform Invoke (P/Invoke) technology is built right into the runtime to enable managed programs to invoke ordinary dynamically linked unmanaged code. It's the logical equivalent to linking against a DLL in C++ for routines exported annotated with a declspec(dllexport). The result of linking against an ordinary DLL in the Microsoft C++ compiler is an executable that inserts small proxy stubs which, when invoked, redirect to the actual code at runtime. P/Invoke is very similar, except that the CLR is responsible for loading, binding, and making necessary transformations between data types as a function is called. As is the case with pure unmanaged code, the OS will share code with multiple processes accessing that DLL simultaneously.

Declaring P/Invoke Signatures

To import an exported symbol from a DLL for use from managed code, you must declare a static method with the extern keyword in C# (pinvokeimpl in IL) and annotate it with the DllImportAttribute found in the System.Runtime.InteropServices namespace. C++ users likewise use the DllImportAttribute, and must additionally mark the method signature as extern "C". VB users simply use the Declare keyword, and the DllImportAttribute is added automatically for them.

When a P/Invoke method is called by managed code, the CLR will resolve the DLL specified, load it if it's not already in process, and route the function call through the P/Invoke marshaling layer to transform arguments according to its default marshaling rules and the options set on the import attribute, invoke the DLL code using the correct calling convention, and lastly perform any output or return value marshaling.

For example, kernel32.dll exports a function GetFreeDiskSpaceEx, which reports the amount of free disk space remaining on a drive. It accepts a single input parameter to specify the drive and returns three output parameters containing the free space information. To call it, we must first declare the P/Invoke signature (in C#):

 [DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Auto)] static extern bool GetDiskFreeSpaceEx(string lpDirectoryName,     out ulong lpFreeBytesAvailable,     out ulong lpTotalNumberOfBytes,     out ulong lpTotalNumberOfFreeBytes);

Notice that we first reference the DLL in which the function is defined, along with a couple other properties. We'll look at the properties and what they mean in just a moment. Then we define the signature with the appropriate data types, ensuring that it is static and marked with the extern keyword. In IL, this is represented using the pinvokeimpl declaration:

 .method private hidebysig static         pinvokeimpl("kernel32.dll" autochar lasterr winapi)         bool  GetDiskFreeSpaceEx(string lpDirectoryName,                                 [out] uint64& lpFreeBytesAvailable,                                 [out] uint64& lpTotalNumberOfBytes,                                 [out] uint64& lpTotalNumberOfFreeBytes)         cil managed preservesig { }

It's common practice to create static classes that contain a logical grouping of the P/Invoke functions you intend to use. This helps to avoid duplication, for example where you've defined multiple GetFree DiskSpaceEx P/Invoke declarations inside a single assembly. It also makes it obvious in managed code when you're calling a Win32 function, assuming that you've named the class something obvious like Win32NativeFunctions. The .NET Framework generally follows this pattern.

The function can then be invoked just like any other static method. The runtime performs all of the necessary translations for you:

 ulong freeBytesAvail; ulong totalNumOfBytes; ulong totalNumOfFreeBytes; if (!GetDiskFreeSpaceEx(@"C:", out freeBytesAvail,     out totalNumOfBytes, out totalNumOfFreeBytes)) {     Console.Error.WriteLine("Error occurred: {0}",         Marshal.GetExceptionForHR(Marshal.GetLastWin32Error()).Message); } else {     Console.WriteLine("Free disk space:");     Console.WriteLine("    Available bytes : {0}", freeBytesAvail);     Console.WriteLine("    Total # of bytes: {0}", totalNumOfBytes);     Console.WriteLine("    Total free bytes: {0}", totalNumOfFreeBytes); }

It turns out that you can get essentially the same information by using the DriveInfo class in the BCL. Care to guess how it gets this information? That's right: It P/Invokes to the GetDiskFreeSpaceEx function! While much of the Framework makes use of Win32 in this manner, there is actually a large portion of the underlying platform that you cannot access in managed code directly. In such cases, P/Invoking is the easiest way to get at those specialized platform functions.

Note

At the time of this writing, there is a web site www.pinvoke.net, called the P/Invoke Wiki, which lists standard Win32 functions and their associated DllImport signatures. If also offers tools such as Visual Studio plug-ins that help to make interoperating with Win32 significantly easier. It even tells you when there's a managed code version of a specific Win32 API that you should consider using instead. Check it out!

Signature Options

We set a couple interesting properties in the above P/Invoke declaration using the DllImportAttribute. In addition to the location of the function, some of these operations are:

BestFitMapping and ThrowOnUnmappableChar: Best fit mapping is a process whereby on non-Unicode platforms (Win9x and WinME) extended characters are converted to an approximate representation. This prevents all extended characters from getting converted to "?," which is the default for all non-ANSI characters. Sometimes this can result in a dangerous conversion, for example extended characters that map to path separator characters. Since this could cause a path to get past string-based security checks only to be altered by the marshaler after the fact, turning on ThrowOnUnmappableChar is advised. It will throw an exception in such dangerous cases. In general, however, it's advised to leave BestFitMapping off. It is known to cause subtle, hard-to-detect security exploits.
CharSet and ExactSpelling: These two properties work in conjunction with each other to determine the manner in which string arguments are marshaled and which function is selected at runtime. Win32 uses a common naming convention of ending the function name in A and W for ANSI and Unicode strings, respectively. If you specify ExactSpelling = false — the default in C++ and C# — the marshaler will select the ANSI or Unicode version based on the CharSet specified.

For example, if you specified Foo as the function name and CharSet = Unicode, P/Invoke would look for a function named FooW first; if that exists, it would bind to it and marshal the characters as Unicode; otherwise, it would search for Foo and if it found a match also marshal the string as Unicode. Similarly, if you specified Foo with a Charset = Ansi, P/Invoke would first look for a function Foo (notice the difference in search order); if it found that, it would bind and perform conversions and marshal strings as ANSI; otherwise, it would look for FooA and similarly marshal strings as ANSI. Lastly, if CharSet = Auto, it would select one of the above behaviors based on the platform. Except for Win9x — which deals in ANSI by default — Auto means Unicode.

If you erroneously marshal a string as ANSI when the function expects Unicode, for example, you're likely to end up with garbage output. Furthermore, using anything but Unicode could result in some nasty security bugs. Please refer to Chapter 8 on Internationalization for some character conversion issues to watch out for, such as Turkish Is and Cyrillic Es.
EntryPoint: Specifying the EntryPoint to refer to the DLL function you are mapping to enables you to name the managed function differently. For example, if you were P/Invoking to a kernel32.dll function called Foo, but wanted to refer to it as Bar, you could set the EntryPoint = "Foo" and name your static extern function Bar. The P/Invoke layer would bind to kernel32!Foo, yet you could refer to it as Bar throughout your managed program. This is just a convenient mechanism to make your use of APIs clearer.
SetLastError: If you ask the P/Invoke marshaler to set the last error field, it will catch errors that result from making the function call (i.e., using Win32's GetLastError) and store it in a cached field for later access. You can retrieve it by calling Marshal.GetLastWin32Error. This is sometimes necessary if a component in the CLR actually causes a Win32 error to occur during the outbound marshaling. This can overwrite the real last error, making it hard to debug a failing function.

There are a few other properties available. Please refer to the SDK for details.

Bridging Type Systems

When working with unmanaged code — whether it's COM or native libraries written in C++ — there is a type system gap that must be bridged. For example, a string to the .NET Framework is not the same thing as a string in C++; the closest thing in C++ to an object reference is a void*, or perhaps an IUnknown* pUnk in COM; and certainly custom CTS types will be a challenge to map across the boundary. Just about the only thing that remains the same is (unboxed) integers. Even longs are different on the CLR than in C++. Because of these sometimes subtle disconnects between representations, if you wish to share data across bits of managed and unmanaged code there is often a marshaling cost associated with it. Those types that map directly are called blittable types.

Marshaling performs the transformation to the bits such that data instances can be used on both sides of the fence. This might be a simple bit-for-bit copy from one data structure to another, but just as well might involve a complete reorganization of the contents of a data structure as the copy occurs. This translation adds overhead, so if performance is important to you — as is the case with many unmanaged scenarios — you should pay close attention and make an attempt to cut down on costs.

You can instead choose to share a pointer to data structure across boundaries to enable the code to interpret bits manually, sometimes reducing the cost of marshaling substantially. As you saw earlier, the System.IntPtr type wraps a native-sized pointer and can be marshaled to unmanaged code for this purpose. We'll take a look at GCHandle later on, which can be used to ensure the GC doesn't move around managed pointers while unmanaged code is actively using them.

The table below shows some of the common mappings between Windows, C++ native types and the CTS types, in addition to noting which types are blittable:

CLR	Blittable	Windows	Unmanaged C++

Boolean	Yes	BOOL	long
Byte	Yes	BYTE	unsigned char
Char	No	CHAR	char
Double	Yes	DOUBLE	double
Int16	Yes	SHORT	short
Int32	Yes	INT LONG	int long
IntPtr	Yes	HANDLE	void*
Single	Yes	FLOAT	float
String	No	LPCSTR LPCWSTR	const char* const wchar_t*
String (reference)	No	LPCSTR LPWSTR	char* const wchar_t*
UInt16	Yes	WORD	unsigned short
UInt32	Yes	DWORD UINT	unsigned long

Saving on Marshaling Costs

Because marshalling can be expensive — it can add tens of native instructions per argument for simple native function calls — you want to eliminate as much superfluous marshaling cost as is possible. In practice, this means cutting down on data being sent across a boundary. The actual technique you use to accomplish this can vary, depending on your scenario. Here are some generalized tips that you might consider:

Send only what you need. If you only use 2 arguments out of 10 most of the time, for example, offer an overload or version of your function that takes only 2 arguments. You can then pass the default values for the other 8 arguments explicitly on the other side of the boundary.
Use shared memory whenever possible, for example when performing in-process interoperation. If you are passing a pointer to a block of shared memory, the marshaling costs are extremely low. Usually a pointer maps identically from managed to unmanaged code. If the unmanaged code knows how to interpret a data structure or raw block of memory, it might be more efficient than trying to write a custom marshaler that blits data back and forth.
Cache data on one the receiving side of the boundary. If you are making frequent function calls and supplying the same data over and over again, you might be paying the marshaling penalty more than you need. Consider sending the data once, caching it in the memory space on the other side of the boundary, and just reusing the same instance. You need a way to tell the unmanaged code to release the memory when you're done. Since most interoperation happens in-process, this has the negative consequence of storing the same data structure more than once. It's the age-old tradeoff between memory footprint and execution time.

While these are technology agnostic, the techniques to implement them are very similar.

Pinning Using GC Handles

Sometimes you might actually want to marshal across pointers to managed memory. You might want to do this for two primary reasons: (1) the data is mutable, and you'd like changes made by unmanaged code to be visible to the managed caller; or (2) the performance overhead of copying entire data structures while passing across the boundary is prohibitively expensive and dominates the computation. Enabling this assumes that the managed and unmanaged code both have access to some shared block of memory. There is one subtle problem, however: interacting with the GC.

When the GC runs, it will compact the heap to reduce fragmentation and improve performance. This process moves objects around in memory, updating any existing pointers to refer to the new location. Without updating the pointers, they would no longer refer to the correct object, and instead would point at the same location in memory that is now occupied by another object (or perhaps decommitted entirely). The GC takes care of this task transparently for managed references using its reference tracking algorithm, but once a pointer is sent off to unmanaged code land it no longer knows who is holding on to it. And furthermore, if managed code no longer references the object, the pointer it sent to unmanaged code might not be considered when attempting to locate roots.

To solve the problem, GC handles are used to encapsulate pointers to managed objects that are sent to unmanaged code. This enables two things:

All GC handles are reported to the GC as roots, meaning that the GC considers the objects they refer to as reachable until they are no longer protected by a handle.
Marshaling a pointer automatically pins the object referenced. Pinning instructs the GC not to move the object around during compaction. There is one obvious downside to pinning. The GC moves memory blocks for a reason: fragmentation can cause reduce memory locality of your objects, and cause your program's working set to remain large even though there is a sufficiently large number of free blocks to reclaim memory (if it weren't for that darn pinned object). Fragmentation can even cause large allocations to fail, which can cause your heap size to grow even larger or in extreme cases under memory stress, OutOfMemoryExceptions when your total amount of noncontiguous free memory is actually large enough to handle the allocation.

You can also explicitly control the pinning of objects using the System.InteropServices.GCHandle class. Call the Alloc static factory method passing the object to be pinned, along with a GCHandleType value of Pinned. This pins the object until you invoke Free on the GCHandle instance returned by Alloc. If you intend to store a reference to a managed object for longer than the duration of an unmanaged function call, yet the managed code that sent it is going to drop its reference after handing it over, using a GCHandle is a great way to enable this scenario. Be careful to unpin your object. As discussed in Chapter 3, pinning can dramatically reduce the performance of the garbage collection algorithm.