COM Interoperability | Professional .NET Framework 2.0 (Programmer to Programmer)

The Component Object Model (i.e., COM) was once the reusable component technology. The .NET Framework and CLR are its natural descendants. But as with any technology, life continues well beyond its replacement has come onto the scene. There are many lines of COM code out in the wild already, and indeed some people are still writing it. For example, new APIs are available in Windows Vista to make working with Really Simple Syndication (RSS) simpler; these APIs are built on COM. The designers of the CLR had backward and forward integration in mind when it came to COM. In fact, that the runtime was originally called things such as COM+ 2.0, the COM+ Runtime, and so forth. Moreover, considering that many of the CLR architects and founders came from the COM team, it shouldn't be a surprise that the runtime and Framework bend over backward to accommodate COM.

A Quick COM Refresher

Before COM, the world of reusable components was quite simple (albeit not very successful). You wrote code in C++ and exported your interesting functions via dynamic-link libraries (DLLs) that clients linked against. This provided for some level of reuse and decoupling between the DLL and user programs. Some form of independent versioning was possible at least — especially if you decided to perform dynamic loading at runtime based on policy — but it was hard.

People had to deal explicitly with the subtle nuances and not-so-subtle headaches of cross-compiler integration, binary compatibility, and versioning. The dynamic programming, resource management, and usability features people really wanted were notably absent (or difficult to achieve). Everybody who wanted to do these things had to write and maintain their own rat's nest of code, making interoperability between discrete solutions near impossible. And furthermore, most solutions were C++ specific. VB was on the rise, and enabling cross-language interoperability was definitely at the top of many engineers' minds.

And thus, COM was born in the year 1993. In the beginning, it was simply the foundation of the Object Linking and Embedding (OLE) technology, but it grew to become much, much more over time. COM solved a few common problems:

Separation of interface from implementation in a binary compatible manner. With this innovation, one could version the implementation entirely independent of the interface, meaning that clients didn't need to recompile and wouldn't crash at runtime because they made assumptions about implementation layout. In addition to that, it forced people to think about encapsulation. You simply couldn't easily write code that relied on clients knowing internal implementation details without encountering versioning nightmares.
A consistent definition and utilization of runtime type identity (RTTI) and polymorphism. A common base interface IUnknown coupled with QueryInterface (a COM-aware replacement for C++'s dynamic_cast) enabled runtime dispatch based on the dynamic capabilities of an object and supplied yet another version resiliency mechanism. Code could take advantage of new features if they existed or fall back in a predictable manner if they didn't.
Standardized object and resource management mechanisms. The notion of ownership over a resource often gets fuzzy, and leads to situations where destructors and copy constructors in C++ can cause confusion over when precisely it's appropriate to delete an object. COM uses a reference counting scheme to reliably release resources only when there are no live pointers available. AddRef increments this count, QueryInterface auto-increments when it returns new references to an object, and Release decrements this count. When the reference count is zero, most implementations will make a call to delete this. (This, of course, only works if somebody is willing to abide by some "simple" rules around when to add or release references. It furthermore requires that the type's implementer write the correct reference counting logic.)
And lastly, language independence. The Interface Definition Language (IDL) provided a facility for defining language-independent contracts for COM objects, from which could be generated lots of language-specific boilerplate. Most Visual Basic users wrote COM code without even knowing it thanks to an intelligent IDE, compiler, and runtime. Automation made VB-style programming richer. And COM's tie-in with OLE means that COM objects can be easily accessed, manipulated, and scripted from business applications such as Microsoft Office.

So in short: COM was great. It solved a lot of common problems of the day and was the first major step Microsoft took toward an intelligent runtime that provided rich type system and resource management facilities. Without COM, it's doubtful that the .NET Framework and CLR would have ever happened. If they did, they likely would have looked very much different.

Distributed COM, Transactions, Etc. (COM+)

Because COM placed such an emphasis on decoupling interface from implementation, a natural extension to the existing programming model was to enable COM objects to be hosted inside a centralized server. A new network protocol, Distributed COM (DCOM), was invented to permit remote access by clients over the network. Specifically, the client was able to interact with a smart proxy mimicking the COM interface; its implementation transparently sent and received data using the DCOM protocol. This is a Remote Procedure Call (RPC)-like technology and is very similar to others such as the Common Object Request Broker Architecture (CORBA).

Along with distributed access to COM components were introduced a whole host of mechanisms to satisfy increasing requirements for building reliable and robust systems. Object lifetime, activation, and deactivation were suddenly more important, in addition to subjects such as pooling and complex resource management. Apartments likewise became more integral because of an increased likelihood that a single COM object would be accessed in parallel (which on the clients of the day was seldom a practical problem). (And STAs started to show their ugly side, destroying the scalability of server applications that relied on STA components.) Distributed transactions were later introduced to facilitate isolated and atomic updates to components and integration with other transacted resources (e.g., databases, LDAP). This meant that multiple COM objects and even multiple clients and servers could participate in the same transaction. Microsoft Transaction Server (MTS) is the technology that fuels this functionality.

COM+ is a term developed years later to refer to the combination of the above technologies, that is, COM plus DCOM plus MTS. That is: start with COM, add in distributed programming using DCOM, and mix in the ability to manage component lifetime and transactionality using MTS, and you've got COM+. These technologies played a key role in the evolution of both the CLR and distributed communication technologies, leading to web services (SOAP, WSDL, etc.) and implementations of service-oriented technology such as Windows Communication Foundation.

Enterprise Services is an extension to COM+ and integrates with the .NET Framework quite nicely. Its fundamental components are located in the System.EnterpriseServices namespace and System.EnterpriseServices.dll assembly. There are several books at the end of this chapter in the "Further Reading" section that I recommend you read if you're interested in knowing more about the Enterprise Services technology.

Backward Interoperability

Managed code running on the CLR is able to make use of COM quite easily. Much like types in the CTS, COM components are self-descriptive. All interfaces are fully described by their type library. A type library is an IDL definition of the COM interfaces supported, function-calling conventions, and attributes specifying GUIDs and other COM-recognized annotations. Given a type library, a client can generate the code necessary to call a specific implementation of a COM interface. This is the same technique VB uses, for example, to hide the use of COM from its users. The .NET Framework takes a similar approach.

Type libraries can be supplied in one of two fashions:

As standalone files: These will end in a .TLB extension, and are produced when an IDL file gets compiled by the MIDL.EXE SDK utility. Type library files are stored in binary format. You can examine their contents using utilities such as the COM/OLE TypeLib browser (OleView.exe) that ships with Visual Studio, for example.
As embedded resources inside another PE file (e.g., DLL, EXE, or OCX): This takes advantage of the capability of PE files on Win32 to embed resources other than the just code and makes distribution simpler. There will be a segment inside the file that contains the type library in binary format. Many COM aware tools recognize this and enable you to extract and work with it just as you would a standalone TLB file, OleView.exe included.

The OleView.exe utility enables you to view the metadata exported by the library, regardless of which of the two formats it is contained within. It can be found installed with Visual Studio (in <vsroot>\Common7 \Tools\Bin) or alternatively can be downloaded individually from http://msdn.microsoft.com.

Generating Managed Code Proxies for COM

The System.Runtime.InteropServices.TypeLibConverter class generates proxy types that managed code can use to communicate with COM objects. The SDK utility TLBIMP.EXE (type library importer) uses this library in its implementation. There are several configuration options that control the resulting namespace, assembly name, and so forth, but they have been omitted here for brevity. Please consult the SDK for such details. Visual Studio also uses the same technique when you select Add Reference for your project and navigate to the COM tab.

TypeLibConverter's function ConvertTypeLibToAssembly does all the magic. It takes as input a type library and produces as output a COM Interop Assembly that contains very simple proxies that expose the COM interface and forward method calls to the CLR. The runtime generates Runtime Callable Wrappers (RCWs) for each COM object defined in the library, which are simple wrappers that know how to work with the underlying COM calling conventions. They also take care of marshaling data into and out of COM code. The implementation of RCWs actually live inside the CLR itself. The CLR furthermore takes care of initializing the COM context (CoInitializeEx), creating instances of components (CoCreateInstance), performing the right reference counting (AddRef, Release), transitioning between apartments, and pumping and dispatching messages, among other things.

The proxies contained inside the Interop Assembly contain very little code. A rough sketch of how they are used is shown in Figure 11-1.

image from book
Figure 11-1: COM Interop Assembly generation and use in action.

The resulting COM Interop Assembly exports a set of regular CLR types that you can then use from your managed applications. As you can see in the diagram above, the CLR's RCW's actually make the invocation on the COM instance.

There is also the idea of a Primary Interop Assembly (PIA) to help solve one primary problem: each client who generates an ordinary Interop Assembly gets its own unique (and incompatible) copy of the managed proxies. If have two portions of an application — perhaps two different libraries — and want to share components, you need a PIA. A PIA, therefore, serves as a single, machine-wide, authoritative Interop Assembly that clients should use — but aren't required to — when interoperating with a specific COM library. The only criteria for a PIA are that it be digitally signed and that it be annotated with the PrimaryInteropAssemblyAttribute. PIAs are typically installed in the GAC.

Working with the Proxies

The first thing you'll notice if you inspect the resulting assembly is that there will be an interface for each COM interface and an associated concrete class for the implementation of each interface. The naming convention is Foo and FooClass for the interface and class, respectively. These types are also annotated with attributes specifying their COM GUIDs, CLSIDs, and ProgIds where available. As with ordinary interfaces, you must use the class for instantiation and can use the interface for calling related methods on the COM interface.

Managing Reference Counts

Most of IUnknown's AddRef and Release magic is hidden underneath the interoperability services that the CLR provides. References are added and released as your instances cross boundaries. But the CLR will maintain at least one reference until your RCW is finalized. This will keep the COM object alive as long as the RCW managed object is usable, to prevent accidentally trying to use the underlying object after it's been deleted.

As we've already covered, finalization on an object happens at some indeterminate point after it is no longer in use. This is a far cry from the explicit pUnk->Release in C++ and Set pUnk = Nothing in VB. In most cases this behavior is fine. But as is the case with our discussion of finalization and Dispose above in the context of other resource management, sometimes you'll need to speed up the process for scarce resources. This might be the case, for example, if the COM object holds on to critical system resources, pools instances that are limited, and so forth. In such cases, you can use the Marshal.ReleaseComObject function to perform the final Release. After calling this function on a COM instance, it will be entirely unusable from managed code (the underlying COM object has been deleted at this point).

For example:

 Connection cn = new ConnectionClass(); try {     cn.Open("MyConnectionString", "foo", "bar", 0);     Command cmd = new CommandClass();     try     {         cmd.ActiveConnection = cn;         cmd.CommandText = "SELECT ...";         // ...     }     finally     {         Marshal.ReleaseComObject(cmd);     } } finally {     if (cn.State != 0)         cn.Close();     Marshal.ReleaseComObject(cn); }

In this code, we allocate a new ADO Connection, open it, create a new Command, and then do some database operations. Ordinarily, you wouldn't see the inner finally block, which calls ReleaseComObject on the command, but without it the COM object will stay alive until the GC sees that it was no longer in use and finalized it. The outer finally would normally just make a call to Close as is shown here, but the additional call to ReleaseComObject forces the connection COM object itself to be deleted sooner. Note that this is a dangerous practice. COM objects will delete other objects when their reference count hits 0; this might, surprisingly, make another RCW that you still have a reference to immediately unusable.

You can monitor the number of active RCW's from the Performance Monitoring tool (perfmon.exe). If you wonder why the active RCW count seems to grow even though you know your code isn't actively using any COM objects, it might be that the GC isn't kicking in when you expected it. You might consider inserting calls to ReleaseComObject to reduce pressure.

Exceptions and HRESULTS

One major improvement over working with COM in C++ is that you needn't check HRESULTs after each method call. The CLR performs these checks for you, and will convert any failures to managed exceptions. This occurs when you're making method invocations automatically.

The Marshal class's GetExceptionForHR and GetHRForException allow you to perform such translations manually. If a translation does not exist for a given HRESULT, the CLR will transform it into a System.Runtime.InteropServices.COMException, at which point the original HRESULT can be retrieved by accessing its ErrorCode property. There are actually quite a few COM-related methods on the Marshal class, but most are used internally by the runtime and by sophisticated developers who want to work with the inner workings of COM marshaling and interoperability.

Interoperating without Proxies

The .NET Framework enables you to work with COM libraries without going through the trouble of creating an Interop Assembly. Using. OLE Automation enables this dynamic style of programming and is much like the way in which VB6 made use of COM.

First, the Type class offers two methods to obtain a reference to a CLR proxy type backed by a RCW that can forward invocations to a COM instance. Both are static: GetTypeFromProgID enables you to use the user friendly COM object's ProgID to obtain a reference; similarly, GetTypeFromCLSID retrieves a Type based on a COM object's class ID. For example, ADO's Connection object has a ProgID of ADODB.Connection; this is significantly easier to remember, type, and maintain compared to its CLSID of . Both of these methods offer overloads accepting a server name for remote instantiation (using COM+).

Once a Type is obtained, you must use late-bound invocation to instantiate and call methods on instances. Instantiation is done through the Activator.CreateInstance static method. Simply pass the Type as the argument, and it will generate a new RCW instance for you. Similarly, Type.InvokeMember performs function calls and property accesses. InvokeMember does a name-based lookup of the method or property requested and calls it through the Automation (e.g., IDispatch.Invoke) infrastructure. All of this gets resolved at runtime.

C# and VB are different in the code they permit you to write. Because the RCW is generated at runtime, programming against it cannot be written statically. The C# compiler can't emit code to bind to types and methods because the RCW type doesn't even exist at compile time! But in VB (assuming that Option Explicit is off), you can make method calls and property accesses, and the runtime library handles the transformation into the same code you'd have written by hand in C#.

For example, this is what the ADO Connection code snippet shown above would look like without an Interop Assembly in C# (and without the explicit Marshal.ReleaseComObject calls):

 Type cnType = Type.GetTypeFromProgID("ADODB.Connection"); object cn = Activator.CreateInstance(cnType); try {     object[] args = new object[] { "MyConnectionString", "foo", "bar", 0 };     cnType.InvokeMember("Open", BindingFlags.InvokeMethod, null, cn, args);     Type cmdType = Type.GetTypeFromProgID("ADODB.Command");     object cmd = Activator.CreateInstance(cmdType);     cmdType.InvokeMember("ActiveConnection", BindingFlags.SetProperty,         null, cmd, new object[] { cn });     cmdType.InvokeMember("CommandTxt", BindingFlags.SetProperty, null,         cmd, new object[] { "SELECT ..." });     // ... } finally {     if ((int)cnType.InvokeMember("State",             BindingFlags.GetProperty, null, cn, null) != 0)         cnType.InvokeMember("Close",BindingFlags.InvokeMethod,null,cn,null); }

InvokeMember takes a string representing the member to invoke, a BindingFlags enumeration value indicating what type of member we are accessing, an optional Binder argument (passing null means "default," which is what we want in this case), the target COM object proxy used for invocation, and the arguments to the member (null means none, which is for properties and/or 0-argument methods). Notice how ugly this code looks! And furthermore, look at how many strings are used, each of which is a potential typo that will fail at runtime. (There's an intentional typo in that block of code … Can you easily spot it? You will when it throws an exception!)

The same code in VB looks nicer but still suffers from the potential to fail at runtime:

 Option Explicit Off Dim cnType As Type = Type.GetTypeFromProgID("ADODB.Connection") Dim cn = Activator.CreateInstance(cnType) Try     cn.Open("MyConnectionString", "foo", "bar", 0)     Dim cmdType As Type = Type.GetTypeFromProgID("ADODB.Command")     Dim cmd = Activator.CreateInstance(cmdType)     cmd.ActiveConnection = cn     cmd.CommandText = "SELECT ... "     ' ... Finally     If (cn.State <> 0)         cn.Close     End If End Try

Clearly interoperating with COM is a deep topic. If you're interested in serious interoperability, please consult the "Further Reading" section. There are several great books and online resources on the topic.

Forward Interoperability

Much like backward interoperability, the .NET Framework enables COM clients to call into newer .NET Framework code. This is called forward interoperability.

Hosting the CLR in Process

At a high level, COM code can host the CLR using the CorBindToRuntimeEx function and ICLRRuntimeHost interface. These are called the hosting APIs, and have been referenced on and off throughout this chapter. A detailed discussion is outside the scope of this book. But this code shows a brief example of some C++ that starts up the runtime:

 #include "stdafx.h" #include "mscoree.h" int _tmain(int argc, _TCHAR* argv[]) {     ICLRRuntimeHost *pClrHost = NULL;     // Bind to the runtime.     HRESULT hrCorBind;     if (S_OK != (hrCorBind = CorBindToRuntimeEx(         NULL,   // Load the latest CLR version available         L"wks", // Workstation GC ("wks" or "svr" overrides)         0,      // No flags needed         CLSID_CLRRuntimeHost,         IID_ICLRRuntimeHost,          (PVOID*)&pClrHost)))     {         fprintf(stderr, "Bind to runtime failed (%d)", hrCorBind);         exit(-1);     }     // Construct our host control object.     IHostControl *pHostControl = new MyCustomHostControl(pClrHost);     if (!pHostControl)     {         fprintf(stderr, "Host control allocation failed");         exit(-1);     }     pClrHost->SetHostControl(pHostControl);     // Now, begin the CLR.     HRESULT hrStart;     if (S_OK != (hrStart = pClrHost->Start()))     {         if (hrStart == S_FALSE)         {             // OK; simply means the runtime has already started.             _ASSERTE(!L" Runtime already started");         }         else         {             fprintf(stderr, "Runtime startup failed (%d)", hrStart);             exit(-1);         }     }     // And execute the program.     DWORD retVal = -1;     HRESULT hrExecute;     if (S_OK != (hrExecute = pClrHost->ExecuteInDefaultAppDomain(         L" foo.dll", L" StartupType", L" Start", L"...", &retVal)))     {         fprintf(stderr, "Execution of managed code failed (%d)", hrExecute);         exit(-1);     }     // Stop the CLR and cleanup.     pClrHost->Stop();     pClrHost->Release();     return (int)retVal; }

Notice that we use the mscoree.h header file; we also link the program with mscoree.lib, which is where many of the functions above are defined. CorBindToRuntimeEx actually loads the CLR in process. This returns us a pointer to the ICLRRuntimeHost COM component, through which we can configure policies (e.g., with SetHostControl) and control execution of the CLR. Then they start and execute code using the Start and ExecuteInDefaultAppDomain functions.

Please refer to the "Further Reading" section for follow-up resources if you'd like to do serious development with the hosting interfaces.

Generating Type Libraries from Managed Code

The CLR also permits you to expose managed types to COM clients. The SDK utility TLBEXP.EXE is much like the TLBIMP.EXE program, except that it does the reverse: It uses the TypeLibConverter.ConvertAssemblyToTypeLib function to generate a type library containing interfaces and metadata about the managed classes in the assembly it was run against. Your TLB can be embedded in your assembly much like unmanaged DLLs and OCXs. This is only one part of the process, however, as making a fully executable COM implementation also entails creating new keys in the registry.

The REGASM.EXE tool adds the necessary keys to the registry when you run it against your assembly. It tells clients to use your assembly as the COM server, in other words the in-process implementation of the COM interface your TLB describes. You can also use REGASM.EXE to generate a new TLB by passing it the /tlb switch. These are distinct activities but are both necessary. Both of these tools allow you to customize the TLB generation and server registration process. For example, executing REGASM.EXE /regfile:mytlb.reg mytlb.dll will analyze your managed library mytlb.dll and provide a registry file which, when executed, adds the keys to your registry.

After these steps, COM clients can call through the generated interfaces. The managed code COM server that is registered uses COM Callable Wrappers (CCWs), much like RCWs, for exposing managed objects out to unmanaged clients. Please consult the "Further Reading" section and .NET Framework SDK documentation for details.

COM Visibility

When the above programs analyze your assembly, you might wonder how they figure out what to export as part of the COM callable interface. The simple answer is all public types and their members by default. But you can control this by adding System.Runtime.InteropServices.ComVisibleAttribute to your assembly, classes, interfaces, structs, and/or members, passing either true or false to the constructor to indicate whether the specific component is visible to COM or not. Visual Studio actually adds this to most project types automatically as part of the AssemblyInfo.cs file, and specifies a value of false.

COM visibility can be defined at various levels. You an declare an entire assembly as being COM visible, for example, or individual types. When you apply the attribute, all lexically contained components are also affected unless they too have a ComVisibleAttribute applied to them. This means that you can set your entire assembly to ComVisible(false) and mark individual types as ComVisible(true) to only export those select types. Likewise, you can attribute COM visibility at the member level.

One brief word of caution: it is usually safer to not export types as COM visible by default. Once you export your APIs and data structures, you enable the possibility that a client will take a binary dependency on the format you've chosen. In plain words: you cannot change the structure of your type at all, meaning no additional fields or methods. If you're in the business of shipping reusable APIs, this means you will be limited in how you can innovate in the future (well, unless you don't care about causing your client's unexpected access violations simply because they called your code from COM). When in doubt, stick [assembly: ComVisible(false)] in your assembly and enable individual types as needed.