The Atoms of COM

[Previous] [Next]

COM can be broken down into several discrete chunks, or atoms. The atoms of COM include interfaces, implementations (COM classes and their associated class objects and servers), and apartments. Understanding each piece by itself is necessary for understanding COM as a whole. Let's start with interfaces—probably the single most important atom of COM.

Interfaces

To help us understand the importance of interfaces, let's construct a hypothetical scenario. Imagine you're a late-night-up-in-the-room-above-the-garage developer trying to make it big by writing a component that almost everyone will want. You survey the computer landscape and notice that most of the computer users around are using office productivity applications such as word processors and spreadsheets. Imagine further that you've developed a spelling checker in C++ that is vastly superior to the ones that ship with the standard word processors and spreadsheets on the market.

A Spelling-Checker Component Example

Because you're a C++ developer, you maintain the worldview that everybody else uses C++. So you decide to develop the spelling-checker component in C++, as shown in the following code. The class definition and body might look something like this:

 // checker.h struct tagTEXTBLOB {     unsigned long nSizeIs;     char* pBuffer; }; class CSpellChecker {     static int m_nRefCount; // How many others are using this?     LPTEXTBLOB m_lpText;    // LPTEXTBLOB is defined elsewhere. public:     CSpellChecker(LPTEXTBLOB lpText);     virtual ~CSpellChecker();     void CheckIt(); }; // checker.cpp CSpellChecker:: m_nRefCount = 0; CSpellChecker::CSpellChecker(LPTEXTBLOB lpText) {     m_lpText = lpText; } CSpellChecker::~CSpellChecker() {     m_lpText = NULL; } void CSpellChecker::CheckIt() {     // Parse the text blob, looking up each word,     //  making corrections when necessary. }

CSpellChecker is a regular C++ class that has a constructor, a destructor, static member data, regular member data, and some member functions. There's nothing really special about it. Clients can use CSpellChecker as they do any other class. Here's an example of how a client might use the spelling-checker class to check the spelling in a view:

 // EditView.h #include "checker.h" void CEditorView::OnCheckSpelling() {     LPTEXTBLOB lpTextBlob = GetRawText();     if (lpTextBlob) {         CSpellChecker spellChecker(lpTextBlob);         spellChecker.CheckIt();     } }

So far, there's nothing extraordinary about this code. It's just a regular C++ class like so many others you've probably seen. Once you've developed the C++ class, your next goal is to make it available to everyone else. If you're going to retire early, you've got to get as many folks as possible to use your spelling checker. But that's easy, right? With so many office productivity applications out there, tapping into this huge, receptive market should be a breeze. All you need to do is to get your spelling checker incorporated into some software and collect a small royalty for each copy sold. At this point, figuring out a decent distribution mechanism is your key to success.

Try Static Linking

If you distribute your spelling-checker component as a library, office productivity vendors can use static linking to add your spelling-checker library to their applications. (This is the time-tested, traditional way.) However, distributing your library this way has two downsides.

The first disadvantage to the static-linking approach is redundancy. Because your spelling checker is so awesome, many vendors will undoubtedly decide to license it. Then as those vendors release their applications, customers will start buying the applications and installing them on their machines. If someone installs five different applications (each of which uses your spelling checker), that person has implicitly copied five copies of your spelling checker onto his or her disk. That's fine if you own some stock in the mass storage industry, but for most of us, this redundancy chews up valuable disk space that we'd rather use for games.

The second disadvantage to the static-linking approach is that the spelling checker becomes "glued" to your client's application. Static linking is fine until you have to change the spelling checker's functionality for some reason. Perhaps you've found a great way to enhance the spelling-checker algorithm. Or perhaps, unfortunately, you've located a bug that causes your spelling checker to format the user's hard disk whenever certain words are encountered. Obviously, regardless of whether you're improving your product or fixing an error in it, you're going to need to release an updated version of your spelling checker. Your clients will have to rebuild and reissue their applications to accommodate the new version. From both a logistical and a marketing perspective, this situation is bad business and will likely make your clients very cranky.

Static linking has been used successfully for a long time as a way to distribute software. For example, most framework libraries were distributed that way until recently. Static linking used to be OK because a few years ago C++ libraries and frameworks were smaller than they are today. In addition, frameworks were much less popular than they are now. Linking library or framework code into an application wasn't a big deal back then because the libraries were smaller and most applications were written in the native language of Microsoft Windows: C and the Windows Software Development Kit (SDK).

These days, libraries and frameworks have assumed a prominent position in the typical software developer's toolbox. Major vendors use these libraries to get their applications out to market faster. (Just check out Microsoft's Paintbrush applet, which uses MFC, or Quyen's NetViz, which uses the Object Windows Library [OWL].)

Unfortunately, libraries are beginning to consume huge quantities of hard disk real estate. Such selfish resource consumption might be OK for one or two applications. But it's becoming a big problem now that many vendors are using the same (very large) libraries (such as your spelling checker).

Let's take MFC as an example. MFC as a framework can add significantly to the size of your code. A half-megabyte here and there isn't a whole lot these days—until you start multiplying it by the number of applications that use MFC. You can imagine what it would be like if each application carried around its own copy of MFC. Your hard disk would contain many redundant copies of MFC. Fortunately, most well-written applications aren't statically linked to MFC.

Dynamic Linking to the Rescue

The solution to the problems inherent in static linking is a technology called dynamic linking. Dynamic linking isn't a new idea. In fact, it's the cornerstone of Windows itself. Windows is really just a collection of dynamic-link libraries (DLLs). DLLs are pieces of executable code that sit on your hard disk waiting to be called. When client code requires the services of a DLL, the client code can load the DLL and link to the functions at run time (instead of at compile time and link time). That way, only one copy of a given library resides on a disk at any particular time. All the clients of the DLL simply share that one copy of the library, freeing up disk space and memory resources for other things.

Traditionally, DLLs have exported single entry points. This entry-point system is exactly how the Windows API works. All those gazillion API functions listed in the SDK manuals really just describe entry points into one of the several DLLs in Windows. In addition, this arrangement involves the client in several housekeeping steps to use the DLL. For example, think about using a Graphics Device Interface (GDI) object such as a pen in a straight C/SDK application. You first have to call CreatePen to get a handle to a pen. Then you use the pen to draw stuff. Finally, you need to call DestroyObject when you're done with the pen.

But wait—C++ is supposed to resolve this sort of problem. Indeed, one of the main benefits of C++ is its ability to group functionality into related pieces called classes. That's what frameworks such as MFC and OWL do. (For example, the MFC class CWnd pretty much wraps all the HWND-based API functions.) In addition, C++ constructors and destructors are supposed to perform all the setup and cleanup functions (such as CreatePen and DestroyObject). Naturally, you'd like to provide your spelling checker's functionality through a C++ class so that your clients can take advantage of these C++ features.

C++ and DLLs

Exporting the C++ class wholesale is probably the easiest way to expose the spelling-checker functionality. Inprise, Microsoft, and Symantec all support the following keywords for exporting entire classes from a DLL. The code for exporting the CSpellChecker class from a DLL looks like this:

class _ _declspec(dllexport) CSpellChecker {     static int m_nRefCount;     LPTEXTBLOB m_lpText; // LPTEXTBLOB is defined elsewhere. public:     CSpellChecker(LPTEXTBLOB lpText);     virtual ~CSpellChecker();     void CheckIt(); };

Notice that the only difference in this class definition is the addition of _ _declspec(dllexport). When you export a class this way, all its member functions and static data members are added to the DLL's export list. Clients who want to use the CSpellChecker class need only include the header file in their source code and make sure the spelling-checker DLL is available in the path.

Wow, that was easy! Is there a hitch? Well, yes, a couple of problems arise.

The Downside of Exporting C++ Classes

Imagine you start the marketing extravaganza for your spelling checker and someone licenses the DLL. You've written the DLL using Microsoft Visual C++. Your first client happens to develop software using Microsoft Visual C++ too, so you won't face a problem here. Then your next client wants to use Inprise's version of C++ as a development platform. Unfortunately, this client can't use the DLL. Here's why.

One of the strengths of C++ is that it employs type-safe linking to enable function overloading. When a C++ class is compiled into object code, the names of the class member functions become mangled; that is, the names become decorated with all kinds of information indicating their return types and signatures. C++ decorates the class members to ensure that the client code and the object code link correctly. This feature is known as type-safe linking, and it's a good thing. However, the folks in New Jersey (Bjarne et al.) only defined the language features (such as type-safe linking)—they couldn't force compiler vendors to implement a feature in a certain way. The folks in Santa Monica, Redmond, and Scotts Valley are free to implement type-safe linking any way they choose.

Here are examples of how each vendor mangles the function and static data symbols given in the class definition on the preceding page:

Symantec's Mangling

??0CSpellChecker@@QAE@PAX@Z ??1CSpellChecker@@UAE@XZ ??4CSpellChecker@@QAEAAV0@ABV0@@Z ??_GCSpellChecker@@UAEPAXI@Z ??_RCSpellChecker@@QAEAAV0@ABV0@@Z ?CheckIt@CSpellChecker@@QAEXXZ ?m_nRefCount@CSpellChecker@@0HA

Microsoft's Mangling

??0CSpellChecker@@QAE@ABV0@@Z ??0CSpellChecker@@QAE@PAX@Z ??1CSpellChecker@@UAE@XZ ??4CSpellChecker@@QAEAAV0@ABV0@@Z ??_7CSpellChecker@@6B@ ??_ECSpellChecker@@UAEPAXI@Z ??_GCSpellChecker@@UAEPAXI@Z ?CheckIt@CSpellChecker@@QAEXXZ ?m_nRefCount@CSpellChecker@@0HA

Inprise's Mangling

@CSpellChecker@$bctr$qpv @CSpellChecker@$bdtr$qv @CSpellChecker@CheckIt$qv @CSpellChecker@m_nRefCount

The compilers and linkers use this scheme to make sure that all the parameters are passed correctly and safely. Obviously, trying to link the Inprise-built DLL with a Symantec-built client will result in linking errors because each compiler uses a different name-mangling scheme.

Solving the Problem with Ordinals

One way around this name-mangling problem is to use ordinals. You can assign ordinal numbers to each exported member using a DEF file, thereby producing an import library for the compiler. Then the client code can refer to each member via its ordinal. This approach solves the problem of name mangling, but it introduces a huge maintenance overhead because you need to produce one import library for each compiler you support (because each compiler vendor probably uses a different name-mangling scheme).

Oops—Some More Problems

The type-safe linkage problem isn't the only problem you need to tackle. Other problems arise as you evolve the class. For example, what if you decide to change the spelling checker, say, to add a feature that allows the user to cache the 10 most frequently looked-up words? To implement this optimization, you'll need to add some data to your class. Another interesting aspect of C++ comes into play when you make such a change.

Perhaps the new class definition now looks like this:

class _ _declspec(dllexport) CSpellChecker {     static int m_nRefCount;     LPTEXTBLOB m_lpText; // LPTEXTBLOB is defined elsewhere.     LPSTR lpszFrequentWords[10]; public:     CSpellChecker(LPTEXTBLOB lpText) {         // Initialize.     }     virtual ~CSpellChecker() {         // Tear down.     }     void CheckIt() {         // Do the checking.     }     void AddToFrequentWordList(LPSTR lpszWord) {         // Cache frequent words.     } };

By adding data to your class, you've changed the class's size. The class is now 40 bytes larger on an Intel-based machine. In addition, you've potentially changed the class's layout. The important point to keep in mind here is that when you write client code that uses a C++ class, the client is quietly aware of the class layout. Although C++ provides syntax for making members private, protected, or public, the semantics don't apply at run time. The client code isn't supposed to know anything about the C++ class layout, but it does. The client code understands the entire layout of the class even if the client can access only certain members using C++ code. Remember, older clients have already coded against the layout of a specific class. Using old client code and new DLL code (or new client code against old DLL code) will likely result in a horrific program crash, increased support costs, and lost sales. The upshot is that if you reissue the DLL, all your clients have to recompile their code and reissue their applications as well.

Unfortunately, this situation brings us back to the problem of distribution that we were trying to solve by using dynamic linking. This solution isn't really much better than static linking. What if one of your clients doesn't want to recompile its application, but the other ones do? That means the people who buy applications from multiple vendors have to maintain two copies of a DLL that do pretty much the same thing (except that the new one is a bit faster). And you might be faced with this problem every time you reissue the DLL to fix a bug or make an improvement.

Ever wonder why you might have a bunch of different versions of the MFC DLL (such as MFC30.DLL, MFC40.DLL, MFC42.DLL, and so on) on your machine? You have this assortment because the DLL versions of MFC use this technique of exporting classes wholesale. Every time the folks in Redmond change the class size and layout, they have to reissue a new DLL. In addition, if you develop your software using the Symantec or the Inprise version of C++, you need to ship the corresponding version of the MFC DLL with your application. (For example, Symantec has a DLL named SMFC42.DLL.)

So you seem to be stuck between a rock and some hard places at this point. If you make the spelling checker available through static linking, you bloat your client's applications. If you use DLLs in the normal manner (that is, one export per function), you impose a good deal of overhead on your clients in terms of setup and cleanup code (in addition to not taking advantage of the strong features of C++). Providing the spelling-checker functionality via a C++ class exported from a DLL is OK as long as (1) everybody agrees on a single C++ compiler, and (2) the class size and layout never change. Unfortunately, these two conditions don't exist in the real world. But don't worry. There's a better way to develop software than using standard class-based programming—namely, using a discipline called interface-based programming. As we just saw when we were trying to distribute C++ classes, class-based programming implicitly couples the client to the DLL in several ways; that is, name-mangling and class-layout issues come into play. Interface-based programming separates the interface of a class from its implementation, thereby reducing the coupling between the client code and the DLL.

Interface-Based Programming

Although C++ has been a great tool for developing entire Windows-based applications during the past few years, it falls short when used to distribute object-oriented software components. The main reason for this limitation is that many C++ features are compiler-dependent and therefore source code bound. But remember that we're in the age of components now. We're trying to make it possible for anyone to purchase any component and be able to hook up to it easily.

When you export a class from a DLL, you explicitly export all the class's member functions, static data, and layout information to the client. You can probably get away with this method as long as you and all your clients are willing to use a single compiler forever. However, if your clients choose a different compiler or your compiler vendor decides to change the name-mangling scheme or class layout in memory, you're hosed. You'll have to recompile and redistribute everything.

Computer science 101 One of the first principles computer science instructors usually teach is the notion of establishing an interface and then holding it constant. If an interface is constant, you can switch the implementation around as much as you like without breaking code written to that interface. Although C++ has syntactic mechanisms for hiding various portions of C++ classes (using the public, private, and protected keywords), this mechanism breaks down as soon as you try to export the class from a DLL. When you export a class wholesale from a DLL, you implicitly provide all sorts of non-interface-related information to the client that can vary from compiler to compiler (or even from one version to another of a single compiler).

The problem we're trying to solve is to get the client's interface to a C++ class to remain constant so that we can exchange implementation details whenever we want to. For example, those implementation details might include a compiler vendor's name-mangling scheme or class layout in memory. In addition, we might want to shield ourselves from our own modifications to a C++ class that might inadvertently change the object's layout in memory. Fortunately, C++ has a mechanism for dealing with this situation: the abstract base class.

An abstract base class is simply a group of function signatures. In C++, pure abstract base classes—classes that have no data members and whose functions are all pure virtual—are expressed like this:

class PureAbstract {     virtual Function1() = 0;     virtual Function2() = 0; };

These classes have three characteristics:

Every function is virtual.

Every function is without implementation. (That's what the " = 0" is for.)

They don't contain any data members.

At first glance, abstract base classes are strange beasts. They are C++ classes, but you can't instantiate them. However, you can derive new classes from abstract base classes and instantiate the derived classes as long as you implement the functions defined in them.

For example, any self-respecting C++ compiler will complain if you try this:

class PureAbstract {     virtual Function1() = 0;     virtual Function2() = 0; }; PureAbstract* pAbstract; pAbstract = new PureAbstract;

But this is OK:

class PureAbstract {     virtual Function1() = 0;     virtual Function2() = 0; }; class DerivedFromPureAbstract :     public PureAbstract {     virtual Function1();     virtual Function2(); }; DerivedFromPureAbstract* pDerived; pDerived = new DerivedFromPureAbstract;

You're probably wondering, "Why would I ever want to use these classes?" All the mainstream C++ literature tells us to take classes and inherit the functionality we need from them and tweak the new class to our liking. But an abstract base class by itself does nothing; it can't inherit functionality. In short, you don't get all the free functionality that C++ is famous for. So what good are these classes? It turns out that abstract base classes are the way to hold a class interface constant in C++ (which sounds like it might be useful for solving the funky DLL problem we're facing).

Let's see what it would take to hold the spelling-checker class's interface constant so that we can switch around the implementation without breaking the clients.

Separating the interface from the implementation One way to separate the interface from its implementation is to describe a class simply as a collection of functions. After all, that's really what a C++ class is—just a bunch of functions that operate on data. Sure, in C++ we can expose data members publicly (even though it's often better design not to). We can always hold data in a C++ class and provide access to the data through accessor functions. So what we're really doing with C++ classes is describing functionality that might or might not have data associated with it.

Now imagine thinking hard about what you'd like to expose in a class and coming up with a complete set of functions that describes your C++ class. That set of functions becomes the abstract base class from which you derive your implementation.

Recall the spelling-checker implementation. (This is the code that goes in the DLL.)

// checker.h _ _declspec(dllexport) class CSpellChecker {     static int m_nRefCount; // How many others are using this?     LPTEXTBLOB m_lpText;    // LPTEXTBLOB is defined elsewhere. public:     CSpellChecker(LPTEXTBLOB lpText);     virtual ~CSpellChecker();     void CheckIt(); }; // checker.cpp CSpellChecker::m_nRefCount = 0; CSpellChecker::CSpellChecker(LPTEXTBLOB lpText) {     m_lpText = lpText; } CSpellChecker::~CSpellChecker() {     m_lpText = NULL; } void CSpellChecker::CheckIt() {     // Parse the text blob, looking up each word,     //  making corrections when necessary. }

By exporting the spelling-checker class wholesale from the DLL, the client can use the new operator to create an instance of the class. This approach works only if the stars are aligned, the gods aren't maligned, and both the client and the DLL were developed using the same compiler, because the client has coded to the entire class definition. We're trying to make it so that the client doesn't have to code to the class definition (because doing so brings in all the compiler-specific junk we're trying to avoid). To expose the class in a layout-independent and compiler-independent way, we need to develop a pure abstract base class representing the spelling-checker class to which the client can code.

For example, here's what the spelling-checker interface might look like:

// checkeri.h class SpellCheckerInterface { public:     virtual void CheckIt() = 0; };

Notice how the SpellCheckerInterface class differs from the original CSpellChecker class definition. Every function is virtual, the data members are absent, and an = 0; follows each function definition. To shield the client from all compiler-dependent types of things, the only functionality the client really needs is a way to call the CheckIt function. SpellCheckerInterface has neither a constructor nor a destructor because constructors and destructors introduce compiler dependencies. We'll see how to deal with those issues in a moment.

To attach the interface to some working C++ code, just derive the spelling-checker class from SpellCheckerInterface like this:

// checker.h #include "checkeri.h" class CSpellChecker : public  SpellCheckerInterface {     static int m_nRefCount; // How many others are using this?     LPTEXTBLOB m_lpText;    // LPTEXTBLOB is defined elsewhere. public:     CSpellChecker(LPTEXTBLOB lpText) {         // Initialize.     }     virtual ~CSpellChecker() {         // Tear down.     }     void CheckIt() {         // Do what it takes.     } };

CSpellChecker is the concrete class that actually implements the spelling-checker functionality. The only difference between this class and the original CSpellChecker definition is that CSpellChecker now inherits from SpellCheckerInterface. This inheritance adds SpellCheckerInterface's functions to CSpellChecker and promises the compiler an implementation of CheckIt.

Construction and destruction We need to attend to one last detail: constructing and destroying the class. Consider again what we're trying to do: we're trying to move all the C++-specific details to the DLL side of the client/DLL boundary so that the client only needs to worry about accessing the pure functionality. C++ object construction and destruction both depend on compiler-specific C++ class information. For example, the new operator has to know how to call the object's constructor, which implies that the new operator must have knowledge about name mangling (a potential problem when going beyond DLL boundaries). In addition, the constructor has to know the address of the object's vtable so that it can put that address in the object's vpointer (an operation that requires class layout knowledge). As for the destructor, the name-mangling problem rears its ugly head again. So we still need to come up with some substitutes for the new operator and the delete operator and somehow manage the construction and destruction of objects.

What we really need to do is to move object construction and destruction behind the DLL wall so that all the compiler-dependent stuff stays in one place. The obvious way to handle object construction is to export a function from the DLL. For example, here's a bit of code that would work well as a constructor for the spelling checker. Notice that what the client receives is not the CSpellChecker object itself but, rather, a pointer to the SpellCheckerInterface.

#include "checkeri.h" #include "checker.h" SpellCheckerInterface* ConstructSpellChecker(LPTEXTBLOB lpText) {     return (SpellCheckerInterface*)          new CSpellChecker(LPTEXTBLOB lpText); }

The DLL exports this function. Clients can acquire a pointer to this function by calling the standard Windows API functions LoadLibrary and GetProcAddress. Once a client gets this function pointer, the client can call the function whenever it requires the spelling-checker functionality.

Before moving on, you need to understand an interesting bit of C++ chicanery that's going on here. Notice how the function casts the result of the new operator (which is a CSpellChecker pointer) to a pointer to the abstract base class (SpellCheckerInterface*). This cast might look a bit odd, but it's done for a reason. One not-so-well-known fact about C++ is that casting an object pointer to one of its pure abstract base class derivatives yields a pointer to that pure abstract base class, which is just a function table. Just what we want!

Using the spelling-checker object Using the spelling checker requires going to the Win32 API and explicitly loading the DLL and calling the constructor function. Here's how a client might acquire and use the SpellCheckerInterface:

#include "checkeri.h"  typedef SpellCheckerInterface*     (WINAPI* LPSPELLCHECKERCTOR)(LPTEXTBLOB); void UseSpellChecker(LPTEXTBLOB lpText) {     SpellCheckerInterface* pSpellChecker = NULL;     LPSPELLCHECKERCTOR lpSpellCheckCtor = NULL;     HINSTANCE hInstance;     HInstance = LoadLibrary("c:\\spellingchecker.dll");     lpSpellCheckCtor =          GetProcAddress(hInstance, "ConstructSpellChecker");     pSpellChecker = lpSpellCheckCtor(lpText);     pSpellChecker->CheckIt(); }

This code doesn't venture beyond the realm of regular DLL coding techniques. UseSpellChecker first declares a pointer to a SpellCheckerInterface object. (The compiler sees this object simply as a function table.) In addition, UseSpellChecker declares a pointer to a function prototyped as the DLL's entry point used for obtaining spelling-checker interfaces. UseSpellChecker calls the Windows API function GetProcAddress to get the address of the DLL's ConstructSpellChecker function and then calls ConstructSpellChecker to get a SpellCheckerInterface. Once UseSpellChecker has the interface, it can use the interface to check the spelling of the block of text.

Destroying the spelling-checker object As you can see from the code on the preceding page, the spelling-checker object is allocated but never freed. How do you handle object destruction in a case like this?

Destroying a C++ object isn't a simple prospect—it involves deallocating memory properly and calling the object's destructor. Again, memory allocation and object construction policies aren't something we want to share between client/DLL boundaries. We have to somehow ask the DLL to destroy the object. At first, it seems as though we might be able to export a single destructor function (as an analogue to the constructor function). However, that would be fairly inconvenient. A better way to handle the destruction is to add one more member function to the interface so that the client can ask the object to delete itself when it's done using the object. So the SpellCheckerInterface gets a new function named DeleteMe, as shown here:

// checkeri.h class SpellCheckerInterface { public:     virtual void DeleteMe() = 0;     virtual void CheckIt() = 0; };

The implementation code looks like this:

void CSpellChecker::DeleteMe() {     delete this; }

This code might look funny, but it's perfectly legal C++ syntax. Calling delete with this simply calls the object's destructor and then deallocates any memory the object used in the normal C++ way. The important point to realize is that this activity is happening on the DLL side of the client/DLL boundary, thereby decoupling the DLL from the client.

Now let's finish the example. Here's how a client might use and destroy your spelling-checker object:

#include "checkeri.h" void UseSpellChecker(LPTEXTBLOB lpText) {      /* Use the exported function */     /* to construct the object.  */     pSpellChecker->CheckIt();     pSpellChecker->DeleteMe(); }

Immutable interfaces At this point, you know that an interface is just a collection of function signatures. (C++ represents these interfaces as abstract base classes.) You also know that the only way for client code to talk to an object is through the interface. Let's add one more ingredient—the idea that interfaces should remain immutable. Once you've decided which functions to include in the spelling-checker interface and clients start coding to the interface, the spelling-checker interface should never change. Of course, this immutability means that if a client can get a spelling-checker interface pointer, the client can count on having a well-known way to talk to the spelling-checker object.

Now you're probably thinking back and trying to recall the last project you worked on whose programmatic interface didn't change eventually. You probably can't think of one. The notion of holding the interface constant is all well and good: an ideal we should all aim for. But hey, we're in the real world, and software changes. What should we do when we need to add functionality or change an interface to a class?

The answer is to create a completely new interface with a new name. Imagine that you want to add a new function to your spelling-checker interface. To avoid breaking new clients by changing the existing interface, you simply need to create a new interface. For example, let's say you want to add a function to get synonyms for a specific word:

class SpellCheckerInterface2 : SpellCheckerInterface { public:     // DeleteMe and CheckIt come from SpellCheckerInterface.     virtual char* GetSynonyms(char* szWord) = 0; };

To implement this interface, you simply apply SpellCheckerInterface2 to the concrete class:

// checker.h #include "checkeri.h" class CSpellChecker : public  SpellCheckerInterface2 {     static int m_nRefCount; // How many others are using this?     LPTEXTBLOB m_lpText;    // LPTEXTBLOB is defined elsewhere. public:     CSpellChecker(LPTEXTBLOB lpText) {         // Initialize.     }     virtual ~CSpellChecker() {         // Tear down.     }     void CheckIt() {         // Do what it takes.     }     void DeleteMe() {         delete this;     }     char* GetSynonyms(char* szWord) {         // Generate synonyms.     } };

The problem is that you've now enabled two ways to talk to the spelling-checker class: through the SpellCheckerInterface base class and through the SpellCheckerInterface2 interface. To solve this problem, should you just add another entry point into the DLL (perhaps a function named ConstructSpellChecker2) to get the second version of the interface?

The way to fix this problem is to provide a well-known way to get new interfaces from an existing interface. For example, imagine that SpellCheckerInterface and SpellCheckerInterface2 look like this:

class SpellCheckerInterface { public:     virtual void* GetInterface(char* szInterfaceName) = 0;     virtual void DeleteMe() = 0;     virtual void CheckIt() = 0; }; class SpellCheckerInterface2 : SpellCheckerInterface { public:     // GetInterface, DeleteMe, and CheckIt     //  come from SpellCheckerInterface.     char* GetSynonyms(char* szWord); };

Once the client gets a SpellCheckerInterface, the client has a way of navigating to a second interface on the object by calling GetInterface. For example, here's how some client code might use GetInterface:

void UseSpellChecker(SpellCheckerInterface* pSpellChecker) {     pSpellChecker->CheckIt();     SpellCheckerInterface2* pSpellChecker2;     pSpellChecker2 =          pSpellChecker->GetInterface("SpellCheckerInterface2");     if (pSpellChecker2) {         char* szSynonyms[256];         pSpellChecker2->GetSynonyms("component");     } };

When you think about it, this is a pretty good solution for the versioning problem we encountered earlier. Because SpellCheckerInterface never changes, clients can always count on certain function signatures in the interface. Older clients don't break. If you decide to add new functionality to the spelling-checker object, you just create a new interface. The GetInterface function (always available at the top of the interfaces) provides a way for you to acquire more interfaces as necessary.

Here's how you would implement the new version of the object:

// checker.h #include "checkeri.h" class CSpellChecker : public  SpellCheckerInterface2 {     static int m_nRefCount; // How many others are using this?     LPTEXTBLOB m_lpText;    // LPTEXTBLOB is defined elsewhere. public:     CSpellChecker(LPTEXTBLOB lpText) {         // Initialize.     }     virtual ~CSpellChecker() {         // Tear down.     }     void* GetInterface(char* pszInterfaceName) {         if (stricmp(pszInterfaceName, "SpellCheckerInterface") {             return static_cast<SpellCheckerInterface*>(this);         } else if (stricmp(pszInterfaceName,             "SpellCheckerInterface2") {                 return static_cast<SpellCheckerInterface2*>(this);         } else {             return 0;         }     }     void CheckIt() {         // Do what it takes.     }     char* GetSynonyms(char* szWord) {         // Generate synonyms.     } };

The only difference in this implementation is the addition of the GetInterface function. Notice how this function is implemented. GetInterface examines the string passed in by the client. If the object implements the interface that the client requested, the object performs a static cast on its own this pointer. This code looks kind of funny, but it's perfectly legal. By casting the pointer to a concrete class to one of its base classes, the C++ compiler shears into the object and retrieves that object's vptr (which just so happens to be a vtable representing the interface functions).

Orthogonal interfaces Although at times you might want to extend an interface by adding or changing functions, at other times you need to add completely new independent interfaces to your class. For example, imagine you want to add an interface that persists the most frequently used words into a file. You might write an interface that looks like this:

class PersistMFUInterface { public:     virtual void* GetInterface(char* szInterfaceName) = 0;     virtual void DeleteMe() = 0;     virtual void PersistMFUWords(char* pszFileName) = 0; };

Adding a new interface to an implementation is simply a matter of adding the abstract base class to the inheritance list and modifying GetInterface to work properly, like so:

// checker.h #include "checkeri.h" class CSpellChecker : public SpellCheckerInterface2,                       public PersistMFUInterface {     static int m_nRefCount; // How many others are using this?     LPTEXTBLOB m_lpText;    // LPTEXTBLOB is defined elsewhere. public:     CSpellChecker(LPTEXTBLOB lpText) {         // Initialize.     }     virtual ~CSpellChecker() {         // Tear down.     }     void* GetInterface(char* pszInterfaceName) {         if (stricmp(pszInterfaceName, "SpellCheckerInterface") {             return static_cast<SpellCheckerInterface*>(this);         } else if (stricmp(pszInterfaceName,             "SpellCheckerInterface2") {                 return static_cast<SpellCheckerInterface2*>(this);         } else if (stricmp(pszInterfaceName, "PersistMFUInterface") {             return static_cast<PersistMFUInterface*>(this);         } else {             return 0;         }     }     void CheckIt() {         // Do what it takes.     }     char* GetSynonyms(char* szWord) {         // Generate synonyms.     }     void PersistMFUWords(char* pszFileName) {         // Save the most frequently used words to         //  a file denoted by pszFileName.     } };

One interesting side effect of implementing C++ classes with multiple interfaces is that you create the potential for the client to refer to the C++ class more than once. Unfortunately, this possibility causes problems for the delete function. The problem is serious enough that we need to handle the object's lifetime a bit differently.

Object lifetime The possibility that multiple interfaces can refer to a class causes a problem for the delete function. For example, consider the following code:

void UseSpellChecker(SpellCheckerInterface* pSpellChecker) {     pSpellChecker->CheckIt();     SpellCheckerInterface2* pSpellChecker2;     pSpellChecker2 =          pSpellChecker->GetInterface("SpellCheckerInterface2");     if (pSpellChecker2) {         char* szSynonyms[256];         pSpellChecker2->GetSynonyms("component");         // Should you call pSpellChecker2->DeleteMe here?     }     PersistMFUInterface* pPersistMFU;     pPersistMFU = pSpellChecker->GetInterface("PersistMFUInterface");     if (pPersistMFU) {         pPersistMFU->PersistMFUWords("c:\\MFUWords.txt");         // Should you call pPersistMFU->DeleteMe here?     } };

Unfortunately, it's unclear when the delete function should be called. One way to solve this problem is to use standard reference counting. So instead of including a delete function on the interface, you might include an AddReference function and a ReleaseReference function. Consider these new interfaces:

class SpellCheckerInterface { public:     virtual void* GetInterface(char* szInterfaceName) = 0;     virtual void AddReference() = 0;     virtual void ReleaseReference() = 0;     virtual void CheckIt() = 0; }; class SpellCheckerInterface2 : SpellCheckerInterface { public:     // GetInterface, AddReference, ReleaseReference,      //  and CheckIt come from SpellCheckerInterface.     virtual char* GetSynonyms(char* szWord) = 0; }; class PersistMFUInterface { public:     virtual void* GetInterface(char* szInterfaceName) = 0;     virtual void AddReference() = 0;     virtual void ReleaseReference() = 0;     virtual void PersistMFUWords(char* pszFileName) = 0; };

The AddReference and the ReleaseReference functions are there to help the spelling-checker object know how many times it's being watched by the client.

The final version of the spelling checker (with the reference counting) now looks like this:

// checker.h #include "checkeri.h" class CSpellChecker : public SpellCheckerInterface2,                       public PersistMFUInterface {     static int m_nRefCount; // How many others are using this?     LPTEXTBLOB m_lpText;    // LPTEXTBLOB is defined elsewhere.     DWORD m_dwRefCount; public:     CSpellChecker(LPTEXTBLOB lpText) {         m_dwRefCount = 0;         // Initialize.     }     virtual ~CSpellChecker() {         // Tear down.     }     void AddReference() {         m_dwRefCount++;     }     void ReleaseReference() {         m_dwRefCount--;         if (m_dwRefCount == 0)             delete this;     }     void* GetInterface(char* pszInterfaceName) {         if (stricmp(pszInterfaceName, "SpellCheckerInterface") {             AddReference();             return static_cast<SpellCheckerInterface*>(this);         } else if (stricmp(pszInterfaceName,             "SpellCheckerInterface2") {                 AddReference();                 return static_cast<SpellCheckerInterface2*>(this);         } else if (stricmp(pszInterfaceName, "PersistMFUInterface") {             AddReference();             return static_cast<PersistMFUInterface*>(this);         } else {             return 0;         }     }     void CheckIt() {         // Do what it takes.     }     char* GetSynonyms(char* szWord) {         // Generate synonyms.     }     void PersistMFUWords(char* pszFileName) {         // Save the most frequently used words to         //  a file denoted by pszFileName.     } };

Notice that AddReference simply bumps the object's reference counter up by 1. ReleaseReference takes the reference counter down by 1. If the reference count is 0, the object deletes itself. The revised client code looks like this:

void UseSpellChecker(SpellCheckerInterface* pSpellChecker) {     pSpellChecker->CheckIt();     SpellCheckerInterface2* pSpellChecker2;     pSpellChecker2 =          pSpellChecker->GetInterface("SpellCheckerInterface2");     if (pSpellChecker2) {         char* szSynonyms[256];         pSpellChecker2->GetSynonyms("component");         pSpellChecker2->ReleaseReference();     }     PersistMFUInterface* pPersistMFU;     pPersistMFU = pSpellChecker->GetInterface("PersistMFUInterface");     if (pPersistMFU) {         pPersistMFU->PersistMFUWords("c:\\MFUWords.txt");         pPersistMFU->ReleaseReference();     } };

Now instead of deleting the object wholesale, the client simply releases the interface pointers once the client has finished using them.

The Upshot

So why would you ever want to spend the extra time writing abstract base classes? Wouldn't it be far more convenient simply to write a C++ class, export it from a DLL, and have the client link to the import library? This scenario actually works fine for small projects that are done in-house. In that setting, you can control which version of which compiler you choose. However, the operative word here is small. We've worked on projects where we've used the class-export technique and have found that it can cause significant problems on large projects (even when they're done in-house). When large projects use lots of DLLs that are coupled because of the DLL class-export mechanism, just adding one tiny variable to one tiny class can force a recompile of the entire project! That can sometimes take hours. (Sure is a great time to catch up on Dilbert, though!) And if you forget to compile one of the DLLs, your program might inadvertently crash because the client and the DLL don't agree on the layout of the class. Remember, for a program to work, all the bits and bytes have to be in exactly the right order.

These problems with DLLs (that is, trying to expose C++ classes from DLLs using the convenient declspec(_ _dllexport) statement) compound when you try to distribute your DLLs to other clients that might or might not share the same compiler, forcing you to create separate import libraries for each compiler. And this is in addition to the problems related to changing the size of the classes.

The technique outlined above (writing abstract base classes) might seem a bit extreme and might appear to introduce a bit of overhead into your DLLs. However, the benefits of not having to worry about recompiling huge libraries of code or breaking applications just because of a minor change in a DLL far exceed the up-front cost of somehow separating a C++ class's interface from its implementation.

If this programming technique involving abstract base classes makes sense to you, you're 85 percent of the way toward understanding COM. The entire basis of COM is this idea of separating an object's interface from its implementation and is a real key to object-oriented software components. You simply need to exercise a bit of discipline.

COM Interfaces

COM is an integration and distribution technology based on the principle of interface-based programming. The main tenet of interface-based programming is that clients talk to software objects using well-defined interfaces (as opposed to talking to the software objects directly). For example, SDK-based Windows programming is a style of interface-based programming. Think about it—when you write software to talk to someone using an interface element on the screen, you don't talk to Windows directly. You communicate through a set of API functions, passing around a window handle. As a programmer, you don't know anything about what's behind that magic number represented by the window handle. However, you do have a well-defined way of making the software do what you want it to.

The C++ programming style of using abstract base classes is a style of interface-based programming. The client has a well-defined way of talking to the software. In this case, the interface is simply a table of function tables.

An interesting point to note is that C++ developers have always had the ability to write interface-based programs using C++ abstract base classes. COM just formalizes interface-based programming. Although the concept of interface-based programming is the most important point of COM, COM throws in some extra goodies. Those goodies are what we'll look at next.

Interfaces are immutable When developing COM-based software, you'll spend a lot of your time concentrating on the interfaces that clients and objects are going to use to talk to each other. In COM, interfaces come first and then are followed up with implementations (as opposed to C++, where developers traditionally think about interfaces and implementation together). COM interfaces are expressed conveniently in C++ using abstract base classes—just as described earlier in this chapter.

As you saw in the spelling-checker example, COM defines a specific rule governing interfaces: interfaces are immutable. So once you create an interface and publish it (that is, the interface is used widely), you won't be able to change it. The advantage to this approach is that client code can count on an interface remaining constant. That way, clients don't break as a result of interface changes (because the interface never changes). Of course, software development is a dynamic process, and interfaces do change. At some point, you'll want to change your objects by changing an existing interface. In COM, you do that by adding a new interface. COM provides a way for the client to acquire those new interfaces—we'll go over that technique in a moment.

Just as in the spelling-checker example, COM interfaces are named. Instead of being named by human-readable strings, however, COM uses a numbering scheme to name interfaces. The numbers used to name interfaces are GUIDs.

GUIDs When you work with interfaces, you must name them. Keep in mind these two points when it comes time to name your interfaces:

Interfaces remain immutable for all time.

Interfaces are going to be distributed all over the world.

These two facts mean that the naming scheme has to produce unique names. COM uses the Distributed Computing Environment (DCE) naming scheme. Specifically, COM borrows DCE's use of 128-bit numbers to identify unique elements. These 128-bit numbers are known as GUIDs—or globally unique identifiers. (GUIDs are also known as UUIDS, or universally unique identifiers.) For example, the following number is a GUID:

{1D5BA865-1F95-11d2-8CAA-E8C677 DDD893}

Whenever you need to name something uniquely within COM, you can use a GUID. (In fact, you can use a GUID whenever you need a unique name for anything—GUIDs are often a great way to name kernel objects.) GUIDs are pretty easy to come by. The Windows API includes a function named CoCreateGuid for generating GUIDs. In addition, Visual C++ comes with a useful utility named GUIDGEN.EXE that generates GUIDs in various formats.

The GUID-production algorithm first relies on the network card installed in the computer (generating numbers that are unique in space). The GUID-production algorithm also relies on the system clock (giving a number unique in time). Finally, the algorithm also uses a couple of other persistent counters that ensure the GUIDs are unique.

The first place you'll bump into GUIDs is in naming interfaces. Again, because interfaces are unique items, they're named by GUIDs. When you need interfaces, you'll ask for them using these GUIDs.

NOTE
GUIDs are also called IIDs (interface IDs), CLSIDs (class IDs), LIBIDs (library IDs), and CATIDs (category IDs), depending on the item being identified. You'll run into some of these other terms later in the book.

IUnknown Recall from the spelling-checker example that each interface started with a function named GetInterface. This function allows clients to widen their connections to objects at run time. In addition, each interface included a function to notify an object that new references are being made to it or that existing references are being removed. When you think about it, that's the kind of functionality all objects need in a binary object model. COM includes these facilities within a single well-known interface named IUnknown, which looks like this:

struct IUnknown {     virtual HRESULT QueryInterface(REFIID riid, void** ppv) = 0;     virtual AddRef() = 0;     virtual Release() = 0; };

Here's the trick to understanding IUnknown:

Every COM object implements this functionality.

Every COM interface makes this functionality available.

The first part is easy—every COM class you see will have code to handle QueryInterface, AddRef, and Release. The second part implies that every COM interface will start with these three functions.

Again, IUnknown embodies the functionality that more or less every object needs. The interface is called IUnknown because when you get a pointer to it, you don't know anything about the object behind the pointer. It's your job to query and find out about the interfaces the object supports. There are lots of ways to guess what's behind the pointer. For example, the vendor who provided this object might be able to give you some documentation telling you what interfaces the object supports.

NOTE
You should keep in mind one detail about QueryInterface's signature. Notice that the requested interface is returned as an out parameter (a parameter that a function uses to return information), not a conventional return value. The reason for the out parameter is that the COM remoting mechanism requires remotable functions to return HRESULTs. HRESULTs are 32-bit rich error codes that indicate success or failure of a method, the area in which a failure occurred (remote procedure call, or RPC; Automation; storage; and so on), and a specific error code.

The separation between interface and implementation is so fundamental to COM that there's even a separate language for describing interfaces: Interface Definition Language.

Interface Definition Language (IDL) Microsoft IDL (MIDL) exists to specifically describe interfaces in unambiguous terms. In conventional C and C++ development (nondistributed development), everybody's swimming in the same pool; that is, client software and object software share the same memory and resources such as the stack. These situations permit a lot more wiggle room, allowing for such conveniences as open-ended arrays.

Remember that one of COM's overriding goals is to make it easy to distribute software. We've looked only at the DLL distribution mechanism so far, but COM also makes it easy to write objects that can be distributed over a network. Obviously, when you've got client code sitting on one machine calling object code on another machine, the client and the object no longer share the same calling context. Something (in this case, the remoting layer) has to pick up the calling context of the client and transfer that calling context over to the object machine. This activity requires interfaces to be defined clearly and specifically. That's why IDL exists. When you describe interfaces using IDL, there's no wiggle room. You describe the calling context exactly so that it can be set up anywhere it needs to go.

IDL looks like an attribute-extended version of C. You describe "things" in IDL. "Things" in this context include interfaces, libraries, parameters, and so forth. Each "thing" in IDL can be preceded by an attribute. For example, when you describe interface functions, you can provide explicit instructions to the remoting layer about the size and direction of the parameters. We'll see an example in just a moment.

Microsoft's IDL compiler even uses the C preprocessor. IDL supports a rich set of primitive data types (short, long, IUnknown*, and so on). You can use IDL to compose your own structures out of the primitive data types. Although the primary purpose of IDL is to describe interfaces, IDL is also useful for describing what interfaces you can expect to find in a COM class.

Here's an example of some IDL—the spelling-checker interfaces described in IDL-ese:

This IDL file describes three distinct interfaces: ISpellChecker, ISpellChecker2, and IPersistMFU. The first line of the IDL imports the standard COM definitions. This line is akin to including WINDOWS.H in your Windows program. Then come the interface definitions. Notice how each interface is preceded by attributes (found in the square braces). Each interface is named by a GUID. Also notice how the parameters have attributes applied to them that indicate the direction of the parameters. You'll see an interesting parameter type listed in some of the function signatures—the BSTR type. The BSTR type is a Unicode string preceded by length data. These parameters are used for Visual Basic compatibility.

Finally, the IDL code has a library statement. In IDL, the library statement tells the compiler to build a type library, or a binary database, including the interface definitions. The type library usually accompanies the DLL (or EXE) housing the COM classes. The coclass statement in IDL indicates which interfaces the client can reasonably expect to retrieve from the COM class.

When developing COM software, IDL is where you start. Again, one of the most important tenets of COM programming is that class implementations should be separated from their interfaces. You need to treat interfaces as separate entities, and having a separate language to define interfaces simply reinforces that requirement. In addition, compiling the IDL generates lots of useful products that your object will need throughout its lifetime. For example, compiling the IDL provides C/C++ header files you can use to implement the interfaces. (The header files include abstract base classes defining the interfaces.) Compiled IDL also produces a file containing symbolic definitions for the GUID mentioned in the IDL, the network glue and the type library so loved by the Java and Visual Basic clients.

How clients use interfaces To see how COM interfaces are used, let's take a look at how we can alter the previous spelling-checker example by using the COM versions of the spelling-checker interfaces. COM's protocol for using interfaces goes like this: you first acquire an interface using some means (perhaps you call an API). You use the interface for as long as you need to. Then you release the interface. Here's an example of acquiring, using, and releasing interfaces:

As in the abstract base class example, the client only knows how to talk to the interfaces (instead of talking directly to the implementations). Notice that the UseSpellChecker function accepts a pointer to an ISpellChecker interface. Don't worry about how to acquire that pointer yet. We'll go over that shortly. Next up, we'll examine how to tie the interfaces to an implementation.

Implementations (COM Classes)

Once you've defined your interfaces, you'll want to attach them to an implementation; that is, you'll want to write a COM class implementing those interfaces. COM classes are bodies of code that implement COM interfaces. A single COM class might implement several interfaces. In fact, a full-blown Microsoft ActiveX control implements more than a dozen COM interfaces.

As in the abstract base class example, the easiest way to attach the interfaces to a concrete class is to use multiple inheritance of the interfaces you want to implement. (This isn't the only way to wire up a COM class—we'll look at other ways when we cover advanced interface composition techniques in Chapter 8.) The formula for implementing a COM class is to inherit a concrete C++ class from the interfaces you want to implement. Then just implement the union of all interface functions on your C++ class, like so:

This class is strikingly similar to the abstract base class. The only real change is that the interface function signatures return HRESULTs (so you can remote the interface if you choose to), and the name of the function for retrieving more interfaces is QueryInterface.

So far, you've seen how to implement COM classes using C++. Remember that the main precept in COM is the separation of the interface from its implementation. Once you understand that, your class is almost ready to participate in the COM infrastructure. For the spelling-checker object to work in the real world, it needs two more things: a class object and a server.

Class Objects

COM classes are always paired with class objects. (This is a requirement for playing in the COM game.) Class objects are COM objects—they implement interfaces, as do all other COM classes. However, they have a special place inside the COM architecture.

You can think of a class object as a meta-class for your COM class. It's a singleton-type COM class that is paired with the real COM class. For example, imagine a COM server with three COM classes in it. That server would also contain three class objects—one for each kind of COM class in the server. COM class objects generally serve two purposes. First, COM class objects are usually responsible for activating the classes to which they're paired. They almost always accomplish this by implementing an interface named IClassFactory. Clients ultimately end up using the IClassFactory interface to create instances of COM classes.

The second purpose of COM class objects is to serve as the static data area for a COM class. The nature of the class object is that it is global and static. A COM class object's lifetime begins at the same time as the server's lifetime. A class object's life extends beyond the life of the COM object it represents. This longevity makes the class object the ideal place to store static data or implement a static interface (similar to the static modifier in C++).

The key function to notice within IClassFactory is the CreateInstance function. Notice how closely CreateInstance resembles QueryInterface—there's a GUID to identify the interface and a place to put the interface pointer. The first parameter to CreateInstance is called the controlling unknown used for COM aggregation. Don't worry about it now. We'll take a closer look at it when we look at COM identity and composing COM classes using ATL in Chapter 8. Here's an example of a class object for the spelling-checker object. This class object implements the IClassFactory interface.

This class object is just like the other COM classes we've seen except for the way reference counting is done. COM class objects are usually global to the server as opposed to being created on the heap. This means that a class object doesn't need to worry about deleting itself—it will go away when the server goes away.

So far, we've looked at how COM separates interfaces from their implementations and at how to attach a COM interface to a COM implementation (a COM class). We've also seen a COM class object—the instance-less area of a COM class. Our next stop is COM servers, where you'll find out how to house COM classes inside real code modules.

COM Servers

COM objects obviously need to live somewhere—they live inside COM servers. One of the key features of COM is that it supports two fundamental localities:

Another key feature of COM is that clients call in-process objects as easily as they call remote objects. The remoting layer is well defined and completely invisible to the client. What's more, the same object code can live in either an in-process server or an out-of-process server. Because the differences necessary for supporting different localities are easily isolated, you can use the same source code to write in-process and out-of-process objects.

The Client Side

Let's first look at COM servers from the client side. The client side is easy. Remember that COM is a binary object model. Instead of calling the new operator to create objects (as you do in C++), you call an API for activating objects, which COM supports.

Before calling any COM functions, a thread needs to call CoInitialize to load the COM infrastructure (and to enter an apartment, as we'll see in a moment). Once a thread calls CoInitialize, the thread is free to call COM APIs, including the activation APIs we're about to look at.

The first way to create COM objects is to retrieve the class object, ask for the IClassFactory interface, and call CreateInstance from the IClassFactory interface. Here's the prototype for CoGetClassObject:

The first parameter for CoGetClassObject is the GUID of the implementation you're looking for. The second parameter represents the locality of the server. The locality is represented by some bitwise flags that can be any of the following values OR'd together: CLSCTX_INPROC_SERVER, CLSCTX_INPROC_HANDLER, CLSCTX_LOCAL_SERVER, CLSCTX_REMOTE_SERVER, CLSCTX_ALL, and CLSCTX_SERVER. The third parameter is a structure containing the name of the remote machine (if applicable) and authorization information. Finally, the last two parameters are the QueryInterface signature—a GUID naming an interface and a place to put the interface pointer. When clients want to talk to the class object, they usually (but not always) ask for the IClassFactory interface.

Using CoGetClassObject is the most flexible way to activate objects. Once you get the class object, you can request any interface you want (not just IClassFactory). That way, you can use other interfaces to activate the actual object. The downside of using CoGetClassObject is that it takes more than one round-trip to activate the object. If you want to create several instances of the spelling-checker object, the performance will be better if you get the class object once and then ask the class object to manufacture multiple objects for you.

COM provides a shortcut for activating objects—CoCreateInstance. CoCreateInstance wraps the calls to CoGetClassObject and IClassFactory::CreateInstance. Here's the prototype for CoCreateInstance:

As with CoGetClassObject, the first parameter for CoCreateInstance is the GUID of the implementation you're looking for. The second parameter is the controlling unknown used for aggregation. The third parameter represents the locality requested by the client. Finally, the fourth and fifth parameters are the QueryInterface signature (a GUID representing the requested interface and a place to put the interface pointer).

This means of activating objects is less flexible because it creates only a single object. However, it takes only one round-trip to create the object. If you want to create several instances of the spelling-checker object, the performance will languish.

So just what happens behind CoGetClassObject and CoCreateInstance? These two functions are responsible for locating and activating the servers the client requests. They work somewhat differently depending on the locality of the server requested, but the client doesn't care.

The Server Side: DLLs

A COM server is simply a code module that houses a COM class and its class object. Again, COM servers come in two flavors: DLLs and EXEs. Although the actual code for the COM classes doesn't vary much, the code for the server will vary depending on whether it's a DLL or an EXE. Let's start with DLL servers.

Four exported functions distinguish a COM DLL from a normal, everyday DLL: DllGetClassObject, DllCanUnloadNow, DllRegisterServer, and DllUnregisterServer. Of these four functions, only DllGetClassObject is absolutely required. If you want to be a good COM citizen when you write servers, however, you'll implement all of them. (Your ATL-based COM servers will implement all these functions.)

When a client calls CoGetClassObject using the GUID that identifies your object, the service control manager (SCM) searches the Registry for that GUID. COM looks under the Registry key HKCR (HKEY_CLASSES_ROOT) for the CLSID (short for class ID) key. The CLSID key contains all the COM classes registered on the machine. This key is just a list of GUIDs. If the GUID representing the spelling-checker class is listed and the GUID has a subkey named InProcServer32 that contains a value that points to a DLL, COM assumes that is the DLL containing the implementation. The SCM loads the DLL and looks for a distinguished entry point named DllGetClassObject. Here's the signature for DllGetClassObject:

The SCM simply forwards the CLSID and the interface ID (IID) requested by the client and the pointer to the interfaces. It's the DLL's responsibility to provide an interface to the class object if the requested class object and interface are available. Here's how you might implement DllGetClassObject for the spelling-checker object:

Notice that the spelling-checker object is a global variable in the DLL source code. This is fine because it suits the purpose of the class object—to be the static area of a COM class. When the SCM calls into DllGetClassObject, DllGetClassObject rips through the list of available COM classes (there's only one in this case) and queries it for the interface the client requested.

The other issue to tackle with the COM DLL is the issue of lifetime management. When you've got a client process space that's loaded with DLLs, that client will often want to remove DLLs when they're no longer needed. The Win32 API includes a function named FreeLibrary that complements the call to LoadLibrary used to load the COM DLL into the client's process space. Keep in mind, however, that a COM DLL might be serving multiple objects simultaneously. It would be very rude (and would crash the system) to remove a DLL via FreeLibrary while the DLL is still in use. For this reason, COM has a specific unloading scheme for its DLLs. That's where the second distinguished entry point, DllCanUnloadNow, comes in.

COM DLLs usually maintain a global reference count on themselves. This count goes up in three cases:

DllCanUnloadNow examines this global reference count and returns S_OK if the reference count is 0 (the DLL is not serving any objects) or returns S_FALSE if the reference count is nonzero.

The system calls DllCanUnloadNow whenever a client calls the COM API function CoFreeUnusedLibraries. CoFreeUnusedLibraries goes to each DLL that's loaded and asks the DLL if it can unload by calling into DllCanUnloadNow. If the DLL can unload, the system frees the library.

The last two functions that distinguish a COM DLL from a regular DLL are DllRegisterServer and DllUnregisterServer. By implementing these two functions, you turn your DLL into a self-registering DLL. "Self-registering" means that the DLL is responsible for putting the required interfaces into the Registry. So far, the only entry we've seen is the HKCR\CLSID\{Some Guid}\InProcServer32 key. We'll see others as we move further into ATL. DllRegisterServer and DllUnregisterServer are usually called by installation programs. DllRegisterServer and DllUnregisterServer exercise the Win32 Registry API to insert and remove Registry entries.

The Server Side: EXEs

COM objects residing within EXEs are activated in a slightly different way than are COM objects residing within DLLs. Let's take a look at how COM EXE servers work under the hood.

Again, the client activates COM objects residing within EXEs in the same way it activates COM objects residing within DLLs. The client just calls CoGetClassObject or CoCreateInstance. When the SCM looks for the EXE version of the server, however, the SCM searches the Registry for the LocalServer32 key under the requested CLSID. When the SCM finds the path to the server, the SCM launches the server. The first task the server performs once it's launched is to register the class objects with the SCM by calling the Win32 API CoRegisterClassObjects. This run-time registration makes the class objects available to the SCM so that the SCM can hand interfaces to class objects over to the client. Then the client uses the class object to activate the real object (or whatever). Once the EXE server has registered its class objects with the SCM, the server spins a message pump until it receives a WM_QUIT message.

As with DLL servers, EXE servers must manage their own lifetimes. The main difference between DLL lifetimes and EXE lifetimes is that the server gets a lock on an EXE lifetime by calling CoRegisterClassObject. This call is equivalent to the DLL putting a lock on the server every time a reference to a class object is made. Remember that the SCM is going to hold on to the class object. Otherwise, the server increments its reference count every time it hands out an interface pointer.

Unlike DLL servers, which need to be unloaded from the outside, EXE servers are responsible for removing themselves when they are no longer needed. EXE servers self-delete by posting a WM_QUIT message to themselves at the appropriate time (when the server reference count drops to 0). This message causes the EXE server to fall out of the message pump, revoke its class objects, and clear out.

Notice how the thread first calls CoInitialize, registers the spelling checker's class object, and then runs the message loop. Before the server ends, it revokes the class object registration from the SCM by calling CoRevokeClassObject.

Finally, EXE servers are responsible for publishing the correct entries in the Registry. They do this by looking for /RegServer on the command line when they run. If an EXE server detects the /RegServer switch on the command line, the EXE server just plugs the proper entries in the Registry and leaves—the server doesn't run a message pump or engage in any other activity.

The final issue we'll explore is how COM handles threading and remote communication.

Apartments

COM is all about allowing as many different kinds of developers as possible to share software. This goal is no small feat, given the variety of development tools and environments in use. For example, it's perfectly reasonable to expect a Visual Basic program to use components written in C++, and vice versa. That kind of sharing is already a done deal because Visual Basic provides a mapping between the Visual Basic language and COM interfaces. Some other issues come into play, however, when developers using other development languages want to share software. One of those issues is thread safety, or making sure data and objects don't get messed up when accessed by multiple threads.

Interfaces

A Spelling-Checker Component Example

Try Static Linking

Dynamic Linking to the Rescue

C++ and DLLs

The Downside of Exporting C++ Classes

Solving the Problem with Ordinals

Oops—Some More Problems

Interface-Based Programming

The Upshot

COM Interfaces

Implementations (COM Classes)

Class Objects

COM Servers

The Client Side

The Server Side: DLLs

The Server Side: EXEs

Apartments

COM and Threads

Just What Is an Apartment?

Moving into an Apartment

Interface Pointers and Apartments

COM Remoting

Marshaling

The Default Protocol: COM ORPC

Apartments, DLLs, and the Apartment Registry Entries

What the Models Mean to the COM Developer

The Whole Picture