Examining the COM Infrastructure | Programming Distributed Applications with Com and Microsoft Visual Basic 6.0 (Programming/Visual Basic)

What do you know about COM so far? You know that a client must communicate with an object through an interface, and that a client binds to an object at run time and invokes methods through vTable pointers. You also know that COM clients learn about objects and interfaces by examining type libraries at compile time. These rules are the foundation on which COM is built.

But think about the following questions: How does a client identify a specific coclass or interface? How does a client create a COM object if it can't use the class name? How does a client obtain the first interface reference to an object? The COM infrastructure must address these questions as well.

The COM library is a set of DLLs and EXEs installed on any COM-enabled computer. These components are part of Windows NT 4 and Windows 95, but future versions of COM+ will decouple them from the operating system. Client applications interact with these components through the COM library, an API that is accessible to low-level programmers. Much of the COM library is exposed through OLE32.DLL. C++ programmers must make direct calls to this library, but Visual Basic programmers are shielded from this DLL by the run-time layer. Figure 3-3 shows the differences in the layers between a COM application written in C++ and one written in Visual Basic.

Client applications must call upon the services provided by the COM library to create and connect to objects. The sequence of object creation must be carefully orchestrated because a client must bind to an interface, not to a class. This leads to a catch-22: How can a client create an object when it doesn't know the definition of the creatable class? The following sections describe exactly how COM makes this possible.

Figure 3-3. C++ programmers talk to the COM library directly. Visual Basic programmers are shielded from this library by the Visual Basic run-time layer.

Globally Unique Identifiers (GUIDs)

COM coclasses and interfaces are identified by a globally unique identifier (GUID), which is a 128-bit integer. Approximately 3.4 x 10³⁸values are possible for a 128-bit integer, so it's safe to assume that an unlimited supply of these identifiers is available for use in COM. Most compilers and databases don't support 128-bit integers, so GUIDs are usually stored in other formats. C++ programmers use a data structure that represents a 128-bit value with a set of smaller integral values. A GUID can also be expressed in 32-character hexadecimal form, which makes it somewhat readable. This string format is also used to store GUIDs in the Windows registry. Here is an example of what a GUID looks like in this format:

 {C46C1BE0-3C52-11D0-9200-848C1D000000}

The COM library supplies a function named CoCreateGUID, which is used to generate a new GUID. The function relies on an algorithm that uses information such as the unique identifier from the computer's network card and system clock to create a GUID that is guaranteed to be unique across time and space. C++ programmers use a utility named GUIDGen.exe to create GUIDs in the development environment. This allows them to cut and paste GUIDs into IDL and C++ source code. Visual Basic programmers never have to worry about this. The Visual Basic IDE generates GUIDs behind the scenes whenever they're needed.

GUIDs are used in many places in COM, but you should start by examining their use with interfaces and coclasses. Each COM interface has an associated GUID called an interface ID (IID). Each coclass has an associated GUID called a class ID (CLSID). When you examine IDL, you'll notice that each interface and coclass has a Universally Unique Identifier (UUID) attribute. Don't let the UUID attribute confuse you. UUIDs and GUIDs are the same thing.

 [ uuid(3B46B8A8-CA17-11D1-920B-709024000000) ] interface _IDog {     // methods }; [ uuid(3B46B8AB-CA17-11D1-920B-709024000000) ] coclass CBeagle {     // interfaces };

These CLSIDs and IIDs are compiled into a server's type library. A GUID becomes the physical name for an interface or a coclass. When a client application is compiled against the type library, these GUIDs are also compiled into the client's binary image. This enables the client application to ask for a specific coclass and interface whenever it needs to create and bind to an object at run time.

Visual Basic does a pretty good job of hiding GUIDs from programmers. When you reference a type library in a Visual Basic project, you simply use the friendly names of interfaces and coclasses in your code. Visual Basic reads the required GUIDs out of the type library at compile time and builds them into the EXE or DLL when you choose the Make command from the File menu.

When you create a COM server, Visual Basic also hides the GUIDs from you. It automatically generates GUIDs for your interfaces and coclasses on the fly whenever you build a server using the Make command. This is convenient, but it would be nice if Visual Basic offered a little more flexibility. It doesn't allow you to take a specific GUID and associate it with an interface or a coclass. For instance, you might have a cool designer GUID like this:

 "{DEADBEEF-BADD-BADD-BADD-2BE2DEF4BEDD}"

Unfortunately, the Visual Basic IDE won't let you assign this GUID to one of the coclasses in your Visual Basic server. Visual Basic also requires that you build your server projects with an appropriate compatibility mode setting. If you don't, your GUIDs can be changed from build to build. Chapter 5 talks about compatibility in greater depth.

COM Activation

A client application can discover the CLSID of a coclass as well as which interfaces it supports at compile time through the type library. However, COM requires that no other dependencies be built between clients and coclasses. The client application must use a supported interface when it binds to and communicates with an object created from the coclass. This act of loading and binding to an object is called activation. A client can activate an object with some help from the COM library if it knows the CLSID and the IID of a supported interface.

Activation support must be built into COM's infrastructure because the client is never allowed to create an object using a visible concrete class definition from the server. If this were the case, the client would be required to see the class definition and know about the object's data layout at compile time. Instead, COM puts the responsibility of creating the object on the server that holds the coclass definition. The infrastructure support supplied by COM plays the role of middleman. It takes an activation request from the client and forwards it to the server.

The COM component that assists activation is the Service Control Manager (SCM), which is affectionately called "the scum" by savvy COM programmers. The SCM is a systemwide service that resides in RPCSS.EXE, as shown in Figure 3-3. (Don't confuse the SCM with the Windows NT Service Control Manager, which is used to start and manage Windows NT Services.)

A client application interacts with the SCM through OLE32.DLL. A C++ programmer can activate an object by calling a function named CoCreateInstance.Visual Basic programmers activate objects by using the New operator followed by a coclass name. Visual Basic translates a call to the New operator into a call to CoCreateInstance. In both C++ and Visual Basic, the SCM is passed the CLSID of the desired object and the IID of the interface that the client will use to connect to the object.

When the client passes the CLSID, the SCM uses configuration information in the Windows registry to locate the server's binary image. This is typically a DLL or an EXE. This means that a COM server requires an associated set of registry entries, including a physical path to its location. Each COM server is responsible for adding its configuration information to the registry when asked. Chapter 5 describes the use of the registry in COM in greater depth, but for now just assume that the SCM can always locate the server on the hard disk.

What happens in an activation request? Here's the play-by-play. When a Visual Basic application calls the New operator on a coclass that is defined in a COM DLL, the following happens:

The Visual Basic run-time library calls CoCreateInstance and passes the SCM the requested CLSID and the IID.
The SCM locates the server (loading the server if necessary).
The SCM calls a well-known entry point in the server and passes it the CLSID and the IID.
The server creates the object of the type specified by the CLSID.
The server returns an interface reference of the type specified by the IID back to the SCM.
The SCM forwards the interface reference back to the client.
The client is bound to the object.
The SCM is no longer needed and therefore drops out of the picture.
The client invokes methods on the object.

As you can see, the SCM is really just a matchmaker. Once it binds a client to an object, it's no longer needed. The client and the object can have a long and fruitful relationship. However, for this architecture to work properly, the SCM must have a predefined way of interacting with the server. Every COM server must therefore provide support for object activation by exposing a well-known entry point through which the SCM can make activation requests.

Class Factories

The rules for server-side activation support are defined in the COM Specification. COM uses a common software technique known as the factory pattern, in which the code that actually creates the object is contained in the same binary file. This eliminates the need for the client or the SCM to know about the class definition behind the object being created. The key advantage to this technique is that it allows class authors to revise their code without worrying about client dependencies such as an object's data layout.

When the SCM interacts with a server to activate an object, it must acquire a reference to a special type of object called a class factory, which is an agent that creates instances of a class associated with a specific CLSID. A COM server must provide a class factory object for each creatable coclass. When the SCM receives an activation request, it must acquire a reference to the appropriate class factory object. It does this in different ways depending on whether the server code is in an inprocess DLL or an out-of-process EXE. Figure 3-4 shows how a single class factory object can be used to create many instances of a particular coclass.

Every COM server, including those built with Visual Basic, must provide class factories for the SCM. When you build an ActiveX DLL or an ActiveX EXE, Visual Basic transparently creates a class factory for each public creatable class. Visual Basic creates class factories in a reasonable and boilerplate fashion. You can't influence how Visual Basic does this. You can't even see the class factories. You have to take it on faith that they are there. Visual Basic also automatically builds the required entry points for the SCM so that it can get at your class factories.

Many C++ programmers have written code for a class factory. Anyone who has done this manually will tell you that it is a tedious undertaking. COM frameworks for C++ programmers such as ATL and MFC provide the boilerplate code for creating these class factories. The Visual Basic team has used a similar technique to hide the code for dealing with class factories from its programmers. You should see this as a good thing. You'll never deal with a class factory in a Visual Basic application. However, this convenience does pose a few limitations.

click to view at full size.

Figure 3-4. The SCM must interact with a class factory object to create instances of a particular coclass. This design allows the code that is responsible for the creation of objects to remain in the same binary file as the coclass being instantiated.

C++ programmers who create class factories have more flexibility. A sophisticated class factory can service an activation request by locating an existing object instead of creating a new one. This can allow a DLL to implement an optimized form of object pooling. Visual Basic, on the other hand, doesn't give you any flexibility. Activation requests to your server always result in the creation of a new object.

C++ programmers can also work directly with class factories on the client side of an activation request. They have techniques available for doing things such as optimizing the creation of many objects at once. Unfortunately, no reasonable technique is available to Visual Basic programmers to use a class factory object in a client application. Advanced use of a class factory must be done in C++.

As it turns out, Visual Basic doesn't suffer much from its inability to work directly with class factories. In most cases, a client that calls New really wants a new object. Visual Basic does a great job of hiding the requirement of COM activation. What's more, many environments (such as MTS running under Windows NT 4) require a plain vanilla implementation of class factories such as the one provided by Visual Basic. C++ programmers who create fancy class factories will find that they can't run their servers in the MTS environment.

What Happens After Activation?

After the client is bound to an object, the SCM is no longer needed. At this point, the object must provide a certain base level of functionality to the client. In addition to implementing each method in every supported interface, an object must manage its own lifetime and allow clients to move back and forth between the various interfaces it supports.

The next chapter covers these issues and explains some of the other significant responsibilities of COM objects. You'll see how an object can service a less sophisticated group of clients through a mechanism known as automation. You'll also see what makes interprocess communication possible between a client and an object.