Taking COM out of Process | Programming Distributed Applications with Com and Microsoft Visual Basic 6.0 (Programming/Visual Basic)

So far, this book has described the interaction between client and object only in the context of a single process under a single thread of execution. When a client is bound to an in-process object, it can directly invoke methods through the use of function pointers that are stored in a vTable. The interaction is very efficient because the client code and the object code share one thread, one call stack, and one set of memory addresses. Unfortunately, when the object runs in another process, none of those resources can be shared.

The function pointers stored in vTables have no meaning across process boundaries. A client can't use a remote function pointer to access an object in another process. How then can COM remote a method call from the client's process to the object's process? COM makes remote communication possible with a pair of helper objects called the proxy and the stub.

Figure 4-4 shows how the proxy and the stub are deployed. The proxy runs in the client's process, while the stub runs in the object's process. The proxy and the stub establish a communication channel using remote procedure calls (RPCs) as the interprocess mechanism. The channel passes data back and forth during remote method execution. This act of serializing method parameters for transmission through the proxy/stub architecture is known as marshaling.

click to view at full size.

Figure 4-4. COM's remoting architecture requires that a proxy/stub layer be introduced between the client and the object. The proxy and the stub establish an RPC channel between them to communicate with each other.

When the client invokes a method on the proxy, the proxy forwards the request to the stub. To properly transmit this request, the proxy must marshal the method's inbound parameters to the stub. When the stub receives the request, it unmarshals the inbound parameters and locally performs the call on the object. After the object has completed the method, the stub prepares a response packet that includes outbound parameters and a return value. The data is then marshaled back to the proxy. The proxy unmarshals the data in the response packet and returns control back to the client.

The best part about this remoting architecture is that neither the client nor the object can tell that it is being remoted. The client thinks that the proxy is the object. The object thinks that the stub is the client. This allows COM programmers to write code for both clients and objects without regard to whether the objects will be activated from an in-process server or an out-of-process server. This powerful feature is known as location transparency.

There is a proxy/stub pair for each connected interface. This allows a client and an object to have two or more proxy/stub pairs connecting them at once. It makes sense that the proxy/stub pair is associated with the interface because the interface describes the methods that need to be remoted. With an interface definition stored in a type library, COM can determine the exact manner in which the data should be marshaled to the object and back. This is why IDL allows you to specify parameters such as [in], [out], and [in, out]. Unfortunately, Visual Basic doesn't support COM [out] parameters in its current release. Chapter 6 describes these parameter attributes in greater detail and shows you how to efficiently marshal your data among processes.

Responsibilities of the Proxy and the Stub

The proxy and the stub have their work cut out for them. They must work together to give both the client and the object the perception that they're running in a single process on a single thread. They create this illusion by constructing a call stack in the object's process that is identical to the one in the client's process. Any data sitting on the call stack in the client's process must be marshaled to the object's process. What's more, any pointers on the client's call stack require the proxy to marshal the data that the pointer refers to. The stub is responsible for unmarshaling all the data and setting up the call stack, which might include pointers to data that doesn't live on the stack.

As you can imagine, the code that accomplishes the marshaling behind a proxy/stub pair can become quite complicated. Luckily, COM provides a system service called the universal marshaler that automatically builds the proxy/stub code at run time. It does this by examining interface definitions in a type library. When an interface reference is exported from the object's process, the universal marshaler builds and loads a stub object. When the interface reference is imported into a client process, the universal marshaler creates a proxy and binds it to the client. The communication channel that is established between the proxy and the stub can thus remote method requests between the client and the object.

Out-of-Process Considerations

You should note two important performance-related points about out-of-process COM. The first is that out-of-process method calls take much longer than in-process calls. Generally, you can expect an out-of-process call to take at least 1000 times longer than an in-process call with direct vTable binding. The proxy/stub layer always requires thread switching and marshaling, so it adds a significant amount of overhead.

The second key point is that objects you create with Visual Basic can be passed only by reference and never by value. Don't be fooled into thinking that you can simply pass a Visual Basic object from one machine to another. Your methods can define arguments that are object references but not actual objects. The current version of Visual Basic lets you put the ByVal keyword in front of object types in argument definitions, but these arguments are still interpreted with pass-by-reference semantics. When you have a reference to an out-of-process object, access to each method or property in the object requires an expensive round-trip.

Out-of-process objects created with Visual Basic are always bound with proxies and stubs built by the universal marshaler. This technique for automatically binding a client to an out-of-process object from the information in a type library is known as standard marshaling. Many programmers using languages other than Visual Basic also prefer standard marshaling because it's easy to use and it's part of a service provided by COM's infrastructure.

C and C++ programmers can forgo standard marshaling in favor of custom marshaling. Those who are willing to write their own marshaling code can optimize the communication channel in ways that are impossible with standard marshaling. For instance, a programmer can implement pass-by-value semantics with custom marshaling code. The downside to custom marshaling is that it requires using C or C++ on the object side.

Out-of-Process Activation

It's time to revisit object activation and look at how it is accomplished with an out-of-process server as opposed to an in-process server. It's also important to understand how activation differs between a local out-of-process server and a remote out-of-process server.

The Service Control Manager (SCM) must acquire a reference to a class factory object in every activation request, but this occurs in quite a different way when the server runs in its own process. With an in-process server, the SCM connects to a class factory object through a well-known entry point exposed by the DLL. Because an out-of-process server can't expose an entry point the way a DLL can, COM must have another way for the SCM to acquire a class factory object reference.

When an out-of-process server is launched, it must register a class factory object for each of its creatable coclasses with the COM library. The SCM maintains a machinewide internal table called the class table, which holds the class factory object references for every registered CLSID. The SCM can scan through this table and retrieve a reference to any local class factory object that has been registered.

When the SCM receives an activation request for a CLSID that is implemented in a local server, it looks through the class table to determine whether the CLSID has already been registered. If it finds the CLSID in the class table, the server is up and running. If the CLSID hasn't been registered, the SCM launches the server process and takes a breath so that the server can register itself. After the server has registered its CLSID, the SCM can revisit the class table and acquire the needed reference to the class factory object. After the SCM connects to a class factory object, it asks the server to create an instance in a manner similar to the in-process scenario.

After the SCM creates the out-of-process object, it must bind the object to the clients using a proxy/stub pair. When the object exports an interface reference to the SCM, the SCM calls on the universal marshaler to create the stub object. Then when the SCM imports the interface reference into the client application, the SCM calls on the universal marshaler to create a proxy object to bind the client and the object together.

Once you come this far, the conceptual differences between activation in a local server and a remote server are not overly complicated. When the local SCM determines that the CLSID in an activation request lives on a different computer, it dials across the network and establishes a connection with a remote SCM. Interhost communication requires that the activation request be passed through an authentication/authorization layer (which is covered in Chapter 8). What's important to see here is that the remote SCM activates the object in an out-of-process server that is local to itself. The remote SCM goes through the same activation process described above.

The only real change is that the interface reference must be marshaled from one computer to another. The interface reference is exported from the remote server process in the same manner as for a local server. When the interface reference is unmarshaled into the client process, the proxy is populated with enough information to get back to a specific stub for a specific object on a specific host. Once again, none of these details are the concern of either the client or the object.

Note that this binding process requires that the type library holding the interface definition be installed on both machines. It's common practice to produce a stand-alone type library for installation on client machines that must build proxies to remote objects.

The Value of Location Transparency

This process of binding a remote object sounds complicated, but the SCM takes care of it. A client doesn't have to concern itself with the details of in-process activation vs. out-of-process activation. The client requests a specific CLSID and is then bound to the object (or something that feels like the object). After the binding takes place, the client goes about its business by invoking methods and accessing properties. The client perceives that the object is close by, but that doesn't have to be the case.

In the out-of-process scenario, the object also perceives that the client is in the same process. This means that the details of in-process versus out-of-process activation are hidden from object code as well as the client. The ability of programmers to write client code and object code for an in-process relationship and have the code work automatically across process boundaries is one of the most powerful features of COM. Figure 4-5 shows three different ways to deploy a server without changing any code for the client or the object.

COM's ability to seamlessly remote objects is known as location transparency. It eliminates the need for programmers to be concerned with the grungy details of interprocess communication. It also means that objects can be redeployed around the network with little impact on code. You can redirect a client that is programmed to activate a certain CLSID from an in-process DLL so that it activates a remote object by making just a few minor modifications to the registry. You don't have to rewrite a single line of code.

You can take code that you have written for a class in a COM DLL and use it in a COM EXE without making any modifications. However, the mere fact that your code compiles and allows you to serve up objects doesn't mean that it's efficient. Code written for in-process objects might not scale when it is deployed in an out-of-process object. Many coding techniques that work well in process can lead to unacceptable performance when the proxy/stub layer is introduced. Chapter 6 explains the importance of designing interfaces that work efficiently across the proxy/stub layer.

click to view at full size.

Figure 4-5. Location transparency eliminates the need for COM programmers to be concerned with the details of interprocess communication.