RPC and COM | Programming Distributed Applications with Com and Microsoft Visual Basic 6.0 (Programming/Visual Basic)

From the early days of computer networks, some programmers were determined to make code on one computer cause the execution of code on another. At the time, the requirements of interhost communication were amazingly high. Only hardcore systems-level programmers were up to the task. To execute a function on a separate computer typically required code to pass messages between computers. This code was almost always written with a specific network protocol and a specific set of hardware in mind.

In the 1980s, it was generally acceptable to write remote networking code in this fashion. However, porting a distributed application to a different protocol or a different hardware platform was expensive. As distributed programming became more popular, people started looking for ways in which application code could be decoupled from the code that handled the remoting of code execution across computer boundaries. Since companies wanted to shed dependencies on both hardware platforms and network protocols, the need for a more generic solution was great.

A standards group called the Open Software Foundation (OSF) set out to create a specification to solve this problem. The group's goal was to eliminate the need to hardcode platform and network protocol dependencies into distributed application code. The fruit of the group's labors was a specification for RPC and Distributed Computing Environment (DCE). The RPC specification gives programmers a way to write application code that's free of these costly dependencies. The most notable advantage of this specification is that the code written for distributed applications can be used on a variety of platforms and protocols with little or no change to application code.

The RPC specification requires programmers to define remote calls by writing a description in an RPC-specific version of Interface Definition Language (IDL). A set of calls is defined in RPC IDL inside an RPC interface. An RPC interface is simply a group of global functions that defines the calling syntax of each remote procedure call. Each parameter in every procedure is defined with a specific data type and a direction. (This talk of interfaces and IDL should sound familiar to you at this point.)

An RPC client application must establish a connection with the server at run time by negotiating a binding protocol with a server process. Once binding has taken place, the two communicate through the RPC channel. Every procedure call involves moving data back and forth across the channel. An RPC request transmits a procedure call and its inbound parameters from the client application to the server. After the server executes the procedure, the RPC response transmits the procedure's return value and any outbound parameters back to the client application. When a client application makes a synchronous RPC call, it must wait for the response to come back from across the network before it can do anything else.

MS-RPC

As OSF DCE RPC became an industry standard for distributed computing, Microsoft created its own version for Windows platforms. Microsoft's implementation of the OSF specification is called MS-RPC. MS-RPC has been integrated into all 32-bit Windows platforms and promises to provide Windows connectivity to many non-Windows platforms.

Developers use MS-RPC by first creating an IDL source file and feeding it to the Microsoft IDL (MIDL) compiler. The MIDL compiler generates C source code files that have the necessary RPC code stubs for both the client and the server. This code is compiled into client applications and server applications. As a result, the client application can call the functions, which are automatically remoted to the server application. Once again, the big win here is that applications don't require networking code with platform or protocol dependencies.

RPC has one significant problem: It doesn't offer the elegance of object-oriented programming. It's largely based on a procedural paradigm in which clients and servers communicate in terms of global functions. While the RPC specification does provide a few object-oriented extensions, it is fair to say that few of these extensions have made it into mainstream use. RPC needs an object-oriented wrapper in order to achieve the higher levels of productivity that application programmers are accustomed to. What's more, MS-RPC requires low-level programming, so it's difficult to access from higher-level tools such as Visual Basic.

When Microsoft engineers were deciding how to make COM interprocess-capable, they saw that RPC had much to offer. RPC was already ubiquitous in the industry, and MS-RPC had been integrated into every significant Windows platform. The engineers knew that RPC could be valuable to COM because it enables a client application to connect to a server across the network and execute a procedure call. COM desperately needed this functionality to participate in distributed programming. COM also had much to offer RPC. A mapping of COM interfaces gives RPC an object-oriented wrapper and makes writing interprocess code much easier. Figure 8-1 shows how the connection is set up.

click to view at full size.

Figure 8-1. The proxy and the stub are generated with an RPC-IDL compiler. They communicate with one another across an RPC channel.

Microsoft's version of IDL includes COM extensions that let you map a COM interface to an RPC interface. The MIDL compiler can build proxy/stub code by examining the definition of a COM interface. As you saw earlier, the proxy and the stub can communicate by establishing an RPC channel. The RPC layer allows a client application to establish a connection with an out-of-process COM object. It also supplies the threading support on which COM builds its model of apartments. It turns out that the concurrency model supplied by RPC is very reasonable. As you saw in Chapter 7, the RPC layer is important to both single-threaded apartments and multithreaded apartments. The threading support provided by RPC allows a server process to service multiple calls concurrently.

The RPC layer is transparent to Visual Basic programmers. This means that when you create an out-of-process COM server with Visual Basic, the COM interface-to-RPC mapping is taken care of behind the scenes. The universal marshaler contains its own IDL compiler. It reads an interface definition from a type library and generates the required proxy/stub code at run time. This allows RPC and COM to work together to provide locality independence. In this respect, you can say that interfaces are the key to seamless distribution in COM. You saw this stated earlier in this book, but I reiterate it here because it's so significant.

Activating Across the Network

Let's take a rest from all this low-level talk about network protocols and the MIDL compiler for a moment and talk about why the interprocess capabilities provided by RPC are so important. They're important because they let you deploy a set of business objects in the middle tier of a distributed application, as shown in Figure 8-2.

click to view at full size.

Figure 8-2. Client applications rely on Distributed COM to communicate with business objects running in the middle tier.

In an N-tier architecture, a client application running in the presentation tier activates and uses objects distributed throughout the network. Distributed COM must provide an infrastructure in which two machines can coordinate to load a remote object and bind it back to the client application. Distributed COM must also provide a security layer so that client applications can use objects only after they have been authenticated and authorized.

Let's examine what happens during remote activation. Recall that in local activation, when an out-of-process object is activated the SCM can launch and/or find the running server process that contains the object. Once it finds the server, it goes through the loading and binding sequence described in Chapter 4. Remote activation is similar to local activation except that it requires two sessions of the SCM running on two different computers. When the SCM running on the client machine determines that the CLSID implementation lives on a separate host, it sends an activation request across the network to the SCM on the server's computer. The client-side SCM passes the requested CLSID and IID as in any other activation request.

The SCM on the server machine can activate the object just as it does when it handles a local activation request. That means that the server-side SCM scans through its Registry looking for a local COM server that serves up objects of the requested CLSID. The SCM launches the server process if this is required. Once the server-side SCM loads the local object, it coordinates with the client-side SCM to move an interface reference across the network. When the interface reference reaches the client machine, the client-side SCM binds it to the client application, as shown in Figure 8-3. After the object is bound to the client application, no additional support is required from the SCM on either machine. The client application can invoke remote calls on the object all by itself.

click to view at full size.

Figure 8-3. Remote activation is coordinated between two separate sessions of the SCM. If the client application has been authenticated and authorized, the server-side SCM performs a local activation and returns an interface reference.

What happens if a client application crashes while holding outstanding object references? The object must have a way to determine that its client application has expired. You saw in Chapter 5 that the client can discover that an object has died by inspecting an HRESULT. However, an object needs a little more assistance to determine whether the client has passed away.

COM provides the infrastructure for distributed garbage collection. The system can determine that a client with an outstanding object reference has died. When the system discovers this, it informs the object by calling Release. This means that distributed objects can be released from memory when their clients crash. The need for garbage collection is important in a distributed application that will run for months at a time.

COM's mechanism for distributed garbage collection is based on the client machine pinging the server machine with a notification that says, "I'm still alive." The client pings the server every two minutes. If the server doesn't hear from the client for six minutes (that's three missed pings), the server informs the object that the client has died.

The ping algorithm has been optimized to avoid excessive network traffic. Pings aren't sent for individual interface references or for individual objects. Instead, the system transmits a single machinewide ping from the client to the server with information about every connection between the two machines. Note that each ping doesn't transmit all the information about every outstanding interface reference. Instead, it transmits the information about what has changed since the last ping. This "delta" algorithm significantly reduces what needs to be broadcast across the network.