Why Are Queues Important? | Programming Distributed Applications with Com and Microsoft Visual Basic 6.0 (Programming/Visual Basic)

The basic idea behind message queues is simple. A queue is a named, ordered repository of messages. Figure 11-1 shows how a queue is typically used in a distributed application. A message is a request or some other type of notification that is sent from one application to another in the same system.

A sender application creates and prepares a message by setting various properties in the message header and packing parameterized information into a payload, which is called the message body. After preparing the message, the sender application writes it to the queue. A receiver application reads the message and removes it from the queue so that it's processed only once. The receiver application then interprets the request, unpacks the parameters from the body, and carries out whatever processing is required.

Message queues also support the concept of a reader application. This type of application can examine a message in a queue without removing it. This is known as peeking. Peeking allows an application to be selective about the messages it removes from a queue.

Any application can assume more than one of these roles. It's common for an application to act as both a reader and a receiver on a single queue. The application can look at what's in the queue and remove only those messages that meet certain criteria. It is also common for one application to receive messages from one queue while sending messages to a second queue.

click to view at full size.

Figure 11-1. A sender application creates and prepares a message and then writes it to the queue. A receiver application removes the message from the queue, interprets the message's request, and executes whatever processing the message requires.

A message queue is a valuable building block in a distributed system because it allows applications to submit requests and send information to other applications in a connectionless, asynchronous manner. In some ways, message passing is like RPC, and in other ways it's very different. As you've seen throughout this book, COM uses RPC to issue interprocess requests between clients and objects. Let's compare using messages to using RPC for communicating between applications in a distributed system.

Every RPC actually requires two distinct messages. A request message that carries inbound parameters is sent from the client to the object. After executing the method implementation, the object sends a response message to the client that carries the return value and output parameters. One of the best things about Distributed COM and RPC is that they hide the complexities of passing these messages between the client and the object. You simply create an object and invoke a method. You don't have to think about sending and receiving messages. The proxy and the stub work together with the RPC layer to marshal the data back and forth and control the flow of execution. COM and RPC make sending the request, executing the method implementation, and receiving the response seem like a single operation to the caller.

While COM and RPC simplify interprocess communication, they have a few notable limitations because RPC bundles the request message, the method execution, and the response message into one indivisible operation. Queues let you overcome some of the shortcomings of RPC, but programming with queues requires more work because you have to explicitly send and receive messages.

Let's look at five common problems with RPC that you can solve by using message queues. The first problem is that the client's thread of control is blocked while the object executes a method call. In other words, method calls based on COM and RPC are synchronous. If an object takes considerable time to process a method call, the client's thread is held hostage by the underlying RPC layer until it receives the object's response. If you use a message queue, a client can post an asynchronous request to the queue. The client doesn't have to wait for a server's response before continuing its work; it can continue its work immediately after submitting a request. Furthermore, the server application that receives the client's message can process the request and send an asynchronous message to the response queue being monitored by the client. While this style of asynchronous programming adds complexity to the interaction between a client application and a server, it can increase efficiency.

A second problem with RPC is that it requires an established connection between the client and the server. Both the client application and the server must be on line at the same time for the application as a whole to be operational. For example, if the server is off line, the client can't submit a request. Likewise, if the client is off line, the server can't process any requests. In essence, neither side can get any work done unless the other side is up and running. This poses an unacceptable constraint for many distributed applications. Think about what this means to a large N-tier information system. A middle-tier application or a database management system (DBMS) might go off line due to a system crash or scheduled maintenance. If the system is based on RPC, clients must wait until all the servers come back on line before they can resume making requests.

A queue can solve this problem because it acts as a buffer between the client application and the server. The client application can continue to send request messages to a queue regardless of whether the server is on line. When the server comes back on line, it can resume responding to the requests that have accumulated. A server can also continue its work after client applications have gone off line. The queue acts as a buffering mechanism that allows either side to accomplish its work in the absence of the other.

A third problem with RPC is that a client application must make a connection to a specific server. RPC has no built-in capacity to distribute the load of processing client requests across a group of servers. If you need load balancing in an RPC-based system, you must typically write code that directs some users to one server and other users to another server. However, most load balancing schemes used in RPC-style applications are vulnerable to problems because the users who are connected to one server might submit a lot of requests while the users connected to a second server aren't sending any requests. One server will become overloaded while the other server sits idle. A queue can provide a much better solution.

Figure 11-2 shows a queue-based approach to load balancing. If every client application sends requests to a single queue, a group of servers can work together to process these messages. The queue acts as a central clearinghouse for every request in the application. One server will never be overloaded while another server sits idle.

click to view at full size.

Figure 11-2. A queue provides an easy way to balance the processing load across a group of servers. This style of load balancing is less complex and more efficient than the algorithms used in most RPC-based systems.

A fourth problem with RPC is that all requests are processed on a first come, first served basis. There is no way to prioritize calls. A high-priority request must wait its turn if low-priority requests were submitted ahead of it. A queue can solve this problem by assigning priority levels to messages. Messages with a higher priority level are placed at the head of the queue, while lower-priority messages are placed at the tail. The server can thus respond to the most important messages first.

The fifth (and arguably the most significant) problem with RPC is that it is vulnerable to failures that lead to inconsistent results in OLTP applications. Let's look at what can go wrong when a base client invokes an RPC-based method call to run an MTS transaction. There are three possible cases for failure. First, the method call's request message might fail between the client and the server. Second, the MTS application might crash while the method call is executing a transaction. In both of these scenarios, the intended changes of the transaction are not committed. The third failure scenario is that the method call's response message to the client might fail or be lost after the method call has successfully run and committed the transaction.

So here's the trouble. What if a client submits a transaction through RPC but doesn't get a successful response? The client application has no way of knowing whether the transaction has been committed. If the client submits the same request a second time, the transaction might be committed a second time. As you can see, this creates a problem that RPC can't solve by itself.

Transactional message queues provide a way to submit a transaction request with exactly-once delivery semantics. You can run a transaction with exactly-once semantics by breaking it down into three distinct phases in which messages are sent and received from transactional queues. Figure 11-3 shows these three phases.

click to view at full size.

Figure 11-3. Running a transaction with exactly-once semantics involves using a transactional queue in three distinct phases.

First the client submits a message to a queue that is known as the request queue. In the second phase, the server carries out three steps within a single high-level transaction. It receives the message from the request queue, processes the transaction requested by the client, and writes a message to a response queue.

In this second phase, the server must successfully accomplish all three steps, or the high-level transaction will be rolled back. If the high-level transaction in phase two is rolled back, the client's message is returned to the request queue. This means that if phase two fails, the application that's running transactions can start phase two from scratch by receiving the message from the request queue a second time. When phase two completes successfully, it means the requested transaction has been committed. It also means that you'll find a corresponding message in the response queue. In the third phase, the client application receives a message from the response queue indicating that the transaction has been successfully committed.

This example doesn't address all the complexities you'll face in a transactional application, but it reveals the essence of why message passing is superior to RPC. Queues give the client the ability to determine which of three possible states a request is in. A request can be waiting to be processed, in the state of being processed, or finished being processed. If the client sees that the original message is still in the request queue, the request hasn't been processed. If the client finds a corresponding message in the response queue, it knows the request has been processed. If there isn't a message in either queue, the client knows the request is currently being processed. RPC isn't as good as message passing because it doesn't allow the client to determine the exact status of a request.

Now that you know how queues can improve your distributed applications, let's put this knowledge to work using MSMQ. As you program with MSMQ, you'll see that it requires more effort than writing COM-based method calls. When you use Distributed COM and RPC, you basically get all of your interprocess communication for free. When you use MSMQ, you have to invest more time in both application design and programming. However, when you consider the limitations of RPC, MSMQ is well worth the time and energy you invest.