Key Considerations for Distributed Components | MicrosoftВ® .NET Distributed Applications: Integrating XML Web Services and .NET Remoting (Pro-Developer)

One of the most harmful myths in software development is that server-side components should act the same way as local objects. Experienced developers realize that a distributed application design must begin and end with practical consideration of the existing technologies. Otherwise, it's all too easy to create an elegant, carefully planned architecture that is fundamentally unworkable. Usually these flawed designs are implemented to some degree before their shortcomings are discovered and quickly reach a bottleneck that can't be overcome.

One common mistake is to look at distributed components as full partners in object-oriented design. Unfortunately, distributed programming represents a compromise between networking technology and object-oriented practices, not a union. Some of these limitations can be overcome with ingenious workarounds (and many articles demonstrate these sorts of coding hacks), but performance almost always suffers.

The next two sections consider two fundamental differences between local code and server code and the design implications of these differences.

Cross-Process and Cross-Computer Communication

One key difference between local and remote code is that remote object communication takes a nontrivial amount of time. This is true whether you are calling a time-consuming method or just setting a simple property. Interacting with an object in another application domain can be hundreds of times slower than interacting with an in-process component. Interacting with a component on another computer or a remote Web server can easily be another order of magnitude slower, even over the fastest possible network connections.

From the developer's point of view, calling a method in a remote object is just as easy as invoking a local method. However, you should always keep in mind that the code you write is just an abstraction over a complex infrastructure that creates a network connection, dispatches a request, and waits for a response. That means if you design your remote object in a traditional style, with numerous properties that need to be set in conjunction, you add extra latency, increase network traffic, and slow down the performance of the entire system.

If the client needs to set four properties before calling a method, for example, that translates into four network calls and four small but measurable delays. The result is a sluggish client application that only grows slower as the load increases. The costs become even higher if you need to create a transaction that spans more than one object interaction because the Microsoft Distributed Transaction Coordinator (DTC) service must be invoked.

The communication problem is one of the reasons that remote objects are always designed as a collection of utility functions (like an old-fashioned DLL) rather than as an abstraction over a real-word entity. Another reason is state.

The Question of State

The other difference between remote objects and local objects is that remote objects rarely retain in-memory information, or state. Some developers have discovered the problems of stateful server components the hard way by creating components that work perfectly well at first but bring the server to a near standstill as the number of clients grows and the memory load soars. Even a small amount of information can become a heavy burden when multiplied by hundreds or thousands of clients, particularly if they don't signal when they are complete and allow the required cleanup.

Using up server memory isn't the only problem with stateful server-side components. If you store state information in server memory, you effectively tie that instance of the object to a single machine and make it impossible to load-balance the system. If you try to load-balance a stateful system, a client request might be routed to a different computer that doesn't have the correct in-memory information and the state data will be lost.

The two remote object technologies in .NET XML Web services and .NET Remoting handle the issues of state very differently. .NET Remoting encourages you to create single-call stateless objects, but it also provides support for stateful client-activated and singleton objects. XML Web services are inherently stateless. Memory can be stored in server memory using the ASP.NET session state service, but there are several limitations:

The client needs to receive an HTTP cookie and maintain it. If this cookie is lost, the client neglects its duty, or the method of communication is not over HTTP, state management is not possible.
The state information is tied to a cookie, which contains a randomly generated unique string for the client. Programmatically, state is managed through a key-based dictionary collection. There is no way to associate state with a specific class instance or even with a specific class.
The state information is held in perpetuity unless you programmatically remove it or the session times out (typically after 20 minutes without any use).

Interestingly, ASP.NET resolves the load-balancing dilemma by allowing multiple computers to store state information on a single designated server or in a shared database. This solution is far from perfect, however. For one thing, it sacrifices one of the main benefits of in-memory state: performance. By storing state information in another process or on another computer, you ensure that calls to store or retrieve a piece of state data will execute much more slowly. Most of the examples in this book avoid session state and its limitations or use a more flexible alternative such as caching (discussed in Chapter 12).

The Difference Between Stateless Singleton and Single-Call Objects

Here's a question you can use to puzzle a seasoned .NET guru: If you create a stateless object for .NET Remoting, is there any difference between single-call and singleton activation types?

You'll remember that with singleton objects, all clients access the same object. In fact, multiple clients can access the object at the same time because .NET uses a thread pool to handle concurrent requests. As you saw in Chapter 7, if multiple clients try to access a member variable in a singleton object at the same time, you need to add some synchronization code to make sure nothing goes wrong. But here's the twist: In a stateless object, there are no member variables. Therefore, there are no possible threading conflicts!

The answer to this riddle is that a stateless class performs about the same whether it is a singleton object or a single-call object. In fact, the singleton object might even perform a little better because .NET doesn't need to create the object for each new client request. The reason Microsoft recommends that you use single-call objects rather than singleton objects is to ensure that you follow stateless design practices, which will make your life simpler in the long run.

Are Remote Objects Really Objects?

What I've said so far should suggest to you that remote objects aren't exactly objects at least not in the traditional object-oriented sense of the word. A good example of this difference is an imaginary Account class. Listing 10-1 shows how you might implement this class in a traditional object-oriented application.

Listing 10-1 A stateful account class

 Public Class Account     ' For simplicity's sake we use public variables instead of     ' full property procedures.     Public AccountID As Integer     Public Balance As Decimal     Public Sub Update()         ' (Update this record in the database.)     End Sub

 Public Sub Insert()         ' (Insert a new account record using the information in this         '  class.)     End Sub     Public Sub Delete()         ' (Delete the database record that matches this class.)     End Sub     Public Sub Fill()         ' (Get the database record that matches this class.)     End Sub End Class

To transfer money between accounts, the client can follow a sequence like the one shown in Listing 10-2.

Listing 10-2 Using the stateful account class

 Dim AccA As New Account(), AccB As New Account() ' Retrieve the balance for account 1001. AccA.AccountID = 1001 AccA.Fill() ' Retrieve the balance for account 1002. AccB.AccountID = 1002 AccB.Fill() ' Transfer the money. Dim Transfer As Decimal = 150 AccA.Balance += Transfer AccB.Balance -= Transfer ' Save both records. AccA.Update() AccB.Update()

This code is clean and straightforward. Unfortunately, it's hopelessly inefficient if the Account class is a remote object. The problems are many:

In total, there are eight remote calls (assuming no action is taken when the objects are first created). Even though the actual message size is small, the basic latency of a cross-process and cross-machine call is not trivial.
The account objects are stateful. Therefore, they each retain the account number and balance information in server memory. Although this problem is not apparent at design time and under a small load, with thousands of clients performing the same tasks it can become a disaster.
The account updates are atomic. To ensure that the updates either succeed or fail together, you need to create a client-side transaction. The only way to support a client-side transaction in an object is through COM+, which isn't nearly as efficient as using a database transaction.

The ideal stateless version of this object might be called AccountUtility. It would provide straightforward, targeted methods that are centralized around common tasks, such as UpdateAccount and TransferFunds. Listing 10-3 shows an example.

Listing 10-3 A stateless AccountUtility class

 Public Class AccountUtility     Public Sub UpdateAccount(accountID As Integer, balance As Decimal)         ' (Update the account record in the database.)     End Sub     Public Sub InsertAccount(accountID As Integer, balance As Decimal)         ' (Insert a new account record.)     End Sub     Public Sub Delete(accountID As Integer)         ' (Delete the corresponding database record.)     End Sub     Public Sub TransferFunds(accountIDFrom As Integer, _       accountIDTo As Integer, transferAmount As Decimal)         ' (Start a database transaction, and modify the two accounts.)     End Sub End Class

It's impossible to claim that the AccountUtility represents a real-world entity. You can't even state that AccountUtility represents the actions you can take with an account because it includes a method that spans more than one account (TransferFunds) and it might well include other methods that perform related updates to several different database records. AccountUtility is a group of related functionality, not a true object. In other words, AccountUtility is a service provider.

Listing 10-4 shows the same fund-transfer task using the AccountUtility class.

Listing 10-4 Using the stateless AccountUtility class

 Dim AccUtility As New AccountUtility() AccUtility.TransferFunds(1001, 1002, 150)

There are a number of benefits in this rewritten version:

There is only one remote call.
This can dramatically increase the client's performance and cut down on network traffic.
No memory is held in server state.
This ensures scalability but also means that every subsequent method call will need to resubmit the same parameters (such as an account ID), which can become quite tedious and increase the network traffic of the total system.
The transfer logic happens on the server, not the client.
This allows the server-side component to use an ordinary database transaction. It also helps separate the business logic from the presentation code (a principle we'll return to later in this chapter).

One of the chief drawbacks to stateless programming is that each method call typically requires a lot of information. In the preceding example, this isn't a problem because only two pieces of information are associated with an account. However, consider the less attractive method that might be required to create a new customer record:

 Public Sub InsertCustomer(firstName As String, lastName As String, _   userID As String, password As String, email As String, _   streetAddress As String, city As String, postalCode As String, _   country As String)     ' (Code omitted) End Sub

This is clearly some messy code. Worst of all, because all these parameters are strings, the client program could easily make a mistake by submitting information in the wrong order, which wouldn't generate an exception but would lead to invalid information in the database.

To overcome these issues, we generally deal with structures that encapsulate several related pieces of information. However, these structures differ dramatically from the traditional approach shown with the Account class because they include only data, not functionality. Often, these classes have names that end with Info or Details to highlight this distinction. This convention is used in many of the examples in this book and some Microsoft case studies, but it is by no means universal.

Here's a typical AccountInfo structure that a client might use to pass information to an InsertAccount method:

 Public Class AccountDetails     Public AccountID As Integer     Public Balance As Integer End Class

Of course, this is the approach you've already seen in the early chapters of this book. It divides a traditional stateful object into a stateless information package and a remote service provider class. The information package is never run out of process; it's just a method of delivering information. The service provider, on the other hand, is entirely stateless and built out of utility functions, ensuring optimum performance.