A Brief History
In the broadest sense, a distributed application is one in which the application processing is divided among two or more machines. This division of processing implies that the data involved is also distributed.
A number of distributed application solutions predate .NET Remoting. These early systems technologies are the foundation from which many lessons about distributed computing have been learned and of which .NET Remoting is the latest incarnation.
Properly managing complexity is an essential part of developing all but the most trivial software applications. One of the most fundamental techniques for managing this complexity is organizing code into related units of functionality. You can apply this technique at many levels by organizing code into procedures; procedures into classes; classes into components; and components into larger, related subsystems. Distributed applications greatly benefit from—and in many cases help enforce—this concept because modularity is required to distribute code to various machines. In fact, the broad categories of distributed architectures mainly differ in the responsibilities assigned to different modules and their interactions.
Client/server is the earliest and most fundamental of distributed architectures. In broad terms, client/server is simply a client process that requests services from a server process. The client process typically is responsible for the presentation layer (or user interface). This layer includes validating user input, dispatching calls to the server, and possibly executing some business rules. The server then acts as an engine—fulfilling client requests by executing business logic and interoperating with resources such as databases and file systems. Often many clients communicate with a single server. Although this book is about distributed application development, we should point out that client and server responsibilities generally don’t have to be divided among multiple machines. The separation of application functionality is a good design approach for processes running on a single machine.
Client/server applications are also referred to as two-tier applications because the client talks directly to the server. Two-tier architectures are usually fairly easy to implement but tend to have limited scalability. In the past, developers frequently discovered the need for n-tier designs this way: An application ran on a single machine. Someone decided the application needed to be distributed for some reason. These reasons might have included intentions to service more than one client, gate access to a resource, or utilize the advanced processing power of a single powerful machine. The first attempt was usually based on a two-tier design—the prototype worked fine, and all was considered well. As more clients were added, things started to slow down a bit. Adding even more clients brought the system to its knees. Next, the server’s hardware was upgraded in an attempt to fix the problem, but this was an expensive option and only delayed confronting the real problem.
A possible solution to this problem is to change the architecture to use a three-tier or n-tier design. Figure 1-1 shows how three-tier architectures involve adding a middle tier to the system to perform a variety of tasks. One option is to put business logic in the middle tier. In this case, the middle tier checks the client-supplied data for consistency and works with the data based on the needs of the business. This work could involve collaborating with a data tier or performing in-memory calculations. If all goes well, the middle tier commonly submits its results to a data tier for storage or returns results to the client. The key strength of this design is a granular distribution of processing responsibilities.
Figure 1-1. Three-tier architecture
Even if more than one tier of an n-tier system is located on the same machine, a logical separation of system functions can be beneficial. Developers or administrators can maintain the tiers separately, swap them out altogether, or migrate them to separate machines to accommodate future scalability needs. This is why three-tier (or really n-tier) architectures are optimal for scalability as well as flexibility of software maintenance and deployment.
The preceding distributed architectures have clear roles for each of the tiers. Client/server tiers can easily be labeled as either master/slave or producer/consumer. Tiers in an n-tier model tend to fall into roles such as presentation layer, business layer, or data layer. This needn’t always be the case, however. Some designs benefit from a more collaborative model in which the lines between client and server are blurred. Workgroup scenarios are constructed this way because the main function of these distributed applications is to share information and processing.
A pure peer-to-peer design is comprised of many individual nodes with no centralized server, as shown in Figure 1-2. Without a well-known main server, there must be a mechanism that enables peers to find each other. This usually is achieved through broadcast techniques or some predefined configuration settings.
Figure 1-2. Peer-to-peer architecture
The Internet is usually considered a classic client/server architecture with a monolithic Web server servicing many thin clients. But the Internet has also given rise to some quasi-peer-to-peer applications, such as Napster and Gnutella. These systems allow collaborative sharing of data between peer machines. These peers use a centralized server for peer discovery and lookup, as shown in Figure 1-3. Although they’re not a pure peer-to-peer architecture, these hybrid models usually scale much better than a completely decentralized peer model and deliver the same collaborative benefits.
Figure 1-3. Peer-to-peer architecture with centralized lookup server
Even multitier designs can benefit from loosening the client-server role distinction. It’s common for a client module to also be a server and for a server to be a client. We’ll discuss this blurring of client-server roles when we look at client callbacks and events in Chapter 3, “Building Distributed Applications with .NET Remoting.”
The various distributed architectures we discussed have been implemented over the years by using a variety of technologies. Although these architectures are tried and true, the big improvements in distributed application development have been in the technology. Compared to the tools and abstractions used to develop distributed applications 10 years ago, today’s developers have it made! Today we can spend a lot more time on solving business problems than on constructing an infrastructure just to move data from machine to machine. Let’s look at how far we’ve come.
Sockets are one of the fundamental abstractions of modern network applications. Sockets shield programmers from the low-level details of a network by making the communication look like stream-based I/O. Although sockets provide full control over communications, they require too much work for building complex, full-featured distributed applications. Using stream-based I/O for data communications means that developers have to construct message-passing systems and build and interpret streams of data. This kind of work is too tedious for most general-purpose distributed applications. What developers need is a higher-level abstraction—one that gives you the illusion of making a local function or procedure call.
Remote Procedure Calls
The Distributed Computing Environment (DCE) of the Open Group (formerly the Open Software Foundation) defined, among other technologies, a specification for making remote procedure calls (RPC). With RPC and proper configuration and data type constraints, developers could enable remote communications by using many of the same semantics required by making a local procedure call. RPC introduced several fundamental concepts that are the basis for all modern distributed technologies, including DCOM, CORBA, Java RMI, and now .NET Remoting. Here are some of these basic concepts:
These pieces of code run on the client and the server that make the remote procedure calls appear as though they’re local. For example, client code calls procedures in the stub that look exactly like the ones implemented on the server. The stub then forwards the call to the remote procedure.
This is the process of passing parameters from one context to another. In RPC, function parameters are serialized into packets for transmission across the wire.
- Interface Definition Language (IDL)
This language provides a standard means of describing the calling syntax and data types of remotely callable procedures independent of any specific programming language. IDL isn’t needed for some Java RMI because this distributed application technology supports only one language: Java.
RPC represented a huge leap forward in making remote communications friendlier than socket programming. Over time, however, the industry moved away from procedural programming and toward object-oriented development. It was inevitable that distributed object technologies wouldn’t be far behind.