This section lays a foundation for application design. I'm not going to show you any code in it, but I talk about distributed application architecture, sometimes known as enterprise application architecture . You need a solid understanding of this to get a good feeling for how all the pieces of the book fit together.
The term distributed is used throughout this chapter, and then a good bit in the rest of the book. What I'm referring to is a system of disparately located resources ”for example, a server in New York is part of an application that is based in Los Angeles. It interoperates just as if it were in the same building. This notion of distant interoperation is also know as location transparency .
Only a short time ago, developers talked about Distributed Internet Architecture (DNA). It was one of the hot buzzwords . DNA was about creating scalable and robust applications that ran within a distributed Microsoft environment, or more often, within a browser via the Internet (or an intranet). In its never-ending quest for improvements, though, Microsoft now is delivering the next generation technology for building distributed applications. It is known as the .NET platform, or the .NET framework.
The .NET platform includes technologies such as Visual C++, Visual C# (Microsoft's new object-oriented language), ASP.NET, COM+, ADO.NET, and XML. This near seamless integration comes in the form of Visual Studio.NET.
The .NET platform is an abstraction. There is no specification for building these applications such as there is for COM/DCOM/COM+, and there are no logo requirements or rules for regulating .NET compliance. Microsoft, however, does promote .NET as a robust framework for building scalable, multitiered Internet applications.
The Core of Distributed Applications
There are usually three tiers or layers of services. Each is bounded by well-defined interfaces that export public behavior only. They are user interface and navigation, business processes, and integrated storage.
User Interface and Navigation
Development tools, such as Visual C++, reside at the user-interface level because they are so popular for creating graphical interaction environments with rich controls (combo boxes, buttons , scrollbars, and so on).
This is the heart or hot zone of distributed applications. Herein live the core server-side products that enable developers to focus on the solution and not the implementation details. Most of the plumbing required for managing transactions, for example, is provided by COM+. If queuing services are required, COM+ exports a very simple interface to make this happen with minimal development and configuration effort. Finally, to connect the application to the Internet, an HTTP server, such as Microsoft's IIS, must exist. IIS does a lot more than interpret HTTP.
The application also must be capable of providing a solution, so the necessary logic and rules are present in this tier . Encapsulating the rules in components is preferable, but not required. Components allow for reusability and interoperability across platforms, but as long as the business rules can be activated and information passed back and forth between tiers, any programming technology can be used.
Not all storage must be tied to a database. Many forms of physical storage are acceptable and sometimes required. With the strong ties to the Internet, e-mail or multimedia binary streams (such as audio or video) are strong contenders for primary data sources or sinks. A file system also can be used as storage when relational data is not required. Log files, for example, are typically text files stored on the file system that are incrementally built as an application runs.
When data must be associated with other data, a relational database such as SQL Server 2000 can be used through Microsoft's Universal Data Access mechanisms. Active Data Objects (ADO.NET) are perfect for distributed applications because of their component nature. Other methods of storage, such as collaboration streams, can exist, too.
A Functional Overview of Distributed Applications
Having discussed the general .NET framework and some of the tools and products it encompasses, you can dive into more detail with a specific (but strictly hypothetical) example.
You begin the journey into an application with a browser. For the .NET platform, a browser usually means Microsoft Internet Explorer (IE). Microsoft has ported IE to many platforms, such as Macintosh, Windows , Solaris, and SunOS, with others in the works. IE is the ideal target browser because of its wide platform support, its integration with COM+ (ActiveX), and its relentless pursuit of world dominance in the browser wars.
Internet Information Server (IIS)
The next point of contact is Microsoft's Internet Information Server (IIS), a critical server-side product that makes the .NET platform possible. One of its main goals is to serve HTML pages over the HTTP protocol efficiently and reliably. IIS is not the only HTTP server out there. Competitors such as Apache and Netscape have long been in the server business and have produced commendable, inexpensive (or free) HTTP servers, but their main focus has been to serve static content as fast as possible. With IIS, however, you get much more than an optimized content server.
ASP.NET adds dynamics to HTML, a static markup language that by itself does not accommodate interaction with a user. In the past, HTML relied on external server-side Common Gateway Interface (CGI) programs or Perl scripts that would generate HTML based on some input. This gave the illusion of interaction, but at the cost of user- observed latency because many round trips were required.
This was the situation circa 1995. Since then, a whole slew of new technologies have changed the way developers write HTML-based applications. It now is possible to execute code based on user input on the client itself, without having to repeatedly query a server. This increases reliability and performance but adds coupling to the application.
ASP.NET encapsulates the complicated task of trying to create the illusion of a dynamic, user-driven application when only static, canned data exists. ASP.NET enables a Web application to create individual environments controlled by the user. This is not a new concept. CGI has been around since HTML and has provided a limited form of dynamic Web content. If you've ever programmed with CGI, you are aware of its delicate and awkward user interface. While ASP took CGI to a new level, ASP.NET takes ASP to the next level and creates an easy-to-program bridge between Web users and server components.
The addition of ASP.NET has provided an even easier level of control of Internet applications. Internet applications are virtually indistinguishable in functionality from their Win32 or similar GUI counterparts. Consider the following ASP.NET code, which creates a COM object, locates an interface, calls a method, and stores a result in a variable, all with only two lines of code:
MyObj = HttpServerUtility.CreateObject( "Object.ImyInterface" ) Result = MyOb.MyMethod( argument1, argument2 )
COM+ Transaction Management
Below IIS live the COM+ transaction services (built on Microsoft Transaction Server, or MTS). With this technology, a client will never see beyond ASP.NET, so the COM+ transaction services are playing a role strictly behind the scenes. And the COM+ transaction services do much more than their name implies. They certainly do a great job of automatically coordinating transactions within and across components and databases, but they also manage the caching and pooling of resources, producing scalability as a free side effect.
COM+ transaction services normally are used as a container for components in the .NET framework.
COM+ Messaging and SQL Server
If your application must asynchronously communicate with another application or data store over a high-latency network, the COM+ messaging services (built on Microsoft Message Queue ”MSMQ) can be of tremendous help. These services coordinate the complex interactions involved in the delivery, reception , and acknowledgment of synchronous or asynchronous messages. Components themselves can be messages and can be automatically managed. The developer need only provide the necessary implementation for the COM+ IPersist interfaces and let the messaging services take over in a reliable and fault-tolerant manner.
For distributed applications that must communicate with each other remotely, or even within a local machine, COM+ messaging services relieves the COM+ programmer from the burden of low-level communication details such as WinSock, spawning threads, and concurrency control.
XML is a way of representing data. It's platform independent, and provides several other advantages, such as being easy to send as a payload to remote servers. This book teaches you about XML, and in the chapters that show you how to build full applications, you'll use it as the data exchange medium.
So, where does Visual Studio fit into the .NET platform? It provides a very rich set of tools that work together in the development of distributed applications. This book covers XML, COM+, and ASP.NET. Specifically, I'll talk about creating and using XML documents, creating and using COM+ components, and creating and using ASP.NET. These three technologies work well to form the basis for distributed applications.
Designing Distributed Applications
Although XML, COM+, and ASP.NET are groundbreaking new technologies that surely will change the way software is developed in the near future, it is no silver bullet. The technologies by themselves do not guarantee robust and reliable architectures ”or even ones that work, for that matter. Good design principles are critical for any software system, regardless of the model or framework on which it is based. It is essential that developers spend sufficient time analyzing the problem to be solved (the what), and then more time designing a working solution (the how).
The analysis and requirement-gathering phases of software engineering, albeit critical, are not related to programming, so they are not covered in this book. Design, however, does impact .NET components and merits a section of its own. If a solid foundation is not created during the design phase of an application's development, the application is unlikely to be successful over time. When it comes time to modify or augment the application, the lack of design will be evident, and the application likely will need a rewrite.
The section titled "Ad-Hoc Design" lays the groundwork on which all .NET programming should stand.
So far, this chapter has already laid down a road map. The task now at hand is to correctly follow this map and build reliable, maintainable , and scalable applications with it. The design principles discussed in the following sections add value to all .NET solutions.
An emphasis is placed on multitiered design for both its success in software design in general and for its seamless adaptation to .NET solutions. Traditional design methods are covered first, then three-tiered models, and finally multiple tiers in the sections that follow.
In general, trial-and-error or nonexistent design techniques are likely to fail when you are developing software. Surprisingly, even when lack of design has been proven to cause late and over-budget projects that do not work or are never delivered, software developers continue to ignore this part of the development cycle. It seems as if there are never enough resources or days in the week for this seemingly empty and intangible phase.
A flawed design becomes much more apparent as the application evolves. Although it is true that there are great benefits to component-based software, all advantages quickly disappear if components are assembled into monolithic, rigid structures. It is very difficult (and indeed, sometimes impossible ) to scale or upgrade components without a rewrite unless they have been properly designed. Add to that the training investment that programming distributed applications requires, and the pressure rises even higher. A failed component-based project will not be as forgiving as a traditional monolithic application. It's easier to fix a monolithic application, so other developers who have to fix problems with your project in the future will have an easier time of it. With all of its complexity, .NET might seem shrouded in mystery to the uninitiated. It is very easy to blame .NET technology in general when an ill-designed project fails.
Even when the technology is used correctly, components themselves don't solve all known software problems; in fact, they introduce a few of their own. If you've ever worked with DCOM, you can probably attest to this statement. XML, COM+, and ASP.NET take time to learn. They require that you invest time in a considerable setup and configuration phase not normally associated with other programming paradigms or languages. Their benefits, however, surpass all these costs with intangible rewards such as reusability, maintainability, and scalability.
Fundamental Application Boundaries: Presentation, Logic, and Data Services
Designing software with components is slightly different from designing with other traditional methods. First, components are inherently independent, standalone entities. Components communicate with each other only through known public interfaces. A process of discovery reliably tests what a component can and cannot do.
If components are grouped into one monolithic, tightly coupled pile, they lose all their intrinsic benefits and add maintenance complexity. You must view components as heterogeneous sets of tiers or layers, where each component has something in common with the other but solves a different problem.
The idea of partitioning the work is not new. One highly successful model is known as the three-tiered model in the client/server community. You learn it here as a primary design guideline for all distributed programming.
When I say a system is tightly coupled, I am referring to its interdependencies.
A coupled architecture is rigid and inflexible , but easy to design and implement. Coupled architectures are synonymous to monoliths because they are regarded as single entities with little or no interchangeable objects or components ”very much like huge blocks of granite.
Tightly coupled applications are difficult to maintain. When it is time to modify a monolithic system, the development team might find they are painted into a corner. Any modification to the system, however minor, is risky. A single change in a line of code in one function in one module can have a domino effect that causes another function in a separate module to break, causing the entire system to fail in strange and unpredictable ways.
Coupling can be divided into many forms. The most important are architectural, intra-procedural, inter-procedural, and inter-modular. They are listed here in order of greater impact on the system:
A loosely coupled architecture, on the other hand, is one composed of independent, inter-operable components or objects. Changes to the architecture are isolated to one component at a time. The probability of a domino effect as described earlier is almost zero. However, loosely coupled architectures incur a performance and development time tradeoff . They take considerably more effort to design, significant communication overhead is introduced by the components, and in general they are harder to implement.
Nonetheless, the benefits of loosely coupled systems far outweigh their cost.
One of the most successful models to emerge from the client/server world has been the three-tiered model. A tier, or layer, is a collection or set of independent homogenous objects (each solving a small problem) that together solve a larger but common problem.
Consider a typical PC application today. It is likely to interact with a user, process some data, and perhaps persist its state somewhere ”hence three tiers.
The three tiers are generally known as presentation , business logic , and data services .
Monolithic Versus Three-Tiered Design
You are very likely to encounter the three-tiered design model under different names at different times (integrated storage rather than data services, or navigation rather than presentation, for example), but the concept in any case is the same. Within each tier live components that in turn contain public interfaces to the private functions or methods that do the actual work.
The underlying principle of COM+ is having rigid implementation boundaries encapsulated by well-known public interfaces. Tiers are no different. The interfaces for a tier are more general and at a coarser level of granularity than those for a component, but encapsulation is still the foundation.
A particular tier should know nothing whatsoever about its adjacent tiers other than their exposed public interfaces. From a procedural standpoint, this indifference seems like a restriction, but it's really a liberating mechanism. Herein lies the strength of the model: Changes in one tier have minimal impact on the others. This rule puts the architecture in a comfortable position to be easily expanded and freely upgraded with time.
Communication across tiers should exist only through public interfaces in a well-designed three-tiered architecture. When tiers are loosely coupled, it is very simple to swap out components (or entire tiers) to adapt to changing requirements without demanding a rewrite or system retest. For this reason, tiers should be completely unaware and carefree about the implementation of adjacent layers. A tier should see only the public interfaces of its immediate neighbors.
This highly effective and universal model has been around for years but is surprisingly uncommon outside the client/server world (one machine stores data, another reports it). With the advent of COM+ (interface-driven) and .NET, the three-tiered model is seeing even greater use.
The main tiers in the three-tiered model can be described as follows :
An application does not require a database to benefit from a three-tiered model. Any kind of persistent storage can be placed here (file systems, e-mail, multimedia streams, and so on), away from the presentation and logic layers, so as to increase the potential for maintainability and evolution of the system in the future.
Keeping Tiers Balanced
Imagine a well-balanced three-tiered architecture sitting atop a delicate fulcrum. On the left rests all the presentation code, in the middle the business logic, and on the right any data services.
The business logic middle tier is likely to carry most of the weight in a system, but placing too much weight on either extreme can be disastrous for future evolution. Putting too much weight on either side tilts the balanced objects and causes the application to collapse.
On the presentation side, this could mean adding data-aware or data-bound controls that talk directly to the data-access tier. In turn, relying heavily on database-specific logic mechanisms, such as stored procedures or triggers, on the data services tier has the opposite effect, tipping the balance to the data services side.
A Solid, Robust Design
Now consider a solid three-tiered design that rests on an immovable foundation of logic, thereby eliminating the fulcrum and the chance of disturbing the balance. A square represents a rigid foundation, unshaken by future upgrades or modifications to any tier. This rigid foundation is achieved by the clear separation of presentation and data logic from the business rules. As long as the problems the application is trying to solve are oblivious to where the data (if any) comes from, or how it is presented, the application will remain scalable and robust with time.
The three-tiered architecture is not just for client/server environments. Virtually all applications can be divided into at least three tiers as described. (As an exception, however, consider device drivers and other low-level hardware manipulation programs where the extra three-tiered design effort can be counter-productive.)
Whether you are building a small Win32-based application or a full-blown, Web-enabled e-commerce system, these principles are applied equally. No matter how complex the problem, software always deals with data in some form (or there would be no work to do) and can always be separated by behavior into distinctive layers.
As applications evolve and become more complex, sometimes it is necessary to break one particular tier into two or more. This results in multitiered, or n-tiered, architectures. Although the fundamental interaction with a user might be the same (present data, gather data, and store data), it is sometimes easier to break the three basic tiers into several pieces.
For large complex applications with dozens of developers, it is much easier to work with an n-tiered architecture than a three-tiered one because the work can be carried out separately in finer granularity.
You might be asking yourself, "Why not keep one tier, as before, and two components that handle HTML and Win32, respectively, adding interfaces and methods as required?" The answer is that Win32 can be so complex that doing so would require a large set of homogenous Win32 components to coexist with another large set of homogenous HTML components. Win32 is inherently different from HTML processing (linear and event-driven rather than non-linear and static).
As you might recall, a tier is defined as a collection of independent homogenous objects that work together to solve a common goal. By breaking down the tiers, and thus the number of components in each tier, you can greatly reduce the development and testing complexity of an application in a team environment. Independently developing and maintaining 5 components is much easier than doing so with 20.
The more components you have in a tier, the more the tier approaches a monolithic model (too much coupling), defeating the purpose of tiered architectures. In the case of HTML and Win32 as presentation choices, separating them into two tiers is a sound logical choice.
It is essential to set a threshold for the minimum number of components a tier should contain. If you were to make each component its own tier, you would be back where you started (all components, no homogeneous organization). When designing an application, use tiers to your advantage but be frugal in their quantity.
Moving down from the top presentation tiers are four separate tiers that normally would be contained in the business tier. If you break them apart according to general functional goals, it is much easier to understand what the application is doing: validating data, crunching data, securing data (encryption, authentication, and so on), and transacting data. Each of these actions certainly could be contained in a single tier, but doing so would crowd the tier and render it unmanageable ”a contradiction of the model's main purpose.
A Data Validation/Shaping tier contains many different, but functionally similar, components. Each one solves a different problem within the validation domain of the application. Although validation can be considered a business-tier functionality, the validation components have little or no relation to crunching the data itself.
The next section talks about where your tiers should be physically placed in a deployment configuration.
Local or Distributed?
Do not be fooled by the apparent distribution of the disconnected layers in a multitiered architecture. Programming with tiers does not imply a distributed relationship or even the presence of a database. It is a road map to follow and can be used for any application. Some applications do not even require a presentation layer. They can be services, for example, such as listening for non-interactive requests . Likewise, a data services tier can be absent if the application doesn't involve persistence (as in the calculator usually found under the Accessories menu in Windows).
Good Design Techniques
Fortunately, sound design principles are very easy to apply to component-based software. This book does not examine the more involved object-oriented approaches that exist, but instead covers tried-and-true techniques for designing multitiered architectures. In order of importance, they are as follows:
From these techniques or principles, you probably can see an inductive approach. It is extremely valuable to start from the general and narrow down into the details.
Abstract the Application into Tiers
All useful software is created to solve one or more problems, which can always be divided or separated into subproblems. With components, it is important to split large and sophisticated problems into smaller, more manageable ones. By using this divide-and-conquer approach, you ensure that the problem as a whole will be solved, and you also create a robust and easily maintainable architecture. It is much easier to modify and upgrade a single subproblem component than it is to try to tackle an entire monolithic problem, with the usual retesting and debugging.
This phase is a little more challenging and requires foresight (and hindsight!). It's probably the most fun and creative phase of all because it brings together past experiences, current technology, and analytical problem solving. Again, although not all applications can seamlessly fit the three-tiered model, they can always be logically separated into this abstraction at some level.
Start with presentation. Will the application interact with a user through a graphic interface or is it a service that fulfills requests blindly? Creating intuitive and user-friendly GUIs is not a trivial task. All too often you see GUIs that are plagued with inconsistencies from screen to screen or are too busy. This only compounds the problems when you are working with browsers and their lack of rich control sets or state.
You won't learn GUI design in this book, but it is worth mentioning that it should not be taken lightly. Remember that although the presentation tier is just the tip of the iceberg, it is the one and only communication link between a user and what constitutes the application.
The tiny presentation tier can be sitting on the shoulders of a huge sophisticated business tier, but if it's not exporting a user-friendly working environment, or is inconsistent, the user might dismiss the entire application.
After you have decided what visual or nonvisual technology to use (Win32, browser, console, and so on), separate the presentation into functional units. One component could handle menu options, another tabs, and yet another ToolTips. Or in a browser, one component can be delegated to track context menus that change from page to page, whereas others can be in charge of tracking combo and text boxes. The more you separate your presentation into standalone objects, the easier it is to maintain and upgrade.
From my own experience, I think it is safe to say that the presentation tier undergoes the most changes. It is impossible to satisfy all users on how interaction should occur, but you can find common ground that satisfies the majority of users and reaches a compromise.
The other two tiers should follow the same idea. The business tier is in charge of validating and processing. That's two components. A validation component can be in charge of filtering any data before it is passed on for processing. The processing component, in turn, does not have to worry about bad data because it has already been validated . This arrangement removes the burden of having to carry out both actions simultaneously , which can cause errors.
If your application requires data access, then the third (but by no means final) tier can be in charge of all data access. If your application communicates with a typical relational database, you can have two components: query and update, for example, each with its set of interfaces for carrying out vendor-specific SQL commands. Avoid using stored procedures whenever possible. They couple the underlying physical database with the rest of the application at the sometimes negligible reward of increased performance. If you offset stored procedures into components instead, the processing is generic and can be applied to any database technology in the future.
Interfaces are at the heart of COM+. They are powerful abstractions that enable you to separate advertised behavior from internal implementation. An interface should describe only what public services an object offers. The private state of an object should never be disclosed through public interfaces.
An interface in COM+ is the binding contract between a component and its clients. At its core, an interface is a collection of semantically similar methods or functions accessed through a vtable pointer at runtime.
An interface itself does not have any functionality. It merely points to the implementation. Interfaces can be reused (polymorphically or with inheritance) across components and upgraded. New methods can be added, but old ones should always remain.
When designing interfaces, keep in mind that all the methods contained in an interface should have something in common. There is no limit to how many methods an interface supports, but a good rule of thumb is to keep the number below 10. This keeps the interface from becoming unmanageable and monolithic, defeating the purpose of COM+.
Implement the Components
Finally, after you have created a conceptual model separated across tiers and interfaces, it is time to implement the behavior. Depending on the complexity of the application, this can be the quickest part of all. With the interfaces serving as a blueprint, and with the confidence of a sound design architecture, the requirements of the application can be implemented in code.
In applications, be they multitiered or monolithic, a whole slew of requirements impose constraints on the finished product. Sometimes, the requirements are conflicting or require bargaining tradeoffs. Size versus speed, for example, has historically been a major tradeoff facing software architects . With memory prices constantly dropping, this particular tradeoff is not as relevant today as it was a decade ago. Many new tradeoffs have emerged to take its place. These are discussed next.
Many real and measurable forces act on the overall shape of the any modern-day multitiered architecture. The following are some of the most important:
It is unrealistic to assume new projects will use all the latest high-tech tools, programming languages, and server-side products, setting aside all existing technology investments. In the real world, significant resources have been invested into software development; hardware, software, training, and skyrocketing IS salaries all add up to a huge expense for any corporation.
Fortunately, programming with the .NET platform does not require a major IS overhaul . It is possible to develop successful .NET-based applications inexpensively and without having to throw in all the bells and whistles Microsoft has to offer. For example, expensive database products such as SQL Server 2000 might not be necessary if a corporation already owns another database.
Software projects are consistently late and over budget. There are many reasons for this, but implementation and integration issues always seem to emerge as primary culprits. I'm not offering any solutions, I'm actually looking for them. If I find them, I'll broadcast them loud and clear.
In today's world of distributed computing and Internet applications, memory is cheap. Reliability, not size, matters. But reliability alone is not sufficient to justify an application to a modern user. As a developer, you must aim for several key intangibles. The following list includes the most important ones:
How effective is a software system if it cannot be upgraded or evolved over time? If there is one lesson we have learned from the Y2K problem, it is that computer programmers should be more judicious and critical of the assumptions they make about the future. Foresight and flexibility during design and implementation will be greatly rewarded. A narrow and specific solution usually must be rewritten several times as new features are requested and unexpected requirements arise.
Rather than attack the problem in the particular, strive to solve the problem in general with additional rules for the specific problem at hand. All too often, developers come back to old code and wish they'd put a little extra effort into decoupling a large procedure or function. Having to spend valuable time in a partial or complete rewrite is no fun.