Web Service Delivery Architecture | Practical Service Level Management: Delivering High-Quality Web-Based Services

Figure 3-1 shows a typical architecture for delivery of services on the Web.

Figure 3-1. Web Service Delivery Architecture

This server farm uses a three-tier application model, which is normally used for large-scale systems. The three tiers are as follows:

Web servers, which maintain the connections with client browsers and other client devices, parsing and handling input from them, formatting data to be sent to them, serving unchanging (static) web pages, and often being responsible for maintaining transaction context.
Application servers, which run the major transaction and dynamic web page generation systems, as well as any specialized applications for the end users. They often run specialized transaction-processing operating systems that simplify programming for scalability and availability.
Database servers, which handle the large back-end databases needed by larger Web systems.

Because the three tiers are loosely coupled, each tier can grow independently of the others, and interconnections can be used to increase availability.

Above the three server tiers in Figure 3-1 is the load distributor, which distributes incoming requests among the web servers, and the firewall and Internet access router.

In Figure 3-1, the primary server farm is multi-homed; it's connected to two different Internet Service Providers (ISPs) to increase availability. The primary server farm usually also includes ancillary devices, such as the authoritative Domain Name System (DNS) server, which provides the key records for mapping the site's Internet host names to Internet numeric addresses, and server-side caches, which can be used to relieve the serving systems of highly repetitive work by storing the results of commonly repeated requests. The end user's Quality of Experience (QoE) depends on much more than the primary server farm's performance, however. Multiple server farms, caching devices, content distribution networks, third-party content providers, and the DNS may also be involved.

Most large systems rely, often indirectly, on multiple, distributed server farms. Some enterprises have multiple locations from which they provide their basic content, and they use geographic distribution technologies to try to direct end users to the server farm that will respond the fastest. For example, it's impossible to deliver rapid web page downloads in Asia from a server system in New York City; enterprises that have a large user base in Asia must, therefore, have some server systems on that side of the Pacific. Geographic distribution is critical to providing good QoE, though it's difficult to locate an end user with great precision by using that end user's Internet address. Obtaining detailed knowledge of location, while very important for some applications and for some performance situations, can be quite tricky.

Caching devices are used to store frequently requested data inside the network, at the server location, or within the end user's local network to decrease both network traffic and the time needed to locate and display data. These devices are often provided free of charge for web sites to use, but configuring web pages for use with remote caching can be complex. Precise evaluation of the QoE at an end user's location as the result of caching requires remote measurement facilities.

A Content Distribution Network (CDN) is a service that uses a large network of remote caches to provide much more control of caching than is available using free caching. A CDN can provide prepositioning of content such as a major advertising campaign; it also provides the ability to cache download files and streaming media, which are usually not stored by public caches. A CDN gives the content owner direct, immediate control over remotely cached content. A CDN can also supply differentiated content to end users, based on their location.

Most web sites uses third-party content providers for some advertising or even basic site content. Many stock-trading sites, for example, use a third-party provider for stock price graphs that are visually embedded in their web pages. Despite the fact that content comes from third-party content providers, the end user usually does not realize that the content originates from different sites. If there are performance problems, the site owner is blamed, not the third-party provider.

Finally, the web site can't even be found by the end user if there are problems with the performance of the DNS. DNS is a worldwide hierarchy of server systems configured as a distributed directory, and it must be able to reach the web site's authoritative record and interpret that (often complex) record correctly. DNS information can then be cached in the DNS's own dedicated system of distributed caching servers, with some control from the web site's owner. Without measurement from end-user locations, problems with the DNS are often not detected until irate end users call up the site to complain about the site's being offline. The site may be completely accessible from the site owner's intranet, but completely inaccessible from large areas of the Internet.

All the traffic between the end user and the various server farms, caching devices, content distribution networks, and DNS servers travels over the Internet's complex mesh of backbones and peering points, which are the locations at which different organizations interconnect their backbones. The routing tables, used to direct Internet traffic, are so complex that the routing software cannot consider fluctuations in transit time when making routing decisions. If it did, the Internet would be saturated by routing table update messages, and the router CPUs would be saturated by the calculations required. The result is that routing through the Internet is often suboptimal, and traffic often heads for congested areas and peering points instead of traveling around them.

Routers do attempt to reroute around failed pieces of the Internet; they just don't usually reroute around congested pieces. Delays can build quickly at congestion points, and packets can be lost or duplicated as routers try to recover from their problems. To add to the complexity, the route between any pair of endpoints is almost always different in the two directions.

For all Internet and Web situations, you can see that measurement of the performance as seen by the end user must be available to detect QoE problems occurring in the complex Web-serving systems. Those measurements must also be quickly available and must be credible; otherwise, the web site's owner won't be able to use them to get an Internet service provider to fix a problem. Of course, some problemssuch as a very localized difficulty with an ISP's bank of dial-in modemsare beyond the scope of responsibility of a web site's owner, even though that web site's availability appears to be affected. In such cases, which occur constantly, measurements from standard locations and the use of public performance index measurements can be used to reassure management (and the end user, if necessary) that the problem is a local one and is beyond the direct responsibility of the owners or operators of the web site.