Web Site Topologies | Performance Analysis for Javaв„ў Websites

Now that we've discussed all the parts , let's put them together to build a web site. High-volume, high-availability web sites deploy across multiple machines and processes for both reliability and workload management purposes. As the request volume increases over time, the web site grows ( scales ) by adding more resources to meet the demand. Successful scaling requires planning and forethought. A good web site configuration maximizes flexibility while avoiding single points of failure. (A component is a single point of failure if its outage results in most or all of the web site becoming unavailable.)

Most large web sites plan for scalability and redundancy by isolating key software components to dedicated machines. While you might find it useful to run an HTTP server, an application server, and a database on one machine for development purposes, most web sites give each of these components a dedicated machine in production. Splitting these functions across their own servers delivers several benefits:

The applications do not contend with each other for system resources.
Each application runs on a machine tuned to its particular performance needs. For example, a database server may require faster DASD than an application server.
The web site gains configuration flexibility. You may place the HTTP server inside the DMZ, while moving the application server and database behind the DMZ for additional security. Many companies also place business databases within their own firewalled networks to protect them from unauthorized internal access.

Of course, exceptions to this rule exist. Some companies prefer to run a few, large server machines with multiple web site components sharing each machine. This approach requires a good understanding of the performance characteristics of each application sharing the box, or an unusually big box. In fact, running web sites with mainframe equipment is becoming increasingly popular for many reasons. However, that's a topic for another book.

Most web sites set up a basic topology similar to the one shown in Figure 3.2. A firewall restricts access to the DMZ containing the HTTP server, and another firewall restricts intranet access to only those machines with network addresses within the DMZ. Sometimes, as noted above, another firewall exists between the application server and the database server for additional security. The HTTP server delivers static content from within the DMZ and passes dynamic requests to the application server behind the second firewall. This very flexible configuration allows specific server tuning for maximum throughput based on the type of content each machine handles. Likewise, none of the web site components contends with another application for CPU, memory, or other machine resources. Assuming sufficient network capacity and proper component tuning, nothing hinders this configuration from achieving top performance. Despite these positives, this setup is not without problems. If any of the servers involved fail, the web site cannot deliver dynamic content. Likewise, if we want to upgrade some software or add memory to a server, the web site becomes unavailable until we finish the upgrade.

Perhaps we need more capacity to handle increasing request volumes ; how do we increase the web site's capacity? Duplicating web site resources eliminates potential failure points and allows us to add capacity as required to meet growing demand. Let's cover two of the most common ways to duplicate resources and grow a web site: vertical and horizontal scaling.

Vertical Scaling

As we discussed in Chapter 1, one type of vertical scaling increases processes to better utilize the web site's hardware. To scale vertically, we create more application server instances on a single machine. (These instances may contain web containers, EJB containers, or both, as required.) Figure 3.12 shows a web site using two application server instances on one server machine. The plug-in provided by the application server manages the flow of requests to the two instances (sometimes called clones ), including any affinity routing required by the traffic. In this case, the plug-in acts as a load balancer for the two application server instances. Vertical scaling also provides limited failure protection for the web site. If a software bug causes a failure in one of the instances, the other instance remains available to serve requests. Obviously, if the server machine experiences a failure, both instances also fail.

Figure 3.12. Vertical scaling of the application server

graphics/03fig12.gif

This type of vertical scaling provides performance benefits only if the server machine has enough capacity for both instances. If a single application server instance drives your CPU to 90% utilization, you gain nothing by burdening the machine with another instance. In this case, the instances contend for the CPU, which usually results in worse performance. For very large machines (12 processors or more), vertical scaling allows the web site to better utilize the box's processing power. Again, a single JVM cannot fully utilize the processing power of a very large server machine. However, with vertical cloning, multiple JVMs (in the form of application server instances) obtain higher CPU utilization. As long as other resources, such as memory and NIC capacity, remain unconstrained, vertical scaling on large machines improves performance.

As we mentioned earlier, vertical scaling rarely provides perfect scaling. Adding two application server instances seldom produces a perfect doubling of throughput. Vertical scaling allows you to get more out of your large server boxes, but it may not allow you to exercise their full potential, because of machine resource limitations and hardware architecture issues. In many cases, we need more hardware to grow our web site.

Horizontal Scaling

Horizontal scaling increases web site hardware and the processes executing on the hardware. Scaling usually adds similar servers with similar configurations (operating systems, installed applications, and so on) to the web site. Most large web sites try to keep their servers and server configuration homogenous throughout. This reduces maintenance overhead and simplifies capacity planning.

Figure 3.13 gives an example of a web site implementing horizontal scaling. The site uses multiple HTTP server machines and application server machines to support the traffic volume. The load balancer provides an entry point to the web site, and distributes the requests across the HTTP servers. The end user never knows that multiple machines or servers exist behind the web site's URL; the load balancer hides the complexity of the web site. Each application server in this diagram runs one or more application server instances. Some web sites combine vertical and horizontal scaling by using multiple server machines and multiple application server instances on each machine. If this web site requires affinity routing, a combination of load balancer affinity and plug-in affinity makes it work. The load balancer routes consistently to the same HTTP server, and the plug-in routes consistently to the same application server instance. Some plug-ins, however, automatically route to the right application server instance and thus do not require load balancer affinity.

Figure 3.13. A small web site using horizontal scaling

graphics/03fig13.gif

We refer to a group of server machines as a cluster. The machines in a cluster, as well as the software running on them, work together to support a seamless web site visit for the end user. The small cluster pictured in Figure 3.13 includes duplicate load balancers and databases to provide for failover. If one machine or service in this cluster fails, another takes its place to avoid interrupting service to the end user. Ideally, failover happens instantaneously, although some lost requests are not uncommon except in the most sophisticated systems.

Note, however, that failover only works if the remaining machines and software contain enough capacity to cover the work of the failed component. If all your application servers routinely run at 100% CPU utilization, a failure of one application server machine leaves insufficient capacity to support your total traffic volume at the same level of performance. Failover preparedness requires extra capacity: You need at least one additional machine beyond your capacity projections for each web site component. Keep your network infrastructure in mind as you grow your web site, and add the necessary capacity to support your expansion. Also, don't forget the impact of the expansion on back-end servers, such as your database. More web site capacity means more database activity, so give the database the capacity it needs to handle web application requests.

Horizontal scaling typically produces near linear scaling. That is, doubling the web site's hardware usually yields close to double the throughput. Of course, we base this statement on some very important assumptions:

Other web site systems (network, database, and so on) have enough capacity to support the expanded web site. Otherwise, increase the capacity of these systems to prevent bottlenecks.
Your web applications scale. Some web application designs prohibit growing the application across multiple application server instances. Test your application in a clustered environment prior to growing the production web site.

We've kept our web site purposefully simple for this discussion. Very large sites often develop very complex topologies. Within large corporations, the corporate web site often consists of many smaller web sites tied together through a series of load balancers and routers. Growing these web sites, or even adding new features, sometimes proves very difficult.

Choosing between a Few Big Machines or Many Smaller Machines

Web site teams frequently struggle with hardware selection. A common debate centers on the merits of operating the web site on a few very large servers versus using many small servers. Let's consider a few guidelines to help you with this decision.

Get More of What You Already Have

If your production environment already uses large machines, and your staff knows how to install, configure, and maintain these boxes, then add additional machines of this caliber. Staffing and training costs usually outweigh any other considerations.

More Machines Means More Labor

Smaller machines cost less, but dozens of small servers require more administration than a single large machine. This cost difference often dwarfs the equipment savings for large installations. Consider additional staffing as part of your analysis.

Machine Selection Impacts Licensing Fees

Software pricing sometimes drives equipment purchase decisions. Some software packages charge per CPU, while others charge per installation. Depending on the licensing fees involved, picking a machine to minimize licensing fees sometimes generates significant savings.

Don't Forget Failover Costs

A web site usually needs at least three application server machines. Having only two machines leaves a thin margin for failover, particularly if you take one server down for maintenance or upgrades. Two large machines often handle normal loads just fine but create tricky maintenance problems, especially for round-the-clock web sites.

Best Practices

So, what do we recommend? A configuration similar to that shown in Figure 3.13 is a good start. Each web site component receives a dedicated machine, giving us more flexibility. As we stated earlier, most web sites configure a minimum of two machines for each major component (HTTP server and application server). Three machines per component give you some buffer to perform maintenance and upgrades while still providing failover capacity. Of course, you need to weigh availability requirements against the additional costs. Many web sites need far more than two or three machines to support their traffic volumes. Regardless of your capacity sizing, consider additional capacity for failover and maintenance concerns.

Earlier we discussed proxy servers. If your web site serves predominantly static content, adding a proxy server to your site may give you tremendous performance and capacity benefits. Place the proxy server as close to the users as possible, preferably just inside the first (outer) firewall. Proxy servers reduce the traffic within the interior of your web site and reduce the burden on your HTTP servers.

Avoid using your application server as a static content server. We realize this advice is a bit controversial , given the new J2EE packaging schemes ( .ear and .war files); however, the HTTP server remains the best component for serving static content. Serving static content from the HTTP server places the content closer to the requester and reduces the burden on the application server. Also, modern HTTP servers deliver static content much faster than an application server. Most web pages contain many static elements; optimizing their service improves overall web site performance.

Some sites direct static content requests to HTTP servers dedicated to serving static content. For instance, requests to the main web site have the form

<http://www.enterprise.com/myservlet>,

while requests for graphics have the form

<http://graphics.enterprise.com/mystuff.gif>.

This allows a completely different set of HTTP servers to satisfy these requests and reduces the contention for listeners on the HTTP servers supporting requests for dynamic pages. Also, because the HTTP servers that handle only static content don't require an application server plug-in, they respond even faster.

Finally, we want to show you a configuration using EJBs in addition to servlets and JSPs. Some enterprises implement their business logic solely in EJBs, with the view and a thin controller layer in the presentation tier , deployed in the DMZ. This works especially well when both internal and Internet applications share some business logic. Figure 3.14 shows such a configuration, where the Internet applications reside in the DMZ and connect to EJB containers behind the DMZ. Of course, for security purposes, both the application servers and the EJB components often lie behind the DMZ, leaving only the HTTP server between the firewalls.

Figure 3.14. A DMZ configuration with servlets, JSPs, and EJBs

graphics/03fig14.gif