Lesson 1: Introduction to Capacity Planning | MCSE Training Kit (Exam 70-226): Designing Highly Available Web Solutions with Microsoft Windows 2000 Server Technologies (MCSE Training Kits)

The goal of any Web site is to present users with a high-quality experience. When users face slow response times, timeouts and errors, and broken links, they become frustrated and often turn to other sites to find what they’re looking for. To prevent this, you must provide an infrastructure that can handle not only average levels of demand but also peak levels and beyond. Capacity planning makes it possible to calculate how much hardware is necessary to meet users’ demands. These calculations allow you to identify bottlenecks in the network design that can cause performance degradation and lead to poor user experiences. You can then modify the design or implement the changes needed to eliminate these bottlenecks. However, before you can plan your network’s capacity requirements, you must be familiar with several basic concepts that are important to your understanding of the issues involved in capacity planning. This lesson intro- duces you to these concepts by describing how network traffic, performance, availability, and scalability are all factors in the capacity planning process.

After this lesson, you will be able to

Identify and describe issues that are relevant to capacity planning

Estimated lesson time: 25 minutes

Traffic

When a browser requests information from a Web server, it first establishes a Transmission Control Protocol (TCP) connection with the server. The browser then sends the request through the connection, and the server sends out pages in response to the requests. This interchange of incoming requests and outgoing responses is referred to as traffic.

Traffic is only partly predictable. It tends to occur in bursts and clumps. For example, many sites might experience activity peaks at the beginning and end of the workday but have lower levels of activity in the middle of the day. In addition, the size of these peaks might vary from day to day. Not surprisingly, a direct relationship exists between the amount of traffic and the network bandwidth necessary to support that traffic. The more visitors to your site and the larger the pages provided by the server, the more network bandwidth that’s required.

Suppose your Web site is connected to the Internet through a DS1/T1 line, which can transmit data at 1.536 megabits per second (Mbps). The Web server displays static Hypertext Markup Language (HTML), text-only pages that average 5 kilobytes (KB) per page. Each transmission carries with it a packet overhead that contains header information. For a 5-KB file, the overhead might average 30 percent of the file’s size. For larger files, the overhead accounts for a smaller percentage of network traffic.

Figure 7.1 illustrates how a load is distributed over a T1 line for a 5-KB page request.

Figure 7.1 - Transmitting a 5-KB page over a T1 line

Table 7.1 shows the traffic generated by a typical request for a 5-KB page. Many of the figures are only estimates. The exact number of bytes sent varies for each request.

Table 7.1 Traffic Generated by 5-KB Page

Traffic Type	Bytes Sent
TCP connection	180 (approximately)
GET request	256 (approximately)
5-KB file	5,120
Protocol overhead	1,364 (approximately)
TOTAL	6,920 (55,360 bits)

The standard transmission rate for a T1 line is 1.544 Mbps. Under normal conditions, the available bandwidth is 1.536 Mbps, or 1,536,000 bits per second. (The available bandwidth is reduced because 8,000 bits is required for framing the bit stream.) To determine the maximum rate of pages per second that your network can support, you should divide the bits per second of the T1 line by the bits generated for a 5-KB page request. In this case you’d divide 1,536,000 by 55,360, which comes to a maximum transmission rate of 27.7 pages per second.

Of course, if you add graphics to your files, the results will be considerably different. In addition, if each page has several images, if the images are relatively large, or if the pages contain other multimedia content, download time will be much longer.

Given a page of moderate size and complexity, there are several ways to serve more pages per second:

Remove some images from the page.
Use smaller pictures if you currently send large ones or compress the existing pictures. (If they’re already compressed, compress them further.)
Offer reduced-size images with links to larger ones and let the user choose.
Use images of a file type that’s inherently compact, such as a .gif or a .jpg file, to replace inherently large file types, such as a .tif file.
Connect to the network by using a faster interface or by using multiple interfaces. Note that this option resolves the issue at the server but not necessarily at the client.

A site that serves primarily static HTML pages, especially those with simple structure, is likely to run out of network bandwidth before it runs out of processing power. On the other hand, a site that performs a lot of dynamic page generation or that acts as a transaction or database server uses more processor cycles and can create bottlenecks in its processor, memory, disk, or network.

Client-Side Network Capacity

Server-side network capacity isn’t the only factor to consider when determining bandwidth limitations. The client computer is limited by its connection to the Internet. It can take a considerable amount of time, relative to a server’s output, for a browser to download a page.

Suppose you want to download a page that, including overhead, totals about 90 KB (about 720 kilobits). Ignoring latencies, which typically add a few seconds before any of the data arrives, it takes roughly 25 seconds to download 720 kilobits through a 28.8 kilobits-per-second (Kbps) connection if everything is working perfectly. Figure 7.2 illustrates the process of downloading the 90-KB page.

If any blocking or bottlenecking is going on at the server, if the network is overloaded and slow, or if the user’s connection is slower than 28.8 Kbps, the download will take longer. For example, a poor phone line connection can affect the download rate.

If the client computer has a higher-bandwidth connection on an intranet, the download time should be much shorter. If your Web site is on the Internet, however, you can’t count on a majority of users having faster connections until the next wave of connection technology becomes well established. Although many users now use 56 Kbps modems, many (if not most) telephone lines are too noisy to allow full-speed connections with 56 Kbps modems. In some areas cable modem and Digital Subscriber Line (DSL) technologies are being used extensively. For this reason, it’s impossible to tell which connection mode users will use to connect to your site.

Figure 7.2 - Downloading a 90-KB page (including overhead) through a 28.8 Kbps modem

Table 7.2 lists the relative speeds of several network interface types, based on a 5-KB text-only page. Numbers of pages transmitted at speeds faster than the standard Ethernet rate of 10 Mbps are rounded.

Table 7.2 Network Interface Speeds

Connection Type	Connection Speed	5-KB Pages Sent per Second
Dedicated Point-to-Point Protocol/ Serial Line Internet Protocol (PPP/SLIP) using a modem	28.8 Kbps	Roughly half of 1 page
Frame Relay or fast modem	56 Kbps	Almost 1 page
Integrated Services Digital Network (ISDN)	128 Kbps	Just over 2 pages
Typical digital subscriber line (DSL)	640 Kbps	Almost 11 pages
Digital signal level 1 (DS1)/T1	1.536 Mbps	26 pages
10-Mb Ethernet	8 Mbps (best case)	(Up to) 136 pages
Digital signal level 3 (DS3)/T3	44.736 Mbps	760 pages
Optical carrier 1 (OC1)	51.844 Mbps	880 pages
100-Mb Ethernet	80 Mbps (best case)	(Up to) 1,360 pages
Optical carrier 3 (OC3)	155.532 Mbps	2,650 pages
Optical carrier 12 (OC12)	622.128 Mbps	10,580 pages
1-Gbps Ethernet	800 Mbps (best case)	(Up to) 13,600 pages

Server-Side Network Capacity

It takes about 52 connections at 28.8 Kbps to saturate a DS1/T1 line. If no more than 52 clients simultaneously request a 90-KB page (including overhead) and if the server can keep up with the requests, the clients will all receive the page in 25 seconds (ignoring the typical delays).

If 100 clients simultaneously request that same page, however, the total number of bits to be transferred will be 100 times 720,000 (720 kilobits). It takes between 47 and 48 seconds for that many bits to travel down a DS1/T1 line. At that point the server’s network connection, not the client’s, is the limiting factor.

Figure 7.3 shows the relationship between concurrent connections and saturation for DS1/T1 and DS3/T3 lines, assuming all clients are using a modem transmission speed of 28.8 Kbps and are always connected. A DS3/T3 line carries nearly 45 Mbps, about 30 times as much capacity as a DS1/T1 line, and it takes more than 1,500 clients at 28.8 Kbps to saturate its bandwidth. Moreover, the increase in download time for each new client is much smaller on a DS3/T3 line. When there are 2,000 simultaneous 28.8 Kbps connections, for example, it still takes less than 33 seconds for a client to download the page.

Figure 7.3 - Download time versus server network bandwidth

The data shown in Figure 7.3 assumes that the server is capable of performing the relevant processing and handling of 2,000 simultaneous connections. That’s not the same as handling 2,000 simultaneous users: users occasionally stop to read or think and typically spend only a modest percentage of their time downloading, except while receiving streaming multimedia content. Because of this difference between users and connections, the number of users that Internet Information Server (IIS) 5.0 can support is larger than the figures would seem to indicate. A Web server on a DS1/T1 line can typically handle several hundred users connecting at 28.8 Kbps, and with a DS3/T3 line the number typically climbs to 5,000 or more. While these numbers are derived from actual servers, you can expect performance to vary with differing content types and user needs and with the number and type of services being performed by a particular computer.

Essentially, the network performance differences shown here scale linearly, and the scaling continues at larger data-transfer rates. If you have two DS3/T3 lines, for example, you can serve approximately twice as many clients as you can with one, provided that you have enough processor power to keep up with user demand and that no bottlenecks prevent your servers from maximizing their processing power.

Performance

The performance of Web applications is critical in determining the site’s capacity. Testing is the only way to find out a Web application’s capacity and performance. The Web Capacity Analysis Tool (WCAT) and Web Application Stress Tool (WAST) utilities (included on the Windows 2000 Server Resource Kit companion CD) are useful testing tools. Before writing an application, however, it’s useful to have a sense of the performance capabilities of different Web application types. In IIS 5.0, Internet Server Application Programming Interface (ISAPI) applications running as inprocess dynamic-link libraries (DLLs) generally offer the best performance. After that, the next best solution is Active Server Pages (ASP) applications, followed by Common Gateway Interface (CGI) applications.

For most applications, the recommendation is to use scripting in ASP pages to call serverside components. This strategy offers performance comparable to ISAPI performance, with the advantage of more rapid development time and easier maintenance.

Table 7.3 shows the results of performance tests run on a beta version of IIS 5.0. The application ran on uniprocessor and multiprocessor kernels for Secure Sockets Layer (SSL) connections and non-SSL connections. The hardware and software used for the tests are described below.

Table 7.3 Performance Testing of IIS 5.0

Test	Non-SSL 1 CPU	Non-SSL 2 CPUs	Non-SSL 4 CPUs	SSL 1 CPU	SSL 2 CPUs	SSL 4 CPUs
ISAPI in-process	517	723	953	50	79	113
ISAPI out-of-process	224	244	283	48	76	95
CGI	46	59	75	29	33	42
Static file (FILE8k.TXT)	1,109	1,748	2,242	48	80	108
ASP in-process	60	107	153	38	59	83
ASP out-of-process	50	82	109	28	43	56

The figures given in Table 7.3 are the actual numbers of pages per second that were served during testing. Each test repeatedly fetched the same 8-KB file. Note that different computer types will provide different performance for the same test. In addition, the performance of different application types depends greatly on the application’s task. For these tests the task was a relatively light load, so the differences among the various methods are maximized. Heavier tasks result in performance differences that are smaller than those reflected in the table.

The following hardware and software were used for the test:

The servers were Compaq Proliant 6500 (4 × Pentium Pro 200 MHz) with 512 MB of 60 nanoseconds (ns) RAM.
The clients were Gateway Pentium II machines, 350 MHz with 64 MB RAM.
The network was configured as follows:
- Clients used one Intel Pro100+ 10/100 MB network adapter card.
- The server used four Intel Pro100+ 10/100 MB network adapter cards.
- Four separate networks were created to distribute the workload evenly for the server, with four clients per network. Two Cisco Catalyst 2900 switches were used, each having two Virtual LANs (VLANs) programmed.
The following software was used:
- The server was configured with Windows 2000 Advanced Server and IIS 5.
- The clients were configured with Windows 2000 Professional.
- WAST was used to test the site.

These tests were conducted with out-of-the-box computers and programs. No additional registry changes or performance enhancements were administered.

Availability

Some sites can afford to fail or go offline; others can’t. Many financial institutions, for example, require 99.999 percent availability or better. Site availability takes two forms: the site needing to remain online if one of its servers crashes and the site needing to remain online while information is being updated or backed up. As discussed in earlier chapters, you can achieve site availability through the use of redundant services, components, and network connections. For example, services can be made redundant through the use of Network Load Balancing (NLB) and the Cluster service.

Availability is an important consideration when planning your network’s capacity requirements. You must first determine what type of availability you’re trying to achieve. How many 9s are you trying to adhere to? Although 99.999 percent might be ideal, it might not be realistic—or necessary—for your organization. In other words, how much can your organization afford to spend to ensure that your network can always meet your peak capacity requirements?

You must determine whether the problems caused by your site’s unavailability offset the expense of trying to keep it online. If you require 99.999 percent availability or better, then you must ensure that users won’t be prevented from accessing your resources because your site has reached its capacity. If, on the other hand, you need to achieve only 99 percent availability or less, you might be less concerned with your site occasionally being unavailable to users because your site has reached its capacity limits, particularly if peak capacity is rare.

Scalability

The scalability of your site is a primary consideration when you’re ready to upgrade it to improve availability, increase the number of concurrent users, or decrease its latency for faster response times. A site’s scalability goes hand in hand with its availability. Upgrading your site shouldn’t lead to unplanned or unnecessary downtime. You should take two types of scaling into consideration when upgrading your site: scaling up and scaling out. Scalability is discussed in more detail in Lesson 3: "Planning Network Capacity."

Lesson Summary

Four factors are important to capacity planning: network traffic, performance, availability, and scalability. Traffic is the interchange of incoming requests and outgoing responses between two computers. Traffic is often unpredictable and occurs in bursts and clumps. To determine the maximum rate of pages per second that your network can support, you should divide the bits per second of the network connection (such as a T1 line) by the bits generated for the page request. A server’s capacity isn’t the only factor to consider when determining bandwidth limitations. The client computer is limited by its connection to the Internet. The performance of Web applications is critical in determining the site’s capacity. Testing is the only way to find out the capacity and performance of a Web application. You can use the WCAT and WAST utilities to test an application. Different computer types will provide different performance for the same test, and the performance of different application types depends greatly on the application’s task. Availability is an important consideration when planning your network’s capacity requirements. You must determine whether the problems caused by your site’s unavailability offset the expense of trying to keep it online. The scalability of your site is a primary consideration when you’re ready to upgrade it in order to improve availability, increase the number of concurrent users, or decrease its latency for faster response times.