Lesson 2: Calculating User Costs

You can determine the appropriate capacity level for your Web site by measuring the number of visitors the site receives and the demand each user places on the server and then calculating the computing resources (CPU, RAM, disk space, and network bandwidth) that are necessary to support current and future usage levels. Site capacity is determined by the number of users, server capacity and configuration of hardware and software, and site content. As the site attracts more users, capacity must increase or users will experience performance degradation. By upgrading the computing infrastructure, you can increase the site’s capacity, thereby allowing more users, more complex content, or a combination of the two. However, before you can plan your network’s capacity requirements or determine an upgrade strategy, you must be able to calculate the costs imposed on the system by each user. This lesson explains how you can calculate those costs so that you can plan your network’s capacity.

After this lesson, you will be able to

• Calculate costs of CPU, memory, and disk usage
• Calculate network bandwidth

Overview of Calculating Costs

At its most basic level, capacity planning can be expressed as a simple equation:

Number of supported users = hardware capacity ÷ load on hardware per user

In this equation, number of supported users refers to concurrent users and hardware capacity refers to both server and network capacity.

Generally, capacity planning is based on two concepts:

• You can decrease the load that each user puts on the hardware, or you can increase the number of supported users. This is done through planning, programming, and configuring the site content to make more efficient use of computing resources.
• You can configure the site infrastructure to increase hardware capacity, or you can increase the number of supported users. Options include scaling the hardware out (adding more servers) or up (upgrading the existing servers).

If you want to increase the complexity of the site’s content, thereby increasing the load on hardware per user, and still maintain the number of supported users, then you must increase the site’s hardware capacity. This supports either the scaling up or scaling out decision. However, if you want to be able to support more users, you must either simplify the site content or increase hardware capacity. This supports the decision to decrease the load on hardware per user.

Operational Parameters

The primary benchmark to use when determining whether a Web site is operating at an acceptable level is latency, or how long a user has to wait for a page to load once a request has been made. Note that although some servers may be capable of handing every request they receive, the load on the servers might create unacceptable wait times, requiring a better performing solution if the site is to operate efficiently and at a level of service users are willing to accept.

In general, static content such as normal HTML pages and graphics don’t contribute to server latency nearly as much as dynamic content such as ASP pages or other content that requires database lookups. Even when a Web server is able to deliver a large number of ASP pages per second, the turnaround time per ASP page can be unacceptable. The chart shown in Figure 7.4 illustrates the latency experienced by users of a four-processor Web server as the number of users and ASP requests increases.

Figure 7.4 - ASP pages per second versus latency

This site’s capacity is between 700 and 800 concurrent shoppers per second. Note that the wait time rises dramatically as the number of users exceeds 800. This is unacceptable. This server’s performance peaks at just over 16 ASP requests per second. At this point the users are waiting roughly 16 seconds for their pages, due to extensive context switching.

Calculating Cost Per User

You should take five steps to determine your cost per user: analyzing the typical user, calculating CPU cost, calculating memory cost, calculating disk cost, and calculating network cost. In order to illustrate each of these steps, a fictitious company—Northwind Traders—is used to demonstrate how user costs are calculated. Sample test results, used to simulate capacity testing through such means as the WAST test utility, are provided as needed, along with any additional data necessary to calculate cost per user.

Calculating user costs is an involved and detailed process. The purpose of this lesson is merely to give you an overview of that process and try to illustrate, through the use of examples, how the basic calculations are made. It’s strongly recommended that you study additional sources so that you have a complete understanding of the process of calculating costs.

Analyzing the Typical User

The first step you need to take in calculating user costs is to determine how the typical user will use your site. By determining the operations that users perform and how often they perform them, you can estimate how much demand a user places on the system. For the purpose of this lesson, the term operation refers to all files that are included in the processing of the primary page. This can include graphic files, ASP include files, or other supporting files. To the user it seems like a single operation or page.

To do this, you must compile a user profile by analyzing the site’s usage log files. The more accurate the user profile, the more accurate capacity planning will be. For this reason, it’s better to use logs gathered over a long period of time (at least a week) to obtain accurate averages.

You should gather the following types of data:

• The number of visitors the site receives
• The number of hits each page receives
• The rate at which transactions take place

You can use the number of visits to each page to profile typical operations for the site. Table 7.4 provides a profile report for a typical user of the Northwind Traders Web site. The table contains the simulated results gathered from test data.

Table 7.4 Typical User Profile for Northwind Traders Web Site

Operations Operation per Second per Session Operations Percent of (Frequency) Total

0.24

0.00033

2.00%

0.02

0.00003

0.17%

0.04

0.00006

0.33%

0.75

0.00104

6.25%

Default

1.00

0.00139

8.33%

Listing

2.50

0.00347

20.83%

Lookup

0.75

0.00104

6.25%

New

0.25

0.00035

2.08%

Product

4.20

0.00583

35.00%

Search

1.25

0.00174

10.42%

Welcome

1.00

0.00139

8.33%

TOTAL

12.0

0.01667

99.99%

This table shows the 11 operations that account for nearly 100 percent of the hits received by the entire site. On big sites, the load might be distributed over a larger set of operations. As a rule, you should generate a report that lists the pages or operations responsible for at least 90 percent of the site’s total hit count.

Calculating CPU Cost

In a typical environment, the Web servers place more demand on the CPU than the data servers do. As a result, this example focuses on measuring the CPU capacity on the Web servers. However, you should perform these calculations on all types of servers in the site where CPU power might become a bottleneck.

Page requests per second and CPU use grow with the number of users. However, when CPU use reaches the maximum, it results in lower page requests per second. Therefore, the number of page requests processed per second at the point at which CPU use reaches 100 percent is the maximum.

Before you can calculate an operation’s CPU cost, you need to know the following information:

• Page throughput (requests per second)
• CPU utilization (percentage of available CPU at optimum page throughput)
• Number of times a page is used per operation (request per operation)
• Upper bound of your CPU

You can calculate an operation’s cost by multiplying the number of pages by the cost per page. This calculation is based on megacycles (MC). The MC is a unit of processor work; 1 MC is equal to 1 million CPU cycles. As a unit of measure, the MC is useful for comparing performance between processors because it’s hardware independent. For example, a dual-processor 400 MHz Pentium II has a total capacity of 800 MC.

The first step in calculating the CPU cost per user is to calculate the CPU cost per operation (in MC). You can use the following formula to calculate the cost:

CPU usage ÷ Requests per second × Requests per operation = Cost per operation

To calculate the CPU usage, use the following formula:

CPU utilization × Number of CPUs × Speed of the CPUs (in MHz) = CPU usage

For example, suppose your computer is a dual-processor 400 MHz Pentium II. A browse operation results in 11.5 requests per second with CPU utilization of 84.10 percent. There are two page requests per operation. You’d first determine the CPU usage, as follows:

0.8410 × 2 × 400 = 672.8

You can then determine the CPU usage per operation, as follows:

672.8 ÷ 11.5 × 2 = 117.01 MC

Table 7.5 provides the CPU cost for each of the main operations of the Northwind Traders Web site. The figures are based on a dual-processor 400 MHz Pentium II.

Table 7.5 CPU Costs per Operation

Operation CPU Utilization Requests/Sec Requests per Operation CPU Cost per Operation (MC)

96.98%

23.31

2

66.57

94.31%

18.48

7

285.79

95.86%

22.29

4

137.62

91.73%

16.81

1

43.65

Default

98.01%

102.22

1

7.67

Listing

91.87%

21.49

1

34.20

Lookup

99.52%

75.40

2

21.19

New

96.61%

65.78

2

23.50

Product

94.81%

18.23

1

41.61

Search

95.11%

37.95

2

40.10

Welcome

96.97%

148.93

1

5.21

Once you’ve determined each operation’s CPU cost, you can calculate the CPU cost per user by using the following formula:

Cost per operation × Operations per second = Cost per user

For example, the cost of an Add Item operation is 66.57 MC. Based on the profile of the typical user, you know that the number of operations per second is 0.00033. So you’d use the following calculation to determine the cost per typical user:

66.57 × 0.00033 = 0.0222 MC

Table 7.6 shows the CPU usage for the typical user for each operation.

Table 7.6 CPU Costs per User

Operation CPU Cost per Operation (MC) Operations per Second (frequency) CPU Usage per User (MC)

66.57

0.00033

0.0220

285.79

0.00006

0.0171

137.62

0.00104

0.1431

43.65

0.00003

0.0013

Default

7.67

0.00139

0.0107

Listing

34.20

0.00347

0.1187

Lookup

21.19

0.00104

0.0220

New

23.50

0.00035

0.0082

Product

41.61

0.00583

0.2426

Search

40.10

0.00174

0.0698

Welcome

5.21

0.00139

0.0072

TOTAL

0.6627

The total indicates that the cost of the total user profile is 0.6627 MC per user. This number reflects the cost of an average user performing the operations described by the user profile. You can use this number to estimate the site’s capacity, based on the assumed user profile.

For example, suppose the upper bound for your dual-processor 400 MHz Pentium II is 526 MHz. The cost of 100 concurrent users is 100 × 0.6627 = 66.27 MC. The cost for 790 users is 523.53 MC. Both numbers of users are within the limits of the upper bound. However, more than 790 concurrent users would exceed your CPU capacity.

If you can’t upgrade or add processors to increase your CPU capacity, you can take two main steps to improve CPU efficiency:

• Limit connections. Consider reducing the maximum number of connections that each IIS 5.0 service accepts. Although limiting connections can result in connections that are blocked or rejected, it helps ensure that accepted connections are processed promptly.

Calculating Memory Cost

Some Web services, such as IIS 5.0, run in a pageable user-mode process. The process in IIS 5.0 is called Inetinfo (INETINFO.EXE). When a process is page-able, the system can remove part or all of it from RAM and write it to disk if there isn’t enough free memory.

If part of the process is paged to disk, the service’s performance suffers, as shown in Figure 7.5.

Figure 7.5 - Paging a process to disk

It’s very important to make sure that your server or servers have enough RAM to keep the entire process in memory at all times. The Web, File Transfer Protocol (FTP), and Simple Mail Transfer Protocol (SMTP) services in IIS run in the Inetinfo process. Each of the current connections is also given about 10 KB of memory in the Inetinfo working set (assuming that the application is running in-process).

The working set of the Inetinfo process should be large enough to contain the IIS object cache, data buffers for IIS 5.0 logging, and the data structures that the Web service uses to track its active connections.

You can use System Monitor to monitor the working set of INETINFO.EXE. Because ISAPI DLLs run in the out-of-process pool by default, you’ll need to monitor them separately from INETINFO.EXE. unless you’ve changed that setting, and you’ll also need to be aware that out-of-process counter information is added together. This makes it difficult to single out any one process or application. (If your site uses custom ISAPI DLLs, those DLLs should incorporate their own counters so that you can monitor them individually.)

You should log this data for several days. You can use performance logs and alerts in System Monitor to identify times of unusually high and low server activity.

If the system has sufficient memory, it can maintain enough space in the Inetinfo working set so that IIS 5.0 rarely needs to perform disk operations. One indicator of memory sufficiency is how much the size of the Inetinfo process working set varies in response to general memory availability on the server. Make sure to examine data collected over time, because these counters display the last value observed rather than an average.

IIS 5.0 relies on the operating system to store and retrieve frequently used Web pages and other files from the File System Cache. The File System Cache is particularly useful for servers of static Web pages, because Web pages tend to be used in repeated, predictable patterns.

If cache performance is poor when the cache is small, use the data you’ve collected to deduce the reason that the system reduced the cache size. Note the available memory on the server and the processes and services running on the server, including the number of simultaneous connections supported.

When you add physical memory to your server, the system allocates more space to the file system cache. A larger cache is almost always more efficient, but typically it’s a case of diminishing returns—each additional megabyte of memory adds less efficiency than the previous one. You must decide where the trade-off point is: the point at which adding more memory gets you so little improvement in performance that it ceases to be worthwhile.

Servers running IIS 5.0, like other high-performance file servers, benefit from ample physical memory. Generally, the more memory you add, the more the servers use and the better they perform. IIS 5.0 requires a minimum of 64 MB of memory, but at least 128 MB is recommended. If you’re running memory-intensive applications, your server could require a much larger amount of memory to run optimally. For example, most of the servers that service the MSNBC Web site have at least 1 GB of memory.

Because memory usage doesn’t directly relate to the number of concurrent users but rather to the content of the site (caching, out-of-process DLLs, etc.), a cost per user can’t be calculated aside from the 10 KB per connection. In this instance you should monitor

• The amount of Inetinfo that’s paged out to disk
• Memory usage during site operation
• The efficiency of the cache utilization, or the cache-hit ratio
• The number of times the cache is flushed
• The number of page faults that occur

Calculating Disk Cost

Web services such as IIS 5.0 write their logs to disk, so there’s usually some disk activity, even when clients are hitting the cache 100 percent of the time. Under ordinary circumstances, disk activity, other than that generated by logging, serves as an indicator of issues in other areas. For example, if your server needs more RAM, you’ll see a lot of disk activity because there are many hard page faults. But there will also be a lot of disk activity if your server houses a database or your users request many different pages.

Since IIS caches most pages in memory, the disk system is rarely a bottleneck as long as the Web servers have sufficient installed memory. However, the Microsoft SQL Server computer does read and write to the disk on a frequent basis. SQL Server also caches data but uses the disk a lot more than IIS. For that reason, the capacity testing for Northwind Traders focuses on the SQL Server computer. However, you should calculate capacity on all servers where disk activity could become a bottleneck. You can use a tool such as System Monitor to record a site’s disk activity while a WAST script is running for each operation.

This section is concerned with disk utilization and the effects of read and write operations. Determining whether you have adequate disk storage is a process separate from this one. Storage requirements are discussed in Lesson 3, "Planning Network Capacity."

For the SQL Server computer used by Northwind Traders, the percentage of disk utilization is based on a calibration of a maximum of 280 random seeks per second. For example, when the Pentium II server generates 2.168 Add Item operations, the SQL Server computer performs 9.530 disk seeks (for a disk utilization of 3.404 percent). You should calculate disk cost by dividing disk seeks per second by operations per second (which you’ll have determined as part of the user profile). In this case the Add Item operation generates 4.395 disk seeks per shopper operation.

You can use the following equations to determine disk costs:

Disk reads per second ÷ Operations per second = Disk read cost per operation

Disk writes per second ÷ Operations per second = Disk write cost per operation

Table 7.7 illustrates the results from calculating disk reads on the Northwind Traders site.

Operation Disk Reads per Second Operations per Second Percentage of Disk Disk Cost

9.530

2.168

3.404%

4.396

7.050

8.728

2.518%

0.808

Checkout

19.688

0.903

7.031%

21.803

Clearitems

8.956

9.384

3.199%

0.954

Default

0.248

28.330

0.089%

0.009

Delitem

4.628

3.633

1.653%

1.274

Listing

0.148

5.533

0.053%

0.027

Lookup

0.063

12.781

0.023%

0.005

Lookup_new

9.275

12.196

3.313%

0.760

Main

0.120

8.839

0.043%

0.014

Browse

0.103

6.033

0.037%

0.017

Search

0.100

8.205

0.036%

0.012

Welcome

0.080

31.878

0.029%

0.003

Once you’ve calculated the disk cost per read operation, you must calculate the disk costs of write operations.

You can then use your calculations for your read and write operations to calculate your disk costs per user per second. As in your CPU cost calculations, you should multiply your cost per operation by operations per second. For example, if your disk cost for a Default read operation is 0.0009 and your usage for that operation is 0.003804 operations per second, your disk cost per user for that operation is 0.0000034 kilobytes per second (KBps).

Once you’ve calculated the cost per operation, you can add those costs together to arrive at the total disk load per user per second. You can then use that number to determine your disk system’s capacity, which will be based on the load supported by your particular disk configuration.

Calculating Network Cost

Network bandwidth is another important resource that can become a bottleneck. You can calculate total network cost from the sum of the costs of the individual operations. However, two network costs are associated with each shopper operation: the connection between the Web client and the Web server and the connection between the SQL Server computer and the Web server. Sites that are more complex can have more types of connections, depending on the number of servers and the site’s architecture.

On a switched Ethernet LAN, traffic is isolated so network costs aren’t added together. On an unswitched Ethernet LAN, network traffic is cumulative so network costs are added together.

When a user performs an operation, the action generates network traffic between the Web server and the Web client, as well as between the Web server and the data server (if a database needs to be accessed).

For example, suppose the Add Item operation for Northwind Traders shows that optimal throughput is 0.000293 operations per second. The network cost of Add Item is 5.627 KBps per operation between the Web client and the Web server and 129.601 KBps between the Web server and the SQL Server computer, as shown in Figure 7.6. Most of the traffic generated by the Add Item operation is between the Web server and the SQL Server database.

Figure 7.6 - Network costs of an Add Item operation

You can figure out the network cost per user per operation by using the following formula:

(Operations/sec. × Web network cost) + (Operations/sec. × Data network cost) = Cost per user per second

For example, you’d use the following calculation to determine the cost per user per second for the Add Item operation:

(0.000293 × 5.627) + (0.000293 × 129.601) = 0.039622

Table 7.8 shows the total bytes transmitted per operation (total network cost per user per second) on an unswitched Ethernet LAN. Web network cost represents the bytes transmitted per operation between the Web client and the Web server. Data network cost represents the bytes transmitted per operation between the SQL Server computer and the Web server.

Table 7.8 Network Costs for Northwind Traders

Operation Usage Profile Operations per Second Web Network Cost Data Network Cost Cost per User per Second (Web Server) Cost per User per Second (SQL Server) Total Cost per User per Second

0.000293

5.627

129.601

0.001649

0.037973

0.039622

0.000183

24.489

55.215

0.004481

0.010104

0.014586

0.003804

1.941

0

0.007384

0

0.007384

Listing

0.000421

25.664

23.134

0.010805

0.009739

0.020544

0.000288

17.881

1.380

0.00515

0.000397

0.005547

Product

0.006102

21.548

21.051

0.131486

0.128453

0.259939

Register

0.000176

5.627

129.601

0.00099

0.02281

0.0238

0.000170

20.719

10.725

0.003522

0.001823

0.005345

Search (Good)

0.002391

20.719

10.725

0.049539

0.025643

0.075183

TOTAL

0.45195

The network cost per user is 0.45195 KBps. You can use this figure to calculate the total network traffic. For example, if 100 concurrent users are accessing your site, the total network traffic is 45.195 KBps. For 10,000 users, the total traffic is 4,519.5, and for 20,000 users, the total traffic is 9,039 KBps.

If the network is a Carrier Sense Multiple Access with Collision Detection (CSMA/CD) Ethernet network running at 100 Mbps, or 12.5 megabytes per second (MBps), collisions will cause network congestion. For this reason, you shouldn’t push network utilization above 36 percent, which means no more than 4.5 MBps on the network. The Northwind Traders network reached the 4.5 MBps threshold at about 10,000 users, which is the site’s capacity. At 20,000 users, the network will become congested due to excessive collisions and a bottleneck will result.

Remember to measure network traffic for the entire site and not just for individual servers.

As your site grows, network capacity can become a bottleneck, especially on sites where the ASP content is relatively simple (low CPU load) and the content (like static HTML or pictures) is relatively large. A few servers can easily serve the content to thousands of users, but the network might not be equipped to handle it. In some cases most of the traffic on the network flows between the Web server and the SQL Server computer.

Lesson Summary

MCSE Training Kit (Exam 70-226): Designing Highly Available Web Solutions with Microsoft Windows 2000 Server Technologies (MCSE Training Kits)
ISBN: 0735614253
EAN: 2147483647
Year: 2001
Pages: 103