6.4 Building a Performance Model

Recall that the main question in this case study is to determine the maximum number of PDF and ZIP files that can concurrently be downloaded while satisfying a given service level agreement. A closed multiclass QN model is used to answer this question.

As specified in Section 6.2, the Web server consists of one CPU and four identical disks. To complete the parameterization of the model, the concurrency level for each class (i.e., file type) and the service demands at each device are needed.

6.4.1 Computing Concurrency Levels

From the preliminary workload analysis in Section 6.3, two types of classes are identified: downloads of PDF files and downloads of ZIP files.

The data from the log in Section 6.2 is used to estimate the mix of concurrent PDF versus ZIP file downloads. This mix represents the concurrency levels of the PDF and ZIP customer classes, respectively. The results from Section 5.7.4 are used here to compute the average concurrency level for each type of file. According to Eq. (5.7.27), the concurrency levels are computed as

Equation 6.4.5

Equation 6.4.6

where e_i,PDF and e_i,ZIP are the elapsed times of PDF and ZIP file downloads indicated in the WSData.XLS workbook, respectively. T is the measurement interval (i.e., 200 sec in the case of this log). Rounding the concurrency levels to the nearest integer yields 4 PDF downloads and 16 ZIP downloads. This ratio of 1:4 between these two types of files is used in the model to apportion the total customer population between these two classes.

6.4.2 Computing Service Demands

To complete the model parameterization, the service demands at the CPU and disk have to be computed for each customer class. The service demands are a function of the file sizes. In order to estimate these demands, the analyst conducts an experiment using a machine similar to the production Web server. A test server consisting of a single CPU and a single disk is sufficient. A set of n dummy files for each of six file sizes (e.g., 10 KB, 100 KB, 200 KB, 500 KB, 800 KB, and 1000 KB) are created and posted on the test server. Then, for each file size, the n files of that given size are downloaded from another machine while measuring the CPU and disk utilizations of the test server. The estimated service demands for the CPU and disk for a certain file size are then obtained from the Service Demand Law (see Chapter 3) as U_CPU/(n/T) and U_disk/(n/T), respectively, where T is the time taken to download all n files of the selected file size.

The results for the CPU times are plotted in Fig. 6.3, which shows, for each file size, the total CPU time (i.e., the CPU service demand) to download a file of a given size. The data points obtained in the experiment are shown in the graph by the dotted line. A trend line is added to these points using MS Excel (i.e., by right clicking on the dashed line and selecting Add Trend Line). A linear trend line is selected because visual inspection indicates a linear relationship between CPU time and file size. The linear regression (i.e., trend line) performed generates the relationship

Equation 6.4.7

where CPUTime is in msec and FileSize is in KB. The R² value (i.e., the coefficient of determination provided by MS Excel) obtained in this linear regression is 0.9969. The closer to one this coefficient, the better the regression line fits the experimental data. As a rule of thumb, an R² value above 0.95 indicates that the regression line adequately models the observed data.

Figure 6.3. CPU Time (in msec) vs. file size (in KB).

graphics/06fig03.gif

The same procedure is used to discover the relationship between the total I/O time (i.e., the disk demand) and the file size. The data points and trend line are shown in Fig. 6.4. The linear regression yields the equation

Equation 6.4.8

with a coefficient of determination equal to 0.9946.

Figure 6.4. I/0 Time (in msec) vs. file size (in KB).

graphics/06fig04.gif

The linear regression equations (i.e., Eqs. (6.4.7) and (6.4.8)) are used to compute the CPU and disk demands for each class for the production server as follows:

PDF files: The average size of a downloaded PDF file is 377.6 KB (i.e., from Fig. 6.1). Using Eq. (6.4.7), the service demand at the CPU for this class is

Equation 6.4.9

From the case study specification in Section 6.2, PDF files are stored on disks 1 and 2 and access to these files is evenly balanced between these disks. As a result, the total I/O time is equally split between these disks. Thus, the service demands for PDF files at disks 1 and 2, using Eq. (6.4.8), are:

Equation 6.4.10

Since no PDF files are stored on disks 3 and 4,

Equation 6.4.11
ZIP files: The average size of a downloaded PDF file is 1155.6 KB (i.e., from Fig. 6.2). Using Eq. (6.4.7), the service demand at the CPU for this class is

Equation 6.4.12

ZIP files are stored evenly across disks 3 and 4. Thus, the service demands for ZIP files at disks 3 and 4 are:

Equation 6.4.13

Since no ZIP files are stored on disks 1 and 2,

Equation 6.4.14

Thus, the service demands for the base QN model, referred to as the original configuration, are summarized in Table 6.1.
Table 6.1. Service Demands (in msec) for the Original Configuration
Resource
Class
PDF
ZIP
CPU
39.4
120.8
Disk 1
77.1
0.0
Disk 2
77.1
0.0
Disk 3
0.0
235.8
Disk 4
0.0
235.8

6.4.1 Computing Concurrency Levels

6.4.2 Computing Service Demands

Figure 6.3. CPU Time (in msec) vs. file size (in KB).

Figure 6.4. I/0 Time (in msec) vs. file size (in KB).

Table 6.1. Service Demands (in msec) for the Original Configuration