Recall that the main question in this case study is to determine the maximum number of PDF and ZIP files that can concurrently be downloaded while satisfying a given service level agreement. A closed multiclass QN model is used to answer this question.
As specified in Section 6.2, the Web server consists of one CPU and four identical disks. To complete the parameterization of the model, the concurrency level for each class (i.e., file type) and the service demands at each device are needed.
6.4.1 Computing Concurrency Levels
From the preliminary workload analysis in Section 6.3, two types of classes are identified: downloads of PDF files and downloads of ZIP files.
The data from the log in Section 6.2 is used to estimate the mix of concurrent PDF versus ZIP file downloads. This mix represents the concurrency levels of the PDF and ZIP customer classes, respectively. The results from Section 5.7.4 are used here to compute the average concurrency level for each type of file. According to Eq. (5.7.27), the concurrency levels are computed as
where ei,PDF and ei,ZIP are the elapsed times of PDF and ZIP file downloads indicated in the WSData.XLS workbook, respectively. T is the measurement interval (i.e., 200 sec in the case of this log). Rounding the concurrency levels to the nearest integer yields 4 PDF downloads and 16 ZIP downloads. This ratio of 1:4 between these two types of files is used in the model to apportion the total customer population between these two classes.
6.4.2 Computing Service Demands
To complete the model parameterization, the service demands at the CPU and disk have to be computed for each customer class. The service demands are a function of the file sizes. In order to estimate these demands, the analyst conducts an experiment using a machine similar to the production Web server. A test server consisting of a single CPU and a single disk is sufficient. A set of n dummy files for each of six file sizes (e.g., 10 KB, 100 KB, 200 KB, 500 KB, 800 KB, and 1000 KB) are created and posted on the test server. Then, for each file size, the n files of that given size are downloaded from another machine while measuring the CPU and disk utilizations of the test server. The estimated service demands for the CPU and disk for a certain file size are then obtained from the Service Demand Law (see Chapter 3) as UCPU/(n/T) and Udisk/(n/T), respectively, where T is the time taken to download all n files of the selected file size.
The results for the CPU times are plotted in Fig. 6.3, which shows, for each file size, the total CPU time (i.e., the CPU service demand) to download a file of a given size. The data points obtained in the experiment are shown in the graph by the dotted line. A trend line is added to these points using MS Excel (i.e., by right clicking on the dashed line and selecting Add Trend Line). A linear trend line is selected because visual inspection indicates a linear relationship between CPU time and file size. The linear regression (i.e., trend line) performed generates the relationship
where CPUTime is in msec and FileSize is in KB. The R2 value (i.e., the coefficient of determination provided by MS Excel) obtained in this linear regression is 0.9969. The closer to one this coefficient, the better the regression line fits the experimental data. As a rule of thumb, an R2 value above 0.95 indicates that the regression line adequately models the observed data.
Figure 6.3. CPU Time (in msec) vs. file size (in KB).
The same procedure is used to discover the relationship between the total I/O time (i.e., the disk demand) and the file size. The data points and trend line are shown in Fig. 6.4. The linear regression yields the equation
with a coefficient of determination equal to 0.9946.
Figure 6.4. I/0 Time (in msec) vs. file size (in KB).
The linear regression equations (i.e., Eqs. (6.4.7) and (6.4.8)) are used to compute the CPU and disk demands for each class for the production server as follows: