Consider a large software company that uses an internal Web server to allow its programmers, testers, and documentation personnel to download two types of files: 1) PDF files containing documents and manuals, and 2) ZIP files containing software files (e.g., source code and executables). The Web server has one CPU and four identical disks. PDF files are stored on disks 1 and 2 in such a way that access to these files is evenly distributed between these two disks. Similarly, ZIP files are stored on disks 3 and 4 in a way that balances the load on these two disks. The main question of interest in this case study is: "What is the maximum number of concurrent PDF and ZIP file downloads that can be in progress in order to satisfy a certain prespecified SLA?"
The Web log contains one entry for each downloaded file, including its type and size. The worksheet Log of the MS Excel WSData.XLS workbook includes 1,000 entries for file downloads captured over 200 seconds during a peak hour. A sample of the first six entries in this worksheet is given below:
File Type Size (KB) Elapsed Time (sec) PDF 303 1.43 ZIP 1233 5.81 ZIP 1077 5.08 PDF 315 1.48 ZIP 1240 5.84 PDF 413 1.95 . . . . . . . . .
The elapsed time column is the total time spent at the server to download the associated file. This time can be recorded in the HTTP log. For example, the elapsed time is captured in Microsoft's Internet Information Server (IIS) by selecting the "Time Taken" field in the Extended Logging Option.