3.2 Basic Performance Results

This section presents the approach known as operational analysis [1], used to establish relationships among quantities based on measured or known data about computer systems. To see how the operational approach might be applied, consider the following motivating problem.

Motivating problem: Suppose that during an observation period of 1 minute, a single resource (e.g., the CPU) is observed to be busy for 36 sec. A total of 1800 transactions are observed to arrive to the system. The total number of observed completions is 1800 transactions (i.e., as many completions as arrivals occurred in the observation period). What is the performance of the system (e.g., the mean service time per transaction, the utilization of the resource, the system throughput)?

Prior to solving this problem, some commonly accepted operational analysis notation is required for the measured data. The following is a partial list of such measured quantities:

T: length of time in the observation period
K: number of resources in the system
B_i: total busy time of resource i in the observation period T
A_i: total number of service requests (i.e., arrivals) to resource i in the observation period T
A₀: total number of requests submitted to the system in the observation period T
C_i: total number of service completions from resource i in the observation period T
C₀: total number of requests completed by the system in the observation period T

From these known measurable quantities, called operational variables, a set of derived quantities can be obtained. A partial list includes the following:

S_i: mean service time per completion at resource i; S_i = B_i/C_i
U_i: utilization of resource i; U_i = B_i/T
X_i: throughput (i.e., completions per unit time) of resource i; X_i = C_i/T

l_i: arrival rate (i.e., arrivals per unit time) at resource i; l_i = A_i/T
X₀: system throughput; X₀ = C₀/T
V_i: average number of visits (i.e., the visit count) per request to resource i; V_i = C_i/C₀

Using the notation above, the motivating problem can be formally stated and solved in a straightforward manner using operational analysis. The measured quantities are:

graphics/063fig02.gif

Thus, the derived quantities are

graphics/063fig01.gif

Chapter 2 discussed the need to consider multiple class models to account for transactions with different demands on the various resources. The notation presented above can be easily extended to the multiple class case by considering that R is the number of classes and by adding the class number r (r = 1, ···, R) to the subscript. For example, U_i,r is the utilization of resource i due to requests of class r and X_0,r is the throughput of class r requests.

The subsections that follow discuss several useful relationships called operational laws between operational variables.

3.2.1 Utilization Law

As seen above, the utilization of a resource is defined as U_i = B_i/T. Dividing the numerator and denominator of this ratio by the number of completions from resource i, C_i, during the observation interval, yields

Equation 3.2.1

graphics/03equ21.gif

The ratio B_i/C_i is simply the average time that the resource was busy for each completion from resource i, i.e., the average service time S_i per visit to the resource. The ratio T/C_i is just the inverse of the resource throughput X_i. Thus, the relation known as the Utilization Law can be written as:

Equation 3.2.2

If the number of completions from resource i during the observation interval T is equal to the number of arrivals in that interval, i.e., if C_i = A_i, then X_i = l_i and the relationship given by the Utilization Law becomes U_i = S_i x l_i.

If resource i has m servers, as in a multiprocessor, the Utilization Law becomes U_i = (S_i x X_i)/m. The multiclass version of the Utilization Law is U_i,r = S_i,r x X_i,r.

Example 3.1.

The bandwidth of a communication link is 56,000 bps and it is used to transmit 1500-byte packets that flow through the link at a rate of 3 packets/second. What is the utilization of the link?

Start by identifying the operational variables provided or that can be obtained from the measured data. The link is the resource (K = 1) for which the utilization is to be computed. The throughput of that resource, X₁, is 3 packets/second. What is the average service time per packet? In other words, what is the average transmission time? Each packet has 1,500 bytes/packet x 8 bits/byte = 12,000 bits/packet. Thus, it takes 12,000 bits/56,000 bits/sec = 0.214 sec to transmit a packet over this link. Therefore, S₁ = 0.214 sec/packet. Using the Utilization Law, we compute the utilization of the link as S₁ x X₁ = 0.214 x 3 = 0.642 = 64.2%.

Example 3.2.

Consider a computer system with one CPU and three disks used to support a database server. Assume that all database transactions have similar resource demands and that the database server is under a constant load of transactions. Thus, the system is modeled using a single-class closed QN, as indicated in Fig. 3.1. The CPU is resource 1 and the disks are numbered from 2 to 4. Measurements taken during one hour provide the number of transactions executed (13,680), the number of reads and writes per second on each disk and their utilization, as indicated in Table 3.1. What is the average service time per request on each disk? What is the database server's throughput?

Figure 3.1. Closed QN model of a database server.

graphics/03fig01.gif

Table 3.1. Data for Example 3.2
Disk	Reads Per Second	Writes Per Second	Total I/Os Per Second	Utilization
1	24	8	32	0.30
2	28	8	36	0.41
3	40	10	50	0.54

The throughput of each disk, denoted by X_i (i = 2, 3, 4), is the total number of I/Os per second, i.e., the sum of the number of reads and writes per second. This value is indicated in the fourth column of the table. Using the Utilization Law, the average service time is computed as S_i as U_i/X_i. Thus, S₂ = U₂/X₂ = 0.30/32 = 0.0094 sec, S₃ = U₃/X₃ = 0.41/36 = 0.0114 sec, and S₄ = U₄/X₄ = 0.54/50 = 0.0108 sec.

The throughput, X₀, of the database server is given by X₀ = C₀/T = 13,680 transactions/3,600 seconds = 3.8 tps.

3.2.2 Service Demand Law

Service demand is a fundamental concept in performance modeling. The notion of service demand is associated both with a resource and a set of requests using the resource. The service demand, denoted as D_i, is defined as the total average time spent by a typical request of a given type obtaining service from resource i. Throughout its existence, a request may visit several devices, possibly multiple times. However, for any given request, its service demand is the sum of all service times during all visits to a given resource. When considering various requests using the same resource, the service demand at the resource is computed as the average, for all requests, of the sum of the service times at that resource. Note that, by definition, service demand does not include queuing time since it is the sum of service times. If different requests have very different service times, using a multiclass model is more appropriate. In this case, define D_i,r, as the service demand of requests of class r at resource i.

To illustrate the concept of service demand, consider that six transactions perform three I/Os on a disk. The service time, in msec, for each I/O and each transaction is given in Table 3.2. The last line shows the sum of the service times over all I/Os for each transaction. The average of these sums is 36.2 msec. This is the service demand on this disk due to the workload generated by the six transactions.

Table 3.2. Service times in msec for six requests
	Transaction No.
I/O No.	1	2	3	4	5	6
1	10	15	13	10	12	14
2	12	12	12	11	13	12
3	11	14	11	11	11	13
Sum	33	41	36	32	36	39

Service demands are important because, along with workload intensity parameters, they are input parameters for QN models. Fortunately, there is an easy way to obtain service demands from resource utilizations and system throughput. By multiplying the utilization U_i of a resource by the measurement interval T one obtains the total time the resource was busy. If this time is divided by the total number of completed requests, C₀, the average amount of time that the resource was busy serving each request is derived. This is precisely the service demand. So,

Equation 3.2.3

graphics/03equ23.gif

This relationship is called the Service Demand Law, which can also be written as D_i = V_i x S_i, by definition of the service demand (and since D_i = U_i/X₀ = (B_i/T)/(C₀/T) = B_i/C₀ = (C_i x S_i)/C₀ = (C_i/C₀) x S_i = V_i x S_i). In many cases, it is not easy to obtain the individual values of the visit counts and service times. However, Eq. (3.2.3) indicates that the service demand can be computed directly from the device utilization and system throughput. The multiclass version of the Service Demand Law is D_i,r = U_i,r/X_0,r = V_i,r x S_i,r.

Example 3.3.

A Web server is monitored for 10 minutes and its CPU is observed to be busy 90% of the monitoring period. The Web server log reveals that 30,000 requests are processed in that interval. What is the CPU service demand of requests to the Web server?

The observation period T is 600 (= 10 x 60) seconds. The Web server throughput, X₀, is equal to the number of completed requests C₀ divided by the observation interval; X₀ = 30,000/600 = 50 requests/sec. The CPU utilization is U_CPU = 0.9. Thus, the service demand at the CPU is D_CPU = U_CPU/X₀ = 0.9/50 = 0.018 seconds/request.

Example 3.4.

What are the service demands at the CPU and the three disks for the database server of Example 3.2 assuming that the CPU utilization is 35% measured during the same one-hour interval?

Remember that the database server's throughput was computed to be 3.8 tps. Using the Service Demand Law and the utilization values for the three disks shown in Table 3.1, yields: D_CPU = 0.35/3.8 = 0.092 sec/transaction, D_disk1 = 0.30/3.8 = 0.079 sec/transaction, D_disk2 = 0.41/3.8 = 0.108 sec/transaction, and D_disk3 = 0.54/3.8 = 0.142 sec/transaction.

3.2.3 The Forced Flow Law

There is an easy way to relate the throughput of resource i, X_i, to the system throughput, X₀. Assume for the moment that every transaction that completes from the database server of Example 3.2 performs an average of two I/Os on disk 1. That is, suppose that for every one visit that the transaction makes to the database server, it visits disk 1 an average of two times. What is the throughput of that disk in I/Os per second? Since 3.8 transactions complete per second (i.e., the system throughput, X₀) and each one performs two I/Os on average on disk 1, the throughput of disk 1 is 7.6 (= 2.0 x 3.8) I/Os per second. In other words, the throughput of a resource (X_i) is equal to the average number of visits (V_i) made by a request to that resource multiplied by the system throughput (X₀). This relation is called the Forced Flow Law:

Equation 3.2.4

The multiclass version of the Forced Flow Law is X_i,r = V_i,r x X_0,r.

Example 3.5.

What is the average number of I/Os on each disk in Example 3.2?

The value of V_i for each disk i, according to the Forced Flow Law, can be obtained as X_i/X₀. The database server throughput is 3.8 tps and the throughput of each disk in I/Os per second is given in the fourth column of Table 3.1. Thus, V₁ = X₁/X₀ = 32/3.8 = 8.4 visits to disk 1 per database transaction. Similarly, V₂ = X₂/X₀ = 36/3.8 = 9.5 and V₃ = X₃/X₀ = 50/3.8 = 13.2.

3.2.4 Little's Law

Conceptually, Little's Law [2] is quite simple and intuitively appealing. We describe the result by way of an analogy. Consider a pub. Customers arrive at the pub, stay for a while, and leave. Little's result states that the average number of folks in the pub (i.e., the queue length) is equal to the departure rate of customers from the pub times the average time each customer stays in the pub (see Fig. 3.2).

Figure 3.2. Little's Law.

graphics/03fig02.gif

This result applies across a wide range of assumptions. For instance, consider a deterministic situation where a new customer walks into the pub every hour on the hour. Upon entering the pub, suppose that there are three other customers in the pub. Suppose that the bartender regularly kicks out the customer who has been there the longest, every hour at the half hour. Thus, a new customer will enter at 9:00, 10:00, 11:00, ..., and the oldest remaining customer will be booted out at 9:30, 10:30, 11:30, .... It is clear that the average number of persons in the pub will be , since 4 customers will be in the pub for the first half hour of every hour and only 3 customers will be in the pub for the second half hour of every hour. The departure rate of customers at the pub is one customer per hour. The time spent in the pub by any customer is hours. Thus, via Little's Law:

graphics/070equ01.gif

Also, it does not matter which customer the bartender kicks out. For instance, suppose that the bartender chooses a customer at random to kick out. We leave it as an exercise to show that the average time spent in the pub in this case would also be hours. [Hint: the average time a customer spends in the pub is one half hour with probability 0.25, one and a half hours with probability (0.75)(0.25) = 0.1875 (i.e., the customer avoided the bartender the first time around, but was chosen the second), two and a half hours with probability (0.75)(0.75)(0.25), and so on.]

Little's Law is quite general and requires few assumptions. In fact, Little's Law holds as long as customers are not destroyed or created. For example, if there is a fight in the pub and someone gets killed or a if a pregnant woman goes into the pub and gives birth, Little's Law does not hold.

Little's Law applies to any "black box", which may contain an arbitrary set of components. If the box contains a single resource (e.g., a single CPU, a single pub) or if the box contains a complex system (e.g., the Internet, a city full of pubs and shops), Little's Law holds. Thus, Little's Law can be restated as

Equation 3.2.5

For example, consider the single server queue of Fig. 3.3. Let the designated box be the server only, excluding the queue. Applying Little's Law, the average number of customers in the box is interpreted as the average number of customers in the server. The server will either have a single customer who is utilizing the server, or the server will have no customer present. The probability that a single customer is utilizing the server is equal to the server utilization. The probability that no customer is present is equal to the probability that the server is idle.

Figure 3.3. Single server.

graphics/03fig03.gif

Thus, the average number of customers in the server equals

Equation 3.2.6

This simply equals the server's utilization. Therefore, the average number of customers in the server, N^s, equals the server's utilization. The departure rate at the server (i.e., the departure rate from the box) equals the server throughput. The average time spent by a customer at the server is simply the mean service time of the server. Thus, with this interpretation of Little's Law, . This result is simply the Utilization Law!

Now consider that the box includes both the waiting queue and the server. The average number of customers in the box (waiting queue + server), denoted by N_i, is equal, according to Little's Law, to the average time spent in the box, which is the response time R_i, times the throughput X_i. Thus, N_i = R_i x X_i. Equivalently, by measuring the average number of customers in a box and measuring the output rate (i.e., the throughput) of the box, the response time can be calculated by taking the ratio of these two measurements.

Finally, by considering the box to include just the waiting line (i.e., the queue but not the server), Little's Law indicates that , where is the average number of customers in the queue and W_i the average waiting time in the queue prior to receiving service.

Example 3.6.

Consider the database server of Example 3.2 and assume that during the same measurement interval the average number of database transactions in execution was 16. What was the response time of database transactions during that measurement interval?

The throughput of the database server was already determined as being 3.8 tps. Apply Little's Law and consider the entire database server as the box. The average number in the box is the average number N of concurrent database transactions in execution (i.e., 16). The average time in the box is the average response time R desired. Thus, R = N/X₀ = 16/3.8 = 4.2 sec.

3.2.5 Interactive Response Time Law

Consider an interactive system composed of M clients each sitting at their own workstation and interactively accessing a common database server system. Clients work independently and alternate between "thinking" (i.e., composing requests for the server) and waiting for a response from the server. The average think time is denoted by Z and the average response time is R. See Fig. 3.4. The think time is defined as the time elapsed since a customer receives a reply to a request until a subsequent request is submitted. The response time is the time elapsed between successive think times by a client.

Figure 3.4. Interactive computer system.

graphics/03fig04.gif

Let and be the average number of clients thinking and the average number of clients waiting for a response, respectively. By viewing clients as moving between workstations and the database server, depending upon whether or not they are in the think state, and represent the average number of clients at the workstations and at the database server, respectively. Clearly, since a client is either in the think state or waiting for a reply to a submitted request. By applying Little's Law to the box containing just the workstations,

Equation 3.2.7

since the average number of requests submitted per unit time (throughput of the set of clients) must equal the number of completed requests per unit time (system throughput X₀). Similarly, by applying Little's Law to the box containing just the database server,

Equation 3.2.8

where R is the average response time. By adding Eqs. (3.2.7) and (3.2.8),

Equation 3.2.9

With a bit of algebra,

Equation 3.2.10

graphics/03equ210.gif

This is an important formula known as the Interactive Response Time Law.

Example 3.7.

If 7,200 requests are processed during one hour by an interactive computer system with 40 clients and an average think time of 15 sec, the average response time is

Equation 3.2.11

graphics/03equ211.gif

Example 3.8.

A client/server system is monitored for one hour. During this time, the utilization of a certain disk is measured to be 50%. Each request makes an average of two accesses to this disk, which has an average service time equal to 25 msec. Considering that there are 150 clients and that the average think time is 10 sec, what is the average response time?

The known quantities are: U_disk = 0.5, V_disk = 2, S_disk = 0.025 sec, M = 150, and Z = 10 sec. From the Utilization Law,

Thus, X_disk = 0.5/0.025 = 20 requests/sec. From the Forced Flow Law,

graphics/074equ02.gif

Finally, from the Interactive Response Time Law,

The multiclass version of the Interactive Response Time Law is R_r = M_r/X_0,r Z_r. Figure 3.5 summarizes the main relationships discussed in the previous sections.

Figure 3.5. Summary of Operational Laws.

graphics/03fig05.gif