15.2 Modeling High Service Time Variability | Performance by Design: Computer Capacity Planning By Example

Examples of high variability abound in computer systems. For instance, the Web shows extreme variability in workload characteristics, including file sizes and file popularity [15]. As another example, it has been shown that network arrivals, measured by packet interarrival times, are not exponentially distributed [28]. The authors also show that Poisson models underestimate the burstiness of TCP traffic. The question one may ask is: what are the sources of variability in computer systems? They are various. Here are some examples: sizes of data objects [27], files stored on Web servers [15], flow lengths traveling through the Internet, files stored in file systems [17], and process lifetimes. It is important to be able to model high service time variability, because increasing variability usually affects the performance of a system. As pointed out in [6], even a small amount of HTTP traffic burstiness degrades the throughput of a Web server. Results shown in Ref. [19] illustrate that the queuing dynamics within IP routers can be qualitatively different depending on whether the network traffic exhibits variability.

The variability of a dataset can be captured by the coefficient of variation (CV), defined as the ratio of standard deviation (s) to the mean (m).

Equation 15.2.1

The characteristics of the distribution function of a random variable show if the random variable has high variability. For example, the tail of a distribution (i.e., the probability that the random variable is greater than a given value) shows the likelihood of large observations. Distributions whose tail decline with a power law are called heavy-tailed. Power-laws are expressions of the form y ~ x^a, where a is a constant and x and y are the measures of interest. Measurements of several systems (e.g., Internet, WWW, Ethernet traffic, file systems) show numbers that exhibit a pattern known as "heavy-tailed" [15, 17, 27]. This means that very large numbers could be present in samples of measurements, such as number of bytes transmitted, file sizes and CPU times. These extremely large numbers do affect the performance of a system and should not be ignored. A random variable X follows a heavy-tailed distribution if P[X > x] = kx^a L(x) where L(x) is a slowly varying function. A Pareto distribution is a special case of heavy-tailed distribution where the number of events larger than x is an inverse power of x. Among other implications, a heavy-tailed distribution presents a great degree of variability, and a non-negligible probability of high sample values. Common distributions, such as the exponential distribution, decay much faster than heavy-tailed distributions. In a log-log plot, x^a is a straight line with inclination a. Depending on the value of a, a power law distribution may have infinite mean (if a < 1) or infinite variance (if a < 2). In practice, these stastics do not occur because values are limited by measured data. Once the importance of variability is recognized, this section presents techniques to analyze models with non-exponential FCFS queues. These techniques were consolidated by Eager, Sorin, and Vernon in [18].

15.2.1 The Decomposition Approach

Consider a closed single-class queuing model with N customers and a FCFS queuing center G that has exponentially distributed service time (see Fig. 15.1). Let S_G be the average service time at this service center. When the service time of service center G is not exponentially distributed and has a coefficient of variation (CV_G), the response time at G can be approximated by

Equation 15.2.2

where U_G(N) is the utilization of the service center G, n_G(N) is the average queue length at G, and L is an estimate [31] (given below) of the mean residual service time of the customer in service at an arrival instant, considering that arrivals do occur at random points in time. Note that Eq. (15.2.2) uses the Schweitzer [35] approximation for the response time equation already discussed in Chapter 13.

Equation 15.2.3

The approximation of Eq. (15.2.2) was found to be too pessimistic for centers that have high coefficient of variation for the service time [18].

Figure 15.1. FCFS service center with high service time variability.

graphics/15fig01.gif

An alternative approach to solving QN models with high service time variability was proposed in [41]. First of all, the service center G with high service time variability is synthesized as having a hyperexponential service time distribution as shown in Fig. 15.2. With probability p_S, customers who visit service center G experience a "short" exponentially distributed service time with average S_S. With probability p_L = 1 p_S, customers have a "long" exponentially distributed service time with average S_L. The problem now is given an empirical service time distribution at service center G with average service time S_G and coefficient of variation CV_G find the values of p_S, p_L, S_S and S_L that match S_G and CV_G. It can be easily shown (see Exercise 15.1) that

Equation 15.2.4

Equation 15.2.5

graphics/15equ25.gif

This system of equations has three unknowns (p_S, S_S, and S_L) and two equations. To solve it, one needs to guess one of the values. A good variable to guess is the probability p_S.

Figure 15.2. Model of a queue with high service time variability.

graphics/15fig02.gif

One can now explain, with the help of Fig. 15.3, how the approach proposed in [41] works. The QN on the left of the figure is replaced by two QNs. In each one of them, the hyperexponential service center is replaced by an exponential service center with average service time S_S in one case and S_L in the other. These two QNs are product form QNs and can be solved by the methods discussed previously in the book. Based directly on the theory of near-complete decomposability [14], the performance metrics of the original network (i.e., the one with the hyperexponential service center) can be obtained as the weighted averages of the performance metrics of the two product-form QNs. The weights are the probabilities p_S and p_L. The decomposition approach captures well the behavior of models with high service time variability. This technique yields results within 15% of the exact values over the large region of the parameter space [18]. A disadvantage of this method is its complexity for solving models that contain many centers with high service time variability.

Figure 15.3. Model of a QN with a two-stage hyperexponential service center.

graphics/15fig03.gif

15.2.2 New MVA-based Decomposition Approach

Instead of decomposing the model into two independent QN models, the approach described in this section applies the decomposition principle only to each individual service center with high service time variability [14]. This is done by modifying the MVA equations for these service centers only and keeping the regular MVA equations for the other service centers.

Let G be a service center with high service time variability and let D_S and D_L be the service demands at G considering the short and long service times discussed earlier. Then, the residence time at G can be written as

Equation 15.2.6

where p_L and p_S are as before and and are computed as follows

Equation 15.2.7

Equation 15.2.8

Equation 15.2.9

graphics/15equ29.gif

Equation 15.2.10

graphics/15equ210.gif

Eqs. (15.2.7) and (15.2.8) are the well-known MVA residence time equations with the difference that they use the proper service demand (long or short) and the proper expressions for queue length given in Eqs. (15.2.9) and (15.2.10). The queue length equations for n_L(N) and n_S(N) follow the well-known MVA queue length equations with the difference that, in each case, the throughput (i.e., N divided by the response time) includes the short or long residence time plus the residence time, R_rest(N), in the rest of the QN.

Example 15.1.

Consider a mail server with a powerful processor and one large disk subsystem. The analyst wants to model the system to analyze the future performance of the server under different workload scenarios. When choosing the abstraction to model the mail server, it was decided to use a closed QN model because during peak times the mail server has a constant (i.e., maximum) number of processes handling incoming and outgoing messages. A study of mail server workloads [9] shows that message size statistics exhibit a large coefficient of variation. Assuming the processor time to process a message is proportional to the size of the message, it is necessary to include high service time variability at the processor in the model. Assume that the following input parameters were obtained through measurements: CV_p = 4, N = 200, D_p = 0.0060 sec, and D_d = 0.0071 sec, where "p" and "d" stand for processor and disk, respectively. The question is: what are the throughput and response time of the mail server?

Consider two solution techniques. The first ignores the high variability for the service time at the processor and the other uses the modeling approach described earlier.

The processor is modeled as an exponential FCFS center.
Using the regular MVA equations (see Chapter 12), yields the following results for response time and throughput: 1.42 sec and 141 messages/sec, respectively.
Using the decomposition technique to model processor service time.
Assume that p_S » p_L and set p_L = 0.1. The solution of Eqs. (15.2.4) and (15.2.5) yields D_L = 0.01436 sec and D_S = 0.000136 sec. Using these values, one solves two MVA models, as shown in Fig. 15.3, yielding the following response times: R_L = 2.872 sec and R_S = 1.42 sec. Thus, the response time in the QN with high variability is equal to 0.1 x 2.872+0.9 x 1.42 = 1.57 and the throughput is equal to 133.7 messages/sec. Note that the exponential service time model overestimates the performance of the mail server.