7.4 Computational methods for queuing network solutions

< Free Open Study >

In Chapters 5 and 6 and the previous sections of this chapter, we introduced probability theory and analysis techniques for performing classical queuing system analysis. Those analyses, however, tend to be complex even for simple systems. In an effort to rectify this situation, three alternative analysis methods have emerged.

The first method, from Buzen [9], gives a technique for finding the normalization constant that is required for the solution of certain product form networks. The method does not require the solution of the normalization summation described in Chapter 5. Instead, it uses an iterative solution, which is simpler to implement.

The next method, from Buzen and Denning [10], introduced a methodology for assessing the match of a given assumption for the system under analysis. In addition, they defined the performance quantities of interest in terms of their operational relationships in the system under study. For that reason, this kind of analysis is known as operational analysis.

The third analysis method, from Reiser and Lavenberg [11], attempts to simplify the analysis of queuing networks. By using the mean waiting time and mean queue size, in conjunction with Little's result, the solution of a system of queues can asymptotically approach the exact solution, with simpler computational requirements. This type of analysis is called mean value analysis.

In this section, we will discuss these methods and models. Some of the results are specific to the type of model used, while others are more general. The specific model cases, however, can be used to approximate a given system or portion of a system and to obtain an initial feeling for the system's actual behavior.

7.4.1 Central server model

The central server model, shown in Figure 7.17, was originally proposed as a model for jobs in a multiprogramming computer. It is a closed network and we assume that a constant (K) number of jobs are always in process. In the original model, programs are serviced by the CPU (server 1) and then are routed to one of M - 1 I/O devices (servers 2 through M). After receiving I/O service, the program again queues for CPU time. If a program completes execution, it is rerouted into the CPU queue to start another job, thereby keeping the number of jobs in the system equal to K. This can be thought of as a system in which there is always a job waiting to enter the system at the CPU queue, but it will not do so until a job completes. The actual jobs in the system, therefore, may vary over time, but the number in circulation at any given time remains constant.

click to expand
Figure 7.17: Central server model.

The central server model can be adapted to represent other systems besides a CPU and its associated I/O devices. For instance, we could choose server 1 to represent a multichannel DMA controller and servers 2 through M to represent the output channels. Or, we could adjust the branching probabilities to represent a system in which the jobs remain constant and never complete (i.e., P₁ = 0). This could be useful for a dedicated I/O server. Alternatively, we could choose one of the servers 2 through M to represent an idle period for a job or I/O channel.

Although we may be able to formulate a central server model that somewhat reflects the actual situation, the match may not be precise. The benefit of this model, however, is the computational simplicity of many of its important performance parameters. Next, we will develop the computational model for this queuing network.

In the central server model, the servers are assumed to have exponential service time distributions. As shown in Figure 7.17, the exit from the central server has several branches, each with an associated branching probability, P_i. There are a total of K customers (jobs) in the system at any time. Let us define the state of the system:

(7.100)

where k_i denotes the number of customers in queue i. Thus:

(7.101)

If we define B(i) as the probability of going to server i after service at server 1, and we let B(1) = 1, then:

(7.102)

Using the same techniques described previously for closed queuing networks, we can obtain the state probabilities as:

(7.103)

This equation for the state probabilities is called product form, and it can be solved by finding the normalization constant, norm, as outlined in the earlier sections of this chapter. The described methods, however, require the solution of M simultaneous equations. An alternative method with fewer computations is as follows:

Let:

(7.104)

(7.105)

Let the states for the summation include all states where equation (7.101) holds. Also, we stipulate that k_i ≥ 0. Define another function g(k, m) where there are m queues in the system instead of M. The following, then, is true:

(7.106)

where k is the total number of customers in the system with M queues. Thus, we can further define g(k, m) as:

(7.107)

We can break up the right-hand summation as:

(7.108)

For the first summation in equation (7.108), if we always have at least one customer in queue m, we can think of the system as having k - 1 customers circulating through m queues. We must also remove the product term that relates to customers in queue m. Similarly, in the second summation, if queue m is always empty, then we can think of the system as having m - 1 queues and k customers. Thus, equation (7.108) becomes:

(7.109)

The two summations can be rewritten, using equations (7.105) and (7.106), as:

(7.110)

For k = 0, and for m = 1, equation (7.110) becomes:

(7.111)

(7.112)

We now have a set of initial conditions equations ([7.111] and [7.112]) and a recursive relationship, equation (7.110), for calculating the values up to g(K, M) = G(K). Then, we can use equations (7.103) and (7.104) to calculate the state probabilities. The computation is as follows:

Suppose we have a network similar to that shown in Figure 7.17, where μ = 3, μ₁ = 0.9, μ₂= 0.5, μ₃ = 0.9, p₁ = 0.7, p₂ = 0.2, and p₃ = 0.1. Furthermore, suppose that there are k = 2 customers in the system. From equation (7.111), we know that:

g(0,1) = 1
g(0,2) = 1
g(0,3) = 1

and from equation (7.112) we know that:

g(0,1) = 1
g(1,1) = 1
g(2,1) = 1

We can arrange these values in a grid, as shown in Figure 7.18. The computation proceeds one row at a time and ends up with a value for g(2,3) = G(2).

click to expand
Figure 7.18: G(k) grid calculation.

For example, to calculate g(1,2), we would proceed as follows:

Thus, for Figure 7.18, the normalization constant is equal to 0.6. With two customers, we can now calculate the state probabilities using equation (7.103).

Buzen [9] gives several expressions for performance measures that are based upon this general computational structure. One of these measures is the device utilization, U_i, for server i. Normally, we define the utilization of a device as the sum of the state probabilities where there is at least one customer at server i.

(7.113)

(7.114)

Using similar reasoning, as we did for equation (7.109), we can treat equation (7.114) as a system with one less customer, multiplied by the factor that accounts for always having a customer at queue i. Thus, we get:

(7.115)

so:

(7.116)

We have already calculated the values for G(K) and G(K - 1), as in Figure 7.18, so calculating the utilization of a device is straightforward. From this, we can find the throughput of device i as:

(7.117)

Looking back to equation (7.114), we can extend equation (7.116) to find the probability that the queue length at server i will be greater or equal to some value n:

(7.118)

so that:

(7.119)

Applying equation (7.118) to the case where the queue length is n + 1, we can obtain the probability of n customers in queue i.

(7.120)

(7.121)

Now that we have derived an expression for the probability of having n customers in the queue, we can use equation (5.66) to obtain an expression for expected queue length:

(7.122)

We can now expand and collect terms, keeping in mind that we have defined G(K < 0) = 0, to get:

(7.123)

The expected delay through a queue, then, can be found from Little's result, using the throughput and expected queue length values just found:

(7.124)

These techniques allow the efficient computation of the important statistics for the model shown in Figure 7.17. Next, we will discuss another, slightly more general computational method: mean value analysis.

7.4.2 Mean value analysis

The analyses described so far have all calculated, at one point or another, expressions for queue length distributions or state probabilities. This is perfectly acceptable for rather simple systems involving few customers and queuing systems. In the following discussion, we will explore an iterative method for finding some of the performance measures of interest without calculating the aforementioned distributions. The drawback to this is that the analysis refers only to the mean values of certain performance measures.

The techniques we are interested in here apply to closed queuing networks that have a product form solution for the state probabilities. The solutions are based on the assumption that a customer, arriving at any queue in a closed system that is at steady state, experiences a wait in the queue that is equivalent to the steady-state wait time for that queue with the arriving customer removed. Thus, the behavior of a network with one more customer is based upon the behavior before its arrival. This assumption leads to an iterative algorithm where the steady-state performance characteristics for the system with n + 1 customers are derived from the characteristics with n customers, which are derived from a system with n - 1 customers, and so on down to one customer.

The general algorithm, then, allows us to compute average values for queue length, throughput, server utilization, and wait time by starting with an expression for one customer in the system and working up to any number of customers.

The main theorem behind mean value analysis states that the mean wait time for customers at any server in the network is related to the solution of the same network with one fewer customers. In conjunction with the wait time, we apply Little's result to obtain the total network throughput and then apply it again to each individual server to get the average queue length at each server.

The expression for wait time related to a network with one fewer customers is given as:

(7.125)

where μ is the mean service time required by a customer at the server. The quantities W(k) and N_q(k - 1) denote the mean wait time for a system with k customers at the queue and the mean number of customers in the queue with k - 1 customers in the system, respectively. This expression holds for systems with first-come first-serve queuing disciplines with single, constant rate servers at each queue.

Next, we can apply Little's result to find the mean throughput for the network:

(7.126)

where Ø is the visit ratio for the server considering all other servers. The visit ratio values will be discussed later in this chapter.

Finally, we can use Little's result on the servers to compare the average queue lengths:

(7.127)

Now we have a new expression for mean queue length that we can use in equation (7.125) to start another iteration.

The general procedure, then, is to start with an empty system (K = 0) and iterate equations (7.125) through (7.127) until we reach the desired value of K. For one iteration, we calculate the values for each queue system in the network before passing on to the next equation. Figure 7.19 shows a simple network to illustrate the technique. In the example, if we start with 0 customers, we obtain the following quantities from equations (7.125) through (7.127). In the following expressions, the subscripts denote the queue/server pair that the measure is associated with. The general iteration algorithm is as follows:

click to expand
Figure 7.19: Network for mean variable analysis.

The first iteration is:

(7.128)

The second iteration is:

(7.129)

The visit ratios, ∅_i, are obtained as follows. Pick a server and set its visit ratio value ∅_i to 1. Next, formulate the equations that contribute to the visit ratio for that queue by looking at all queues that feed it. Equate the feeder visit ratios, multiplied by the respective branching probabilities, to the next in line (∅_i). Continue this process for each queue system until we have m relationships in m unknowns, where m is the number of queuing systems. We can then solve this system of equations to obtain the desired visit ratios. Note that the visit ratios are relative quantities. For Figure 7.19, the visit ratios would be calculated as follows:

(7.130)

The algorithm is iterated until we reach the desired network population, where we can calculate the mean performance measures for the network.

7.4.3 Operational analysis

The methods for performing queuing analysis given in the beginning of the chapter provide close approximations to the actual systems they represent. It is contended, however, that the assumption that the various distributions and relationships that we use to represent their real-world counterparts cannot be proven to be absolutely accurate. Furthermore, the stochastic models studied earlier yield relationships that cannot be absolutely proven to be valid during any observation period.

Operational analysis, on the other hand, is based upon the observation of basic, measurable quantities that can be combined into operational relationships. Furthermore, the observation period for which the system is analyzed is finite. The assumption is that the basic quantities, called operational variables, are measurable (at least in theory). The basic operational variables that are commonly found in operational analysis are as follows:

T = the observation period length
A = the number of arrivals during the observation period
B = the server busy time during the observation period
C = the number of job completions during the observation period

In addition to the measurable quantities, there are several fundamental relationships that define other useful quantities. These are as follows:

λ = arrival rate = A/T
X = completion rate = C/T
U = server utilization = B/T
S = mean service time per job = B/C

Several operational identities are also defined that hold under certain operational assumptions. The first, which relates server utilization to the completion rate and mean service time, is derived as follows:

(7.131)

This relationship holds in all cases and is thus referred to as an operational law.

If we assume that the system is in steady-state equilibrium, we can state that the number of arrivals and the number of completions during a given observation period will be equal (i.e., the flow balance assumption). This statement may not always be true, but it can be measured and verified for any observation period of interest. Thus, it is called an operational theorem. From this, we can derive another relationship, which holds when the system is in flow balance:

(7.132)

One advantageous property of operational analysis is that the technique can be used for open and closed networks. The one condition, however, that must be met is that the network under consideration must be what is called operationally connected. That is, no server may be idle during the entire observation period.

For a closed system, we know the number of jobs in circulation in the network and we find the system throughput at a particular point in the network. Other quantities can then be derived using that throughput as a starting point. In an open system, the throughput at the exit from the network is assumed to be known, and we use this to find the number of customers at the queues.

Let's look now at some basic operational quantities. Suppose that we have an arbitrary network that is observed over a period of T. For each queue system in the network, we observe and collect the following data:

A_i = number of arrivals at queuing system i
B_i = busy time of server i
C_ij = number of times a job goes directly from server i to server j's queue

Jobs that arrive from an external source or that leave to an external sink are denoted by A₀_i and C_i₀. The number of arrivals to and departures from the system are given by:

(7.133)

(7.134)

and the total number of completions at server i is given as:

(7.135)

From the basic measured quantities defined previously, several other performance quantities can be derived, as follows:

Utilization of server i:	U_i = B_i/T
Mean service time of server i:	Si = B_i/C_i
Output rate of server i:	X_i = C_i/T
Routing frequency from server i to j:	qij = C_ij/C_i

We can represent the job completion rate of such a system as:

(7.136)

and the utilization of any server i as:

(7.137)

If we think of the wait time at a particular server i at each increment of time during the observation period as the sum of the service times of the customers ahead of the new arrival, the total wait time accumulated for all jobs in the system over the period is:

(7.138)

The average queue length at the server in question is given as:

(7.139)

and the response time of the server system is given as:

(7.140)

Combining equations (7.139) and (7.140), we obtain the operational equivalent of Little's result:

(7.141)

If the system under study is in steady state, so that we have flow balance, we assume that the arrival rate to a queuing system is equal to the completion rate of that same system. We can also derive the server throughput rate for any server, j, as:

(7.142)

We can obtain the same expression as stated in equation (7.136), but generalized for any server it is:

(7.143)

The relationship derived yields a unique solution if applied to an open system, because the input throughput, X, is known. In a closed system, equation (7.143) will yield relative throughput rates, because we do not know the absolute value of X₀.

Buzen [9] defines the visit ratio V_i as the number of times a particular server, i, will be visited, relative to a given number of inputs. We can express this quantity as the ratio of the throughput at server i to the total input throughput:

(7.144)

If we assume that the flow of jobs in the network is balanced, we can set V₀ = 1 (since all jobs pass through the network input) and solve for all of the other visit ratios using the following expression:

(7.145)

Also, knowing the throughput of any server in the network allows us to find the throughput of any other server through a combination of equations (7.144) and (7.145).

Now let's look at the total time a job remains in the system as a function of each server's throughput and average queue length. The total wait time for any job arriving at any server depends on how many jobs are ahead of the new one in the queue and on the rate that jobs get completed by the server. At each server, we can use Little's result equation (7.141) in combination with equation (7.144) to obtain:

(7.146)

If we then sum equation (7.146) over all servers in the network, we obtain a general expression that can be interpreted as Little's result applied to the total system:

(7.147)

where the number of jobs in the system at any time is simply the sum of all jobs at the network's servers:

(7.148)

So we have:

(7.149)

The left-hand side of equation (7.149) can be thought of as an application of Little's result to the system as a whole; thus, we define the system response time as:

(7.150)

The final topic that we will cover under operational analysis is bottleneck analysis in a closed system. In every network, one of the queuing systems will eventually be unable to keep up with increased service demands as the number of jobs in the network increases. This particular server will subsequently determine the maximum throughput rate of the network as a whole. A device is considered to be saturated (e.g., unable to process jobs any faster) if its utilization becomes one. In this case, the throughput will be inversely proportional to the service time, since there will always be a job in service.

(7.151)

If we combine equations (7.137) and (7.144), we can express the relative utilization of any two servers in the network as follows:

(7.152)

Note that the ratios of the server utilizations do not depend upon the throughput of either server; the ratio remains constant independent of system load. Thus, the device with the largest value of V_iS_i will become the network's bottleneck as load increases.

It is possible, then, to find the maximum possible system throughput when the bottleneck is in saturation. Since, for bottleneck server b, throughput is equal to the inverse of the service time, we can combine equations (7.144) and (7.151) to obtain the maximum system throughput:

(7.153)

The network response time, in the saturation case, is given by equation (7.150) as:

(7.154)

and is thus limited by the bottleneck server.

Buzen and Denning [10] extend the operational results discussed previously to systems with load-dependent behavior. Also, an earlier proposal for operational analysis of queuing networks can be found in [12].

< Free Open Study >