15.6 Modeling ForkJoin Queuing Networks | Performance by Design: Computer Capacity Planning By Example

15.6 Modeling Fork/Join Queuing Networks

Parallelism and concurrency in computer systems can be modeled by fork-join systems. When entering a concurrent processing stage, an arriving request forks into subtasks that are executed independently either on different service centers or on the same service center. Upon completing execution, each subtask waits at the join point for its sibling tasks to complete execution. The fork operation starts new concurrent subtasks. The join operation forces one subtask to wait for sibling subtasks.

Fork/join models have been analyzed in the context of computer systems and communication networks. Fork/join queuing networks are used to model disk arrays under synchronous workloads [38]. The cache, the array controller, and the disks are represented by individual queuing service centers. The controller is modeled by a fork/join queue. Each disk is modeled by a queuing server of this fork/join queue. A request to the controller is forked into subrequests that are assigned to the disks of the disk array. An I/O request to the disk array is complete when all subrequests are complete, i.e., when they all meet at the join point.

Performance models of modern servers and operating systems typically involve solving a set of non-product form closed QNs with fork/join structures. In these models, fork/join stations model the synchronization constraints between processes at different stages of execution. Obtaining the exact solution for these networks is difficult and may be costly in the case of systems with a large state space. Therefore, approximate methods are needed. This section presents the approximate solution technique proposed in [39, 40] for queuing models with fork/join synchronization.

Consider that the QN consists of one or more interconnected subsystems as illustrated in the example of Fig. 15.12. Two types of subsystems are considered:

Figure 15.12. Fork and join.

graphics/15fig12.gif

Serial subsystem: consists of a single queuing service center like the ones discussed throughout the book.
Parallel subsystem: consists of k (k > 1) parallel service centers in a fork/join arrangement. That is, a request to a parallel subsystem is split into k independent sub-requests, one for each of the k service centers. The original request completes when all subrequests complete, i.e., when all of them arrive at the join point. Thus, the time spent by a request in a parallel subsystem is the maximum of the times spent in each of the k service centers. The average service time of a request is assumed to be the same at each of the k service centers of a parallel subsystem. Note that a serial subsystem is a special case of a parallel subsystem when k = 1.

Consider a closed fork/join queuing network with the following parameters:

I: number of subsystems in the queuing network model.
S_i: average service time at subsystem i.
k_i: number of parallel service centers at subsystem i.
R_i(n): response time at subsystem i when there are n requests in the queuing network.
n): residence time at subsystem i when there are n requests in the queuing network.
V_i: average number of visits per request to subsystem i.

The approximate fork/join MVA solution method is based on adapting the response time equation as indicated by Eq. (15.6.37), which is an approximation derived from the arrival theorem [30].

Equation 15.6.37

where is the average queue length at subsystem i when there is one less customer in the QN and H_{k_i} is the k_i-th harmonic number defined as [Note: when k = 1, Eq. (15.6.37) becomes the exact MVA response time equation since H₁ = 1.] According to Eq. (15.6.37), when there is only one request in the queuing network the average response time at a parallel subsystem with k_i service centers is S_iH_{k_i}. The harmonic number accounts for the synchronization time at the joining point. The rationale for this approximation comes from the fact that the expectation of a random variable defined as the maximum of k exponentially distributed random variables with average S is equal to SH_k. Performance measures for a fork/join queuing network are then obtained through the following approximate MVA equations:

Equation 15.6.38

Equation 15.6.39

graphics/15equ639.gif

Equation 15.6.40

The relative error of this approximation was shown to be less than 5% for the majority of the cases analyzed [39].

Example 15.6.

A database server consists of a powerful processor and two disk array subsystems, 1 and 2. The disk arrays operate under synchronous workload, i.e., transactions are blocked until their I/O requests complete service. A disk array improves I/O performance by striping data across multiple disks [38]. The disk array subsystems consist of k₁ and k₂ disks, respectively. A transaction alternates between using the processor and the I/O subsystem until it completes its execution. Previous measurements indicate that the average response time is 0.34 seconds when 10 transactions are executing concurrently. One of the SLAs for the database server is that the average response time should not exceed 1 second. Before increasing the concurrency level to 50 transactions, the performance analyst wants to predict the new response time to verify whether or not the response time SLA will continue to be satisfied.

The model parameters are as follows. The processor service demand is equal to 0.01 sec and the service demands for each disk of disk arrays 1 and 2 are 0.02 and 0.03, respectively. With the purpose of keeping the example simple, the number of parallel disks in the disk arrays is set to small values, i.e., k₁ = 2 and k₂ = 3. The queuing model of the database server has three subsystems, the processor and the two storage systems, with two and three disks, respectively. The model is closed with n transactions in execution. Typically a disk array has three components, the cache, the controller and the disks. The disk array in this example is modeled by separate and parallel services centers, such that caching effects as well as controller overheads are implicitly represented in the disk service demands. Each disk array is modeled by parallel subsystem with a fork/join structure, where each disk is a queuing server.

The parameters for the QN model are n = 50, I = 3 (i.e., three subsystems), k₀ = 1 (serial execution at the processor), k₁ = 2, k₂ = 3 (parallel execution at the disk arrays), D_processor = 0.01 sec, D_{disk array 1} = 0.02 sec, and D_{disk array 2} = 0.03 sec. The solution of the algorithm presented in this section provides an average response time of 1.53 seconds, which exceed the SLA of 1 second (see the MS Excel workbook Ch15-Ex15-6.XLS).