8.6 Adding More Servers | Performance by Design: Computer Capacity Planning By Example

The original online auction site assumes that there is a single Web server, a single application server, and a single database server. In general, there may be many identical servers in each tier. Consider a simple case of N_ws identical Web servers and a perfect load balancer that sends exactly 1/N_ws of the traffic to each Web server. Thus, the average arrival rate of requests at each Web server is equal to l/N_ws, where l is the overall arrival rate of requests to the site. In the following analysis, the application and database servers are ignored. Also, consider a single class of requests. The generalization to multiple classes is straightforward.

One way of modeling the N_ws-server situation is to replicate all queues that represent the devices (e.g., CPU, disks) of a Web server in the QN model so that there are N_ws of each device in the QN model. Given the assumption of identical Web servers and perfect load balancing, a simpler approach can be followed where all N_ws Web servers are represented in the QN model by a single equivalent Web server. The model is constructed so that the average response time of the set of N_ws servers is the same as that of the single equivalent server. Figure 8.5 illustrates this situation.

Figure 8.5. Single Web server equivalent to multiple Web servers.

graphics/08fig05.gif

The response time of a request that goes through server j(j = 1, ···, N_ws) of Fig. 8.5(a) is the same as the response time of the single equivalent Web server of Fig. 8.5(b) with an arrival rate equal to l/N_ws. (The two models have the same service demands at the CPU and disk devices.) The response time at the single equivalent server of Fig. 8.5(b) is given by

Equation 8.6.17

graphics/08equ617.gif

where K is the number of devices (i.e., CPU ands disk) and D_i (i = 1, ···, K) is the service demand of a request at device i. Note that the term (l/N_ws) D_i is the utilization of device i according to the Service Demand Law. The generalization of Eq. (8.6.17) to multiple classes is

Equation 8.6.18

graphics/08equ618.gif

where R_r is the average response time of class r requests, l_r is the average arrival rate of requests of class r, and D_i,r is the total utilization of device i over all R classes.

As an example, consider the service demands of Table 8.4 and an overall session start rate g of 11 sessions/sec. Then, consistent with the example in Section 8.3, assume 25% of type A customers and 75% of type customers. Then, the arrival rates for each type of request are given by l_home = 11.0 requests/sec, l_search = 13.71 requests/sec, l_view = 2.22 requests/sec, l_login = 3.68 requests/sec, l_create = 1.10 requests/sec, and l_bid = 2.10 requests/sec.

Consider three Web servers instead of one. The utilizations of the CPU and disk at the single equivalent Web server are given by

graphics/216equ01.gif

and

graphics/217equ02.gif

Then, the response times of each of the six classes of requests at the Web server tier are computed using Eq. (8.6.18) as

graphics/217equ03.gif

The same approach of replacing all servers of the Web tier by a single equivalent Web server can be applied to the application and database tiers.