13.2 The Need for Multiple-Class Models | Performance by Design: Computer Capacity Planning By Example

There are various motivations for constructing multiple-class models to capture the features of heterogeneous workloads. These models can be used to represent different QoS and SLA requirements for the different workload classes. Multi-class models capture the priorities and special services that each workload class requires.

As a more detailed example, consider the widespread use of electronic mail for personal communication. The need to provide reliable and high-quality mail service motivates providers to conduct comprehensive capacity planning analysis for their mail servers. A mail server can be viewed as a collection several underlying subsystems, including storage buffers, network routers, and processing servers. A typical problem faced by service providers is how to properly size (i.e., amount of storage capacity, number of network ports, processor speed) a mail servers in the most cost-effective way. Profiles of e-mail clients exhibit dramatic differences in their resource demands by factors of 50 and more between light and heavy e-mail users [5, 20]. Representing the workload of such disparate profiles by a single workload class does injustice to both light and heavy users. For example, approximating one heavy user and one light user by two medium users results in an inaccurate workload representation. The identities of the heavy user and the light user are lost, being replaced by two non-existent medium users. Hence, such mail server performance models call for different classes of requests to provide meaningful results.

The choice of the workload abstraction and the corresponding number of classes are key steps in performance modeling. For example, different Service Level Agreements (SLA) are usually imposed on different workload classes [14]. SLAs represent guarantees regarding the quality of service provided by a system. An SLA for one workload class could state that 90% of all messages to local users get delivered to the target mailbox within 60 sec. Another SLA for a different workload class could be that 98% of messages sent to remote mail servers are received by the remote server within 90 seconds.

The accuracy of the system model is strongly influenced by the number of workload classes chosen. Too few classes can lead to inaccurate generalizations whereas too many classes lead to excessive detail and complexity. As an example, consider the case of a Web-based shopping system. In the performance analysis of the system, three workload classes are considered: cacheable transactions, noncacheable transactions, and search transactions [2]. The transaction requests are grouped into classes based on the impact of their performance on the system. Cacheable requests have their responses stored in an application server cache and consequently demand less processor and disk time than noncacheable requests. Processing a noncacheable request requires approximately 100 times more CPU time than required if the request were in cache [2]. Thus, it is obvious that each class of transactions uses the system resources differently and experiences very different response times. Single-class models are unable to answer important performance questions related to specific workload classes, because they cannot single out differences among groups of transactions. Single-class models are effective in capturing global behavior but are limited in their predictive capability of individual group (and often important) behavior.

Although multiple-class models are more useful and natural for describing workloads of real systems, they present problems to the modeler. For instance, it is difficult to obtain parameters (e.g., multiclass service demands and multiclass visit ratios) for models with multiple classes. Usually, monitoring tools do not provide measurements on a per-class basis. Inferences (sometimes wild guesses) have to be made to parameterize each workload class and to apportion the system overhead among the classes. As a result, it is more difficult to obtain accurate parameters for multiple class models than for single-class ones.

Example 13.1.

An explicit SLA defines the expectations between application clients and service providers. Increased expectations are associated with increased costs for meeting those expectations. Thus, an SLA expresses a direct relationship between a class of customers and the service demands (and the related costs) of their applications. Some customers require very short response time for critical applications and are willing to pay more for these specific transactions. Suppose that the manager of a data center service provider is negotiating an SLA with a client representing a financial company for three types of applications: risk portfolio analysis, purchase transactions, and browsing transactions. The initial proposal states that the average response time for risk portfolio analysis is to be 3 hours and the client is willing to pay 15 dollars for this service per execution. Purchase transactions are to have an average system response time (i.e., from when a client submits a purchase request until a purchase verification message is returned) of less than 1 sec and each transaction will cost 50 cents. Browsing transactions are to have an average response time of less than 2 sec and will cost 10 cents each. Before agreeing to the SLA, the data center manager needs to know whether the currently installed capacity can accommodate the proposed new services for the financial company client. This is an important step of the process because the SLA may also specify financial penalties if the response times are not met. How should the data center's performance analyst specify the workload? A single-class workload description does not provide adequate detail for analyzing the SLA. Instead, the performance analyst specifies a multiple-class workload model as follows.

Class 1: The risk portfolio analysis is modeled by a closed class, consisting of a set of background processes, defined by the service demands (i.e., processor and disks) and the number of processes in execution during "the peak hour."
Class 2: The online purchase transactions are modeled by an open class, defined by the service demands (i.e., processor and disks) and the average arrival rate during "a peak minute."
Class 3: The browsing transactions are modeled by an open class, defined by service demands (i.e., processor and disks) and an average arrival rate during "a peak minute."

The three-class model is solved using the techniques presented in this chapter and the response times obtained. Based on the predicted response times, the data center manager will know if the currently installed capacity is sufficient for the contract. If not, in order to meet the new performance objectives, the data center will have to increase its capacity before agreeing to the SLA contract. In this case, management will incur new hardware acquisition costs, which will have to be prorated against the revenue generated by the new transactions.