Goal, Theme, and Approach | Performance by Design: Computer Capacity Planning By Example

The idea of writing this book originated from our observation that some of the most fundamental concepts and methods related to performance modeling and analysis are largely unknown to most Information Technology (IT) practitioners. As a result, many IT systems are designed and built without sufficient consideration given to their non-functional requirements such as performance, availability, and reliability. More often than not, performance is an afterthought. Performance testing is done when a system is nearing completion. At that late stage, it may be necessary to conduct a major redesign to correct eventual performance problems. This approach is inefficient, expensive, time consuming, and professionally irresponsible.

A major goal of this book is to provide to those involved in designing and building computer systems a performance engineering framework to be applied throughout a computer system's life cycle. This framework is quantitative, rigorous, and based on the theory of queuing networks. Some of our readers may not be interested in the details behind the models. For that reason, we divided the book into two parts: the practice of performance engineering (Part I) and the theory of performance engineering (Part II).

Part I brings many examples and case studies supported by algorithms implemented in Visual Basic modules attached to the various MS Excel workbooks provided with the book. Five complete case studies are inspired in real-world problems: a database service sizing, a Web service capacity planning situation, a data center's cost and availability analysis, the sizing of an e-business service, and the performance engineering of a future help desk application. After reading Part I, the reader should be able to 1) identify the sources of potential performance problems of a computer system and 2) build and solve performance models to answer what-if questions regarding competing hardware and software alternatives. The models that you build can be solved with the tools provided with the book.

Part II presents the theory of performance engineering in an intuitive, example-driven manner. Before the algorithms and methods are formalized, the basic ideas are derived from first principles applied to examples. The readers of part II are exposed to the most important techniques for solving 1) Markov models, 2) open and closed multiclass queuing networks using exact and approximate methods, and 3) non-product form queuing networks that represent software contention, blocking, high service time variability, priority scheduling, and fork and join systems.

Throughout Part I, references to specific techniques and methods of Part II provide a nice integration between the two components of this text.

Who Should Read This Book

Information technology professionals must ensure that the systems under their management provide an acceptable quality of service to their users. Managers must avoid the pitfalls of inadequate capacity and meet users' performance expectations in a cost-effective manner. Performance engineers, system administrators, software engineers, network administrators, capacity planners and analysts, managers, consultants, and other IT professionals will benefit from reading parts or the entire book. Its practical, yet sound and formal, approach provides the basis for understanding modern and complex networked environments.

This book can also be used as a textbook for senior undergraduate and graduate courses in Computer Science and Computer Engineering. Exercises are provided at the end of all fifteen chapters. At the undergraduate level, the book is a good starting point to motivate students to learn the important implications and solutions to performance problems. An under-graduate course would concentrate on the first part of the book, i.e., the practice of performance engineering. At the graduate level, it can be used in System Performance Evaluation courses. This book offers a theoretical and practical foundation in performance modeling. The book can also be used as a supplement for systems courses, including Operating Systems, Distributed Systems, and Networking, both at the undergraduate and graduate levels.

Book Organization

Part I: The Practice of Performance Engineering

Chapter 1 introduces several properties and metrics used to assess the quality of IT systems. Such metrics include response time, throughput, availability, reliability, security, scalability, and extensibility. The chapter also discusses the various phases of the life cycle of a computer system and shows the importance of addressing QoS issues early on in the design stage as opposed to after the system is deployed.

Chapter 2 presents the qualitative aspects of the performance engineering framework used in this book. The framework is based on queuing networks. The chapter uses examples of practical systems to introduce the various aspects of such queuing networks.

Chapter 3 focuses on the quantitative aspects of the queuing network framework and introduces the input parameters and performance metrics that can be obtained from these models. The notions of service times, arrival rates, service demands, utilization, queue lengths, response time, throughput, and waiting time are discussed. The chapter also introduces Operational Analysis, a set of basic quantitative relationships between performance quantities.

Chapter 4 presents a practical performance engineering methodology and describes its steps: specification of the system's performance goals, understanding the current environment, characterization of the workload, development of a performance model, validation of the performance and workload models, workload forecasting, and cost x performance analysis of competing alternatives.

Chapter 5 uses a complete case study of a database service sizing to introduce the issue of obtaining input parameters to performance models from measurement data. The chapter also discusses various types of software monitors and their use in the data collection process.

Chapter 6 uses a Web server capacity planning case study to introduce several important concepts in performance engineering, including the determination of confidence intervals, the computation of service demands from the results of experiments, the use of linear regression, and comparison of alternatives through analytic modeling and through experimentation.

Chapter 7 applies performance modeling techniques to address the issue of properly sizing a data center. This sizing is part of the system design process and focuses on the number of machines and the number and skill level of maintenance personnel to achieve desired levels of availability.

Chapter 8 shows, through an online auction site case study, how performance models are used to analyze the scalability of multi-tiered e-business services. The workload of these services is characterized at the user level. User models are used to characterize the way customers navigate through the various e-business functions during a typical visit to an e-commerce site. This user-level characterization is mapped to a request-level characterization used by queuing network models. The models are used for capacity planning and performance prediction of various what-if scenarios.

Chapter 9 discusses Software Performance Engineering through a complete case study of the development and sizing of a new help desk application. The chapter explains how parameters for performance models can be estimated at various stages of a system's life cycle. The solutions of these models provide valuable feedback to developers at the early stages of a new application under development.

Part II: The Theory of Performance Engineering

Chapter 10 presents a basic, practical, and working knowledge of Markov models through easy-to-understand examples. Then, the general solution to birth-death Markov chains is presented.

Chapter 11 discusses the most important results in single queuing stations systems. The results presented include M/M/1, M/G/1, M/G/1 with server vacation, M/G/1 with non-preemptive priority, M/G/1 with preemptive resume priority, G/G/1 approximation, and G/G/c approximation. Computer and network related examples illustrate the use of the results.

Chapter 12 reconstructs single class Mean Value Analysis (MVA) from first principles and illustrates the use of the technique through a detailed example. The special case of balanced systems is discussed.

Chapter 13 generalizes the results of chapter 12 to the case of multiclass queuing networks. Results and algorithms for open, closed, and mixed queuing networks are presented. Both exact and approximate techniques are discussed.

Chapter 14 extends the results of chapter 13 for the case of load-dependent devices. These extended model address situations where a device's service rate varies with the queue size.

Chapter 15 discusses approximations to the Mean Value Analysis technique to deal with blocking, high variability of service times, priority scheduling, software contention, and fork and join.

Acknowledgments

Daniel Menascé would like to thank his students and colleagues at George Mason University for providing a stimulating work environment. He would also like to thank his mother and late father for their love and guidance in life. Special recognition goes to his wife, Gilda, a very special person, whose love and companionship makes all the difference in the world. His children Flavio and Juliana have tremendously enriched his life from the moment they were born.

Virgilio Almeida would like to thank his colleagues and students at UFMG. He would also like to thank CNPq (the Brazilian Council for Scientific Research and Development), which provided partial support for his research work. Virgilio would also like to express his gratitude to his family, parents (in memoriam), brothers, and many relatives and friends. His wife Rejane and sons Pedro and André have always been a source of continuous encouragement and inspiration.

Larry Dowdy would like to gratefully thank the students and faculty at the University of Leeds, England, particularly David Dubas-Fisher, Jonathan Galliano, Michael Harwood, and Gurpreet Sohal for their feedback, support, and insights. In addition, the Vanderbilt students in CS284 provided many constructive and insightful comments.

From the Same Authors

Capacity Planning for Web Services: Metrics, Models, and Methods, D. A. Menascé and V. A. F. Almeida, Prentice Hall, Upper Saddle River, NJ, 2002.
Scaling for E-Business: Technologies, Models, Performance, and Capacity Planning, D. A. Menascé and V. A. F. Almeida, Prentice Hall, Upper Saddle River, NJ, 2000.
Capacity Planning for Web Performance: Metrics, Models, and Methods, D. A. Menascé and V. A. F. Almeida, Prentice Hall, Upper Saddle River, NJ, 1998.
Capacity Planning and Performance Modeling: From Mainframes to Client-Server Systems, D. A. Menascé, V. A. F. Almeida, L. W. Dowdy, Prentice Hall, Upper Saddle River, NJ, 1994

Book's Web Site and Authors' Addresses

The Web site at www.cs.gmu.edu/~menasce/perfbyd/ will be used to keep the readers informed about new developments related to the book and to store the various Excel workbooks described in the book. Some of the Excel workbooks are password protected. The password is 2004.

The authors' e-mail, postal addresses, and Web sites are:

Professor Daniel A. Menascé
Department of Computer Science, MS 4A5
George Mason University
Fairfax, VA 22030-4444
United States
(703) 993-1537
menasce@cs.gmu.edu
www.cs.gmu.edu/faculty/menasce.html

Professor Virgilio A. F.
Almeida Department of Computer Science
Universidade Federal de Minas Gerais
P.O. Box 920
31270-010
Belo Horizonte, MG
Brazil
+55 31 3499-5887
virgilio@dcc.ufmg.br
www.dcc.ufmg.br/~virgilio

Professor Larry W. Dowdy
Department of Electrical Engineering and Computer Science
Vanderbilt University
Station B, Box 1679,
Nashville, TN 37235
(615) 322-3031
larry.dowdy@vanderbilt.edu
www.vuse.vanderbilt.edu/~dowdy/persinfo.html

We hope that you will enjoy reading this book as much as we enjoyed writing it!

Daniel Menascé,
Virgilio Almeida, and
Larry Dowdy