2.1 Introduction | Performance by Design: Computer Capacity Planning By Example

Performance and scalability are much easier to guarantee if they are taken into account at the time of system design. Treating performance as an after-thought (i.e., as something that can be tested for compliance after a system has been developed) usually leads to frustration. In this chapter, we start to provide a useful framework that can be used by computer system designers to think about performance at design time. This framework is based on the observation that computer systems, including software systems, are composed of a collection of resources (e.g., processors, disks, communication links, process threads, critical sections, database locks) that are shared by various requests (e.g., transactions, Web requests, batch processes). Usually, there are several requests running concurrently and many of them may want to access the same resource at the same time. Since resources have a finite capacity of performing work (e.g., a CPU can only execute a finite number of instructions per second, a disk can only transfer a certain number of bytes per second, and a communications link can only transmit a certain number of bits per second) waiting lines often build up in front of these resources, in the same way queues form at bank tellers or at supermarket cashiers.

Thus, the framework used in this book is based on queuing models of computer systems. These models view a computer system as a collection of interconnected queues, or a network of queues. This chapter describes the qualitative aspect of the framework while the following chapters introduce more quantitative characteristics. A general discussion on modeling is presented before queuing models are introduced.