Addressing performance problems at the end of system development is a common industrial practice that can lead to using more expensive hardware than originally specified, time consuming performance-tuning procedures, and, in some extreme cases, to a complete system redesign [3]. It is therefore important to consider performance as an integral part of a computer system life cycle and not as an afterthought. The methods used to assure that that QoS requirements are met, once a system is developed, are part of the discipline called Performance Engineering (PE) [16]. This section discusses the seven phases of the life cyle of any IT system: requirements analysis and specification, design, development, testing, deployment, operation, and evolution as illustrated in Fig. 1.6. The inputs and outputs of each phase are discussed, the tasks involved in each phase are described, and QoS issues associated with each phase are addressed. Figure 1.6. System life cycle.
1.3.1 Requirements Analysis and SpecificationDuring this phase of the life cycle of a computer system, the analysts, in conjunction with users, gather information about what they want the system to do. The result of this analysis is a requirements specifications document that is divided into two main parts:
1.3.2 System DesignSystem design is the stage in which the question "How will the requirements be met?" is answered. In this phase, the system architecture is designed, the system is broken down into components, major data structures, including files and databases, are designed, algorithms are selected and/or designed, and pseudo code for the major system components is written. It is also during this phase that the interfaces between the various components are specified. These interfaces may be of different types, including local procedure calls, Remote Procedure Calls (RPC), and message exchanges of various types. The current trend in software engineering is to reuse as many proven software solutions as possible. While this approach is very attractive from the point of view of shortening the duration of the design and development phases, it may pose risks in terms of performance. Designs that perform well in one type of environment and under a certain type of workload may perform very poorly in other settings. For example, a search engine used in a low volume online retailer may perform very poorly when used in an e-commerce site that receives millions of requests per day. As the workload intensity scales up, different techniques, different algorithms, and different designs may have to be adopted to satisfy the non-functional requirements. A key recommendation is that special care be given to the non-functional requirements at the design stage since decisions made at this stage are more likely to have a strong impact on system performance, availability, reliability, and security. Moreover, problems caused by poor decisions made at this stage are much more expensive and time consuming to correct than those generated by decisions made at the later stages of the development life cycle. It is also common at the design stage to make decisions related to the adoption of third-party components such as messaging middleware, search engines, directory services, and transaction processing software. Again, it is important to evaluate the performance impact of each of the third-party solutions on overall system performance. Credible performance evaluation is a non-trivial task, one that is addressed by techniques in this text. 1.3.3 System DevelopmentDuring this phase, the various components of the system are implemented. Some may be completely new creations, others may be adapted from existing similar components, and others may just be reused without modification from other system implementations. Components are then interconnected to form the system. As there are many possible ways to design a system that meets the requirements, there are also many different implementation decisions, left open at the design stage, that can significantly affect performance. For example, it may be left to the development phase to decide how a particular search to a database will be implemented. The developer must not only make sure that the query returns the correct answer but also that its performance will be acceptable when the query is submitted to a production database with potentially millions of records as opposed to a small test database. As components are developed, they should be instrumented to facilitate data gathering for the testing phase and for the QoS monitoring that takes place during system operation. It should be easy to selectively turn on and off the instrumentation code of components to avoid unnecessary overhead generated by data collection. 1.3.4 System TestingSystem testing usually occurs concurrently with system development. As components become available, they can be tested in isolation. This is called unit testing. Then, tested components are put together into subsystems which are further tested until the entire system meets its specification requirements. It is common for a significant amount of effort to be invested in testing the functional requirements while not enough resources are devoted to the testing of the non-functional requirements such as performance, scalability, availability, and security. When performance is tested before deployment, the usual approach is to conduct load testing [10, 12]. In this case, scripts of typical transactions are constructed and executed on the system while its performance is measured. These scripts can simulate an increasing number of users, called virtual users. While testing is an important part of a computer system life cycle, it is not possible to anticipate or test all possible scenarios because of time and budget constraints. Therefore, virtually every moderate to complex system is deployed without being fully tested for both functional and non-functional requirements. To reduce the chance that flaws go unnoticed, one must use design and development techniques that attempt to build correct, reliable, secure, and well-performing systems from the ground up. The techniques and methods described in this book provide system designers with the proper mindset needed to incorporate performance into the design. In the remaining chapters of this book we provide a framework that can be used by system designers and developers to understand the performance implications and consequences of their design and implementation decisions. The issue of how to build secure systems from the early stages is still an open problem and is discussed in [4]. 1.3.5 System DeploymentAfter a system has been tested, usually in a controlled environment, it is deployed for use. During system deployment, many configuration parameters (e.g., maximum number of TCP connections, maximum number of threads, timeout periods, database connection pool size) have to be set for optimal performance. The models described in this book can be used to predict the performance of a computer system under different configuration scenarios so that a proper set of values can be selected for the workload conditions expected to be seen in the field. 1.3.6 System OperationA system in operation has to be constantly monitored to check if the QoS requirements are being met. Examples of features that should be monitored include:
During system operation, it may be necessary to change the values of the various configuration parameters to adapt to the evolving nature of the system workload so that the QoS requirements are continuously met. Methods to dynamically control the QoS of complex networked computer systems have been described in [2, 13]. 1.3.7 System EvolutionMost IT systems need to evolve after they have been in operation for some time due to many different factors that may include environmental changes or the need to satisfy new user requirements. For example, new laws and regulations may be enacted requiring existing systems to evolve in order to be compliant with them. For instance, the U.S. Health Insurance Portability and Accountability Act (HIPAA) of 1996 triggered many changes in IT systems that support the health care industry. Another example of evolution would be for an e-commerce site to provide access to wireless devices. System evolution may interfere in non-trivial ways with existing functionality. For instance, an online bookstore may decide to sell CDs and DVDs. The additional workload of requests for CDs and DVDs will share the same IT infrastructure with the one that supports the book selling services. An important question to answer is whether the existing resources will be able to support the old and new workloads while still meeting the QoS requirements for both of them. Predictive models of computer performance are needed to answer these types of questions. This book discusses the use of such predictive models in Part I and the models themselves in Part II. |