1.3 System Life Cycle

Addressing performance problems at the end of system development is a common industrial practice that can lead to using more expensive hardware than originally specified, time consuming performance-tuning procedures, and, in some extreme cases, to a complete system redesign [3]. It is therefore important to consider performance as an integral part of a computer system life cycle and not as an afterthought. The methods used to assure that that QoS requirements are met, once a system is developed, are part of the discipline called Performance Engineering (PE) [16].

This section discusses the seven phases of the life cyle of any IT system: requirements analysis and specification, design, development, testing, deployment, operation, and evolution as illustrated in Fig. 1.6. The inputs and outputs of each phase are discussed, the tasks involved in each phase are described, and QoS issues associated with each phase are addressed.

Figure 1.6. System life cycle.


1.3.1 Requirements Analysis and Specification

During this phase of the life cycle of a computer system, the analysts, in conjunction with users, gather information about what they want the system to do. The result of this analysis is a requirements specifications document that is divided into two main parts:

  • Functional requirements: The functional requirements specify the set of functions the system must provide with the corresponding inputs and outputs as well as the interaction patterns between the system and the outside world (users). For example, the functional requirements of an online bookstore could indicate that the site must provide a search function that allows users to search for books based on keywords, ISBN, title, and authors. The specification indicates how the results of a search are displayed back to the user. The functional requirements usually include information about the physical environment and technology to be used to design and implement the system. In the same example, the specification could say that the online bookstore site should use Web servers based on UNIX and Apache and that it should also provide access to wireless users using the Wireless Application Protocol (WAP) [19].

  • Non-functional requirements: The non-functional requirements deal mainly with the QoS requirements expected from the system. Issues such as performance, availability, reliability, and security are specified as part of the non-functional requirements. A qualitative and quantitative characterization of the workload must be given so that the QoS requirements can be specified for specific workload types and levels. For example, a non-functional requirement could specify that "at peak periods, the online bookstore is expected to receive 50 search requests/sec and respond within 2 seconds to 95% of the requests."

1.3.2 System Design

System design is the stage in which the question "How will the requirements be met?" is answered. In this phase, the system architecture is designed, the system is broken down into components, major data structures, including files and databases, are designed, algorithms are selected and/or designed, and pseudo code for the major system components is written. It is also during this phase that the interfaces between the various components are specified. These interfaces may be of different types, including local procedure calls, Remote Procedure Calls (RPC), and message exchanges of various types.

The current trend in software engineering is to reuse as many proven software solutions as possible. While this approach is very attractive from the point of view of shortening the duration of the design and development phases, it may pose risks in terms of performance. Designs that perform well in one type of environment and under a certain type of workload may perform very poorly in other settings. For example, a search engine used in a low volume online retailer may perform very poorly when used in an e-commerce site that receives millions of requests per day. As the workload intensity scales up, different techniques, different algorithms, and different designs may have to be adopted to satisfy the non-functional requirements.

A key recommendation is that special care be given to the non-functional requirements at the design stage since decisions made at this stage are more likely to have a strong impact on system performance, availability, reliability, and security. Moreover, problems caused by poor decisions made at this stage are much more expensive and time consuming to correct than those generated by decisions made at the later stages of the development life cycle.

It is also common at the design stage to make decisions related to the adoption of third-party components such as messaging middleware, search engines, directory services, and transaction processing software. Again, it is important to evaluate the performance impact of each of the third-party solutions on overall system performance. Credible performance evaluation is a non-trivial task, one that is addressed by techniques in this text.

1.3.3 System Development

During this phase, the various components of the system are implemented. Some may be completely new creations, others may be adapted from existing similar components, and others may just be reused without modification from other system implementations. Components are then interconnected to form the system. As there are many possible ways to design a system that meets the requirements, there are also many different implementation decisions, left open at the design stage, that can significantly affect performance. For example, it may be left to the development phase to decide how a particular search to a database will be implemented. The developer must not only make sure that the query returns the correct answer but also that its performance will be acceptable when the query is submitted to a production database with potentially millions of records as opposed to a small test database.

As components are developed, they should be instrumented to facilitate data gathering for the testing phase and for the QoS monitoring that takes place during system operation. It should be easy to selectively turn on and off the instrumentation code of components to avoid unnecessary overhead generated by data collection.

1.3.4 System Testing

System testing usually occurs concurrently with system development. As components become available, they can be tested in isolation. This is called unit testing. Then, tested components are put together into subsystems which are further tested until the entire system meets its specification requirements.

It is common for a significant amount of effort to be invested in testing the functional requirements while not enough resources are devoted to the testing of the non-functional requirements such as performance, scalability, availability, and security. When performance is tested before deployment, the usual approach is to conduct load testing [10, 12]. In this case, scripts of typical transactions are constructed and executed on the system while its performance is measured. These scripts can simulate an increasing number of users, called virtual users.

While testing is an important part of a computer system life cycle, it is not possible to anticipate or test all possible scenarios because of time and budget constraints. Therefore, virtually every moderate to complex system is deployed without being fully tested for both functional and non-functional requirements. To reduce the chance that flaws go unnoticed, one must use design and development techniques that attempt to build correct, reliable, secure, and well-performing systems from the ground up. The techniques and methods described in this book provide system designers with the proper mindset needed to incorporate performance into the design.

In the remaining chapters of this book we provide a framework that can be used by system designers and developers to understand the performance implications and consequences of their design and implementation decisions. The issue of how to build secure systems from the early stages is still an open problem and is discussed in [4].

1.3.5 System Deployment

After a system has been tested, usually in a controlled environment, it is deployed for use. During system deployment, many configuration parameters (e.g., maximum number of TCP connections, maximum number of threads, timeout periods, database connection pool size) have to be set for optimal performance. The models described in this book can be used to predict the performance of a computer system under different configuration scenarios so that a proper set of values can be selected for the workload conditions expected to be seen in the field.

1.3.6 System Operation

A system in operation has to be constantly monitored to check if the QoS requirements are being met. Examples of features that should be monitored include:

  • Workload: determination of peak periods during which the system is subject to higher workload intensity levels, determination of the characteristics of the arrival process of requests (e.g., does the workload exhibit extreme bursts?), and detection of unusual patterns that could indicate security attacks such as Denial of Service (DoS) attacks. Part of the workload monitoring process includes a characterization of the global workload into "similar" types of requests. This is important since the performance of the system depends on the types of requests it receives.

  • External Performance Metrics: measurement of user-perceived satisfaction and statistics (e.g., mean, standard deviation, percentiles) relative to response time, throughput, and probability that requests are rejected. When establishing monitoring procedures it is important to keep in mind that for some applications (e.g., Web-based applications), the response time perceived by a user depends not only on the system the Web site in that case but also on the user's geographical location, bandwidth of the Internet connection, the time of day, and on the local machine performance characteristics.

  • Internal Performance Metrics: identification of internal factors that aid in the diagnosis of performance failures and bottleneck detection. Examples include the utilization of processors, storage devices, and networks, and the number of requests waiting in the various software and hardware queues. The amount of information collected this way can easily become overwhelming. Care must be taken to efficiently and effectively organize, collect, and report such internal performance metrics. There are several monitoring and performance management tools that provide good filtering, visualization, and alarm-based reporting capabilities. Some of the tools use data-mining techniques to find useful correlations between internal and external metrics.

  • Availability: determination of the percentage of time that a system is available to service requests. This is usually done by external monitoring agents that send requests to a system at regular intervals to determine if the system is responsive. Availability determination may be done at various levels. Consider for example an online bookstore that has several Web servers and a load balancer that distributes incoming HTTP requests to the Web servers. The load balancer may periodically send "heart-beat" pings to each server to check its network connectivity. In addition to this, there may be software agents running at computers spread over several geographical regions that send search requests to the online bookstore at regular intervals. The latter type of monitoring is useful to check the availability of the service as a whole, including the entire site and the networking infrastructure that connects users to the site. It is important that such pings are infrequent enough so as not to interfere with the normal workload, but are frequent enough to provide accurate information in order for corrective action to be taken in a timely fashion.

During system operation, it may be necessary to change the values of the various configuration parameters to adapt to the evolving nature of the system workload so that the QoS requirements are continuously met. Methods to dynamically control the QoS of complex networked computer systems have been described in [2, 13].

1.3.7 System Evolution

Most IT systems need to evolve after they have been in operation for some time due to many different factors that may include environmental changes or the need to satisfy new user requirements. For example, new laws and regulations may be enacted requiring existing systems to evolve in order to be compliant with them. For instance, the U.S. Health Insurance Portability and Accountability Act (HIPAA) of 1996 triggered many changes in IT systems that support the health care industry. Another example of evolution would be for an e-commerce site to provide access to wireless devices.

System evolution may interfere in non-trivial ways with existing functionality. For instance, an online bookstore may decide to sell CDs and DVDs. The additional workload of requests for CDs and DVDs will share the same IT infrastructure with the one that supports the book selling services. An important question to answer is whether the existing resources will be able to support the old and new workloads while still meeting the QoS requirements for both of them. Predictive models of computer performance are needed to answer these types of questions. This book discusses the use of such predictive models in Part I and the models themselves in Part II.

Performance by Design. Computer Capacity Planning by Example
Performance by Design: Computer Capacity Planning By Example
ISBN: 0130906735
EAN: 2147483647
Year: 2003
Pages: 166

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net