11.4 Performance metrics

< Free Open Study >

In studies involving computer systems we will typically be interested in many such measures, not simply one for the system. The computer systems we will model are composed of systems, components, and users. All will have their own measures reflected, and each provides a different lens into the performance of the systems as a whole. Some metrics will be systemwide or global, while others will be localized or individual. There are cases where optimizing an individual metric will impact the global metric and other times when it will have little or no effect. Also, the different levels may have different primary goals. We may be looking for high utilization at one level and low utilization at another, depending on the goals for the system and the individual components making it up. This indicates that the metrics for modeling must be chosen at differing levels so that an appropriate analysis of the true system performance can be determined. The modeler must determine the full set of metrics available for some study. Then these metrics must be examined in relation to their variability, redundancy, and completeness. If a metric has low variability, it may be assumed to be static, removing it from our list of services and measures to consider. If one variable provides the same information another provides, one of them should be dropped. For example, the queue length is equal to the number in service plus those waiting for service, so we need not keep track of them all. Completeness deals with making sure the set of variables provides as reflective a set as that from the real system.

When modeling computer systems, there are many commonly encountered performance metrics. The most common are response time (sometimes called speed, turnaround time, reaction time), throughput (sometimes called capacity or bandwidth), utilization (sometimes referred to as efficiency or business), reliability, and cost/performance ratio.

11.4.1 Response time

Response time is broadly defined as the time interval between a user's request for service and the services return of results, as shown in Figure 11.1. In reality this is overly simplistic and not what occurs. There are many more components on both sides of the request/response making up the true measure. If we think about the same transaction processing system we have used in our previous example, we begin with the user inputting the transaction. We assume this is a single step, but it can be much longer if the user is using an interactive interface to the transactional service. The database system must set up the appropriate data structures and provide resources for the transaction to execute. The transaction then is executed by the database engine. The transaction then completes processing and prepares the transaction results and sends them off, as shown in Figure 11.2. Each of these steps, while a bit more complete than the simplistic model, is still only a partial representation of the full transaction processing cycle in a commercial database system.

click to expand
Figure 11.1: Typical response time measurement.

click to expand
Figure 11.2: Transaction processing response partitioning.

Each of these components of the transaction response time is a response time component. These components are the subparts of the total transaction response time, just as queue wait time and server time represent the job time in a queuing model.

The response time for a computer system will typically increase as the load increases. Many measures have been developed to provide rules of thumb for such scenarios. One, called the stretch factor, is computed as the expected response time over the expected service time, or:

(11.1)

This measure is depicted in Figure 11.3. In most real systems we wish to see this stretch factor to have a computed value of approximately 5. If the factor rises above this approximation, this implies longer waiting times in relation to service times and, therefore, lower availability of the resource and higher utilization.

click to expand
Figure 11.3: Stretch factor compared with utilization.

11.4.2 Throughput

The throughput is a measure of the number of items being measured (e.g., transactions) that receive service (e.g., complete transaction execution) over some predefined period of time. For the transaction system we have been discussing, this would be measured as transactions per second, or TPS. In computer systems' CPUs, the measure is MIPS, or million instructions per second. In communications systems it may be MPS for messages per second or BPS for bits per second. Throughput, as with response time, will grow as additional load is placed on a system. However, unlike response time, there will be a point where the throughput will maximize and possibly begin to degrade, as shown in Figure 11.4. In this figure you will note that the throughput seems to increase over a wide range of load and then slows as we reach a saturation point. In the throughput case, the throughput increases to some maximal level and then levels off. At a critical point in the load, where the response time has begun to increase exponentially, the throughput begins to degrade below the maximum. Such curves are typical of computer systems where there is inadequate service capacity for the presented load. We always want to keep throughput near its peak, but not too far into the saturation region, in order that resources stay available for spikes in load.

click to expand
Figure 11.4: Throughput curves versus response curves.

11.4.3 Efficiency

Another important measure is efficiency. This measure is related to utilization and throughput. The relationships look at a ratio of the maximum achievable throughput compared with the actual throughput:

(11.2)

If we have a processor rated at 100 megaflops (floating-point operations) and, when run in a testbed we measure 90 megaflops, the processor's efficiency is 90 percent. Efficiency can also be measured for multiple resource systems. One common use is when looking at the performance speedup of having one processor versus n processors. Efficiency in this class of environment is calculated as the ratio of the theoretical throughput times the number of devices divided by the speed of a single device.

In Figure 11.5 we see that the theoretical efficiency of adding more processors is a linear curve with an efficiency equal to the number of devices applied. The real measured curve shows a much different story. The efficiency is not linear and continues to degrade as more devices are added. This is due to the added overhead involved in keeping the processors effectively utilized in performing tasks.

click to expand
Figure 11.5: Multiprocessor efficiency curve.

11.4.4 Utilization, reliability, and availability

The utilization of a resource is a measure of how busy the resource is. It is computed as the fraction of time the resource is busy servicing clients divided by the entire time period:

(11.3)

The goal in most systems is not to saturate resources (i.e., keep them 100 percent busy) but to balance the utilization so that no device is more heavily utilized than another. In principle this is the goal, but in reality this is very difficult to achieve. Utilization is an important measure when examining systems. Different devices in the system have different average utilization values. For example, processors typically will be highly utilized, while memory, disks, and other peripheral devices will all have smaller fractional use time.

Other important measures in analyzing computer systems include systems reliability and systems availability. Reliability is a measure of the probability of errors or a measure of the typical time between errors. Most computer systems are fairly reliable, with hardware being more reliable than software. The availability of a system is measured in relation to reliability. If a system is highly reliable, it will be available more likely than not. But if a system is unreliable, then it will have periods of downtime, where the system is not running or is running erroneously. In the case of failures, another important metric is the mean time to repair, or MTTR. The MTTR will indicate on average how long the system will be unavailable after an error. If errors can be quantified and predicted, we can also develop metrics such as mean time to failure, or MTTF.

A final measure used by systems analysts when comparing systems or components is the cost versus performance ratio. This measure is useful in determining which of multiple systems, having the same relative performance, is a better buy. The cost in this case includes, minimally, the hardware and software but also may include licensing, installation, maintenance, and even operations.

< Free Open Study >