The Validation Process | Designing Component-Based Applications

It should be clear by now that defining the necessary level of performance requirements is a critical first step in the validation process. Once you have defined the requirements, you can identify a set of tests that should be conducted to measure performance. These tests should be conducted at various points during the development process to ensure that you are within striking distance of your performance requirements. As the application approaches completion, you can begin validating performance in the context of a test environment resembling the environment in which the application will ultimately be deployed. If your tests indicate that your performance requirements are not being met, you should conduct a series of controlled experiments to locate performance bottlenecks. You then remove the bottlenecks until your goals are met, as illustrated in Figure 13-1.

click to view at full size.

Figure 13-1. The performance validation process.

We will look at each of these steps in the sections that follow.

Defining Performance Requirements

We examined performance requirements briefly in Chapter 7. Here we'll get into the specifics of how to define a "good" performance requirement. Keep in mind that you should define your performance requirements up front, before you begin development and debugging. To define a good performance requirement, you must identify any project constraints, specify the services performed by the application, and specify the load on the application; you then use this information to select tangible metrics to measure and to determine the specific values that must be achieved for those metrics.

Identifying constraints

Some aspects of your project cannot be changed to improve performance. Some of these constraints were identified in Chapter 7 as deployment and security requirements. You might also have constraints on your schedule or your choice of development tools or technologies. For example, your application might need to be deployed by a certain date to meet certain contractual obligations. Your development team might have Microsoft Visual Basic expertise but no C++ expertise, making it impractical to develop components using C++. You might face some hardware constraints as well, particularly for user workstations. Whatever your constraints are, be sure to document them. These are factors that you will hold constant during performance tuning. If you cannot achieve satisfactory performance within these constraints, you might need to ask your management or your customers to revisit the constraints.

You should also start thinking about the aspects of your project that are not constrained. These are the factors you can modify during the tuning process to see whether performance can be improved. For example, can you implement your components in a different language? Can you use different data access technologies? Do you really need transactions? Can you add machines to your application topology? These questions can help you identify ways to remove bottlenecks in your system.

Specifying the services

Before you can measure performance, you need to determine what you are measuring the performance of. Applications typically provide one or more services to users. These services typically correspond to scenarios in the functional specification, as discussed in Chapter 7. Usually, each of these scenarios can be described as a set of transactions. Even if transactions are not involved, a sequence of interactions with the user takes place for each scenario. You should define the semantics of each performance-sensitive scenario—that is, define precisely what the user does and what the application service does in response, including how the service accesses databases and other system services. These definitions will drive the tests you use to measure performance.

In addition to defining which services should be measured, you should specify how often these services are used. For example, for the Island Hopper application, you might expect users browsing classified ads to account for 75 percent of system usage and users placing classified ads to account for 25 percent of system usage, with negligible usage of other services. Accurate estimates of the usage of various application services will help you create tests that closely mimic the expected usage of the system, improving the accuracy of your performance test results.

Specifying the load

You also need to estimate the load on the application. A common way to measure load is by identifying the number of clients that will use the application. A related measure is think time. Think time is the elapsed time between receiving a reply to one request and submitting the next request. For example, in the Island Hopper application, you might estimate that it takes about 60 seconds for a user to enter all the information required to place a classified ad. This would be the think time for the "Place a Classified Ad" scenario.

You should also consider how the load varies over time. For some applications, the load will remain fairly constant. Other applications will exhibit varying loads. For example, a payment processing application might have heavier usage during the week payments are due. An insurance claims application would have heavier load when a natural disaster such as a hurricane or tornado occurs. A help desk application might have heavy load in the month following the release of a software upgrade. Using information about how the load varies over time, you can determine the peak and average loads on the system. Your performance requirements can be based on either or both of these measures.

Defining metrics and goals

Once you have identified constraints, services, and load, you need to define the specific performance goals, or requirements, for the application. First select the specific metrics you will measure. One common metric is total system throughput, in terms of transactions per second (TPS). This quantity is measured for a given mix of service requests (that is, transactions) and a given user load. Another common metric is response time, which is the elapsed time between submitting a request and receiving a reply. Response time metrics are often specified at a certain percentile—for example, you might specify that 95 percent of all requests must respond in less than one second.

After you select the appropriate metrics, you need to specify the required values for those metrics. These values should be realistic measures of the necessary performance of the application—"as fast as possible" is almost never the correct answer. A simple way to determine the TPS requirement is to divide the number of clients by the think time. For example, in the Island Hopper application we might determine that on average the application needs to support 1200 simultaneous clients with a 60-second think time. This gives us a value of 20 TPS for average load. Response time measures should take user expectations into account. In the Island Hopper application, once a user submits a classified ad, we might determine that the user will not wait longer than 5 seconds before deciding that the application is not working correctly. So we might specify the response time requirement as 95 percent response within 5 seconds over a 28.8 KB modem connection.

Measuring Performance

After you have identified the specific performance requirements, you can begin testing whether your application meets those requirements. It is important that you eliminate as many variables as possible from your tests. For example, bugs in your program can create the appearance of a performance problem. To compare test results from different performance test passes, you must be sure that your application is working correctly. It is especially important to retest application functionality if you have modified the implementation of a component or an application as part of the tuning process. Be sure that your application passes its functional tests before running performance tests. In addition to unexpected application changes, you should confirm that no unexpected changes have occurred in hardware, network traffic, software configuration, system services, and so on.

To tune performance, you need to keep accurate and complete records of each test pass. You must record the exact system configuration, especially any changes from previous test passes. You should record both the raw data and the calculated results from performance monitoring tools. These records not only help you determine whether you have met your goals, but can also help you identify the potential causes of performance problems down the road.

Defining performance tests

During each test pass, you should run exactly the same set of performance tests—otherwise you won't know whether any difference in results is due to the tests or to changes in your application. Try to automate as much of the performance test set as possible to eliminate operator differences.

During performance testing, you will measure and record values for the metrics specified in your performance goals. You should also ensure that the conditions defined in your goals for think time, transaction mix, and so on are met. Within these constraints, you should make the testing as realistic as possible. For example, you'll want to test how the application performs when many clients are accessing it simultaneously. Usually, you will not be able to run a reproduceable test with multiple clients. However, you can simulate multiple clients in a reproduceable manner using a multi-threaded test application, in which each thread represents one client. If your application accesses a database, you'll want to ensure that the database contains a realistic number of records and that your tests use random (but valid) values for data entry. If you use a small test database, the effects of caching in the database server will give you unrealistic test results. You might also obtain unrealistic results if data is entered or accessed in unrealistic ways. For example, it's unlikely that new data would be created in alphabetical order on the primary key.

The MTS Performance Toolkit provides sample test harnesses you can use as models for building automated test harnesses for your own applications. These sample test harnesses demonstrate how to collect TPS and response-time metrics, as well as how to simulate multiple clients using multiple threads. You'll usually want to build a test harness that lets you specify the transaction mix, think time, number of clients, and so on as input parameters. However, the rules for creating realistic random data will probably be encoded within the test harness itself.

NOTE
In addition to sample test harnesses, the MTS Performance Toolkit provides a great deal of information about the process of validating performance of MTS applications. You should read the MTS Performance Toolkit documentation as a supplement to this chapter. A beta version of the toolkit is contained in the MTSPERF.EXE file, which is located in the \Performance directory on the companion CD. Check the COM home page, http://www.microsoft.com/com/, or the Microsoft Platform SDK for updates.

Once you have created a test harness for driving your application, you should document all the invariant conditions for running the tests. At a minimum, these conditions should include the input parameters required to run the test harness. You should also document how to set up a "clean" database for running the test—that is, a database that does not contain changes made by a previous test pass—and the machine configurations used for the test. Usually, you will want to run the test harness on a separate machine from the MTS application, as this more closely approximates a production environment.

Determining baseline performance

After you have defined your performance goals and developed your performance tests, you should run the tests once to establish a baseline. The more closely your test environment resembles the production environment, the more confidence you can have that the application will perform acceptably after deployment. Remember, you should try to create a realistic test environment right from the beginning.

If you're lucky, the baseline performance will meet your goals and you won't need to do any tuning. More likely, the baseline performance will not be satisfactory. However, by documenting the initial test environment and the baseline results, you have a solid foundation for your tuning efforts.

Scaling out to improve performance

If the baseline performance is not acceptable, you can often improve performance by scaling out your application. Scaling out means adding MTS server machines to your application topology to distribute the client load. You can also scale out by adding database servers and partitioning data access.

To add MTS server machines to your application topology, you install the same MTS packages on all the machines. You then generate a client install program for each machine. You statically allocate users to specific MTS servers by distributing the client install program from each machine to a specific set of users—this distributes the client load statically over multiple machines, which will usually substantially improve overall performance.

When multiple MTS servers or database servers are used in a transactional application, writing the Microsoft Distributed Transaction Coordinator (MS DTC) logs can add significant overhead, decreasing your application's performance. As we'll see in the section "Transaction Bottlenecks" later in this chapter, you can improve performance by using a single MS DTC for all machines.

Scaling out is often an attractive solution to meeting performance goals because it improves performance without requiring any code changes in your application. The cost of adding hardware to the application topology is usually much less than the development and testing costs associated with changing application code, especially relative to the performance gains. Scaling out might increase administrative overhead, but again the performance benefits typically outweigh the administrative costs.

Identifying and Eliminating Bottlenecks

If your performance requirements are not met after you scale out, or if scaling out is not an option, you should use the data from your test results to identify bottlenecks in the system and form a hypothesis about their cause. Sometimes the test data is not sufficient to form a hypothesis and you will need to run additional tests using other performance monitoring tools to isolate the cause of the bottleneck. Some commonly used tools for monitoring the performance of MTS-based applications are Microsoft Windows Task Manager, the Transaction Statistics pane in the MTS Explorer, Microsoft Windows Performance Monitor (PerfMon), and the Visual Studio Analyzer.

The Performance tab in Task Manager, shown in Figure 13-2, provides information about CPU and memory usage on a particular machine. The Processes tab provides information about CPU and memory usage by all the processes on that machine. You can use this information to determine at a high level where bottlenecks might be located.

click to view at full size.

Figure 13-2. The Performance tab in Task Manager.

The Transaction Statistics pane in MTS Explorer, shown in Figure 13-3, can also be used to collect high-level information about your application's performance—that is, if the application uses transactions. You can determine how many transactions commit or abort during a test pass, as well as the minimum, maximum, and average response times. Note that these response times are for transactions only, not for the entire end-to-end scenario you probably want to measure. Also, these statistics do not distinguish between different types of applications in your system. However, you can use this information to get a rough idea of how distributed transactions impact your application's overall performance.

click to view at full size.

Figure 13-3. The Transaction Statistics pane in MTS Explorer.

PerfMon, shown in Figure 13-4, is a useful tool for identifying bottlenecks and suggesting possible causes. PerfMon is a GUI application that lets you observe various performance counters on a Windows NT system. Performance counters measure the throughput, queue lengths, congestion, and other metrics associated with devices and applications. Although MTS itself does not currently provide any performance counters, you can use performance counters for devices such as memory, disks, and the CPU to identify many bottlenecks. System applications such as SQL Server also provide performance counters that can help identify bottlenecks.

click to view at full size.

Figure 13-4. The Windows Performance Monitor.

You will probably want to chart a standard set of performance counters for every performance test. The most common performance problems in MTS applications are due to insufficient RAM, insufficient processor capacity, disk access bottlenecks, and database hotspots. Table 13-1 describes a set of performance counters you can use to identify these common bottlenecks.

Table 13-1. Performance counters for identifying common bottlenecks.

*Counter*	*Description*	*Bottleneck Symptom*
Memory: Page Faults/Sec	Number of page faults in the processor	Sustained page fault rates over 5/sec indicate that the system has insufficient RAM.
Physical Disk: % Disk Time	Percentage of elapsed time that selected disk drive is busy servicing read or write requests	Percentages over 85%, in conjunction with Avg. Disk Queue Length over 2, might indicate disk bottlenecks, if insufficient RAM is not causing the disk activity.
Physical Disk: Avg. Disk Queue Length	Average number ofread and write requests queued up during thesampling interval	Queue lengths over 2, in conjunction with % Disk Time over 85%, might indicate disk bottlenecks, if insufficient RAM is not causing the disk activity.
System: % Total Processor Time	Percentage of time processors are busy doing useful work	Percentages consistently over 80% indicate CPU bottlenecks.
System: Processor Queue Length	Instantaneous count of the number of threads queued up waiting for processor cycles	Queue lengths greater than 2 generally indicate processor congestion.
SQL Server: Cache Hit Ratio	Percentage of time that SQL Server finds data in its cache	Percentages less than 80% indicate that insufficient RAM has been allocated to SQL Server.
SQLServer-Locks: Total Blocking Locks	Number of locks blocking other processes	High counts can indicate database hot spots.

NOTE
Many other performance counters are available—you can find information about the counters, using PerfMon, the Platform SDK, the Windows NT Resource Kit, and other Microsoft resource kits.

The Visual Studio Analyzer, included with Microsoft Visual Studio 6.0, can also be used to monitor performance counters. In addition, it can be used to monitor events related to your application's components and communication between components. Both COM and MTS fire events that the Visual Studio Analyzer can capture, thus helping you to identify performance bottlenecks related to your component implementations. For example, you can identify method calls that are consistently slow.

NOTE
Visual Studio 6.0 provides extensive documentation on using the Visual Studio Analyzer.

After you have collected data using the performance monitoring tools, you should know whether a bottleneck exists and what is causing it. Based on your hypothesis about the cause of a bottleneck, you need to devise and implement a solution to the problem. Sometimes this process is easy, but often the performance data does not give a clear indication of how the problem might be fixed. In this case, you might need to conduct a number of experiments, changing one aspect of the application or test environment at a time and observing how those changes impact performance. As you gain more experience with performance tuning, you'll begin to see common problems and solutions. Some of the common problems the Microsoft COM team has identified in its performance work are listed in the following section, "Common Bottlenecks" After you change the application or test environment to eliminate the bottleneck, you should retest the application to verify that the bottleneck has indeed been eliminated. If the change has no impact or makes performance worse, you should undo the change and try something else.