17.2 Performance Versus Load Testing | Programming Jakarta Struts, 2nd Edition

There are many different types of software testing: functional, unit, integration, white box, black box, regression, and so on. Performance and load testing are among the most important types of testing, but they usually get the least amount of attention. There are two general reasons for this neglect. The first reason is that developers typically wait until the very end of the development cycle to start testing the performance of the application, and the end of the cycle is when you have the least amount of time for testing. It is true, however, that it's not always practical to conduct performance testing during every phase of development. Early phases tend to focus on the architecturally significant pieces, and there may not be enough of the application built to test its performance. You should, however, gather some preliminary performance measurements as early as possible.

Another reason that performance and load testing don't get much attention is that they're honestly hard to do. While there are many tools on the market, both free and commercial, it's not always easy to use these tools to detect problems. The tools must be able to simulate many simultaneous users of a system, but that involves understanding what virtual users are,^[2] what the different threading models are, and how they affect performance and load. Also, you must be able to look at the results and determine whether or not they are acceptable. All of this can be overwhelming to the average developer. This is part of what keeps developers from conducting the tests; they just don't understand the necessary steps, or how and where to get started. Many organizations house a separate team that is solely responsible for performance testing.

^[2] Virtual users are simulated users that testing applications use to simulate the impact of multiple users on a system without requiring "real" users to sit around testing the application. A user session is recorded for each virtual user and can be played back as if a real user were using the application.

Although performance, load, and stress testing are related, they are not the same thing and are not carried out in exactly the same manner. Performance testing involves executing the functional behavior of the application and essentially timing how long it takes for each result to complete. The amount of time that a single task takes to finish is known as its response time. If you execute the method many times and then take the average, this is its average response time.

The average response time for the signin action in the Storefront application, for example, is roughly 250 milliseconds.^[3] This result is for a single user. You should always conduct the initial performance testing using a single user to get a baseline. If there are performance bottlenecks for a single user of the system, you can bet that these problems will have an impact when multiple users start logging in. In general, the faster the response time is, the faster the application is. This end-to-end time can also be thought of as the transaction time for the function being tested.

^[3] The tests were conducted on a Pentium III 750 MHz machine with 1 GB of memory and all tiers collocated on the same box.

Based on the response time, you can come up with a rough throughput time. Throughput defines the number of transactions that can occur in a set amount of time. The theoretical throughput that is calculated based on a single user will probably differ with real loads. Due to multiprocessing and other hardware and software features, applications can achieve a higher throughput by adding more hardware and software resources. This enables the application to process more transactions per time period, which increases the throughput numbers.

Load testing is analogous to volume testing. This type of testing is performed to see how the application will react to a higher user load on the system. During this type of testing, you can adjust the hardware and software configurations to determine which configuration gives the best throughput for a given user load. Load testing is usually hard to conduct because you are constantly going back and forth, adjusting configuration systems to see what gives you a higher throughput. No application can sustain an infinite user load. The idea is to try to maximize the number of concurrent users with an acceptable average response time.

Throughput is usually measured in transactions per second (tps), but it can also be measured per minute, per hour, and so on. Armed with response times and throughput numbers, you can make intelligent decisions about how the application should be configured to ensure the best performance and scalability for the users. You should share these numbers with the network engineers, so they'll understand how much network bandwidth the application might require.

Stress testing is the next logical extension. Stress testing is essentially load testing using peak loads. When conducting stress tests, you really are trying to stress the application to its limits to see how it reacts, how efficiently the memory is used, and what other types of problems will surface.

Stressing your application under a heavy simulated load offers many benefits. In particular, stress testing allows you to:

Identify bottlenecks in the application under a large user load before they occur in the production environment.
Control risks and costs by predicting scalability and performance limits.
Increase uptime and availability of the application through proper resource planning.
Avoid missing go-live dates due to unexpected performance and scalability problems.

Performance, load, and stress testing should be performed on an application to get the complete picture. They can point to parts of the application that might become bottlenecks, both under normal loads and as the number of users climbs.