Load testing simulates real-world scenarios on your applications and is the only way to quantitatively measure your application's performance and how mitigating factors such as network environments, code optimizations, and resource changes affect scalability. Load testing validates application performance, assists in performance tuning, and determines system load capacity.
Interpretation of Load Test Data
It's important that you understand how to interpret the data that you get from load testing your applications. Although entire books could be written based just on how to interpret load test results, the important thing to understand is that this testing should point out the key weaknesses in your application.
Do you have particular pages or sections of your applications that are bottlenecks, slowing down performance in the rest of the application? If so, this should show up in the results of your load tests.
Are you finding that your application doesn't scale as linearly as you'd expect it to? Perhaps that's due to improper use of shared resources that are staring to cause problems as concurrency increases. These issues make themselves evident as load testing progresses.
Keep your eyes open for any unusual or unexpected application behavior as you test. In a perfect world, you'd expect your application to respond exponentially more slowly as you increase load. There should be no point at which the response time suddenly spikes. If this happens, you have a concurrency issue that you need to investigate further and resolve before you take your application live.
Finding the Sweet Spot
"Finding the sweet spot" is a phrase you might hear used by people who have done load testing on several applications. Essentially, it means making the most of your hardware resources (pushing them to their upper limits) without giving in to unbearable response times from a user's perspective.
Prior to load testing, you need to determine the maximum acceptable response time for any page within your application while it is running under peak load. Peak load is defined as the maximum amount of load that your application can handle before it becomes essentially unresponsive and practically inoperable.
When determining that maximum response time, keep in mind that you're trying to push the hardware to the limit of its usefulness. Thus, saying that "three seconds is as long as anyone should have to wait" is a little unreasonable.
In the load testing in which this author has been involved, an acceptable maximum response time under peak load could come in anywhere from 30 to 60 seconds. If it is any longer than this, the vast majority of your users are going to bail out.
The key to successful load testing is developing and maintaining a load testing methodology. Load testing should be systematic and focused. It should simulate real-world scenarios that effectively emulate user behaviors, browsers, and connection speeds. Although there are several models from which to choose, this part of the chapter will focus on the following: isolation testing, stress testing, and endurance testing. You should use these at different stages of your application's life cycle.
Isolation testing involves focusing on a particular section or functionality of your application. This may be testing the responsiveness of a search engine or measuring the response times of a shopping cart checkout process. Isolation testing is most useful during early stages of development when you are trying to identify potential bugs in sections of code. It can also be useful during QA before deploying new sections of code.
Stress testing is the application of continuous user loads, at a maximum speed, on your system without any think times between transactions or page views. If a hit is the actual delivery of a page, then a view is the actual scanning or reading of that page. Think times are the time a user spends viewing the rendered page.
Stress testing disables the loading of images and other HTML resources (such as external style sheets) to eliminate the time incurred waiting for these to render. After all, not only are images usually the last items to load on a page, but also they add to the document weight, which increases the response time.
The goal of stress testing is to stress the system limits. This determines the maximum number of simultaneous requests the application can handle, and it determines the server's load threshold. Stress testing is most useful when applied to identified paths throughout your completed application.
Recommended periods for stress testing include the following:
Endurance testing applies real-world user load over a period of 24 hours or longer. This helps determine the following facts:
Endurance testing is most useful before deployment. Because you are applying a real-world user load over a length of time, this test is useful in determining an optimal server configuration for hosting your application.
The most useful model in determining overall site scalability is stress testing. Isolation tests identify problem areas in code; endurance tests identify problem areas in system stability over time. Stress tests thoroughly assess system resources for pure responsiveness and are useful in determining the correct user load and system settings for your endurance tests.
Just as there are varying load testing models, there are many variations in load testing methodologies. All the major testing tool vendors have their own methodologies that their consultants utilize and that their trainers teach. Similarly, IT department administrators, developers, and Internet service providers (ISPs) also have their own methodology for measuring a server's load capacity. The disparity among these methodologies is as disproportionate as their respective site designs. However, no matter what tool you decide to use or what model you employ, there is a correct way to load test a ColdFusion Server.
The proper way to load test a ColdFusion Server is to test a virtual user path through your application that emulates real users. This probably sounds similar to what you are already doing; however, if you have more than one ColdFusion application on your server, your test script must explore paths along each application and do so simultaneously. This is required because although you may have a virtual site or server for each application, there is only one ColdFusion Server to service CFM requests. That means ColdFusion's pool of simultaneous requests services requests from all these virtual sites in a FIFO (First In, First Out) fashion. If the metrics you gather do not consider data for all these applications, you should consider your data incomplete or "dirty." Thus, load testing and optimizing one application does not buy you much if the other applications on the server perform poorly and you have not considered them in your testing and optimizations.
Other Performance Considerations
When determining the maximum amount of load your application can handle, it can be helpful to look at some of the other ways that you can improve the performance of your application to help achieve the best results possible during your testing. Some of these methods are discussed in the following list:
Of course, beyond these methods, there's the tried and true "throw hardware at it" method. If you need to be able to serve more users concurrently, but you simply cannot squeeze any more performance out of your application or current hardware, you can always explore clustering your application or database servers to pool horsepower or divide load among several servers.
This strategy is a perfectly acceptable method for increasing application performance, and you'll find that, if your traffic grows for any length of time, it will be a method that you'll have to eventually explore.
There are many methods for calculating the amount of hardware that you'll need to scale to a certain level of load. Although we don't endorse any particular method, you'll find that this is an area where your load-testing data will come in handy. For example, if based on your load testing you know that your current hardware can adequately support 200 concurrent connections, you should be able to extrapolate the fact that if you want to service 1000 users simultaneously, you will need to consider a cluster of 5 similar servers.