Production Simulation Requirements | Performance Analysis for Javaв„ў Websites

A performance tool simulates user load against your web site or test system. These automated performance tools work better than human testers in almost all cases. First of all, you cannot round up a thousand or so humans to run test scenarios against a large web site. Secondly, even if your web site supports significantly smaller user loads (20 or 30 concurrent users, for example), live testers typically do not simulate real world users well. Live testers tend to get "click-happy" and move through the web site very quickly. Thus, they usually generate much shorter think times than found in production. An automated tool gives you more control over the simulation, and allows you to test larger loads with fewer resources.

A performance test tool (sometimes also called a load driver ) executes your test scripts. As discussed in Chapter 7, most load drivers provide capture and replay capabilities to partially automate the creation of test scripts. This technology allows you to "record" a typical browser interaction with your web site and then automatically replay this same interaction. As shown in Figure 8.1, the tool typically sits between the browser and the server and "listens" to the traffic flow to create the initial script. Most tools provide some type of recording capability to initially create the scripts. However, recorded scripts usually need additional customization to adequately simulate real users. Making these changes requires programming skill as well as time.

Figure 8.1. Script recording

graphics/08fig01.gif

Performance test tools vary significantly in capability and price. To choose the right tool, begin by understanding the key features of your site and the goals of your test. For example, let's think back to Chapter 5 and the distinguishing features for some of the different types of web sites, as summarized in Table 8.1. Let's map some of the differences to test tool requirements for each major web site type.

Table 8.1. Some Performance Characteristics of Common Web Site Types

Web Site Type	User Pressure	Throughput Dependent	Response Time Requirements	Caching Potential
Financial	High	High	High	Medium
B2B	High	Medium	Medium	Low
e-Commerce	High	High	High	High

Financial Web Site

A financial web site experiences high user pressure and high throughput. These web sites receive large numbers of users, some of whom stay on the site all day, as well as large transaction volumes . Security plays a large role in these sites. Therefore, a tool simulating load for a financial web site must support the following requirements:

Simulates many concurrent users
Simulates both long user sessions and a large number of short, frequent visits
Simulates a mixed workload of many queries and a smaller number of buy/sell transactions
Simulates secure transactions
Supports high transaction throughput
Simulates spikes in traffic

B2B Web Site

A B2B web site experiences high user pressure but typically lower throughput dependencies. Users on a B2B site frequently stay on all day, with multiple, but infrequent, query and update interactions. The web site secures transaction traffic. To simulate a B2B workload, look for a tool satisfying these requirements:

Simulates many logged-on users
Simulates very long user sessions
Simulates few transactions with long wait times
Simulates secure transactions

e-Commerce Web Site

An e-Commerce site receives more concurrent users and must support a high transaction rate. Users typically visit the site briefly and consider site response time of paramount importance. Browse ratios significantly outnumber actual purchase transactions. These sites require security only for purchase and account related transactions. A tool needs to support the following to simulate an e-Commerce workload:

Simulates many users
Simulates a mix of short and long user sessions
Simulates a mixed workload of queries and buy/sell transactions
Simulates secure transactions
Simulates browsing/searching large catalog databases of items

The web site types discussed here share some common workload simulation requirements: All the sites require a way to simulate logged-on user sessions, a mixed workload of read and update transactions, and security.

However, not every site listed here shares every requirement. Your site may also need special performance test features to meet its requirements. For example, if you performance test an e-Commerce site with a large database, you may need a tool that supports easily importing selections from this database into your scripts. Let's take an in-depth look at some of these different tool considerations for workload simulation.

Users

User simulation is paramount to a successful performance test. Every load driver tool claims the ability to simulate many users. However, some tools may simulate only a limited number of unique users. Based on your test plan, determine how many users you need to simulate, and also consider the users you need for future growth of the web site. (Remember, your test scripts represent an investment. Make sure your tool satisfies your projected load testing needs for some time to come.)

For example, an e-Commerce site typically needs more simulated users than a B2B site. The number of virtual users required contributes significantly to the overall cost of test hardware and software. Clearly, the more realistic the test, the better; however, simulating all of the expected users may simply cost too much.

The key differentiators between tools regarding user simulation include

Think time support
User ramp-up and load control
Accurate browser simulation, including browser caching
Accurate cookie support, including the ability to dump the cookie cache between script iterations
Hardware requirements and platform support
Multiple driver machine support

Think Time

The most realistic user simulations include think time simulation. Some basic load driver tools, such as Apache Bench, do not provide the ability to simulate think time. However, almost all high-end tools provide think time support, although they differ in the flexibility they provide for adjusting these times. If simulating think time is a high-priority requirement for your site, look for a tool that allows you to change think times without manually updating your scripts.

In order to lower test costs, you may simulate your load by using fewer simulated users with little or no think time. This often generates the desired throughput with fewer user licenses. For example, if your market research projects 1,000 concurrent users, and your usability testing shows an average think time of 10 seconds, this equates to 100 active users at any point in time.

 1 request/10 seconds/user * 1,000 concurrent users = 100 user requests/second

In this case, using 100 simulated clients without think time produces the same active load as 1,000 users with think time.

Important! This technique proves ineffective for many web sites using HTTP sessions in their web applications. In the first test, your application accesses 1,000 HTTP sessions in memory, whereas in the second test, it accesses only 100 HTTP sessions. Also, the HTTP sessions time out faster using this technique than in a full user test. The case study in Chapter 14 provides an example of testing with reduced think time. Of course, if you use this technique, look for a test tool that allows you to easily reduce or eliminate think times during test execution.

Ability to Control User Load and Ramp-Up

Beyond how many users the tool supports, also consider if the tool allows you to add simulated users during a test run. Adding users to a running test helps simulate spikes in user loading as well as patterns of increasing load. Most high-end performance test tools allow you to specify how many users to start and when to add additional users. For example, when running a 1,000-user test, you might start 100 users and add another 100 users every minute until reaching 1,000 users. This gives your site a gradual "warm-up" period.

Figure 8.2 shows a sample LoadRunner controller screen. In this example, we defined LoadRunner to run a "Ramp Up" schedule, adding 10 virtual users every minute until reaching a total of 100 users. (LoadRunner allows customization of these ramp-up steps and durations.) Of course, if your load driver tool does not provide this kind of user management flexibility, consider simulating ramp-up or spike patterns by using multiple load driver instances. For example, simulate a "spike" against a financial site by starting up another test system late in the test to generate additional load. However, using this approach requires manual collation of the data returned from the test tool instances. Do not underestimate the difficulty of this task.

Figure 8.2. Example LoadRunner ramp-up. 2002 Mercury Interactive Corporation.

graphics/08fig02.gif

Browser Simulation and Caching

During your comparison shopping, consider whether the tools correctly simulate typical browser behaviors, including the ability to clear cached items between script iterations. Some tools use clever strategies to simulate browser caching. For example, SilkPerformer does not actually maintain a browser cache (just think of how much disk space a cache for each simulated user might require), but rather the tool remembers each user's "cached" items by name , and does not request them again.

Usually, high-end tools provide several cache clearing options for different simulated users. If your web site focuses on support for a specific browser, check for the proper simulation support for this browser.

Network Address Management

If your web site uses load balancing, look for a tool compatible with your load balancer. As discussed in Chapter 3, some load balancing solutions rely on the client's IP address. During load tests, a single load generator may simulate hundreds of different clients. If each of these clients receives the same IP address, you defeat the load balancer's routing scheme, and all your traffic goes to the same server machine within your test cluster.

To work around this problem, some test teams use multiple test client machines, each with a unique IP address. Also, many performance tools manage simulated users across multiple client machines automatically and provide consolidated reporting in this configuration. However, other tools do not, leaving you to manually consolidate reports from each client machine.

If you test IP address load balancing for your site, give careful consideration to purchasing a load driver with IP address simulation capabilities or with the ability to spread simulated users over multiple client machines.

Cookie Support

If your web application uses cookies to associate users with HTTP session data, select a test tool that handles cookies for each simulated user. ^[2] Also, make sure the tool allows you to dump the user's cookie cache after each script iteration. (This allows you to simulate a new user for each iteration.)

^[2] See Stephan Asboeck, Load Testing for eConfidence .

The current editions of most test tools support cookies in some fashion. Make sure the support meets your needs and requires little or no customization on your part to function properly.

Hardware and Platform Support

Don't overlook the hardware requirements for your planned simulated user load. Many tools require lots of hardware capacity, especially memory. As a rule of thumb, plan a comparable client driver to match every system under test (SUT) in your environment. For example, if your test cluster contains two four-way application server machines, plan to include two four-way performance test client machines in your configuration. Of course, use this as a general rule, but confirm these estimates with the actual recommendations published by your performance test tool vendor. Also, keep in mind that these guidelines vary depending on the complexity of the test, the data returned, the degree of error checking performed by the test tool, and other factors. Ask your tool provider for assistance with hardware recommendations.

Surprisingly, simple tests with high transaction throughput often require more client hardware capacity than the more complex tests (tests generating more server processing per request). A workload of primarily static content generates a higher throughput rate than a complex, transaction-oriented workload, which puts more stress on the test client machines. For example, a mostly static based workload may generate 2,000 transactions per second, while a complex, transaction-oriented workload running on the same hardware may generate only 100 transaction per second. The same client simulator works harder handling 2,000 transactions per second then it does with only 100.

For this reason, the client hardware requirements often increase if you test less complex web site logic. For example, if you test an e-Commerce site with lots of static element caching, you may need more client hardware than when testing a B2B site with the same simulated user load. The rapid request and return rate of the static data from the e-Commerce site stands to overwhelm the test client machines.

Also, as you shop for a performance tool, consider the hardware platforms it supports. Performance test tools typically execute against web sites on any hardware platform, just as a browser interacts with web sites deployed on multiple hardware platforms. For example, if your web site uses a UNIX-based platform, you don't necessarily need to limit your performance tool to this platform as well. Most likely, a performance tool running a Microsoft Windows “based system works just as well as one running on a UNIX platform to test your web site.

Many load drivers use test controllers and agent systems to simulate the user load. Many vendors only support a Microsoft Windows “based test controller while supporting the agent simulator on many different platforms, including Microsoft and UNIX systems. For example, Mercury LoadRunner requires a Microsoft Windows platform for their controller, but the controller manages test agents running on Microsoft Windows or several UNIX variants. Keep these important platform requirements in mind as you plan the hardware and skills required for your load tests.

Scripts

As we mentioned earlier, your test scripts represent a significant investment. When you select a test tool, consider how the tool supports script development, customization, and maintenance as part of your selection criteria. Remember, no standard exists for test scripts, so scripts do not port between different performance test tools. Once you make an investment in recording and customizing test scripts, switching test tools becomes difficult.

The key differentiators between tools' script support include

Dynamic data support
- Parameter substitution
- Dynamic web page parsing
Scripting language (learning curve and skill levels required for use)
Ability to build a scenario from individual scripts and to specify their weighting
Ability to reuse scripts in production monitoring

Dynamic Data Support

Going back to one of our Chapter 7 discussions, well-written test scripts simulate a typical user's selection process. The script also incorporates enough variation and dynamic data selection to simulate a breadth of interaction consistent with large user volumes. Repetitive scripts (for example, a script driving a single, unchanging URL request over and over again) often miss out on the most important aspects of performance testing. For example, multiple users reading and writing different records from the same tables sometimes cause database contention. Repetitive scripts never generate this level of contention , so they never reveal this potentially serious problem during the test. Also, realistic scripts properly exercise the various caches throughout the system.

Look at the tool's capability to add parameters based on random numbers or file input. Low-end script drivers typically drive the same sequence of URLs over and over again, but frequently lack the support required for highly dynamic, HTTP session driven sites. Dynamic data handling marks one of the key differentiators between low-end and high-end performance test tools. A free tool such as Apache Bench, which excels at driving a sequence of URLs, is easy to use, and requires minimal hardware. Many "shareware" load drivers exhibit similar characteristics. However, the free tools quickly lose their appeal when seriously testing dynamic applications. For example, if your site supports a user login, look for a load driver that simulates different user IDs.

If you plan to use a low-end driver, consider developing custom application logic for better dynamic user simulation. IBM uses this approach with the WebSphere Performance Benchmark Sample (Trade2). IBM developed the TradeScenarioServlet to front-end the actual application processing. This scenario servlet actually performs the random user generation and dynamically generates the workload mix. This approach requires additional programming resources to develop the equivalent of the scenario servlet, but simplifies the load generation tool requirements. (Also, this approach requires server-side resources to drive the test load.) The load generation tool needs to handle cookies, but does not need sophisticated dynamic data handling capabilities. IBM took this approach in order to easily share the benchmark with customers using different load generators.

Of course, high-end performance tools support sophisticated dynamic data handling capabilities. The high-end tools perform the same function as the scenario servlet within their client-side processing. In general, we recommend finding a performance test tool that handles your dynamic data requirements as opposed to writing your own. Developing test scripts with a reputable performance test tool usually requires less resource investment in the long run and allows you to more easily manage and change the test scripts over time.

Most high-end load driver tools provide the capabilities required to script web applications like Pet Store or Trade2. However, the scripts normally require customization in order to adequately test the dynamic capabilities. During your tool selection process, we recommend actually prototyping scripts with the different tools under consideration to determine if they support your needs.

Parsing Dynamic Web Pages

Dynamic web pages often vary in content and length depending on the request parameters. For example, just like Pet Store, many web sites include a search function. The search function returns a dynamically generated page (usually of web links) based on the search value submitted by the users. Every search potentially generates a different results page with a different number of results links. These pages often require customized scripts to process and select from the dynamic links returned. In fact, in Chapter 7 we showed you two examples of dynamic link processing using both SilkPerformer and LoadRunner.

If you require dynamic link processing, confirm the performance tools you're considering support this feature. In our experience, each vendor provides a range of capabilities in this area, and often the support involves complex and highly skilled script programming. Regardless of the tool you select, take a class or use consulting resources to assist in the initial development of scripts using this function. (In fact, to produce the scripts for our examples, we received help from both Mercury Interactive and Segue to exploit their advanced features.)

Scripting Language and Skills

In addition to script recording, most performance tools offer test script customization support. However, the level of skill required to accomplish the customization varies among the tool vendors. Some tools support simple application customization using the tool's GUIs; more sophisticated application scripting frequently requires custom script programming. The scripting language used by the tools varies from proprietary scripting languages to JavaScript or C.

As we mentioned earlier, no test script standards exist. This leaves the tool vendors free to select or build any scripting language for their scripts. Before you purchase a performance test tool, understand the effort and skills required to create and maintain test scripts.

Building and Weighting Scenarios

As we discussed in Chapter 7, during runtime your test scripts must represent the actual production usage patterns of your web site. For an e-Commerce web site, the workload mix usually consists of far more browsing users than users making purchases. We reflect this same workload mix when we test the e-Commerce site. Simulating different workload mixes is important, especially for larger web sites. Low-end tools typically execute a single workload for all users. Most high-end tools allow you to assign different scripts to different users. For example, if you run a 1,000-user test, use these tools to specify 300 of the users to run a search scenario, 750 of the users to run a browse scenario, and 50 to run a purchase scenario. Figure 8.3 gives an example of defining a weighted test run using SilkPerformer.

Figure 8.3. Example of SilkPerformer V workload configuration. 2002 Segue Software, Inc.

graphics/08fig03.gif

Many of the test tools also provide capabilities to build more complex scripts from smaller, individual scripts. For example, some tools allow you to define a series of scripts to log on, perform a random (or specified) number of browses, and purchase a random selection of items. The ability to isolate function into separate scripts makes it easier to optimize critical paths. For an e-Commerce site, the buy path may constitute a small percentage of the overall workload, but the test plan usually specifies an aggressive response time target for this path. Isolating the buy path to a small, atomic script gives you better control over its execution in the overall workload mix, and, depending on the tool, tighter monitoring of its performance.

Reusing Test Scripts in Production

Frequently, the same scripts used to performance test your web site also come in handy for monitoring and troubleshooting your production site. Periodically, you might run the Search script against your production site under limited load to check the search path's production performance. Several test tool vendors now offer production monitoring tools that use the same scripts developed for load testing. See Appendix C for more information on production monitoring offerings.

Automation and Centralized Control

Consider how well the test tool manages the test execution and results collection. How much of this process does the tool automate, and how much does it require you to do manually?

The key differentiators for automated and central control include

Controlling multiple client drivers from a central location or machine
Consolidating results from multiple clients into a single report
Command line or remote test control

While these differentiators do not actually influence your test results, they make a big difference in the effort required to run your tests. If your test tool requires lots of manual intervention to execute and collect results, the added manual overhead reduces the number of tests you execute every day. The more work you do, the longer it takes to complete the performance tests. Small performance tests (defined by a small test environment, few test scripts, and few simulated users) generally do not require highly automated test runtime management.

Controlling Multiple Client Drivers

If you need multiple test client machines to simulate the user load, consider test tools with support for managing the simulated users distributed over these machines. Figure 8.4 shows a test environment using four client machines to generate the test load. The load generators send requests on an isolated network to the systems under test. The performance test tool provides a central controller to manage the four client machines. The communications between the controller and the load generators traverse a separate network from the actual performance test traffic so as not to interfere with test performance.

Figure 8.4. Testing environment

graphics/08fig04.gif

Typically, a controller assigns a set of test scripts to a set of virtual users, starts these virtual users on distributed client machines, and monitors the users as the test executes. After the test completes, the controller consolidates the data from all the simulated users. Centralized control is common in the high-end load drivers; most low-end tools do not provide this function. If one test machine suffices for your testing in the foreseeable future, this function is not a requirement. However, most large web sites require multiple test clients and a test controller to manage their test runs.

Running Tests via Command Line or Remotely

Some organizations run their performance tests from remote locations. Often the performance tests run from labs miles away from the actual test servers. Also, many organizations write automated test control scripts using the test tool's command line interface to control remote or off-shift test runs. If you need these capabilities, check carefully with the performance tool vendor. Many high-end test tools support remote execution but do not provide a command line interface. (High-end tools use sophisticated GUIs to support their functionality. These functions do not translate easily into a command line interface.)

Conversely, many of the low-end tools provide command line access. However, make sure the tool provides all the functions you need to support the test.

Pricing and Licensing

The price for load driver tools ranges from no cost (such as for the Apache tool) to hundreds of thousands of dollars. You may be in for a shock at the price demanded for high-end load generation tools. A July 2001 ^[3] study comparing prices for tools from Quest, RadView, Segue, Mercury Interactive, Empirix, and Compuware found that the entry level price for 50 or 100 virtual users is between $7,500 and $33,075, plus additional costs to simulate more users.

^[3] Web Testing tools pricing comparison for the big six suppliers. Retrieved December 27, 2001, from the World Wide Web: <http://www.red-gate.com/web_testing_tool_prices.htm>.

The base costs and licensing terms quickly become significant considerations when evaluating test tools.

Key differentiators for pricing and licensing include

Licensing costs (flat fees or based on per simulated user)
Group sharing of users across multiple controller machines
Short- term client licenses
Existing vendor relationship

Licensing Costs

Many of the high-end load driver products charge a flat fee for each controller installed, which includes a small number of virtual users. Increasing the number of virtual users beyond the base amount usually generates additional "per virtual user" licensing fees. Large tests with many virtual users often require very expensive licenses.

In pricing tools, consider the number of users you need to simulate and the cost of simulating these users with the various available tools. Some vendors also offer "unlimited user" licenses. While typically quite expensive, these unlimited licenses may be a good solution for some organizations.

Ability to Share Users across Multiple Controller Machines

Some tools allow you to buy a large number of users and share them as needed across a set of controllers on a " check-out /check-in" basis. Consider this flexible licensing option if your organization supports many test environments and test teams. Of course, for smaller organizations, this licensing option may provide little benefit.

Short-Term Licenses

Several vendors offer short-term licenses (sometimes known as "per day" licenses). These licenses allow you to pay for more virtual users only as you need them over a limited period of time. Early in your testing, you only need a few virtual users to begin tuning a subset of your test environment. Also, early in your testing, you may cut the think time in your test scripts to generate request pressure with fewer virtual users. However, after a time, you may require significantly more users for a scalability test. Also, some organizations insist on a full user test using realistic think times before approving the web site for production release. During these times, you require more virtual users.

Short-term licenses allow you to purchase virtual users for exactly the time you need them. Often this strategy proves more cost effective than buying licenses for virtual users you seldom use. Just be sure to adequately test the environment and scripts beforehand, so as not to waste the short-term license in debugging and setup issues.

Existing Vendor Relationships

Before you purchase a performance test tool, check with your company's software acquisition group. Some organizations enter into agreements with performance tool vendors to provide tools or virtual users at a negotiated cost. Using these agreements usually saves money over purchasing the tool through regular channels.

Also, avoid conflicts with products already approved by your organization. For example, if your company uses functional test or production monitoring tools from another vendor, make sure your performance test tool selection works with these other tools.