Let’s be honest: performance tuning doesn’t sound like fun. It’s probably the last thing on your list of things to do when developing an application—if it’s even on the list at all. Many developers ignore performance work until they are faced with a critical problem. Maybe this is because so many things contribute to the overall performance of an application that it’s difficult to know where to begin. Or maybe developers have difficulty knowing when they have reached the end of the performance work.
In this chapter, we explore how to analyze the performance of an ASP.NET application, discuss what steps can be taken to identify and solve performance issues, and walk through some development best practices that can help you avoid bottlenecks. Performance tuning, in which you analyze and measure the behavior of the application as it is being developed, can be interesting, challenging, and very rewarding because it allows you to fully see the results of your work, as long as you don’t get caught up in minor details. Look first for the biggest performance issues where your work will pay off the most. This chapter will help you sort out where you can get those payoffs.
Performance tuning is an iterative process. The first step is to analyze the performance of the application. You can’t know where to focus your efforts until the behavior of the application is understood. Also, you can’t accurately gauge the impact of your changes without being able to reproduce the measurements again and again.
Once you gather performance data about an application, you might be tempted to implement multiple changes simultaneously. However, you might find that when you implement a large set of changes and then repeat the performance analysis, you actually see less improvement than you originally expected. Even if you do get significant improvements, you won’t know which changes contributed to them. The iterative process is important because it enables you to understand the impact of each change and validate each change separately. A bad change made at the same time as a good one won’t be recognized if both changes are measured together.
You can measure the performance of an ASP.NET application using three primary means: performance counters, profiling, and throughput measurements. These approaches generate solid and reproducible data, so we can measure the behavior of an application through multiple iterations. For all of them, the basic requirements are the same: you must load the application with requests and measure the results.
You must have a controlled test environment so that changes in performance measurements reflect the changes you made to the application, not other demands being placed on the server during your testing. You don’t want to spend hours chasing down a dip in performance, only to find that the test machine is getting abnormally high file-sharing traffic from peers on the network.
First, establish a set of hardware dedicated to the task. Of course, the ideal is to work on hardware that is exactly the same as the production systems, but you don’t have to as long as the architecture is the same. For example, if the production environment consists of front-end Web servers that utilize a back-end database, your test environment should, too. When the hardware is not the same, there is room for error when you extrapolate the capabilities of the deployment environment using numbers gathered in the test environment. Do not use a lack of duplicate hardware as an excuse for not testing, but understand that the results will not be exact when using different processor speeds and quantities of memory.
The hardware dedicated to performance testing should include enough machines acting as clients to generate a significant amount of load. For Web applications that face the Internet instead of internal corporate traffic, make sure you include a variety of user agents and include some random requests. The amount of garbage requests that bombard a publicly accessible server can be staggering. All popular load-generation tools include options for specifying various user-agent strings.
Tip | When simulating real-world load, use various pseudo-random user-agent strings and completely random requests with large query strings and quantities of posted data. Live servers must endure this type of load, so accurate analysis should include malformed requests along with the valid ones. |
The raw throughput number is the key metric used to gauge Web application performance. This metric is the average number of requests per second that the Web server can manage effectively. The other number that is central when discussing throughput is the response time. How long does it take for the request issued by the browser to return results? To really understand response time, you typically divide the measurement into two separate measurements: time-to- first-byte, and time-to-last-byte. Time-to-first-byte is calculated as the number of milliseconds that elapse from the moment the client browser makes a request to the moment the first byte of the response is received. Add to that number the time that passes between the first byte and last of byte of the received response and you have time-to-last-byte. Figure 9-1 shows the round trip between client and server.
Figure 9-1: Round trip between client and server
You might assume that the time between the first byte and the last byte is very short. Normally it will be, so having large time-to-first-byte or time-to-last- byte times should be investigated. When output is buffered, the time between the first byte and last byte is a rough measure of output size. This timing can also be affected by network bottlenecks. Some performance measuring tools allow you to specify that a selection of clients behave as though they are on slow links to better simulate the real-world mix of connection speeds. A long time-to-first-byte measurement can indicate that the page is performing extensive processing, is blocking on back-end resources, or that the server is in an overloaded state and is undergoing excessive context switching.
Note | The time-to-first-byte throughput metric translates directly into how the user perceives the performance of the Web application. Even when throughput numbers look adequate, the user might feel that the response time is inadequate. The user is not concerned with how many requests the server can manage; she just notices how fast her request is handled. |
To measure the way an application performs under load, you obviously must subject it to real-world activity. We can’t very well enlist hundreds of friends and co-workers to help out, so we use load generating tools. The cost for these software packages varies significantly. The free tools lack some of the features available in the more expensive retail performance analysis software, but we won’t take the time here to dissect the various features of each tool. For many applications, using free tools, like Microsoft Web Application Stress Tool, will suffice. This tool is available at http://www.microsoft.com/technet/treeview/default.asp?url=/technet/itsolutions/intranet/downloads/webstres.asp. The Web Application Stress Tool is particularly suited for static and relatively static sites where client-side manipulation of variables for subsequent pages is not central to the application. After the Web Application Stress Tool was released, Microsoft went on to incorporate load generation directly into the Application Center Test program, which better handles recording of user actions and managing ViewState round trips between client and server.
Tip | Become proficient at using the load generating tool you choose. This might seem like obvious advice, but we can’t emphasize it enough. Almost all test packages include various parameters that can be customized to vary the mix of client connection speeds, SSL connections, user-agents, and security credentials. To effectively utilize the tool and thus measure Web application performance accurately, you must understand how to vary the settings in accordance with your application requirements. |
The better load generation tools can gather performance counters and log data to give an accurate reporting of how the server and application behaved for a given set of requests. The scripts used to generate the load can be replayed. This is very useful in achieving a controlled test environment after you make adjustments to the application.
When you are finished with a session of throughput measurements, you should have a set of numbers with average time-to-first-byte and time-to-last- byte measurements for all pages of your application. Does that include those rarely used error pages or privacy policy description pages? Yes. After all, the impact of a single poorly behaving page, however infrequently it is requested, can be serious on overall application performance. A user won’t care that filling the shopping basket is easy if performance issues during checkout prevent him from completing the transaction. Obviously, the focus of your performance tuning will be on the mainstream process and the most frequently requested pages, but the data for the other pages should be collected so that you don’t have any surprises later on.
For the most part, Microsoft Windows performance counters operate in one of two ways: they report a total count of a particular piece of data over time, or they keep running averages of samples provided by the application or service. If an accumulated counter returns to zero and starts climbing again, it’s potentially indicating that the associated service failed and re-started. If an averaging counter deviates significantly from the other values for a brief period, even if that variance is short-lived, some resource might have become constrained. These types of anomalies must be investigated because they can be evidence of a stress in the application and can translate into poor user experiences.
The Microsoft common language runtime (CLR) includes a feature referred to as side-by-side support, which allows multiple versions of an application to run on the same machine using different versions of the same library. This feature is surfaced in the performance counters. Performance counter objects are supplied by the operating system and by individual processes as well as by the common language runtime. Each performance object provides a set of counters that can include a list of object instances as well as a total counter that provides an aggregated number of all such instances. Both ASP.NET and CLR performance counters include object names that correspond to the last version installed and do not provide the explicitly stated versions. Also present are the full names that include the specific version numbers. In most scenarios, you will be focusing on the latest version, but you might find that you need to work in a side-by-side environment, so you’ll need to be aware of this distinction in the names. In the following sections, we describe some of the key counters to collect when examining the performance of an ASP.NET application.
In the Performance console, the Performance Processor object provides the % CPU Utilization metric that indicates what percentage of CPU time is spent actively processing data. (For information on using the Performance console, search for Performance in Help and Support.) In the normal operations of a desktop PC, the CPU is used a relatively small percentage of the time. When you use tools to put stress load on a Web server, the utilization of the CPU can easily be driven to 100 percent. If requests are becoming queued with low CPU utilization, there might be contention for a back-end or operating system resource.
The Requests/Sec counter, available under the ASP.NET Applications performance object in the Performance console, indicates how many ASP.NET requests per second are handled by the Web server. You can generate enough load to get the requests-per-second counter to a fairly steady state. Then you can gradually add more load to gain an understanding of where the CPU or other resource will become a bottleneck when the number of requests that can be serviced per second is increased.
The Errors Total counter, also available under the ASP.NET Applications performance object in the Performance console, tracks the total error count. Check it to ensure that the application is behaving as expected and that stress scripts are generating load correctly. It’s amazing how fast some error paths execute when compared with fully functioning pages. A change to configuration or stress scripts can leave you inadvertently measuring how fast the Web server can return a cached error message, which is probably not what you set out to count.
The # Of Exceps Thrown counter keeps a running total of the exceptions. This counter is available under the .NET CLR Exceptions performance object in the Performance console. Some code within ASP.NET includes exception handling logic; the application logic might include some exceptional, but not unheard of, conditions as well. However, if the number is growing rapidly, you should understand why. Perhaps exceptions are being used too heavily within the code, or redirections and transfers (which are handled by the runtime with exceptions) are being used excessively.
In the ideal world, the Application Restarts counter, under ASP.NET in the Performance console, remains at zero, but remember that modification to config files, compilations exceeding the limit set by the numRecompilesBeforeAppRestart attribute of compile in configuration data, and modifications to the bin directory will cause restarts. An application restart can have a short-term effect on performance that is easily visible to the user.