Page Caching Using the OutputCache Directive

The most familiar programming target in ASP.NET is the page, such as default.aspx. The page is the usual target of a request, such as an HTTP GET for http://www.asp.net/default.aspx, and is responsible for generating the response. Internally, the page executes logic and writes output, such as HTML, WML markup, or XML to a series of memory buffers. It is configurable, but these memory buffers, also called response buffers, are flushed when the page completes execution. Output buffering can be controlled at the page level using <%@ Page Buffer=”[true/false]” %> or at the application level by setting <pages buffer=”[true/false]” /> in a configuration file. By default, this is set to true.

The page output caching feature of ASP.NET allows for the contents of the response buffers to be written to memory before being sent to the client. When output caching is enabled, the contents are written to memory and the page has been output cached. On subsequent requests, rather than executing the page to fulfill the request, the memory from cache can be written directly to the output stream; page buffering settings do not affect the output cache. This process is illustrated in Figure 6-1.

click to expand
Figure 6-1: HttpRuntime request and response

Let’s review in detail what is happening in Figure 6-1. An HTTP request is made for an ASP.NET page (1), for example, default.aspx. The request is handled by the ASP.NET HttpRuntime. ASP.NET determines whether the request can be satisfied from the output cache: (2a) either the request cannot be satisfied from the output cache; or (2b) the request can be satisfied from the output cache, and the contents from the output cache are written directly back to the response stream. ASP.NET also determines whether the page being requested is already parsed and compiled (3a and 3b).

An instance of the requested page is created, and the Render method is called (4) for the page to render its contents. ASP.NET determines whether the contents of the rendered page can be served from the output cache: if it cannot, the response is written back to the response stream (5a); if it can, the response is stored in the output cache and written back to the response stream (5b).

The ability for ASP.NET to write directly to the response stream from memory means that responses served from the cache are incredibly fast—in some ways, this is equivalent to sending static HTML.

To put this scenario in perspective, consider that a common but costly stored procedure used in the ASP.NET forums (http://www.asp.net/Forums/) generates about 60 page requests per second (with results simply bound to a DataGrid). With page output caching enabled, the number of requests jumps to approximately 480 per second—about 8 times faster!

The OutputCache Directive

Page output caching follows a common pattern found in the .NET Framework: programming model factoring. A page can be instructed to output cache itself using either the OutputCache page directive or the APIs found on Response.Cache. The Response.Cache APIs are used to programmatically manage page output caching. For example, the page OutputCache directives:

<%@ OutputCache Duration="60" VaryByParam="none" %>

is equivalent to the page output Cache API:

public void Page_Load(Object sender, EventArgs e) {
    Response.Cache.SetExpires(DateTime.Now.AddSeconds(60));
    Response.Cache.SetCacheability(HttpCacheability.Public);
    Response.Cache.SetValidUntilExpires(true); 
}

Following are HTTP exchanges for a page that do not use page output caching (the examples include HTTP headers only). These are the HTTP request headers:

GET /test.aspx HTTP/1.1
Host: rhoward-laptop
Accept: */*
HTTP Response
HTTP/1.1 200 OK
Server: Microsoft-IIS/5.1
Date: Thu, 17 Apr 2003 15:49:38 GMT
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 865

The same request with page output caching enabled yields much different results. These are the HTTP response headers:

HTTP/1.1 200 OK
Server: Microsoft-IIS/5.1
Date: Thu, 17 Apr 2003 15:53:06 GMT
Cache-Control: public
Expires: Thu, 17 Apr 2003 15:55:05 GMT
Last-Modified: Thu, 17 Apr 2003 15:53:05 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 865

Note that the Cache-Control header changed from private to public, an Expires header was added, and a Last-Modified header was added. So, as you can see, in addition to caching the page in memory on the server, when ASP.NET output caches a page, it sends the appropriate HTTP cache headers. Table 6-2 describes these headers in more detail. You can read more about HTTP in Hypertext Transfer Protocol–HTTP/1.1 ( RFC 2616), available at http://www.ietf.org/rfc/rfc2616.txt.

Table 6-2: HTTP Cache Headers
HTTP Header	Description
Cache-Control	Specifies how servers connected to the network that participate in the process of returning the requested document to the browser participate in caching. The Location attribute in the OutputCache directive of ASP.NET is used to control this header. (This attribute is discussed later in the chapter.)
	Here are several of the most commonly used values returned by this header: *public* Any server/browser can cache the response.private Cacheable only by the browser/client that made the request.no-cache Whenever the document is requested, the request must go directly to the server that originated the response. Other values returned by this header can be found in RFC 2616, available at http://www.ietf.org/rfc/rfc2616.txt.
Expires	If the response can be cached, the Expires header specifies a point in time at which the response can no longer be cached. When the Duration attribute is set in the OutputCache directive of ASP.NET, the setting affects the Expires HTTP header.
Last-Modified	This is the point in time at which the document was last modified, for example, when the document was last saved.

Note

For more details on the Cache-Control header, see Chapter 8 of Web Proxy Servers by Ari Luotonen (published by Prentice Hall).

Important

Use the page directives when possible. There is less risk of introducing bugs in your application because the OutputCache directive is declarative.

The two samples we looked at earlier—the OutputCache page directive compared to the Cache APIs—accomplish the same page output caching behavior: Both cache the page for 60 seconds and do not use any VaryBy parameters. (We’ll discuss the VaryBy options in a moment.) However, two methods used by the page output Cache API achieve parity behavior with the directive:

SetCacheability
SetValidUntilExpires

We’ll talk about these methods, as well as several others supported by the page output Cache APIs, after our discussion of the page OutputCache directive.

Using the Page OutputCache Directive

The page OutputCache directive is designed to be a simple technique for enabling page output caching. It successfully addresses 95 percent of page output caching scenarios. For special cases, such as an ETag HTTP header generated by pages, the page output cache APIs should be utilized.

Note

An ETag, or entity tag, specifies an HTTP header sent with the served document to uniquely identify a specific version of the page. Cache servers can query the originating cache server to determine whether a cached document is still valid by comparing the cached documents entity tag to the entity tag returned from the origin server.

The following code shows the syntax for the OutputCache directives:

<%@ OutputCache Duration="[seconds]" 
                VaryByParam="[none or parameter list]"
                [Optional attributes] %>

The Duration and VaryByParam attributes are required when using the OutputCache directive. If these attributes are not specified, a detailed exception is thrown when the page is compiled that indicates the VaryByParam attribute is missing, as shown in Figure 6-2.

click to expand
Figure 6-2: Parser error from missing VaryByParam attribute

The most common use of the OutputCache directive is to output cache a page for a duration of time, for example, output caching a page used to display sales reports for a duration of 12 hours. The following OutputCache directive would accomplish this:

<%@ OutputCache Duration="43200" VaryByParam="none" %>

As stated earlier, the VaryByParam attribute is a required attribute that must be set when using the directive. When not used, its value must be set to none.

Although we specified Duration with 43200 (12 hours), there is no guarantee that the output cached page would remain in the cache for this entire period of time. Earlier we discussed how a cached item could be evicted from the cache when memory needs to be reclaimed. In the case of an output cached page, the page would simply be evicted, and on the next request, the page would fully re-execute and be re-inserted into the cache. No exception occurs when this happens; it’s a normal and expected occurrence. Auto-eviction allows the server to optimize itself depending upon the current load. An ASP.NET application performance object is available in the Windows Performance monitor. A Cache API misses counter increments when a page marked as cacheable, or other items requested from the cache, cannot be served from the cache.

Storing an output cached page for a period of time is straightforward unless the cache is dealing with more complex requests. For example, a simple HTTP GET request (with no querystring parameters) is assumed with the directive here:

<%@ OutputCache Duration="43200" VaryByParam="none" %>

What happens for HTTP POST requests, in which parameters are sent via the POST body, or for HTTP GET requests, in which parameters are sent via that querystring?

When building dynamic Web applications, the parameters passed via the POST body or the querystring represent significant data that might affect how the page is displayed. For example, when looking at the www.asp.net site, you’ll notice that we pass querystring parameters tabindex and tabid. The statement /default.aspx?tablindex=0&tabid=1 tells the server to load the controls to display the home page as shown in Figure 6-3.

click to expand
Figure 6-3: The www.asp.net site

Simultaneously, the statement /default.aspx?tabindex=2&tabid=31 tells the server to load the Control Gallery (Figure 6-4):

click to expand
Figure 6-4: The www.asp.net site Control Gallery

Your assumption might be that this page could not be output cached because default.aspx can have different output that is determined by the parameters sent to it. However, this assumption is totally incorrect and the ability to vary the cache by request parameters is one of the unique advantages of the ASP.NET output cache. In fact, the ASP.NET page output cache supports several vary by options to support the scenario in which parameters or other data might affect how the page is to be cached.

Varying Cached Pages by Parameters

An output cached page can be varied by a number of different conditions. Internally, when the page output cache is varied, the cache stores different versions of the page in memory, for example, the ASP.NET output cache is capable of storing different contents for a single page by varying parameters:

http://www.asp.net/Default.aspx?tabindex=0&tabid=1 
http://www.asp.net/Default.aspx?tabindex=5&tabid=42 
http://www.asp.net/Default.aspx?tabindex=2&tabid=31

To support this scenario, we need to use the VaryByParam attribute, which we had previously set to none.

Important

If VaryByParam is not used, why is it required and why is its value set to none? The decision was made to force the developer to add VaryByParam with a value of none to clearly indicate that the page was not varying by any parameters. Requests with parameters sent to an output cached page using VaryByParam with none will not be resolved by the output cache and are treated as misses.

<%@ OutputCache Duration="43200"
                VaryByParam="tabindex;tabid" %>

When you set the VaryByParam values to tabindex and tabid, the output cache will store and retrieve different versions of the requested page from cache, or execute the page if it is not found in the cache. This behavior is shown in Figure 6-5.

click to expand
Figure 6-5: Cache in HttpModule

Note

A single parameter can be specified, for example, VaryByParam="tabindex". Multiple parameters to be varied by must be semicolon-separated, for example, VaryByParam="tabindex;tabId".

Multiple versions of the page reside in the cache, and if the requested version is found in the cache, the contents from the cache are sent back as the response. Otherwise, the request is executed normally as if it were not cached.

The VaryByParam attribute is powerful because it allows the developer to author a single page to be output cached, which can further be constrained by the parameters that affect how the page is to be displayed. Using VaryByParam, we can build highly specialized pages and still guarantee that we can take advantage of the output caching feature for increased performance.

Tip

Varying the output cache by various parameters is very useful. However, here is a good rule of thumb to keep in mind: the more specific the request, the less likely it is that the request can be satisfied from the cache. For example, if the page’s output is highly user-specific, for example, an e-commerce check-out page, the output cached page could only be utilized again by that same user in the same condition (in contrast to output caching the page used to display product information). When items are stored in the cache and cannot be utilized again, the cache is a wasted resource.

The VaryByParam attribute supports three settings:

None Vary by no parameters. Requests with either a query string or POST parameters cannot be satisfied from the cache.
[Param1] or [Param1;Param2]Parameter names are sent in either the query string or the POST body of the request. Multiple values are semicolon-separated.

*This is a special option to vary by all parameters (vs. naming each parameter individually).

Tip

Do not use VaryByParam with * unless absolutely necessary. Any arbitrary data passed in the query string or POST body will affect how many versions of the output cached page are created, potentially filling memory with many pages that can’t be used again.

In addition to varying the cache by the query string or POST parameters, the output cache allows for two other vary by conditions:

VaryByHeaderVaries cache entries by HTTP headers
VaryByCustomVaries cache entries by the browser type or by user code

Varying by HTTP headers

Varying the output cached result of a page based on parameters sent to the page is very powerful, but the page can also be varied by the HTTP headers that are available when the request is made.

When a desktop browser such as Microsoft Internet Explorer 6 makes an HTTP request for a resource stored on a Web server, the client sends several HTTP headers along with the request. Following are the applicable headers Internet Explorer 6 sends for a standard HTTP GET request:

Accept-Language
User-Agent
Cookies

The Accept-Language header is used by the client to set the language that the client is using. In the case of my browser, the language set is EN-US, which means United States English. A request from the United Kingdom might be EN- GB, from France FR-FR, from Japan JP-JP, and so on.

Applications are often developed to support globalization and localization, that is, changing content or display based on the locale or language native to the user. Users can specify their language interactively through the application, such as selecting an option from a drop-down list, or the application can intelligently choose which language to use based on the Accept-Language client header.

The www.asp.net site does not support various languages, but if it did have its content stored in both French and Japanese, the site could still output cache its pages varying by the tabindex and tabid parameters and also varying by the Accept-Language header:

<%@ OutputCache Duration="43200"
                VaryByParam="tabindex;tabid"
                VaryByHeader="Accept-Language" %>

The total number of pages that can be stored in the output cache based on the current settings follows this formula:

[occurrences of tabindex] * [occurrences of tabid] * [Supported Languages]

As you can clearly see, the output cached version of the page is becoming more and more specific; also more and more entries must be kept in the cache. Keep in mind that an entry is created only after it is first requested, so if no requests are made for FR-FR, for example, no entry would appear in the cache.

The second HTTP header of interest, User-Agent, is used to identify the type of browser to the server, for example, the user agent string for Internet Explorer 6:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)

Note

If you have the .NET Framework installed and are using Internet Explorer, a .NET CLR [version #] string will be added as part of the User-Agent header. This can be useful for users who are downloading .NET applications because you can determine whether they also need to download the .NET Framework.

Earlier we said that the more specific the request, the less likely it is that the request will be satisfied from the cache. Constraints such as Accept-Language are common. We can expect multiple requests to specify EN-EN. However, many different browsers types (versioned by both major and minor version numbers), and in some cases the User-Agent, can contain even more data, such as which version of .NET Framework the client has installed. The cache varies by the entire contents of the header, so using User-Agent as a VaryByHeader option is a poor choice for varying by header since the value is unique.

However, just because using VaryByHeader with User-Agent is a bad choice for varying the output cached by browser type does not mean we can’t vary by browser type! To vary by browser type, we use a special vary by option: VaryByCustom.

Varying by Browser Type

ASP.NET supports a rich server control model that allows developers to declaratively add programmable elements to their page using special XML tags. These server controls go through a life cycle, shown in Figure 6-6.

click to expand
Figure 6-6: Server control rendering events

Server controls have an event life cycle and eventually render contents into the response stream to be sent back to the client. Part of this life cycle involves ASP.NET providing the server controls with information about the request, such as the type of browser, for example, Internet Explorer 6; or type of device, for example, a phone supporting WML. The server control then uses this information to determine what markup should be rendered.

For example, a server control might posses the ability to render standard HTML, DHTML, or WML based on the browser or device requesting the page that the control is used within. To successfully cache this page, we need to vary the output based on the type of device requesting the page. We already have this information in the User-Agent header, but we concluded that the User-Agent header is not a good vary-by candidate. Knowing this, and still wanting to vary by the browser type, a special VaryByCustom attribute was created.

The VaryByCustom attribute can be used to either vary the output cache entries by browser type and major version or allow the user to specify a custom vary by option. To vary by the browser type and major version, we simply specify the following:

<%@ OutputCache Duration="43200"
                VaryByParam="tabindex;tabid"
                VaryByHeader="Accept-Language"
                VaryByCustom="browser" %>

Tip

Page output cache directives are additive, and you should plan to use more than just the required VaryByParam for pages containing server controls that behave differently for different browser types. Otherwise, inconsistencies will occur, as Internet Explorer DHTML could potentially be sent to a Netscape 4 browser (if the output cache is not being varied by browser type).

Varying By User-Defined Conditions

So far we’ve examined three distinct vary by options supported by the OutputCache directive. It was our team’s belief when designing this feature that the OutputCache directive would address the majority of output caching scenarios developers would face. However, one last piece of vary-by extensibility was added just in case we didn’t cover all the scenarios: the ability to override the behavior of VaryByCustom.

VaryByCustom accepts a string, just as the other vary by conditions do. However, with VaryByCustom, we stated that if the string was browser, ASP.NET would vary the cache by browser type and major version. Under the covers, however, ASP.NET is calling a method in the output cache API:

virtual HttpCachePolicy.GetVaryByCustomString(HttpContext, string)

It is the responsibility of this method to perform the appropriate actions when browser is specified. However, this API is marked as virtual and thus can be overridden, allowing the developer to customize the output cache VaryByCustom behavior.

Overriding the default behavior of VaryByCustom means that we can vary by any custom condition. For example, we could vary our cache by browser minor version as well.

The following syntax is used when overriding GetVaryByCustomString in C#. This code could be within an HttpModule or within global.asax:

override public string GetVaryByCustomString(HttpContext context,
                                             String arg) {
    // Implementation  
}

The following syntax is used when overriding GetVaryByCustomString in Microsoft Visual Basic .NET:

Overrides Public Function GetVaryByCustomString
                 (context As HttpContext, arg As String) As String
   ' Implementation  
End Function

When GetVaryByCustomString is overridden, ASP.NET uses the overridden method instead of the default implementation. The method accepts two parameters. The first is an instance of HttpContext, which contains all the details about the current request. The second parameter is the string value set in the VaryByCustom OutputCache directive.

To vary by a custom scenario, such as caching the page based only on the minor version of the requesting browser, you would use this code:

<%@ OutputCache Duration="60"
                VaryByParam="none"
                VaryByCustom="MinorVersion" %>  
--- Page Content Here (not shown) ---

To vary by a custom scenario in global.asax, you would use this code:

<script runat="server" >
    override public string GetVaryByCustomString(
        HttpContext context, String arg) {
            string[] varyByArgs;
            string customVaryByString = null;
 
    // Assume the string follows a similar pattern
    // using a semi-colon as a separator.
    //
    varyByArgs = arg.Split(';');
 
    // Now process each string
    //
    foreach(string varyByArg in varyByArgs) {
        // Case each string separately
        //
        switch (varyByArg) {
            case "MinorVersion":
            return "minorVersion=" + 
                context.Request.Browser.MinorVersion.ToString()
            break;
        }
    }
} 
</script>

This code must reside either within global.asax or within an ASP.NET HttpModule, and GetVaryByCustomString must return a unique string value. The returned string value is used to create the key to the output cached page.

Controlling Where the Page Is Cached

The final OutputCache attribute, Location, is used to control who can cache a copy of the response generated by ASP.NET. It is shown in the next code snippet. Note that you are unlikely to use this attribute unless you are using other caching hardware within your network.

<%@ OutputCache Duration="43200"
                VaryByParam="none"
                Location="Client" %>

Valid values for Location are as follows:

Any Indicates that any downstream caching application is allowed to cache the generated response from ASP.NET. Any is the default value for Location.
Clien t Indicates that the browser can store the page in its local browser cache. When the user navigates using the Back and Forward buttons, the browser can satisfy these requests without a request to the server.
Downstream Indicates that downstream clients, such as browsers or proxy caches, can cache the document, however, the document is not cached by the server. This setting is useful when you want to guarantee that any requests to the origin server are generated dynamically. However, if the request was made through a proxy server, the proxy server has the first chance to satisfy the request.
Server Indicates that the response is cached only by the server and no downstream caching clients or proxies can cache the response. This setting is useful when you want to ensure cache consistency throughout the network by not allowing any proxy servers to cache the contents of the request.
None Indicates that only the page cannot be stored in any caches.

Tip
Don’t use the Location attribute unless you completely understand how it works. In the majority of cases, it is unnecessary.

Now that we’ve covered how to use page output caching through the OutputCache directive, let’s examine how to use page output caching using the page output Cache APIs surfaced from Response.Cache.