Caching dynamic content | High-Volume Web Sites Team - More about High-Volume Web Sites

< Day Day Up >

The key issue with caching dynamic content is to determine what should be cached, where caching should take place, and how to invalidate cached data.

What should be cached?

A candidate for dynamic content caching is content or data that is changing and at the same time must be stable over a long enough time for meaningful reuse to occur. If frequent access is high, such as pricing information of a popular stock, then even a short time of stability may be enough to benefit by caching dynamic content.

All dynamic Web pages consist of smaller and simpler page fragments. Some fragments are static (such as headers, footer), while others are dynamic (such as fragments containing stock quotes or sport scores). Breaking a page into fragments or components makes effective caching possible for any page, even a highly dynamic page. The goal of creating fragments or components is to maximize fragment reusability and cache utilization.

For example, the personalized page shown in Figure 5-1 contains user-specific elements applicable to only one person. Not much benefit can be realized by caching this whole page.

click to expand
Figure 5-1: Example of a dynamic page containing personalized data

Figure 5-2 shows that when the same page is broken down into fragments based on reusability and cacheability, some or all of the fragments (for example, headers, footers, navigation bars for all users; targeted pricing and advertising for user groups) may become reusable and cacheable for a larger audience. Only fragments that are not cacheable need to be fetched from the back-end, thereby reducing server-side workload and improving performance.

click to expand
Figure 5-2: Example of the dynamic page fragmented for caching

Web pages should be fragmented to cache dynamic content effectively within the enterprise infrastructure and at the content distribution network. However, in some cases, even caching the most granular, final formatted fragment is not sufficient. Under such circumstances, caching at the raw data level is the next granular technique that can be used.

Caching a final formatted whole page (such as HTML or XML), a final formatted fragment, or a piece of unformatted raw data, each, in its own way, contributes to the ultimate benefit of caching dynamic content. The WebSphere solution for caching dynamic content offers features that enable dynamic content to be cached at various granularities, namely whole pages, fragments, and raw data. These important features are Servlet/JSP Result Cache and Command Cache.

Where should caching take place?

Theoretically, caching of dynamic content should take place as close to the user as possible. In reality, other factors such as security and user specific data may influence the choice for the best place to cache dynamic content.

Web page design also plays an important role in determining where dynamic data is cached. One example is personalized pages. Although personalized, these pages would contain user specific, nonuser-specific, locale sensitive, secure, nonsecurity sensitive dynamic data. To maximize the benefit of caching dynamic content, these types of data should be fragmented as finely as possible so they can be cached independently at different locations. For example, the nonuser-specific, nonsecurity sensitive fragments or components are generally useful to many users, and thus can be cached in a more public space and closer to users. The security sensitive data should be cached behind the enterprise firewall, yet as close to the edge of the enterprise as possible.

In a multi-tier e-business environment (Figure 5-3), WebSphere Application Server dynamic cache service can be activated at the business logic and/or presentation layer. It can also control external caches on servers, such as IBM WebSphere Edge Server and IBM HTTP Server. When external caching is enabled, the cache matches pages with their universal resource identifiers (URIs) and exports matching pages to the external cache. The contents can then be served from the external cache instead of the application server, which saves resources and improves performance. Additionally, WebSphere Application Server dynamic cache service's Replication and Invalidation Support extends the cost effectiveness of caching dynamic content by enabling cache sharing and cache replication in an environment with multiple tiers and multiple servers. Finally, WebSphere Application Server's Edge of Network Caching Support expands the application caches into the network.

click to expand
Figure 5-3: Examples of two-, three-, and four-tiered infrastructures

How is cache invalidated?

The biggest challenge when caching dynamic content is to guarantee the freshness, consistency, and accuracy of the content. This requires efficient and comprehensive mechanisms for identifying and updating pages/fragments/data that are no longer valid, a process called invalidation.

WebSphere Application Server dynamic cache service provides invalidation techniques that are rule-based, time-based, group-based, and programmatic. It can also invalidate the remote caches that were configured as its external caches. Additionally, the WebSphere Application Server dynamic cache service uses a built-in standard-based Java Message Service (JMS) messaging system that is lightweight and high performance to support the efficient and effective propagation of the invalidation requests among servers.

< Day Day Up >