7.4 Optimizing Hierarchies

only for RuBoard - do not distribute or recompile

7.4 Optimizing Hierarchies

Remember that a primary motivation for cache hierarchies is to find cache hits in your neighbors. Usually, a cache uses one of the intercache protocols described in Chapter 8 to predict neighbor hits. In some cases, though, we can identify requests that could not possibly result in a cache hit. For these, it makes sense to immediately forward the request to the origin server rather than go through a neighbor cache. In most cases, a direct connection to the origin server is faster than a cache miss through a parent. In addition to reducing latency for end users, this technique also reduces the load placed on upper layers of a hierarchy.

Identifying requests that should bypass the hierarchy is relatively straightforward. The most common case is requests that have uncachable responses. An easy way to find uncachable responses is by looking at the request method. Recall from Section 2.2 that only GET method requests are cachable by default. Many of the other methods are never cachable, and POST is cachable only if specifically allowed. Thus, a common heuristic is to send all non-GET requests (with one exception) directly to origin servers.

The exception to this rule is the TRACE method. The purpose of the TRACE method is to enable someone to discover the sequence of caches (or proxies) between the user and an origin server. It is similar to the traceroute program used to show IP routing paths. Even though TRACE responses are not cachable, a proxy cache should forward a TRACE request as though it were a GET, so the proper forwarding path can be shown.

We saw additional heuristics for identifying uncachable requests in Section 2.2.7. Perhaps the most common is to look for "cgi" or "?" in the URL. Even though there is some small probability that such a request can be cached, it's probably still better to bypass the hierarchy. You may lose one cache hit out of 1,000 requests, but you'll gain much more in better response times.

Requests that require authentication always have uncachable responses, unless the response includes the proxy-revalidate directive. At the present time, responses to authenticated requests almost never have the proxy-revalidate directive, so this is another good way to identify uncachable requests.

We might be tempted to bypass the hierarchy for requests that have a no-cache directive because they always force a cache miss. Bypassing the hierarchy for these requests is probably a bad idea. Let's assume that a parent cache has stored a response that is now out of date but is believed to be fresh. If a user clicks the Reload button, the browser includes the no-cache directive in its request. If this request doesn't go through the parent, then the out-of-date response does not get updated. Thus, a no-cache request should still go through a cache hierarchy, but it must never be sent to a sibling cache.

only for RuBoard - do not distribute or recompile


Web Caching
Web Caching
ISBN: 156592536X
EAN: N/A
Year: 2001
Pages: 160

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net