15.12 Caches

     

Web browsers have been caching pages and images for years . If a logo is repeated on every page of a site, the browser normally loads it from the remote server only once, stores it in its cache, and reloads it from the cache whenever it's needed rather than returning to the remote server every time the same page is needed. Several HTTP headers, including Expires and Cache-Control, can control caching.

Java 1.5 finally adds the ability to cache data to the URL and URLConnection classes. By default, Java 1.5 does not cache anything, but you can create your own cache by subclassing the java.net.ResponseCache class and installing it as the system default. Whenever the system tries to load a new URL thorough a protocol handler, it will first look for it in the cache. If the cache returns the desired content, the protocol handler won't need to connect to the remote server. However, if the requested data is not in the cache, the protocol handler will download it. After it's done so, it will put its response into the cache so the content is more quickly available the next time that URL is loaded.

Two abstract methods in the ResponseCache class store and retrieve data from the system's single cache:

 public abstract CacheResponse get(URI uri, String requestMethod,   Map<String,List<String>> requestHeaders) throws IOException public abstract CacheRequest put(URI uri, URLConnection connection)   throws IOException 

The put( ) method returns a CacheRequest object that wraps an OutputStream into which the protocol handler will write the data it reads. CacheRequest is an abstract class with two methods, as shown in Example 15-11.

Example 15-11. The CacheRequest class
 package java.net public abstract class CacheRequest {   public abstract OutputStream getBody( ) throws IOException;   public abstract void abort( ); } 

The getOutputStream() method in the subclass should return an OutputStream that points into the cache's data store for the URI passed to the put( ) method at the same time. For instance, if you're storing the data in a file, then you'd return a FileOutputStream connected to that file. The protocol handler will copy the data it reads onto this OutputStream . If a problem arises while copying (e.g., the server unexpectedly closes the connection), the protocol handler calls the abort( ) method. This method should then remove any data that has been stored from the cache.

Example 15-12 demonstrates a basic CacheRequest subclass that passes back a ByteArrayOutputStream . Later the data can be retrieved using the getData( ) method, a custom method in this subclass just retrieving the data Java wrote onto the OutputStream this class supplied. An obvious alternative strategy would be to store results in files and use a FileOutputStream instead.

Example 15-12. A basic CacheRequest subclass
 import java.net.*; import java.io.*; import java.util.*; public class SimpleCacheRequest extends CacheRequest {      ByteArrayOutputStream out = new ByteArrayOutputStream( );      public OutputStream getBody( ) throws IOException {     return out;   }   public void abort( ) {     out = null;    }   public byte[] getData( ) {     if (out == null) return null;     else return out.toByteArray( );   }    } 

The get( ) method retrieves the data and headers from the cache and returns them wrapped in a CacheResponse object. It returns null if the desired URI is not in the cache, in which case the protocol handler loads the URI from the remote server as normal. Again, this is an abstract class that you have to implement in a subclass. Example 15-13 summarizes this class. It has two methods, one to return the data of the request and one to return the headers. When caching the original response, you need to store both. The headers should be returned in an unmodifiable map with keys that are the HTTP header field names and values that are lists of values for each named HTTP header.

Example 15-13. The CacheResponse class
 package java.net; public abstract class CacheRequest {   public abstract InputStream getBody( ) ;   public abstract Map<String,List<String>> getHeaders( ); } 

Example 15-14 shows a simple CacheResponse subclass that is tied to a SimpleCacheRequest . In this example, shared references pass data from the request class to the response class. If we were storing responses in files, we'd just need to share the filenames instead. Along with the SimpleCacheRequest object from which it will read the data, we must also pass the original URLConnection object into the constructor. This is used to read the HTTP header so it can be stored for later retrieval. The object also keeps track of the expiration date (if any) provided by the server for the cached representation of the resource.

Example 15-14. A basic CacheResponse subclass
 import java.net.*; import java.io.*; import java.util.*; public class SimpleCacheResponse extends CacheResponse {       private Map<String,List<String>> headers;   private SimpleCacheRequest request;   private Date expires;      public SimpleCacheResponse(SimpleCacheRequest request, URLConnection uc)     throws IOException {          this.request = request;          // deliberate shadowing; we need to fill the map and     // then make it unmodifiable      Map<String,List<String>> headers = new HashMap<String,List<String>>( );     String value = "";     for (int i = 0;; i++) {        String name = uc.getHeaderFieldKey(i);        value = uc.getHeaderField(i);        if (value == null) break;        List<String> values = headers.get(name);        if (values == null) {          values = new ArrayList<String>(1);          headers.put(name, values);        }        values.add(value);     }     long expiration = uc.getExpiration( );     if (expiration != 0) {       this.expires = new Date(expiration);      }     this.headers = Collections.unmodifiableMap(headers);   }        public InputStream getBody( ) {     return new ByteArrayInputStream(request.getData( ));    }        public Map<String,List<String>> getHeaders( )    throws IOException {     return headers;   }        public boolean isExpired( ) {     if (expires == null) return false;     else {       Date now = new Date( );       return expires.before(now);     }   }     } 

Finally, we need a simple ResponseCache subclass that passes SimpleCacheRequest s and SimpleCacheResponse s back to the protocol handler as requested. Example 15-15 demonstrates such a simple class that stores a finite number of responses in memory in one big HashMap .

Example 15-15. An in-memory ResponseCache
 import java.net.*; import java.io.*; import java.util.*; import java.util.concurrent.*; public class MemoryCache extends ResponseCache {      private Map<URI, SimpleCacheResponse> responses     = new ConcurrentHashMap<URI, SimpleCacheResponse>( );   private int maxEntries = 100;        public MemoryCache( ) {     this(100);   }      public MemoryCache(int maxEntries) {     this.maxEntries = maxEntries;   }      public CacheRequest put(URI uri, URLConnection uc)    throws IOException {            if (responses.size( ) >= maxEntries) return null;            String cacheControl = uc.getHeaderField("Cache-Control");      if (cacheControl != null && cacheControl.indexOf("no-cache") >= 0) {        return null;      }          SimpleCacheRequest request = new SimpleCacheRequest( );      SimpleCacheResponse response = new SimpleCacheResponse(request, uc);          responses.put(uri, response);      return request;        }   public CacheResponse get(URI uri, String requestMethod,     Map<String,List<String>> requestHeaders)    throws IOException {            SimpleCacheResponse response = responses.get(uri);      // check expiration date      if (response != null && response.isExpired( )) {         responses.remove(response);        response = null;      }      return response;         }    } 

Once a ResponseCache like this one is installed, Java's HTTP protocol handler always uses it, even when it shouldn't. The client code needs to check the expiration dates on anything it's stored and watch out for Cache-Control header fields. The key value of concern is no-cache. If you see this string in a Cache-Control header field, it means any resource representation is valid only momentarily and any cached copy is likely to be out of date almost immediately, so you really shouldn't store it at all.

Each retrieved resource stays in the HashMap until it expires. This example waits for an expired document to be requested again before it deletes it from the cache. A more sophisticated implementation could use a low-priority thread to scan for expired documents and remove them to make way for others. Instead of or in addition to this, an implementation might cache the representations in a queue and remove the oldest documents or those closest to their expiration date as necessary to make room for new ones. An even more sophisticated implementation could track how often each document in the store was accessed and expunge only the oldest and least-used documents.

I've already mentioned that you could implement this on top of the filesystem instead of sitting on top of the Java Collections API. You could also store the cache in a database and you could do a lot of less-common things as well. For instance, you could redirect requests for certain URLs to a local server rather than a remote server halfway around the world, in essence using a local web server as the cache. Or a ResponseCache could load a fixed set of files at launch time and then only serve those out of memory. This might be useful for a server that processes many different SOAP requests, all of which adhere to a few common schemas that can be stored in the cache. The abstract ResponseCache class is flexible enough to support all of these and other usage patterns.

Regrettably, Java only allows one cache at a time. To change the cache object, use the static ResponseCache.setDefault() and ResponseCache.getDefault( ) methods:

 public static ResponseCache getDefault( ) public static void setDefault(ResponseCache responseCache) 

These set the single cache used by all programs running within the same Java virtual machine. For example, this one line of code installs Example 15-13 in an application:

 ResponseCache.setDefault(new MemoryCache( )); 



Java Network Programming
Java Network Programming, Third Edition
ISBN: 0596007213
EAN: 2147483647
Year: 2003
Pages: 164

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net