8.4 Cookies


Cookies are an atrocious hack perpetrated on the browsing world by Netscape. They are completely contrary to the web architecture. They attempt to graft state onto the deliberately stateless HTTP protocol. Statelessness in HTTP was not a mistake or a design flaw. It was a deliberate design decision that helped the Web scale to the enormous size it's reached today.

On the server side, cookies are never necessary and always a bad idea. There is always a cleaner, simpler, more scalable solution that does not involve cookies. Sadly, a lot of server-side developers don't know this and go blindly forward developing web sites that require client-side developers to support cookies.

Prior to Java 1.5, cookies can be supported only by direct manipulation of the HTTP header. When a server sets a cookie, it includes a Set-Cookie field like this one in the HTTP header:

 Set-Cookie: user=elharo 

This sends the browser a cookie with the name "user" and the value "elharo". The value of this field is limited to the printable ASCII characters (because HTTP header fields are limited to the printable ASCII characters ). Furthermore, the names may not contain commas, semicolons, or whitespace.

A later version of the spec, RFC 2965, uses a Set-Cookie2 HTTP header instead. The most obvious difference is that this version of the cookie spec requires a version attribute after the name=value pair, like so:

 Set-Cookie2: user=elharo; Version=1 

The Version attribute simply indicates the version of the cookie spec in use. Version 1 and the unmarked original version zero are the only ones currently defined. Some servers will send both Set-Cookie and Set-Cookie2 headers. If so, the value in Set-Cookie2 takes precedence if a client understands both. Set-Cookie2 also allows cookie values to be quoted so they can contain internal whitespace. For example, this sets the cookie with the name food and the value "chocolate ice cream".

 Set-Cookie2: food="chocolate ice cream"; Version=1 

The quotes are just delimiters. They are not part of the attribute value. However, the attribute values are still limited to printable ASCII characters.

When requesting a document from the same server, the client echoes that cookie back in a Cookie header field in the request it sends to the server:

 Cookie: user=elharo 

If the original cookie was set by Set-Cookie2, this begins with a $Version attribute:

 Cookie: $Version=1;user=elharo 

The $ sign helps distinguish between cookie attributes and the main cookie name=value pair.

The client's job is simply to keep track of all the cookies it's been sent, and send the right ones back to the original servers at the right time. However, this is a little more complicated because cookies can have attributes identifying the expiration date, path , domain, port, version, and security options.

For example, by default a cookie applies to the server it came from. If a cookie is originally set by www.foo.example.com , the browser will only send the cookie back to www.foo.example.com . However, a site can also indicate that a cookie applies within an entire subdomain, not just at the original server. For example, this request sets a user cookie for the entire.foo.example.com domain:

 Set-Cookie: user=elharo;Domain=.foo.example.com 

The browser will echo this cookie back not just to www.foo.example.com but also to lothar.foo.example.com , eliza.foo.example.com , enoch.foo.example.com , and any other host somewhere in the foo.example.com domain. However, a server can only set cookies for domains it immediately belongs to. www.foo.example.com cannot set a cookie for www.oreilly.com, example.com, or .com, no matter how it sets the domain. (In practice, there have been a number of holes and workarounds for this, with severe negative impacts on user privacy.)

If the cookie was set by Set-Cookie2, the client will include the domain that was originally set, like so:

 Cookie: $Version=1; user=elharo;  $Domain=.foo.example.com 

However, if it's a version zero cookie, the domain is not echoed back.

Beyond domains, cookies are scoped by path, so they're used for some directories on the server, but not all. The default scope is the original URL and any subdirectories. For instance, if a cookie is set for the URL http://www.cafeconleche.org/XOM/, the cookie also applies in http://www.cafeconleche.org/XOM/apidocs/, but not in http://www.cafeconleche.org/slides/ or http://www.cafeconleche.org/. However, the default scope can be changed using a Path attribute in the cookie. For example, this next response sends the browser a cookie with the name "user" and the value "elharo" that applies only within the server's /restricted subtree , not on the rest of the site:

 Set-Cookie: user=elharo; $Version=1;Path=/restricted 

When requesting a document in the subtree /restricted from the same server, the client echoes that cookie back. However, it does not use the cookie in other directories on the site. Again, if and only if the cookie was originally set with Set-Cookie2, the client will include the Path that was originally set, like so:

 Cookie: user=elharo; $Version=1;$Path=/restricted 

A cookie can set include both a domain and a path. For instance, this cookie applies in the /restricted path on any servers within the example.com domain:

 Set-Cookie2: $Version=1;user=elharo; $Path=/restricted;$Domain=.example.com 

The order of the different cookie attributes doesn't matter, as long as they're all separated by semicolons and the cookie's own name and value come first. However, this isn't true when the client is sending the cookie back to the server. In this case, the path must precede the domain, like so:

 Cookie: $Version=1;user=elharo; $Path=/restricted;$Domain=.foo.example.com 

A version zero cookie can be set to expire at a certain point in time by setting the expires attribute to a date in the form Wdy, DD-Mon-YYYY HH:MM:SS GMT. Weekday and month are given as three-letter abbreviations. The rest are numeric, padded with initial zeros if necessary. In the pattern language used by java.text.SimpleDateFormat , this is " E, dd-MMM-yyyy k:m:s 'GMT' ". For instance, this cookie expires at 3:23 P.M. on December 21, 2005:

 Set-Cookie: user=elharo; expires=Wed, 21-Dec-2005 15:23:00 GMT 

The browser should remove this cookie from its cache after that date has passed.

Set-Cookie2 use a Max-Age attribute that sets the cookie to expire after a certain number of seconds have passed instead of at a specific moment. For instance, this cookie expires one hour (3,600 seconds) after it's first set:

 Set-Cookie2: user="elharo"; $Version=1;Max-Age=3600 

The browser should remove this cookie from its cache after this amount of time has elapsed.

Because cookies can contain sensitive information such as passwords and session keys, some cookie transactions should be secure. Exactly what secure means in this context is not specified. Most of the time, it means using HTTPS instead of HTTP, but whatever it means, each cookie can have the a secure attribute with no value, like so:

 Set-Cookie: key=etrogl7*;Domain=.foo.example.com; secure 

Browsers are supposed to refuse to send such cookies over insecure channels.

Finally, in addition to path, domain, and time, version 1 cookies can be scoped by port. This isn't common, but clients are required to support it. The Port attribute contains a quoted list of whitespace-separated port numbers to which the cookie applies:

 Set-Cookie2: $Version=1;user=elharo; $Path=/restricted;$Port="8080 8000" 

For the response, the order is always path, domain, and port, like so:

 Cookie: $Version=1;user=elharo;  $Path=/restricted; $Domain=.foo.example.com;  $Port="8080 8000" 

Multiple cookies can be set in one request by separating the name-value pairs with commas. For example, this Set-Cookie header assigns the cookie named user the value "elharo" and the cookie named zip the value "10003":

 Set-Cookie: user=elharo, zip=10003 

Each cookie set in this way can also contain attributes. For example, this Set-Cookie header scopes the user cookie to the path /restricted and the zip cookie to the path /weather :

 Set-Cookie: user=elharo; path=/restricted, zip=10003; path=/weather 

I've left out a couple of less important details like comments that don't matter much in practice. If you're interested, complete details are available in RFC 2965, HTTP State Management Mechanism .

That's how cookies work behind the scenes. In theory, this is all transparent to the user. In practice, the most sophisticated users routinely disable, filter, or inspect cookies to protect their privacy and security so cookies are not guaranteed to work.

Let's wrap this all up in a neat class called Cookie , shown in Example 8-12, with appropriate getter methods for the relevant properties and a factory method that parses HTTP header fields that set cookies. We'll need this in a minute because even as of Java 1.5 there's nothing like this in the standard JDK.

Example 8-11. A cookie class
 package com.macfaq.http; import java.net.URI; import java.text.DateFormat; import java.text.SimpleDateFormat; import java.util.Date; public class Cookie {   private String  version = "0";   private String  name;   private String  value;   private URI   uri;   private String  domain;   private Date  expires;   private String  path;   private boolean secure = false;   private static DateFormat expiresFormat      = new SimpleDateFormat("E, dd-MMM-yyyy k:m:s 'GMT'");   // prevent instantiation   private Cookie( ) {}   public static Cookie bake(String header, URI uri)     throws CookieException {     try {       String[] attributes = header.split(";");       String nameValue = attributes[0];       Cookie cookie = new Cookie( );       cookie.uri = uri;       cookie.name = nameValue.substring(0, nameValue.indexOf('='));       cookie.value = nameValue.substring(nameValue.indexOf('=')+1);       cookie.path = "/";       cookie.domain = uri.getHost( );       if (attributes[attributes.length-1].trim( ).equals("secure")) {         cookie.secure = true;       }       for (int i=1; i < attributes.length; i++) {         nameValue = attributes[i].trim( );         int equals = nameValue.indexOf('=');         if (equals == -1) continue;         String attributeName = nameValue.substring(0, equals);         String attributeValue = nameValue.substring(equals+1);          if (attributeName.equalsIgnoreCase("domain")) {           String uriDomain = uri.getHost( );           if (uriDomain.equals(attributeValue)) {             cookie.domain = attributeValue;           }           else {             if (!attributeValue.startsWith(".")) {               attributeValue = "." + attributeValue;             }             uriDomain = uriDomain.substring(uriDomain.indexOf('.'));             if (!uriDomain.equals(attributeValue)) {               throw new CookieException(                "Server tried to set cookie in another domain");             }             cookie.domain = attributeValue;           }         }         else if (attributeName.equalsIgnoreCase("path")) {           cookie.path = attributeValue;         }         else if (attributeName.equalsIgnoreCase("expires")) {           cookie.expires = expiresFormat.parse(attributeValue);         }         else if (attributeName.equalsIgnoreCase("Version")) {           if (!"1".equals(attributeValue)) {             throw new CookieException("Unexpected version " + attributeValue);           }           cookie.version = attributeValue;         }       }       return cookie;     }     catch (Exception ex) {        // ParseException, StringIndexOutOfBoundsException etc.       throw new CookieException(ex);     }   }   public boolean isExpired( ) {     if (expires == null) return false;     Date now = new Date( );     return now.after(expires);   }   public String getName( ) {     return name;   }   public boolean isSecure( ) {     return secure;   }   public URI getURI( ) {     return uri;   }   public String getVersion( ) {     return version;   }   // should this cookie be sent when retrieving the specified URI?   public boolean matches(URI u) {     if (isExpired( )) return false;     String path = u.getPath( );     if (path == null) path = "/";     if (path.startsWith(this.path)) return true;     return false;   }   public String toExternalForm( ) {     StringBuffer result = new StringBuffer(name);     result.append("=");     result.append(value);     if ("1".equals(version)) {        result.append(" Version=1");        }     return result.toString( );   } } 

Prior to Java 1.5, the only way to support cookies is by direct inspection of the relevant HTTP headers. The URL class does not support this, but the URLConnection class introduced in Chapter 15 does. Java 1.5 adds a new java.net.CookieHandler class that makes this process somewhat easier. You provide a subclass of this abstract class where Java will store all cookies retrieved through the HTTP protocol handler. Once you've done this, when you access an HTTP server through a URL object and the server sends a cookie, Java automatically puts it in the system default cookie handler. When the same VM instance goes back to that server, it sends the cookie.

I'm writing this section based on betas of Java 1.5. While the information about how cookies are handled in HTTP should be accurate, it's entirely possible a few of the Java details may change by the time Java 1.5 is released. Be sure to compare what you read here with the latest documentation from Sun.

The CookieHandler class is summarized in Example 8-12. As you can see, there are two abstract methods to implement, get( ) and put() . When Java loads a URL from a server that sets a cookie, it passes the URI it was loading and the complete HTTP headers of the server response to the put( ) method. The handler can parse the details out of these headers and store them somewhere. When Java tries to load an HTTP URL from a server, it passes the URL and the request HTTP header to the get( ) method to see if there are any cookies in the store for that URL. Sadly, you have to implement the parsing and storage code yourself. CookieHandler is an abstract class that does not do this for you, even though it's pretty standard stuff.

Example 8-12. CookieHandler
 package java.net; public abstract class CookieHandler {   public CookieHandler( )   public abstract Map<String,List<String>> get(    URI uri, Map<String,List<String>> requestHeaders)     throws IOException   public abstract void put(    URI uri, Map<String,List<String>> responseHeaders)     throws IOException   public static CookieHandler getDefault( )   public static void           setDefault(CookieHandler handler) } 

A subclass is most easily implemented by delegating the hard work to the Java Collections API, as Example 8-13 demonstrates . Since CookieHandler is only available in Java 1.5 anyway, I took the opportunity to show off some new features of Java 1.5, including generic types and enhanced for loops . This implementation limits itself to version 0 cookies, which are far and away the most common kind you'll find in practice. If version 1 cookies ever achieve broad adoption, it should be easy to extend these classes to support them.

Example 8-13. A CookieHandler implemented on top of the Java Collections API
 package com.macfaq.http; import java.io.IOException; import java.net.*; import java.util.*; public class CookieStore extends CookieHandler {        private List<Cookie> store = new ArrayList<Cookie>( );   public Map<String,List<String>> get(URI uri,     Map<String,List<String>> requestHeaders)     throws IOException {              Map<String,List<String>> result = new HashMap<String,List<String>>( );       StringBuffer cookies = new StringBuffer( );       for (Cookie cookie : store) {         if (cookie.isExpired( )) {           store.remove(cookie);         }         else if (cookie.matches(uri)) {               if (cookies.length( ) != 0) cookies.append(", ");               cookies.append(cookie.toExternalForm( ));         }       }              if (cookies.length( ) > 0) {         List<String> temp = new ArrayList<String>(1);         temp.add(cookies.toString( ));         result.put("Cookie", temp);       }              return result;          }   public void put(URI uri, Map<String,List<String>> responseHeaders)     throws IOException {            List<String> setCookies = responseHeaders.get("Set-Cookie");     for (String next : setCookies) {       try {         Cookie cookie = Cookie.bake(next, uri);         // Is a cookie with this name and URI already in the list?          // If so, we replace it         for (Cookie existingCookie : store) {           if (cookie.getURI( ).equals(existingCookie.getURI( )) &&             cookie.getName( ).equals(existingCookie.getName( ))) {               store.remove(existingCookie);               break;           }         }         store.add(cookie);                   }       catch (CookieException ex) {         // Server sent malformed header;          // log and ignore         System.err.println(ex);       }     }   } } 

When storing a cookie, the responseHeaders argument to the put( ) method contains the complete HTTP response header sent by the server. From this you need to extract any header fields that set cookies (basically, just Set-Cookie and Set-Cookie2 ). The key to this map is the field name ( Set-Cookie or Set-Cookie2 ). The value of the map entry is a list of cookies set in that field. Each separate cookie is a separate member of the list. That is, Java does divide the header field value along the commas to split up several cookies and pass them in each as a separate entry in the list.

In the other direction, when getting a cookie it's necessary to consider not only the URI but the path for which the cookie is valid. Here, the path is delegated to the Cookie class itself via the matches() method. This is hardly the most efficient implementation possible. For each cookie, the store does a linear search through all available cookies. A more intelligent implementation would index the list by URIs and domains, but for simple purposes this solution suffices without being overly complex. A more serious limitation is that this store is not persistent. It lasts only until the driving program exits. Most web browsers would want to store the cookies in a file so they could be reloaded when the browser was relaunched. Nonetheless, this class is sufficient to add basic cookie support to the simple web browser. All that's required is to add this one line at the beginning of the main( ) method in Example 8-5:

 CookieHandler.setDefault(new com.macfaq.http.CookieStore( )); 

Java Network Programming
Java Network Programming, Third Edition
ISBN: 0596007213
EAN: 2147483647
Year: 2003
Pages: 164

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net