16.2 The URLStreamHandler Class

     

The abstract URLStreamHandler class is a superclass for classes that handle specific protocolsfor example, HTTP. You rarely call the methods of the URLStreamHandler class; they are called by other methods in the URL and URLConnection classes. By overriding the URLStreamHandler methods in your own subclass, you teach the URL class how to handle new protocols. Therefore, I'll focus on overriding the methods of URLStreamHandler rather than calling the methods.

16.2.1 The Constructor

You do not create URLStreamHandler objects directly. Instead, when a URL is constructed with a protocol that hasn't been seen before, Java asks the application's URLStreamHandlerFactory to create the appropriate URLStreamHandler subclass for the protocol. If that fails, Java guesses at the fully package-qualified name of the URLStreamHandler class and uses Class.forName( ) to attempt to construct such an object. This means each concrete subclass should have a noargs constructor. The single constructor for URLStreamHandler doesn't take any arguments:

 public URLStreamHandler( ) 

Because URLStreamHandler is an abstract class, this constructor is never called directly; it is only called from the constructors of subclasses.

16.2.2 Methods for Parsing URLs

The first responsibility of a URLStreamHandler is to split a string representation of a URL into its component parts and use those parts to set the various fields of the URL object. The parseURL( ) method splits the URL into parts, possibly using setURL( ) to assign values to the URL 's fields. It is very difficult to imagine a situation in which you would call parseURL( ) directly; instead, you override it to change the behavior of the URL class.

16.2.2.1 protected void parseURL(URL u, String spec, int start, int limit)

This method parses the String spec into a URL u . All characters in the spec string before start should already have been parsed into the URL u . Characters after limit are ignored. Generally, the protocol will have already been parsed and stored in u before this method is invoked, and start will be adjusted so that it starts with the character after the colon that delimits the protocol.

The task of parseURL( ) is to set u 's protocol , host , port , file , and ref fields. It can assume that any parts of the String that are before start and after limit have already been parsed or can be ignored.

The parseURL( ) method that Java supplies assumes that the URL looks more or less like an http or other hierarchical URL:

 protocol://www.host.com:port/directory/another_directory/file#fragmentID 

This works for ftp and gopher URLs. It does not work for mailto or news URLs and may not be appropriate for any new URL schemes you define. If the protocol handler uses URLs that fit this hierarchical form, you don't have to override parseURL() at all; the method inherited from URLStreamHandler works just fine. If the URLs are completely different, you must supply a parseURL( ) method that parses the URL completely. However, there's often a middle ground that can make your task easier. If your URL looks somewhat like a standard URL, you can implement a parseURL( ) method that handles the nonstandard portion of the URL and then calls super.parseURL( ) to do the rest of the work, setting the offset and limit arguments to indicate the portion of the URL that you didn't parse.

For example, a mailto URL looks like mailto:elharo@metalab.unc.edu. First, you need to figure out how to map this into the URL class's protocol , host , port , file , and ref fields. The protocol is clearly mailto . Everything after the @ can be the host . The hard question is what to do with the username. Since a mailto URL really doesn't have a file portion, we will use the URL class's file field to hold the username. The ref can be set to the empty string or null . The parseURL( ) method that follows implements this scheme:

 public void parseURL(URL u, String spec, int start, int limit) {      String protocol = u.getProtocol( );   String host = "";   int port = u.getPort( );   String file = ""; // really username   String fragmentID  = null;   if( start < limit) {     String address = spec.substring(start, limit);     int atSign = address.indexOf('@');     if (atSign >= 0) {       host = address.substring(atSign+1);       file = address.substring(0, atSign);     }   }   this.setURL(u, protocol, host, port, file, fragmentID ); } 

Rather than borrowing an unused field from the URL object, it's possibly a better idea to store protocol-specific parts of the URL, such as the username, in fields of the URLStreamHandler subclass. The disadvantage of this approach is that such fields can be seen only by your own code; in this example, you couldn't use the getFile( ) method in the URL class to retrieve the username. Here's a version of parseURL( ) that stores the username in a field of the Handler subclass. When the connection is opened, the username can be copied into the MailtoURLConnection object that results. That class would provide some sort of getUserName( ) method:

  String username = "";  public void parseURL(URL u, String spec, int start, int limit) {      String protocol = u.getProtocol( );   String host = "";   int port = u.getPort( );   String file = "";   String fragmentID  = null;   if( start < limit) {     String address = spec.substring(start, limit);     int atSign = address.indexOf('@');     if (atSign >= 0) {       host = address.substring(atSign+1);  this.username = address.substring(0, atSign);  }   }   this.setURL(u, protocol, host, port, file, fragmentID );          } 

16.2.2.2 protected String toExternalForm(URL u)

This method puts the pieces of the URL u that is, its protocol , host , port , file , and ref fieldsback together in a String . A class that overrides parseURL() should also override toExternalForm( ) . Here's a toExternalForm() method for a mailto URL; it assumes that the username has been stored in the URL 's file field:

 protected String toExternalForm(URL u) {      return "mailto:" + u.getFile( ) + "@" + u.getHost( );      } 

Since toExternalForm( ) is protected, you probably won't call this method directly. However, it is called by the public toExternalForm( ) and toString( ) methods of the URL class, so any change you make here is reflected when you convert URL objects to strings.

16.2.2.3 protected void setURL(URL u, String protocol, String host, int port, String authority, String userInfo, String path , String query, String fragmentID) // Java 1.3

This method sets the protocol , host , port , authority , userInfo , path , query , and ref fields of the URL u to the given values. parseURL( ) uses this method to set these fields to the values it has found by parsing the URL. You need to call this method at the end of the parseURL( ) method when you subclass URLStreamHandler .

This method is a little flaky, since the host, port, and user info together make up the authority. In the event of a conflict between them, they're all stored separately, but the host, port, and user info are used in preference to the authority when deciding which site to connect to.

This is actually quite relevant to the mailto example, since mailto URLs often have query strings that indicate the subject or other header; for example, mailto:elharo@metalab.unc.edu?subject=JavaReading. Here the query string is subject=JavaReading . Rewriting the parseURL( ) method to support mailto URLs in this format, the result looks like this:

 public void parseURL(URL u, String spec, int start, int limit) {      String protocol    = u.getProtocol( );   String host        = "";   int port           = u.getPort( );   String file        = "";   String userInfo    = null;   String query       = null;   String fragmentID  = null;   if (start < limit) {     String address = spec.substring(start, limit);     int atSign = address.indexOf('@');     int questionMark = address.indexOf('?');     int hostEnd = questionMark >= 0 ? questionMark : address.length( );     if (atSign >= 0) {       host = address.substring(atSign+1, hostEnd);       userInfo = address.substring(0, atSign);     }     if (questionMark >= 0 && questionMark > atSign) {       query = address.substring(questionMark + 1);     }    }   String authority = "";   if (userInfo != null) authority += userInfo + '@';   authority += host;   if (port >= 0) authority += ":" + port;   this.setURL(u, protocol, host, port, authority, userInfo, file,     query, fragmentID );          } 

16.2.2.4 protected int getDefaultPort( ) // Java 1.3

The getDefaultPort() method returns the default port for the protocol, e.g., 80 for HTTP. The default implementation of this method simply returns -1, but each subclass should override that with the appropriate default port for the protocol it handles. For example, here's a getDefaultPort() method for the finger protocol that normally operates on port 79:

 public int getDefaultPort( ) {   return 79; } 

As well as providing the right port for finger, overriding this method also makes getDefaultPort( ) public. Although there's only a default implementation of this method in Java 1.3, there's no reason you can't provide it in your own subclasses in any version of Java. You simply won't be able to invoke it polymorphically from a reference typed as the superclass.

16.2.2.5 protected InetAddress getHostAddress(URL u) // Java 1.3

The getHostAddress() method returns an InetAddress object pointing to the server in the URL. This requires a DNS lookup, and the method does block while the lookup is made. However, it does not throw any exceptions. If the host can't be located, whether because the URL does not contain host information as a result of a DNS failure or a SecurityException , it simply returns null. The default implementation of this method is sufficient for any reasonable case. It shouldn't be necessary to override it.

16.2.2.6 protected boolean hostsEqual(URL u1, URL u2) // Java 1.3

The hostsEqual( ) method determines whether the two URLs refer to the same server. This method does use DNS to look up the hosts . If the DNS lookups succeed, it can tell that, for example, http://www. ibiblio .org/Dave/this-week.html and ftp://metalab.unc.edu/pub/linux/distributions/debian/ are the same host. However, if the DNS lookup fails for any reason, then hostsEqual( ) falls back to a simple case-insensitive string comparison, in which case it would think these were two different hosts.

The default implementation of this method is sufficient for most cases. You probably won't need to override it. The only case I can imagine where you might want to is if you were trying to make mirror sites on different servers appear equal.

16.2.2.7 protected boolean sameFile(URL u1, URL u2) // Java 1.3

The sameFile( ) method determines whether two URLs point to the same file. It does this by comparing the protocol, host, port, and path. The files are considered to be the same only if each of those four pieces is the same. However, it does not consider the query string or the fragment identifier. Furthermore, the hosts are compared by the hostsEqual( ) method so that www.ibiblio.org and metalab.unc.edu can be recognized as the same if DNS can resolve them. This is similar to the sameFile() method of the URL class. Indeed, that sameFile( ) method just calls this sameFile( ) method.

The default implementation of this method is sufficient for most cases. You probably won't need to override it. You might perhaps want to do so if you need a more sophisticated test that converts paths to canonical paths or follows redirects before determining whether two URLs have the same file part.

16.2.2.8 protected boolean equals(URL u1, URL u2) // Java 1.3

The final equality method tests almost the entire URL, including protocol, host, file, path, and fragment identifier. Only the query string is ignored. All five of these must be equal for the two URLs to be considered equal. Everything except the fragment identifier is compared by the sameFile() method, so overriding that method changes the behavior of this one. The fragment identifiers are compared by simple string equality. Since the sameFile( ) method uses hostsEqual( ) to compare hosts, this method does too. Thus, it performs a DNS lookup if possible and may block. The equals( ) method of the URL class calls this method to compare two URL objects for equality. Again, you probably won't need to override this method. The default implementation should suffice for most purposes.

16.2.2.9 protected int hashCode(URL u) // Java 1.3

URLStreamHandler s can change the default hash code calculation by overriding this method. You should do this if you override equals( ) , sameFile() , or hostsEqual( ) to make sure that two equal URL objects will have the same hash code, and two unequal URL objects will not have the same hash code, at least to a very high degree of probability.

16.2.3 A Method for Connecting

The second responsibility of a URLStreamHandler is to create a URLConnection object appropriate to the URL. This is done with the abstract openConnection( ) method.

16.2.3.1 protected abstract URLConnection openConnection(URL u) throws IOException

This method must be overridden in each subclass of URLConnection . It takes a single argument, u , which is the URL to connect to. It returns an unopened URLConnection , directed at the resource u points to. Each subclass of URLStreamHandler should know how to find the right subclass of URLConnection for the protocol it handles.

The openConnection( ) method is protected, so you usually do not call it directly; it is called by the openConnection( ) method of a URL class. The URL u that is passed as an argument is the URL that needs a connection. Subclasses override this method to handle a specific protocol. The subclass's openConnection( ) method is usually extremely simple; in most cases, it just calls the constructor for the appropriate subclass of URLConnection . For example, a URLStreamHandler for the mailto protocol might have an openConnection( ) method that looks like this:

 protected URLConnection openConnection(URL u) throws IOException {   return new com.macfaq.net.www.protocol.MailtoURLConnection(u); } 

Example 16-1 demonstrates a complete URLStreamHandler for mailto URLs. The name of the class is Handler , following Sun's naming conventions. It assumes the existence of a MailtoURLConnection class.

Example 16-1. A mailto URLStreamHandler
 package com.macfaq.net.www.protocol.mailto; import java.net.*; import java.io.*; import java.util.*; public class Handler extends URLStreamHandler {   protected URLConnection openConnection(URL u) throws IOException {     return new MailtoURLConnection(u);   }   public void parseURL(URL u, String spec, int start, int limit) {          String protocol    = u.getProtocol( );     String host        = "";     int    port        = u.getPort( );     String file        = ""; // really username     String userInfo    = null;     String authority   = null;     String query       = null;     String fragmentID  = null;        if( start < limit) {       String address = spec.substring(start, limit);       int atSign = address.indexOf('@');       if (atSign >= 0) {         host = address.substring(atSign+1);         file = address.substring(0, atSign);       }     }         // For Java 1.2 comment out this next line    this.setURL(u, protocol, host, port, authority,                    userInfo, file, query, fragmentID );          // In Java 1.2 and earlier uncomment the following line:     // this.setURL(u, protocol, host, port, file, fragmentID );              }   protected String toExternalForm(URL u) {        return "mailto:" + u.getFile( ) + "@" + u.getHost( );;        } } 

16.2.3.2 protected URLConnection openConnection(URL u, Proxy p) throws IOException // Java 1.5

Java 1.5 overloads the openConnection( ) method to allow you to specify a proxy server for the connection. The java.net.Proxy class (also new in Java 1.5) encapsulates the address of a proxy server. Rather than connecting to the host directly, this URLConnection connects to the specified proxy server, which relays data back and forth between the client and the server. Protocols that do not support proxies can simply ignore the second argument.

Normally connections are opened with the usual proxy server settings within that VM. Calling this method is only necessary if you want to use a different proxy server. If you want to bypass the usual proxy server and connect directly instead, pass the constant Proxy.NO_PROXY as the second argument.



Java Network Programming
Java Network Programming, Third Edition
ISBN: 0596007213
EAN: 2147483647
Year: 2003
Pages: 164

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net