16.1 What Is a Protocol Handler?

     

The way the URL , URLStreamHandler , URLConnection , and URLStreamHandlerFactory classes work together can be confusing. Everything starts with a URL, which represents a pointer to a particular Internet resource. Each URL specifies the protocol used to access the resource; typical values for the protocol include mailto , http , and ftp . When you construct a URL object from the URL's string representation, the constructor strips the protocol field and passes it to the URLStreamHandlerFactory . The factory's job is to take the protocol, locate the right subclass of URLStreamHandler for the protocol, and create a new instance of that stream handler, which is stored as a field within the URL object. Each application has at most one URLStreamHandlerFactory ; once the factory has been installed, attempting to install another will throw an Error .

Now that the URL object has a stream handler, it asks the stream handler to finish parsing the URL string and create a subclass of URLConnection that knows how to talk to servers using this protocol. URLStreamHandler subclasses and URLConnection subclasses always come in pairs; the stream handler for a protocol always knows how to find an appropriate URLConnection for its protocol. It is worth noting that the stream handler does most of the work of parsing the URL. The format of the URL, although standard, depends on the protocol; therefore, it must be parsed by a URLStreamHandler , which knows about a particular protocol, and not by the URL object, which is generic and has no knowledge of specific protocols. This also means that if you are writing a new stream handler, you can define a new URL format that's appropriate to your task.

New URL schemes should be defined only for genuinely new protocols. They should not be defined for different uses of existing protocols. The iTunes Music Store itms scheme and the RSS feed scheme are examples of what not to do. Both of these should use http .


The URLConnection class, which you learned about in the previous chapter, represents an active connection to an Internet resource. It is responsible for interacting with the server. A URLConnection knows how to generate requests and interpret the headers that the server returns. The output from a URLConnection is the raw data requested with all traces of the protocol (headers, etc.) stripped, ready for processing by a content handler.

In most applications, you don't need to worry about URLConnection objects and stream handlers; they are hidden by the URL class, which provides a simple interface to the functionality you need. When you call the getInputStream( ) , getOutputStream() , and getContent( ) methods of the URL class, you are really calling similarly named methods in the URLConnection class. We have seen that interacting directly with a URLConnection can be convenient when you need a little more control over communication with a server, such as when downloading binary files or posting data to a server-side program.

However, the URLConnection and URLStreamHandler classes are even more important when you need to add new protocols. By writing subclasses of these classes, you can add support for standard protocols such as finger, whois, or NTP that Java doesn't support out of the box. Furthermore, you're not limited to established protocols with well-known services. You can create new protocols that perform database queries, search across multiple Internet search engines, view pictures from binary newsgroups, and more. You can add new kinds of URLs as needed to represent the new types of resources. Furthermore, Java applications can be built so that they load new protocol handlers at runtime. Unlike current browsers such as Mozilla and Internet Explorer, which contain explicit knowledge of all the protocols and content types they can handle, a Java browser can be a relatively lightweight skeleton that loads new handlers as needed. Supporting a new protocol just means adding some new classes in predefined locations, not writing an entirely new release of the browser.

What's involved in adding support for a new protocol? As I said earlier, you need to write two new classes: a subclass of URLConnection and a subclass of URLStreamHandler . You may also need to write a class that implements the URLStreamHandlerFactory interface. The URLConnection subclass handles the interaction with the server, converts anything the server sends into an InputStream , and converts anything the client sends into an OutputStream . This subclass must implement the abstract method connect( ) ; it may also override the concrete methods getInputStream( ) , getOutputStream( ) , and getContentType( ) .

The URLStreamHandler subclass parses the string representation of the URL into its separate parts and creates a new URLConnection object that understands that URL's protocol. This subclass must implement the abstract openConnection( ) method, which returns the new URLConnection to its caller. If the String representation of the URL doesn't look like a standard hierarchical URL, you should also override the parseURL( ) and toExternalForm( ) methods.

Finally, you may need to create a class that implements the URLStreamHandlerFactory interface. The URLStreamHandlerFactory helps the application find the right protocol handler for each type of URL. The URLStreamHandlerFactory interface has a single method, createURLStreamHandler( ) , which returns a URLStreamHandler object. This method must find the appropriate subclass of URLStreamHandler given only the protocol (e.g., ftp ); that is, it must understand the package and class-naming conventions used for stream handlers. Since URLStreamHandlerFactory is an interface, you can place the createURLStreamHandler( ) method in any convenient class, perhaps the main class of your application.

When it first encounters a protocol, Java looks for URLStreamHandler classes in this order:

  1. First, Java checks to see whether a URLStreamHandlerFactory is installed. If it is, the factory is asked for a URLStreamHandler for the protocol.

  2. If a URLStreamHandlerFactory isn't installed or if Java can't find a URLStreamHandler for the protocol, Java looks in the packages named in the java.protocol.handler.pkgs system property for a sub-package that shares the protocol name and a class called Handler . The value of this property is a list of package names separated by a vertical bar ( ). Thus, to indicate that Java should seek protocol handlers in the com.macfaq.net.www and org.cafeaulait.protocols packages, you would add this line to your properties file:

     java.protocol.handler.pkgs=com.macfaq.net.wwworg.cafeaulait.protocols 

    To find an FTP protocol handler (for example), Java first looks for the class com.macfaq.net.www.ftp.Handler . If that's not found, Java next tries to instantiate org.cafeaulait.protocols.ftp.Handler .

  3. Finally, if all else fails, Java looks for a URLStreamHandler named sun.net.www.protocol . name .Handler , where name is replaced by the name of the protocol; for example, sun.net.www.protocol.ftp.Handler .

    In the early days of Java (circa 1995), Sun promised that protocols could be installed at runtime from the server that used them. For instance, in 1996, James Gosling and Henry McGilton wrote: "The HotJava Browser is given a reference to an object (a URL). If the handler for that protocol is already loaded, it will be used. If not, the HotJava Browser will search first the local system and then the system that is the target of the URL." ( The Java Language Environment, A White Paper , May 1996, http://java.sun.com/docs/white/langenv/HotJava.doc1.html) However, the loading of protocol handlers from web sites was never implemented, and Sun doesn't talk much about it anymore.


Most of the time, an end user who wants to permanently install an extra protocol handler in a program such as HotJava will place the necessary classes in the program's class path and add the package prefix to the java.protocol.handler.pkgs property. However, a programmer who just wants to add a custom protocol handler to their program at compile time will write and install a URLStreamHandlerFactory that knows how to find their custom protocol handlers. The factory can tell an application to look for URLStreamHandler classes in any place that's convenient: on a web site, in the same directory as the application, or somewhere in the user's class path.

When each of these classes has been written and compiled, you're ready to write an application that uses the new protocol handler. Assuming that you're using a URLStreamHandlerFactory , pass the factory object to the static URL . setURLStreamHandlerFactory() method like this:

 URL.setURLStreamHandlerFactory(new MyURLStreamHandlerFactory( )); 

This method can be called only once in the lifetime of an application. If it is called a second time, it will throw an Error . Untrusted code will generally not be allowed to install factories or change the java.protocol.handler.pkgs property. Consequently, protocol handlers are primarily of use to standalone applications such as HotJava; Netscape and Internet Explorer use their own native C code instead of Java to handle protocols, so they're limited to a fixed set of protocols.

To summarize, here's the sequence of events:

  1. The program constructs a URL object.

  2. The constructor uses the arguments it's passed to determine the protocol part of the URL, e.g., http .

  3. The URL( ) constructor tries to find a URLStreamHandler for the given protocol like this:

    1. If the protocol has been used before, the URLStreamHandler object is retrieved from a cache.

    2. Otherwise, if a URLStreamHandlerFactory has been set, the protocol string is passed to the factory's createURLStreamHandler( ) method.

    3. If the protocol hasn't been seen before and there's no URLStreamHandlerFactory , the constructor attempts to instantiate a URLStreamHandler object named protocol .Handler in one of the packages listed in the java.protocol.handler.pkgs property.

    4. Failing that, the constructor attempts to instantiate a URLStreamHandler object named protocol .Handler in the sun.net.www.protocol package.

    5. If any of these attempts succeed in retrieving a URLStreamHandler object, the URL constructor sets the URL object's handler field. If none of the attempts succeed, the constructor throws a MalformedURLException .

  4. The program calls the URL object's openConnection( ) method.

  5. The URL object asks the URLStreamHandler to return a URLConnection object appropriate for this URL. If there's any problem, an IOException is thrown. Otherwise, a URLConnection object is returned.

  6. The program uses the methods of the URLConnection class to interact with the remote resource.

Instead of calling openConnection( ) in step 4, the program can call getContent( ) or getInputStream( ) . In this case, the URLStreamHandler still instantiates a URLConnection object of the appropriate class. However, instead of returning the URLConnection object itself, the URLStreamHandler returns the result of URLConnection 's getContent( ) or getInputStream() method.



Java Network Programming
Java Network Programming, Third Edition
ISBN: 0596007213
EAN: 2147483647
Year: 2003
Pages: 164

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net