The java.net.URL class represents a Uniform Resource Locator such as http://www.cafeaulait.org/books/javaio2/. Each URL unambiguously identifies the location of a resource on the Internet. The URL class has six constructors. All are declared to throw MalformedURLException, a subclass of IOException.
public URL(String url) throws MalformedURLException public URL(String protocol, String host, String file) throws MalformedURLException public URL(String protocol, String host, int port, String file) throws MalformedURLException public URL(String protocol, String host, int port, String file, URLStreamHandler handler) throws MalformedURLException public URL(URL context, String url) throws MalformedURLException public URL(URL context, String url, URLStreamHandler handler) throws MalformedURLException
Each constructor throws a MalformedURLException if its arguments do not specify a valid URL. Often, this means a particular Java implementation does not have the right protocol handler installed. Thus, given a complete absolute URL such as http://www.poly.edu/schedule/fall2006/bgrad.html#cs, you construct a URL object like so:
URL u = null; try { u = new URL("http://www.poly.edu/schedule/fall2006/bgrad.html#cs"); } catch (MalformedURLException ex) { // this shouldn't happen for a syntactically correct http URL }
You can also construct the URL object by passing its pieces to the constructor:
URL u = new URL("http", "www.poly.edu", "/schedule/ fall2006/bgrad.html#cs");
You don't normally need to specify a port for a URL. Most protocols have default ports. For instance, the HTTP port is 80. Sometimes the port used does change, and in that case you can use this constructor:
URL u = new URL("http", "www.poly.edu", 80, "/schedule/ fall2006/bgrad.html#cs ");
Finally, many HTML files contain relative URLs. The last two constructors create URLs relative to a given URL and are particularly useful when parsing HTML. For example, the following code creates a URL pointing to the file 08.html, taking the rest of the URL from u1:
URL u1 = new URL("http://www.cafeaualit.org/course/week12/07.html"); URL u2 = new URL(u1, "08.html");
Once a URL object has been constructed, you can retrieve its data in two ways. The openStream( ) method returns a raw stream of bytes from the source. The getContent( ) method returns a Java object that represents the data. When you call getContent( ), Java looks for a content handler that matches the MIME type of the data. It is the openStream( ) method that is of concern in this book.
The openStream( ) method makes a socket connection to the server and port specified in the URL. It returns an input stream from which you can read the data at that URL. Any headers that come before the actual data or file requested are stripped off before the stream is opened. You only get the raw data.
public InputStream openStream( ) throws IOException
Example 5-1 shows you how to connect to a URL entered on the command line, download its data, and copy that to System.out.
Example 5-1. The URLTyper program
import java.net.*; import java.io.*; public class URLTyper { public static void main(String[] args) throws IOException { if (args.length != 1) { System.err.println("Usage: java URLTyper url"); return; } InputStream in = null; try { URL u = new URL(args[0]); in = u.openStream( ); for (int c = in.read(); c != -1; c = in.read( )) { System.out.write(c); } } catch (MalformedURLException ex) { System.err.println(args[0] + " is not a URL Java understands."); } finally { if (in != null) in.close( ); } } } |
For example, here are the first few lines you see when you connect to http://www.oreilly.com/:
$ java URLTyper http://www.oreilly.com/
oreilly.com -- Welcome to O'Reilly Media, Inc. -- computer books, softwar e conferences, online publishing...
Most network connections, even on LANs, are slower and less reliable sources of data than files. Connections across the Internet are even slower and less reliable, and connections through a modem are slower and less reliable still. One way to enhance performance under these conditions is to buffer the data: to read as much data as you can into a temporary storage array inside the class, then parcel it out as needed. In the next chapter, you'll learn about the BufferedInputStream class that does exactly this.
Untrusted code running under the control of a security managere.g., applets that run inside a web browserare normally allowed to connect only to the host they were downloaded from. This host can be determined from the URL returned by the getCodeBase( ) method of the Applet class. Attempts to connect to other hosts throw security exceptions. You can create URLs that point to other hosts, but you may not download data from them using openStream( ) or any other method. (This security restriction for applets applies to any network connection, regardless of how you get it.)
Basic I/O
Introducing I/O
Output Streams
Input Streams
Data Sources
File Streams
Network Streams
Filter Streams
Filter Streams
Print Streams
Data Streams
Streams in Memory
Compressing Streams
JAR Archives
Cryptographic Streams
Object Serialization
New I/O
Buffers
Channels
Nonblocking I/O
The File System
Working with Files
File Dialogs and Choosers
Text
Character Sets and Unicode
Readers and Writers
Formatted I/O with java.text
Devices
The Java Communications API
USB
The J2ME Generic Connection Framework
Bluetooth
Character Sets