The java.net.HttpURLConnection class is an abstract subclass of URLConnection ; it provides some additional methods that are helpful when working specifically with http URLs: public abstract class HttpURLConnection extends URLConnection In particular, it contains methods to get and set the request method, decide whether to follow redirects, get the response code and message, and figure out whether a proxy server is being used. It also includes several dozen mnemonic constants matching the various HTTP response codes. Finally, it overrides the getPermission() method from the URLConnection superclass, although it doesn't change the semantics of this method at all. Since this class is abstract and its only constructor is protected, you can't directly create instances of HttpURLConnection . However, if you construct a URL object using an http URL and invoke its openConnection( ) method, the URLConnection object returned will be an instance of HttpURLConnection . Cast that URLConnection to HttpURLConnection like this: URL u = new URL("http://www.amnesty.org/"); URLConnection uc = u.openConnection( ); HttpURLConnection http = (HttpURLConnection) uc; Or, skipping a step, like this: URL u = new URL("http://www.amnesty.org/"); HttpURLConnection http = (HttpURLConnection) u.openConnection( ); | There's another HttpURLConnection class in the undocumented sun.net.www.protocol.http package, a concrete subclass of java.net.HttpURLConnection that actually implements the abstract connect( ) method: | | public class HttpURLConnection extends java.net.HttpURLConnection | There's little reason to access this class directly. It doesn't add any important methods that aren't already declared in java.net.HttpURLConnection or java.net.URLConnection . However, any URLConnection you open to an http URL will be an instance of this class. | | 15.11.1 The Request Method When a web client contacts a web server, the first thing it sends is a request line. Typically, this line begins with GET and is followed by the name of the file that the client wants to retrieve and the version of the HTTP protocol that the client understands. For example: GET /catalog/jfcnut/index.html HTTP/1.0 However, web clients can do more than simply GET files from web servers. They can POST responses to forms. They can PUT a file on a web server or DELETE a file from a server. And they can ask for just the HEAD of a document. They can ask the web server for a list of the OPTIONS supported at a given URL. They can even TRACE the request itself. All of these are accomplished by changing the request method from GET to a different keyword. For example, here's how a browser asks for just the header of a document using HEAD: HEAD /catalog/jfcnut/index.html HTTP/1.1 User-Agent: Java/1.4.2_05 Host: www.oreilly.com Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2 Connection: close By default, HttpURLConnection uses the GET method. However, you can change this with the setRequestMethod( ) method: public void setRequestMethod(String method) throws ProtocolException The method argument should be one of these seven case-sensitive strings: -
GET -
POST -
HEAD -
PUT -
OPTIONS -
DELETE -
TRACE If it's some other method, then a java.net.ProtocolException , a subclass of IOException , is thrown. However, it's generally not enough to simply set the request method. Depending on what you're trying to do, you may need to adjust the HTTP header and provide a message body as well. For instance, POSTing a form requires you to provide a Content-length header. We've already explored the GET and POST methods. Let's look at the other five possibilities. | Some web servers support additional, nonstandard request methods. For instance, Apache 1.3 also supports CONNECT, PROPFIND, PROPPATCH, MKCOL, COPY, MOVE, LOCK, and UNLOCK. However, Java doesn't support any of these. | | 15.11.1.1 HEAD The HEAD function is possibly the simplest of all the request methods. It behaves much like GET. However, it tells the server only to return the HTTP header, not to actually send the file. The most common use of this method is to check whether a file has been modified since the last time it was cached. Example 15-9 is a simple program that uses the HEAD request method and prints the last time a file on a server was modified. Example 15-9. Get the time when a URL was last changed import java.net.*; import java.io.*; import java.util.*; public class LastModified { public static void main(String args[]) { for (int i=0; i < args.length; i++) { try { URL u = new URL(args[i]); HttpURLConnection http = (HttpURLConnection) u.openConnection( ); http.setRequestMethod("HEAD"); System.out.println(u + "was last modified at " + new Date(http.getLastModified( ))); } // end try catch (MalformedURLException ex) { System.err.println(args[i] + " is not a URL I understand"); } catch (IOException ex) { System.err.println(ex); } System.out.println( ); } // end for } // end main } // end LastModified Here's the output from one run: D:\JAVA\JNP3\examples> java LastModified http://www.ibiblio.org/xml/ http://www.ibiblio.org/xml/was last modified at Thu Aug 19 06:06:57 PDT 2004 It wasn't absolutely necessary to use the HEAD method here. We'd have gotten the same results with GET. But if we used GET, the entire file at http://www.ibiblio.org/xml/ would have been sent across the network, whereas all we cared about was one line in the header. When you can use HEAD, it's much more efficient to do so. 15.11.1.2 OPTIONS The OPTIONS request method asks what options are supported for a particular URL. If the request URL is an asterisk (*), the request applies to the server as a whole rather than to one particular URL on the server. For example: OPTIONS /xml/ HTTP/1.1 User-Agent: Java/1.4.2_05 Host: www.ibiblio.org Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2 Connection: close The server responds to an OPTIONS request by sending an HTTP header with a list of the commands allowed on that URL. For example, when the previous command was sent, here's what Apache responded: Date: Thu, 21 Oct 2004 18:06:10 GMT Server: Apache/1.3.4 (Unix) PHP/3.0.6 mod_perl/1.17 Content-Length: 0 Allow: GET, HEAD, POST, PUT, DELETE, CONNECT, OPTIONS, PATCH, PROPFIND, PROPPATCH, MKCOL, COPY, MOVE, LOCK, UNLOCK, TRACE Connection: close The list of legal commands is found in the Allow field. However, in practice these are just the commands the server understands, not necessarily the ones it will actually perform on that URL. For instance, let's look at what happens when you try the DELETE request method. 15.11.1.3 DELETE The DELETE method removes a file at a specified URL from a web server. Since this request is an obvious security risk, not all servers will be configured to support it, and those that are will generally demand some sort of authentication. A typical DELETE request looks like this: DELETE /javafaq/2004march.html HTTP/1.1 User-Agent: Java/1.4.2_05 Host: www.ibiblio.org Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2 Connection: close The server is free to refuse this request or ask for identification. For example: Date: Thu, 19 Aug 2004 14:32:15 GMT Server: Apache/1.3.4 (Unix) PHP/3.0.6 mod_perl/1.17 Allow: GET, HEAD, POST, PUT, DELETE, CONNECT, OPTIONS, PATCH, PROPFIND, PROPPATCH, MKCOL, COPY, MOVE, LOCK, UNLOCK, TRACE Connection: close Transfer-Encoding: chunked Content-Type: text/html content-length: 313 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>405 Method Not Allowed</TITLE> </HEAD><BODY> <H1>Method Not Allowed</H1> The requested method DELETE is not allowed for the URL /javafaq/2004march.html.<P> <HR> <ADDRESS>Apache/1.3.4 Server at www.ibiblio.org Port 80</ADDRESS> </BODY></HTML> Even if the server accepts this request, its response is implementation-dependent. Some servers may delete the file; others simply move it to a trash directory. Others simply mark it as not readable. Details are left up to the server vendor. 15.11.1.4 PUT Many HTML editors and other programs that want to store files on a web server use the PUT method. It allows clients to place documents in the abstract hierarchy of the site without necessarily knowing how the site maps to the actual local filesystem. This contrasts with FTP, where the user has to know the actual directory structure as opposed to the server's virtual directory structure. Here's a how a browser might PUT a file on a web server: PUT /hello.html HTTP/1.0 Connection: Keep-Alive User-Agent: Mozilla/4.6 [en] (WinNT; I) Pragma: no-cache Host: www.ibiblio.org Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */* Accept-Encoding: gzip Accept-Language: en Accept-Charset: iso-8859-1,*,utf-8 Content-Length: 364 <!doctype html public "-//w3c//dtd html 4.0 transitional//en"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> <meta name="Author" content="Elliotte Rusty Harold"> <meta name="GENERATOR" content="Mozilla/4.6 [en] (WinNT; I) [Netscape]"> <title>Mine</title> </head> <body> <b>Hello</b> </body> </html> As with deleting files, allowing arbitrary users to PUT files on your web server is a clear security risk. Generally, some sort of authentication is required and the server must be specially configured to support PUT. The details are likely to vary from server to server. Most web servers do not include full support for PUT out of the box. For instance, Apache requires you to install an additional module just to handle PUT requests . 15.11.1.5 TRACE The TRACE request method sends the HTTP header that the server received from the client. The main reason for this information is to see what any proxy servers between the server and client might be changing. For example, suppose this TRACE request is sent: TRACE /xml/ HTTP/1.1 Hello: Push me User-Agent: Java/1.4.2_05 Host: www.ibiblio.org Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2 Connection: close The server should respond like this: Date: Thu, 19 Aug 2004 17:50:02 GMT Server: Apache/1.3.4 (Unix) PHP/3.0.6 mod_perl/1.17 Connection: close Transfer-Encoding: chunked Content-Type: message/http content-length: 169 TRACE /xml/ HTTP/1.1 Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2 Connection: close Hello: Push me Host: www.ibiblio.org User-Agent: Java/1.4.2_05 The first six lines are the server's normal response HTTP header. The lines from TRACE /xml/ HTTP/1.1 on are the echo of the original client request. In this case, the echo is faithful, although out of order. However, if there were a proxy server between the client and server, it might not be. 15.11.2 Disconnecting from the Server Recent versions of HTTP support what's known as Keep-Alive . Keep-Alive enhances the performance of some web connections by allowing multiple requests and responses to be sent in a series over a single TCP connection. A client indicates that it's willing to use HTTP Keep-Alive by including a Connection field in the HTTP request header with the value Keep-Alive: Connection: Keep-Alive However, when Keep-Alive is used, the server can no longer close the connection simply because it has sent the last byte of data to the client. The client may, after all, send another request. Consequently, it is up to the client to close the connection when it's done. Java marginally supports HTTP Keep-Alive, mostly by piggybacking on top of browser support. It doesn't provide any convenient API for making multiple requests over the same connection. However, in anticipation of a day when Java will better support Keep-Alive, the HttpURLConnection class adds a disconnect( ) method that allows the client to break the connection: public abstract void disconnect( ) In practice, you rarely if ever need to call this. 15.11.3 Handling Server Responses The first line of an HTTP server's response includes a numeric code and a message indicating what sort of response is made. For instance, the most common response is 200 OK, indicating that the requested document was found. For example: HTTP/1.1 200 OK Date: Fri, 20 Aug 2004 15:33:40 GMT Server: Apache/1.3.4 (Unix) PHP/3.0.6 mod_perl/1.17 Last-Modified: Sun, 06 Jun 1999 16:30:33 GMT ETag: "28d907-657-375aa229" Accept-Ranges: bytes Content-Length: 1623 Connection: close Content-Type: text/html <HTML> <HEAD> rest of document follows... Another response that you're undoubtedly all too familiar with is 404 Not Found, indicating that the URL you requested no longer points to a document. For example: HTTP/1.1 404 Not Found Date: Fri, 20 Aug 2004 15:39:16 GMT Server: Apache/1.3.4 (Unix) PHP/3.0.6 mod_perl/1.17 Last-Modified: Mon, 20 Sep 1999 19:25:05 GMT ETag: "5-14ab-37e68a11" Accept-Ranges: bytes Content-Length: 5291 Connection: close Content-Type: text/html <html> <head> <title>Lost ... and lost</title> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> </head> <body bgcolor="#FFFFFF"> <div align="left"> <h1>404 FILE NOT FOUND</h1> Rest of error message follows... There are many other, less common responses. For instance, code 301 indicates that the resource has permanently moved to a new location and the browser should redirect itself to the new location and update any bookmarks that point to the old location. For example: HTTP/1.1 301 Moved Permanently Date: Fri, 20 Aug 2004 15:36:44 GMT Server: Apache/1.3.4 (Unix) PHP/3.0.6 mod_perl/1.17 Location: http://www.ibiblio.org/javafaq/books/beans/index.html Connection: close Content-Type: text/html <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>301 Moved Permanently</TITLE> </HEAD><BODY> <H1>Moved Permanently</H1> The document has moved <A HREF="http://www.ibiblio.org/javafaq/books/beans/index .html">here</A>.<P> <HR> <ADDRESS>Apache/1.3.4 Server at www.ibiblio.org Port 80</ADDRESS> </BODY></HTML> The first line of this response is called the response message . It will not be returned by the various getHeaderField( ) methods in URLConnection . However, HttpURLConnection has a method to read and return just the response message. This is the aptly named getResponseMessage() : public String getResponseMessage( ) throws IOException Often all you need from the response message is the numeric response code. HttpURLConnection also has a getResponseCode( ) method to return this as an int : public int getResponseCode( ) throws IOException HTTP 1.0 defines 16 response codes. HTTP 1.1 expands this to 40 different codes. While some numbers , notably 404, have become slang almost synonymous with their semantic meaning, most of them are less familiar. The HttpURLConnection class includes 36 named constants representing the most common response codes. These are summarized in Table 15-3. Table 15-3. The HTTP 1.1 response codes Code | Meaning | HttpURLConnection constant | 1XX | Informational | | 100 | The server is prepared to accept the request body and the client should send it; a new feature in HTTP 1.1 that allows clients to ask whether the server will accept a request before they send a large amount of data as part of the request. | N/A | 101 | The server accepts the client's request in the Upgrade header field to change the application protocol; e.g., from HTTP 1.0 to HTTP 1.1. | N/A | 2XX | Request succeeded. | | 200 | The most common response code. If the request method was GET or POST, the requested data is contained in the response along with the usual headers. If the request method was HEAD, only the header information is included. | HTTP_OK | 201 | The server has created a resource at the URL specified in the body of the response. The client should now attempt to load that URL. This code is sent only in response to POST requests. | HTTP_CREATED | 202 | This rather uncommon response indicates that a request (generally from POST) is being processed , but the processing is not yet complete, so no response can be returned. However, the server should return an HTML page that explains the situation to the user and provide an estimate of when the request is likely to be completed, and, ideally , a link to a status monitor of some kind. | HTTP_ACCEPTED | 203 | The resource representation was returned from a caching proxy or other local source and is not guaranteed to be up to date. | HTTP_NOT_AUTHORITATIVE | 204 | The server has successfully processed the request but has no information to send back to the client. This is normally the result of a poorly written form-processing program on the server that accepts data but does not return a response to the user. | HTTP_NO_CONTENT | 205 | The server has successfully processed the request but has no information to send back to the client. Furthermore, the client should clear the form to which the request is sent. | HTTP_RESET | 206 | The server has returned the part of the document the client requested using the byte range extension to HTTP, rather than the whole document. | HTTP_PARTIAL | 3XX | Relocation and redirection. | | 300 | The server is providing a list of different representations (e.g., PostScript and PDF) for the requested document. | HTTP_MULT_CHOICE | 301 | The resource has moved to a new URL. The client should automatically load the resource at this URL and update any bookmarks that point to the old URL. | HTTP_MOVED_PERM | 302 | The resource is at a new URL temporarily, but its location will change again in the foreseeable future; therefore, bookmarks should not be updated. | HTTP_MOVED_TEMP | 303 | Generally used in response to a POST form request, this code indicates that the user should retrieve a document other than the one requested (as opposed to a different location for the requested document). | HTTP_SEE_OTHER | 304 | The If-Modified-Since header indicates that the client wants the document only if it has been recently updated. This status code is returned if the document has not been updated. In this case, the client should load the document from its cache. | HTTP_NOT_MODIFIED | 305 | The Location header field contains the address of a proxy that will serve the response. | HTTP_USE_PROXY | 307 | Almost the same as code 303, a 307 response indicates that the resource has moved to a new URL, although it may move again to a different URL in the future. The client should automatically load the page at this URL. | N/A | 4XX | Client error. | | 400 | The client request to the server used improper syntax. This is rather unusual in normal web browsing but more common when debugging custom clients. | HTTP_BAD_REQUEST | 401 | Authorization, generally a username and password, is required to access this page. Either a username and password have not yet been presented or the username and password are invalid. | HTTP_UNAUTHORIZED | 402 | Not used today, but may be used in the future to indicate that some sort of digital cash transaction is required to access the resource. | HTTP_PAYMENT_REQUIRED | 403 | The server understood the request, but is deliberately refusing to process it. Authorization will not help. This might be used when access to a certain page is denied to a certain range of IP addresses. | HTTP_FORBIDDEN | 404 | This most common error response indicates that the server cannot find the requested resource. It may indicate a bad link, a document that has moved with no forwarding address, a mistyped URL, or something similar. | HTTP_NOT_FOUND | 405 | The request method is not allowed for the specified resource; for instance, you tried to PUT a file on a web server that doesn't support PUT or tried to POST to a URI that only allows GET. | HTTP_BAD_METHOD | 406 | The requested resource cannot be provided in a format the client is willing to accept, as indicated by the Accept field of the request HTTP header. | HTTP_NOT_ACCEPTABLE | 407 | An intermediate proxy server requires authentication from the client, probably in the form of a username and password, before it will retrieve the requested resource. | HTTP_PROXY_AUTH | 408 | The client took too long to send the request, perhaps because of network congestion. | HTTP_CLIENT_TIMEOUT | 409 | A temporary conflict prevents the request from being fulfilled; for instance, two clients are trying to PUT the same file at the same time. | HTTP_CONFLICT | 410 | Like a 404, but makes a stronger assertion about the existence of the resource. The resource has been deliberately deleted (not moved) and will not be restored. Links to it should be removed. | HTTP_GONE | 411 | The client must but did not send a Content-length field in the client request HTTP header. | HTTP_LENGTH_REQUIRED | 412 | A condition for the request that the client specified in the request HTTP header is not satisfied. | HTTP_PRECON_FAILED | 413 | The body of the client request is larger than the server is able to process at this time. | HTTP_ENTITY_TOO_LARGE | 414 | The URI of the request is too long. This is important to prevent certain buffer overflow attacks. | HTTP_REQ_TOO_LONG | 415 | The server does not understand or accept the MIME content-type of the request body. | HTTP_UNSUPPORTED_TYPE | 416 | The server cannot send the byte range the client requested. | N/A | 417 | The server cannot meet the client's expectation given in an Expect-request header field. | N/A | 5XX | Server error. | | 500 | An unexpected condition occurred that the server does not know how to handle. | HTTP_SERVER_ERROR HTTP_INTERNAL_ERROR | 501 | The server does not have a feature that is needed to fulfill this request. A server that cannot handle POST requests might send this response to a client that tried to POST form data to it. | HTTP_NOT_IMPLEMENTED | 502 | This code is applicable only to servers that act as proxies or gateways. It indicates that the proxy received an invalid response from a server it was connecting to in an effort to fulfill the request. | HTTP_BAD_GATEWAY | 503 | The server is temporarily unable to handle the request, perhaps due to overloading or maintenance. | HTTP_UNAVAILABLE | 504 | The proxy server did not receive a response from the upstream server within a reasonable amount of time, so it can't send the desired response to the client. | HTTP_GATEWAY_TIMEOUT | 505 | The server does not support the version of HTTP the client is using (e.g., the as-yet-nonexistent HTTP 2.0). | HTTP_VERSION | Example 15-10 is a revised source viewer program that now includes the response message. The lines added since SourceViewer2 are in bold. Example 15-10. A SourceViewer that includes the response code and message import java.net.*; import java.io.*; import javax.swing.*; import java.awt.*; public class SourceViewer3 { public static void main (String[] args) { for (int i = 0; i < args.length; i++) { try { //Open the URLConnection for reading URL u = new URL(args[i]); HttpURLConnection uc = (HttpURLConnection) u.openConnection( ); int code = uc.getResponseCode( ); String response = uc.getResponseMessage( ); System.out.println("HTTP/1.x " + code + " " + response); for (int j = 1; ; j++) { String header = uc.getHeaderField(j); String key = uc.getHeaderFieldKey(j); if (header == null key == null) break; System.out.println(uc.getHeaderFieldKey(j) + ": " + header); } // end for InputStream in = new BufferedInputStream(uc.getInputStream( )); // chain the InputStream to a Reader Reader r = new InputStreamReader(in); int c; while ((c = r.read( )) != -1) { System.out.print((char) c); } } catch (MalformedURLException ex) { System.err.println(args[0] + " is not a parseable URL"); } catch (IOException ex) { System.err.println(ex); } } // end if } // end main } // end SourceViewer3 The only thing this program doesn't read that the server sends is the version of HTTP the server is using. There's currently no method to return that. If you need it, you'll just have to use a raw socket instead. Consequently, in this example, we just fake it as "HTTP/1.x", like this: % java SourceViewer3 http://www.oreilly.com HTTP/1.x 200 OK Server: WN/1.15.1 Date: Mon, 01 Nov 1999 23:39:19 GMT Last-modified: Fri, 29 Oct 1999 23:40:06 GMT Content-type: text/html Title: www.oreilly.com -- Welcome to O'Reilly & Associates! -- computer books, software, online publishing Link: <mailto:webmaster@ora.com>; rev="Made" <HTML> <HEAD> ... 15.11.3.1 Error conditions On occasion, the server encounters an error but returns useful information in the message body nonetheless. For example, when a client requests a nonexistent page from the www.ibiblio.org web site, rather than simply returning a 404 error code, the server sends the search page shown in Figure 15-2 to help the user figure out where the missing page might have gone. Figure 15-2. IBiblio's 404 page The getErrorStream( ) method returns an InputStream containing this data or null if no error was encountered or no data returned: public InputStream getErrorStream( ) // Java 1.2 In practice, this isn't necessary. Most implementations will return this data from getInputStream() as well. 15.11.3.2 Redirects The 300-level response codes all indicate some sort of redirect; that is, the requested resource is no longer available at the expected location but it may be found at some other location. When encountering such a response, most browsers automatically load the document from its new location. However, this can be a security risk, because it has the potential to move the user from a trusted site to an untrusted one, perhaps without the user even noticing. By default, an HttpURLConnection follows redirects. However, the HttpURLConnection class has two static methods that let you decide whether to follow redirects: public static boolean getFollowRedirects( ) public static void setFollowRedirects(boolean follow) The getFollowRedirects( ) method returns true if redirects are being followed, false if they aren't. With an argument of true, the setFollowRedirects( ) method makes HttpURLConnection objects follow redirects. With an argument of false , it prevents them from following redirects. Since these are static methods, they change the behavior of all HttpURLConnection objects constructed after the method is invoked. The setFollowRedirects( ) method may throw a SecurityException if the security manager disallows the change. Applets especially are not allowed to change this value. Java has two methods to configure redirection on an instance-by-instance basis. These are: public boolean getInstanceFollowRedirects( ) // Java 1.3 public void setInstanceFollowRedirects(boolean followRedirects) // Java 1.3 If setInstanceFollowRedirects( ) is not invoked on a given HttpURLConnection , that HttpURLConnection simply follows the default behavior as set by the class method HttpURLConnection.setFollowRedirects( ) . 15.11.4 Proxies Many users behind firewalls or using AOL or other high-volume ISPs access the web through proxy servers. The usingProxy( ) method tells you whether the particular HttpURLConnection is going through a proxy server: public abstract boolean usingProxy( ) // Java 1.3 It returns true if a proxy is being used, false if not. In some contexts, the use of a proxy server may have security implications. 15.11.5 Streaming Mode Every request sent to an HTTP server has an HTTP header. One field in this header is the Content-length; that is, the number of bytes in the body of the request. The header comes before the body. However, to write the header you need to know the length of the body, which you may not have yet. Normally the way Java solves this Catch-22 is by caching every thing you write onto the OutputStream retrieved from the HttpURLConnection until the stream is closed. At that point, it knows how many bytes are in the body so it has enough information to write the Content-length header. This scheme is fine for small requests sent in response to typical web forms. However, it's burdensome for responses to very long forms or some SOAP messages. It's very wasteful and slow for medium-to-large documents sent with HTTP PUT. It's much more efficient if Java doesn't have to wait for the last byte of data to be written before sending the first byte of data over the network. Java 1.5 offers two solutions to this problem. If you know the size of your datafor instance, you're uploading a file of known size using HTTP PUTyou can tell the HttpURLConnection object the size of that data. If you don't know the size of the data in advance, the you can use chunked transfer encoding instead. In chunked transfer encoding, the body of the request is sent in multiple pieces, each with its own separate content length. To turn on chunked transfer encoding, just pass the size of the chunks you want to the setChunkedStreamingMode( ) method before you connect the URL. public void setChunkedStreamingMode(int chunkLength) // Java 1.5 Java will then use a slightly different form of HTTP than the examples in this book. However, to the Java programmer the difference is irrelevant. As long as you're using the URLConnection class instead of raw sockets and as long as the server supports chunked transfer encoding, it should all just work without any further changes to your code. However, not all servers support chunked encoding, though most of the late-model, major ones do. Even more importantly, chunked transfer encoding does get in the way of authentication and redirection. If you're trying to send chunked files to a redirected URL or one that requires password authentication, an HttpRetryException will be thrown. You'll then need to retry the request at the new URL or at the old URL with the appropriate credentials; and this all needs to be done manually without the full support of the HTTP protocol handler you normally have. Therefore, don't use chunked transfer encoding unless you really need it. As with most performance advice, this means you shouldn't implement this optimization until measurements prove the non-streaming default is a bottleneck. If you do happen to know the size of the request data in advance, Java 1.5 lets you optimize the connection by providing this information to the HttpURLConnection object. If you do this Java can start streaming the data over the network immediately. Otherwise, it has to cache everything you write in order to determine the content length, and only send it over the network after you've closed the stream. If you know exactly how big your data is, pass that number to the setFixedLengthStreamingMode( ) method: public void setFixedLengthStreamingMode(int contentLength) Java will use this number in the HTTP Content-length HTTP header field. However, if you then try to write more or less than the number of bytes given here, Java will throw an IOException . Of course, that will happen later, when you're writing data, not when you first call this method. The setFixedLengthStreamingMode( ) method itself will throw an IllegalArgumentException if you pass in a negative number, or an IllegalStateException if the connection is connected or has already been set to chunked transfer encoding. (You can't use both chunked transfer encoding and fixed-length streaming mode on the same request.) Fixed-length streaming mode is transparent on the server side. Servers neither know nor care how the Content-length was set as long as it's correct. However, like chunked transfer encoding, streaming mode does interfere authentication and redirection. If either of these is required for a given URL, an HttpRetryException will be thrown; you have to manually retry. Therefore, don't use this mode unless you really need it. |