2.1 HTTP Basics

Java Servlet Programming, 2nd Edition > 2. HTTP Servlet Basics > 2.1 HTTP Basics

 
< BACKCONTINUE >

2.1 HTTP Basics

Before we can even show you a simple HTTP servlet, we need to make sure that you have a basic understanding of how the protocol behind the Web, HTTP, works. If you're an experienced CGI programmer (or if you've done any serious server-side web programming), you can safely skip this section. Better yet, you might skim it to refresh your memory about the finer points of the GET and POST methods. If you are new to the world of server-side web programming, however, you should read this material carefully, as the rest of the book is going to assume that you understand HTTP. For a more thorough discussion of HTTP and its methods, see HTTP Pocket Reference by Clinton Wong (O'Reilly).

2.1.1 Requests, Responses, and Headers

HTTP is a simple, stateless protocol. A client, such as a web browser, makes a request, the web server responds, and the transaction is done. When the client sends a request, the first thing it specifies is an HTTP command, called a method, that tells the server the type of action it wants performed. This first line of the request also specifies the address of a document (a URL) and the version of the HTTP protocol it is using. For example:

GET /intro.html HTTP/1.0

This request uses the GET method to ask for the document named intro.html, using HTTP Version 1.0. After sending the request, the client can send optional header information to tell the server extra information about the request, such as what software the client is running and what content types it understands. This information doesn't directly pertain to what was requested, but it could be used by the server in generating its response. Here are some sample request headers:

User-Agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 95) Accept: image/gif, image/jpeg, text/*, */*

The User-Agent header provides information about the client software, while the Accept header specifies the media (MIME) types that the client prefers to accept. (We'll talk more about request headers in the context of servlets in Chapter 4.) After the headers, the client sends a blank line, to indicate the end of the header section. The client can also send additional data, if appropriate for the method being used, as it is with the POST method that we'll discuss shortly. If the request doesn't send any data, it ends with an empty line.

After the client sends the request, the server processes it and sends a response. The first line of the response is a status line specifing the version of the HTTP protocol the server is using, a status code, and a description of the status code. For example:

HTTP/1.0 200 OK

This status line includes a status code of 200, which indicates that the request was successful, hence the description OK. Another common status code is 404, with the description Not Found as you can guess, this means that the requested document was not found. Chapter 5 discusses common status codes and how you can use them in servlets, while Appendix D, provides a complete list of HTTP status codes.

After the status line, the server sends response headers that tell the client things like what software the server is running and the content type of the server's response. For example:

Date: Saturday, 23-May-00 03:25:12 GMT Server: Tomcat Web Server/3.2 MIME-version: 1.0 Content-type: text/html Content-length: 1029 Last-modified: Thursday, 7-May-00 12:15:35 GMT

The Server header provides information about the server software, while the Content-type header specifies the MIME type of the data included with the response. (We'll also talk more about response headers in Chapter 5.) The server sends a blank line after the headers, to conclude the header section.

If the request was successful, the requested data is then sent as part of the response. Otherwise, the response may contain human-readable data that explains why the server couldn't fulfill the request.

2.1.2 GET and POST

When a client connects to a server and makes an HTTP request, the request can be of several different types, called methods. The most frequently used methods are GET and POST. Put simply, the GET method is designed for getting information (a document, a chart, or the results from a database query), while the POST method is designed for posting information (a credit card number, some new chart data, or information that is to be stored in a database). To use a bulletin board analogy, GET is for reading and POST is for tacking up new material. GET is the method used when you type a URL directly into your browser or click on a hyperlink; either GET or POST can be used when submitting an HTML form.

The GET method, although it's designed for reading information, can include as part of the request some of its own information that better describes what to get such as an x, y scale for a dynamically created chart. This information is passed as a sequence of characters appended to the request URL in what's called a query string . Placing the extra information in the URL in this way allows the page to be bookmarked or emailed like any other. Because GET requests theoretically shouldn't need to send large amounts of information, some servers limit the length of URLs and query strings to about 240 characters.

The POST method uses a different technique to send information to the server because in some cases it may need to send megabytes of information. A POST request passes all its data, of unlimited length, directly over the socket connection as part of its HTTP request body. The exchange is invisible to the client. The URL doesn't change at all. Consequently, POST requests cannot be bookmarked or emailed or, in some cases, even reloaded. That's by design information sent to the server, such as your credit card number, should be sent only once. POST also provides a bit of extra security when sending sensitive information because the server's access log that records all URL accesses won't record the submitted POST data.

In practice, the use of GET and POST has strayed from the original intent. It's common for long parameterized requests for information to use POST instead of GET to work around problems with overly long URLs. It's also common for simple forms that upload information to use GET because, well why not, it works! Generally, this isn't much of a problem. Just remember that GET requests, because they can be bookmarked so easily, should not be allowed to cause a change on the server for which the client could be held responsible. In other words, GET requests should not be used to place an order, update a database, or take an explicit client action in any way.

2.1.3 Other Methods

In addition to GET and POST, there are several other lesser-used HTTP methods. There's the HEAD method, which is sent by a client when it wants to see only the headers of the response, to determine the document's size, modification time, or general availability. There's also PUT, to place documents directly on the server, and DELETE, to do just the opposite. These last two aren't widely supported due to complicated policy issues. The TRACE method is used as a debugging aid it returns to the client the exact contents of its request. Finally, the OPTIONS method can be used to ask the server which methods it supports or what options are available for a particular resource on the server.


Last updated on 3/20/2003
Java Servlet Programming, 2nd Edition, © 2001 O'Reilly

< BACKCONTINUE >


Java servlet programming
Java Servlet Programming (Java Series)
ISBN: 0596000405
EAN: 2147483647
Year: 2000
Pages: 223

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net