These days many web pages are not served from static files on the hard drive. Instead, the server generates them dynamically to meet user requests . The content may be pulled from a database or generated algorithmically by a program. Indeed, the actual page delivered to the client may contain data combined from several different sources. In Java, such server-side programs are often written using servlets or Java Server Pages (JSP). They can also be written with other languages, such as C and Perl, or other frameworks, such as ASP and PHP. The concern in this book is not so much with how these programs are written as with how your programs communicate with them. One advantage to HTTP is that it really doesn't matter how the other side of the connection is written, as long as it speaks the same basic HTTP protocol.
The simplest server-side programs run without any input from the user. From the viewpoint of the client, these programs are accessed like any other web page and aren't of much concern to this book. The difference between a web page produced by a program that takes no input and a web page written in static HTML is all on the server side. When writing clients , you don't need to know or care whether the web server is feeding you a file or the output of some program it ran. Your interface to the server is the same in either case.
A slightly more complex server-side program processes user input from HTML forms. A web form is essentially just a way of collecting input from the user, dividing it into neat pieces, and passing those pieces to some program on the server. A client written in Java can perform the same function, either by asking the user for input in its own GUI or by providing its own unique information.
HTTP provides a standard, well understood and well supported means for Java applets and applications to talk to remote systems; therefore, I will cover how to use Java to both receive and send data to the server. There are other ways for Java programs to talk to servers, including Remote Method Invocation (RMI) and SOAP. However, RMI is slow and SOAP is quite complex. By way of contrast, HTTP is mature, robust, better supported across multiple platforms and web servers, and better understood in the web development community.
Example 3-1 and Figure 3-3 show a simple form with two fields that collects a name and an email address. The values the user enters in the form are sent back to the server when the user presses the "Submit Query" button. The program to run when the form data is received is /cgi/reg.pl ; the program is specified in the ACTION attribute of the FORM element. The URL in this parameter is usually a relative URL, as it is in this example.
Example 3-1. A simple form with input fields for a name and an email address
<HTML> <HEAD> <TITLE>Sample Form</TITLE> </HEAD> <BODY> <FORM METHOD=GET ACTION="/cgi/reg.pl"> <PRE> Please enter your name: <INPUT NAME="username" SIZE=40> Please enter your email address: <INPUT NAME="email" SIZE=40> </PRE> <INPUT TYPE="SUBMIT"> </FORM> </BODY> </HTML>
Figure 3-3. A simple form
The web browser reads the data the user types and encodes it in a simple fashion. The name of each field is separated from its value by the equals sign (=). Different fields are separated from each other by an ampersand (&). Each field name and value is x-www-form-url-encoded; that is, any non-ASCII or reserved characters are replaced by a percent sign followed by hexadecimal digits giving the value for that character in some character set. Spaces are a special case because they're so common. Instead of being encoded as %20, they become the + sign. The plus sign itself is encoded as %2b. For example, the data from the form in Figure 3-3 is encoded as:
This is called the query string .
There are two methods by which the query string can be sent to the server: GET and POST . If the form specifies the GET method, the browser attaches the query string to the URL it sends to the server. Forms that specify POST send the query string on an output stream. The form in Example 3-1 uses GET to communicate with the server, so it connects to the server and sends the following command:
GET /cgi/reg.pl?username=Elliotte+Harold&email=elharo%40macfaq.com HTTP/1.0
The server uses the path component of the URL to determine which program should handle this request. It passes the query string's set of name-value pairs to that program, which normally takes responsibility for replying to the client.
With the POST method, the web browser sends the usual headers and follows them with a blank line (two successive carriage return/ linefeed pairs) and then sends the query string. If the form in Example 3-1 used POST , it would send this to the server:
POST /cgi-bin/register.pl HTTP 1.0 Content-type: application/x-www-form-urlencoded Content-length: 65 username=Elliotte+Harold&email=elharo%40metalab.unc.edu
There are many different form tags in HTML that produce pop-up menus , radio buttons , and more. However, although these input widgets appear different to the user, the format of data they send to the server is the same. Each form element provides a name and an encoded string value.
Because GET requests include all necessary information in the URL, they can be bookmarked, linked to, spidered, googled, and so forth. The results of a POST request cannot. This is deliberate . GET is intended for noncommital actions, like browsing a static web page. POST is intended for actions that commit to something. For example, adding items to a shopping cart should be done with GET , because this action doesn't commit; you can still abandon the cart. However, placing the order should be done with POST because that action makes a commitment. This is why browsers ask you if you're sure when you go back to a page that uses POST (as shown in Figure 3-4). Reposting data may buy two copies of a book and charge your credit card twice.
Figure 3-4. Repost confirmation
In practice, POST is vastly overused on the web today. Any safe operation that does not commit the user to anything should use GET rather than POST . Only operations that commit the user should use POST .