HTTP

[Previous] [Next]

To understand Web services using HTTP, you need to understand how HTTP works. HTTP is a simple protocol. It is plain-text-based, so it travels nicely over any transport medium you can think of. HTTP communication consists of a series of messages sent between a client and a server. These are classified as request and response messages. The HTTP standard defines several different request methods. The most common are GET and POST.

The GET Method

When the user types a URL into the address bar of a Web browser, a GET request is usually issued. If, for example, the user requests the file http://www.myduckjokes.com/latest.htm, the Web browser creates an HTTP packet—consisting of a request header—that is sent to the server at the IP address pointed to by the URL. The header looks something like this:

 GET /latest.htm HTTP/1.1 Host: www.myduckjokes.com Content-Type: text/html {blank line} 

An HTTP packet consists of the following parts:

  • An initial line
  • Zero or more header lines
  • A blank line (for example, a solitary newline character)
  • An optional message body (such as a file, query data, or query output)

The initial line indicates the purpose of the packet. In this case, the packet is a GET request instructing the server to access the file located at /latest.htm. The version of HTTP (1.1 in this example) finishes this line.

The next lines in the header indicate the parameters of the request. These consist of key-value pairs—that is, the name of the parameter starts a line, followed by a colon and space, followed by the value of the parameter. The parameter ends with a single newline character.

An HTTP request packet can contain dozens or even hundreds of parameters. It's in this packet that the browser can send the browser type, operating system type and version, and cookie information to a server. Obviously, the larger your HTTP packet is, the more time sending the request will take. The header ends with two newline characters in a row. All text after these newline characters is considered the body of the HTTP packet.

The packet is received by the HTTP server (also called a Web server), which tries to resolve the filename (/latest.htm). If possible, the HTTP server will find the page, perform any necessary processing, and return the resulting HTML as an HTTP response package. Listing 8-1 shows that the response package has an HTTP header, followed by two newline characters and then the payload. The payload is usually an HTML document.

Listing 8-1. An HTTP response packet.

 

HTTP Response

HTTP/1.1 200 OK Date: Fri, 31 Dec 1999 23:59:59 GMT Content-Type: text/html Content-Length: 283 <HTML> <HEAD> <TITLE>Latest Duck-Bar Jokes</TITLE> </HEAD> <BODY> <H1>Latest Duck-Bar Jokes</H1> <P>A duck walks into a bar. The bartender says, "We don't serve ducks here." The duck says, "That's OK, I don't really like duck, anyway. How about a beer?"</P> <P></P> </BODY> </HTML>

Notice that the response packet has information about itself in the first line of the header, followed by a series of key-value pairs that help the Web browser understand what to do with the packet.

The first line contains a status code. There are a number of these status codes. The most common is 200, which means "Package was delivered OK." Other common status codes are 404, which means that some resource wasn't found, and 500, which means that the server generated an error. The status code is meant to be readable by a computer; the message is meant to be readable by a human. Status codes are assigned to ranges of numbers:

  • 1xx Indicates an informational message only
  • 2xx Indicates success of some kind
  • 3xx Redirects the client to another URL
  • 4xx Indicates an error on the client's part
  • 5xx Indicates an error on the server's part

All responses except those with 100-level status (but including error responses) must include the Date: header. All time values in HTTP use Greenwich Mean Time.

The POST Method

When the user hits a Submit button on a Web page, information is probably being posted to a Web server somewhere. HTML has elements that capture information from the user and send it to a Web server. In this example, we'll use two INPUT elements to retrieve data from a user, and we'll send that data to a server using the POST method. The file 1040.htm in Listing 8-2 shows a typical page that captures information from a user. You can find this file in the folder /Samples/Ch08 on the companion CD.

Listing 8-2. An HTML document containing a FORM element.

 

1040.htm

<HTML> <HEAD> <TITLE>1040 Super E-Z Form</TITLE> </HEAD> <BODY STYLE="font-family:Verdana;"> <H1>1040 Super E-Z Form</H1> <FORM METHOD="POST" ACTION="/cgi-bin/processForm.pl"> <TABLE> <TR> <TD>Enter your social security number:</TD> <TD><INPUT TYPE="TEXT" NAME="socSecNum"></TD> </TR> <TR> <TD>How much did you make last year?</TD> <TD><INPUT TYPE="TEXT" NAME="income"></TD> </TR> <TR> <TD>Press "Calculate" for verdict</TD> <TD><INPUT TYPE="SUBMIT" VALUE="Calculate"> </TR> </TABLE> </FORM> </BODY> </HTML>

Notice the FORM element. This element is used to combine user interface elements such as INPUT boxes, TEXTAREA fields, and SUBMIT buttons. In this example, FORM has two attributes, METHOD and ACTION. The METHOD attribute indicates that we are going to use the HTTP POST method to submit the data in the form to the Web server. The ACTION attribute contains the pathname of the script that's going to process this form data on the Web server. The .pl extension in the filename contained in the path indicates that we are going to execute a Perl script in the /cgi-bin directory. Perl script will interpret the name-value pairs in the posted HTTP package and (we hope) do something useful.

Loading the code in Listing 8-2 in a Web browser results in the view shown in Figure 8-1. The user has filled in the values.

click to view at full size.

Figure 8-1. An HTML form document in a Web browser.

When the user clicks the Calculate button, the Web browser creates the HTTP packet shown in Listing 8-3.

Listing 8-3. An HTTP package containing information captured from an HTML form.

 

HTTP Information

POST /cgi-bin/processForm.pl HTTP/1.1 Host: www.irs.gov Content-Type: application/x-www-form-urlencoded Content-Length: 35 socSecNum=123-45-6789&income=80000

The Content-Type header tells the server that this packet is an encoded form object that the server will need to decode before routing it to the application. The body of the package contains the names of each of the input fields, along with the value the user typed in. An ampersand (&) separates each name-value pair.

When the server gets the package, it parses the values in the body and passes that information to the script. The script then takes over processing and is able to return a result through the HTTP server.



XML and SOAP Programming for BizTalk Servers
XML and SOAP Programming for BizTalk(TM) Servers (DV-MPS Programming)
ISBN: 0735611262
EAN: 2147483647
Year: 2000
Pages: 150

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net