The Hypertext Transfer Protocol (HTTP) is an application-level protocol that defines the information interchange between HTTP clients, commonly known as Web browsers, and HTTP servers, commonly referred to as Web servers. HTTP uses the request-response message mechanism for communication between a client and a server. An HTTP client opens a connection and sends a request message to an HTTP server; the server then returns a response message, usually containing the resource that was requested. After delivering the response, the server closes the connection. HTTP is a stateless protocol that is, it does not maintain any connection information across requests.
Three versions of HTTP have been used on the Internet since 1990: 0.9, 1.0, and 1.1. The HTTP version number consists of a major and a minor part. A minor number change implies the addition of some field values that do not change the general message-parsing algorithm. Major numbers are changed when the format of the message is altered.
The first version of HTTP was HTTP/0.9. It was a simple protocol for ASCII data transfer across a TCP/IP-based network. The next version, HTTP/1.0, was defined by RFC 1945. It brought significant changes to the original protocol by allowing both requests and responses to contain metadata describing the data being transferred, as well as additional modifiers. The requests and responses are based on the message format defined by the Multipurpose Internet Mail Extensions (MIME) in RFC 1521 and later in RFC 2045. For example, the introduction of a content-type header allowed multimedia data transfer from the server to the client, so that client software can display not only ASCII text but other media types, such as images, sound, and video. That's what made possible GUI desktop browsers as we know them today.
Both request and response messages consist of an initial line, zero or more header lines, a blank line (i.e., a CRLF carriage return, line feed by itself), and an optional message-body. Initial lines for requests and responses are different. Header lines are usually different as well; however, some headers may be used in both requests and responses. The general format for request and response messages is as follows:
<initial line> Header1: value1 Header2: value2 ... HeaderN: valueN <optional message-body>
Initial lines and headers should end in CRLF; CR and LF here mean ASCII values 13 and 10.
A typical example of HTTP/1.0 request-response interchange is shown in Figure A-1. In this example, a browser sends a request to a server asking for a file called page.html located in the root folder of the Web server. The HTTP initial request line has three parts, separated by spaces: a method name, the local path of the requested resource, and the version of HTTP being used, as follows:
GET /page.html HTTP/1.0
Figure A-1. HTTP/1.0 client-server interaction
When the server gets the request, it retrieves the file, forms a response, and sends the response back to the requesting client as follows:
HTTP/1.0 200 OK Content-Type: text/html <html> ... </html>
The initial response line is called the status line. It has three parts separated by spaces: the HTTP version, a status code, and a reason phrase describing the status code. The Content-Type header specifies the media type of the data being sent to the client in the response's message body in the format of the MIME type. Because the server is sending back an HTML file, the MIME type is text/html. The body of the message contains the requested resource; the header lines and the body are separated by a blank line (CRLF).
HTTP/1.1 is defined by RFC 2616. It extends the functionality of HTTP/1.0 by adding support for virtual server hosting, persistent connections between the client and the server, caching, hierarchical proxies, and gateways. For example, an HTTP/1.1 request must include the Host header that identifies a Web server, as shown in Figure A-2.
Figure A-2. HTTP/1.1 client-server interaction
NOTE: Modern browsers, as well as HTTP proxy servers, include support for both HTTP/1.0 and 1.1. However, users can easily disable HTTP/1.1 on their client software if they want. Also, it is possible to configure a proxy server to use only HTTP/1.0. These factors mean that Web servers need to support both HTTP/1.1 and HTTP/1.0 so that they are able to communicate with the widest possible variety of requesting clients.
The rest of this appendix looks into the formats, methods, header fields, and status codes for HTTP requests and responses. It includes partial quotes from RFC 2616. For the detailed specification, refer to the full version of RFC 2616.