3.7 HTTP Headers

This section briefly covers every HTTP header mentioned in this book, plus a few bonus headers. You can skim through it now to familiarize yourself with the wealth of headers and what they can do, or refer back to this section later when you encounter an unfamiliar header.

In addition to the HTTP headers listed here, additional headers are defined in the HTTP/1.1 specification and elsewhere. Many clients and servers also use custom headers that aren't standardized and aren't even intended to be recognized. HTTP/1.1 specifies that unknown headers must be ignored (unless they begin with Content-). Examples are given for most headers.

3.7.1 Accept-Ranges Header

The Accept-Ranges header allows the server to advertise in an OPTIONS response whether it supports range requests and how it can measure ranges. The only value defined in the HTTP specification for this header is bytes, indicating that the server can accept byte-range requests and return partial resource contents.

When clients know that the server supports range requests, clients can download large files piece by piece. Since a dropped connection is more likely to occur in a long download than a short one merely due to increased time spent, this feature is frequently used to break a large download into chunks. Usually, if a ranged request fails, only that ranged request needs to be reissued, rather than all of them.

 
 Accept-Ranges: bytes 

Note that the Accept-Ranges header may have different values depending on what resource is the target of the OPTIONS request. A dynamically generated resource might not support byte-range requests, whereas a static resource in the same directory would.

The Accept-Ranges header also appears on other kinds of responses. Apache 1.3 sends the Accept-Ranges header in response to GET requests.

3.7.2 Allow Header

The Allow header is used mostly in OPTIONS responses. It can also be used in a PUT request and response, but it rarely is in practice. This header shows the list of methods that are currently supported by the resource addressed in the request. This value can change over time because as it may depend not only on the type of the resource but also on its current state.

 
 Allow: GET, HEAD, OPTIONS, TRACE 

When an OPTIONS * request is used to find out the server's general feature set, the Allow header in the response should include every method known to the server, if possible.

3.7.3 Authorization Header

The Authorization header allows the client to provide user authentication credentials to the server in an HTTP request.

The example of Basic syntax shows that the word BASIC comes first so the server knows how to interpret the header. In Basic syntax, the rest of the header value is simply the result of base-64 encoding the username and password concatenated together with : in between.

 
 Authorization: BASIC bGlzYTp0ZXN0 

The Digest syntax and mechanism are much more complicated. Both authentication mechanisms are described in a companion document to the HTTP specification, "HTTP Authentication: Basic and Digest Access Authentication" [RFC2617].

3.7.4 Accept-* Headers

With the OPTIONS response, we've already seen the major mechanisms allowing clients to get information about the supported features of a Web server. What we haven't seen yet is how servers learn about client support and user preferences. A series of headers on requests provides this kind of information (see Table 3-2). Usually, these headers appear on GET requests, but they can be used more widely whenever the client expects a response body (e.g., a POST request).

The Accept-Ranges header is defined in Section 3.7.1 because it usually appears in OPTIONS responses.

3.7.5 Content-* Headers

Whenever an HTTP message contains a body, it must use at least some headers to provide information about the body, most commonly Content-Length and Content-Type (see Table 3-3).

3.7.6 Conditional Headers

The conditional headers allow clients to put conditional statements on the request such that the server will process the request only if the conditions are met.

Accept Headers on Every Request

The designers of HTTP wanted the protocol to be stateless. A stateless protocol is one that never or rarely requires a server to cache or save information from one request to use in a later request. Since the server can't remember what functions the client supports, the client must put this information in every request.

Putting extra information on every request trades off bandwidth for state. The designers of HTTP were gambling that the ability for servers to reply quickly and to many users made up for the "wasteful" use of bandwidth. The Accept headers are a good example of using bandwidth in an attempt to improve performance.

On Web servers that use cookies extensively to look up state, the advantage may be lost. Still, it remains possible to write a very fast server that doesn't maintain any state, doesn't use cookies, and handles many requests flexibly according to the header values used.


Table 3-2. HTTP Accept-* Headers, Descriptions, and Examples

Accept-Charset

Lists the character sets that the client is capable of handling (comma-separated).

 Accept-Charset: iso-8859-5, unicode-1-1 

Accept-Encoding

Lists the body encodings that the client is capable of handling (comma-separated). The only common values for this header denote the Unix "compress" encoding and the GNU "gzip" encoding, both of which provide compression. Note that Netscape 6.2 requests gzip encoding even though it does not handle the result properly by unzipping the response before handing it on to user or file system but this has been fixed in later versions.

 Accept-Encoding: compress, gzip 

Accept-Language

Lists the human languages that the user prefers or can handle, in comma-separated values. Usually this kind of information is derived from the locale of the client software or user preference settings, rather than functional capabilities of the client software. We show here an example that includes preference information, with French (fr) being the most preferred language, followed by English (en) and then German (de), as indicated by decreasing q values. The language code is separated from the q value by a semicolon. No q value implies a q value of 1 (maximum).

 Accept-Language: fr, de;q=0.5, en;q=0.7 

If-Match, If-None-Match, and If-Range all compare ETag values known to the client to current ETag values for resources on the server. The client may store ETags in its resource cache or remember the ETag for a file that it is editing. These may only be used by the client if the server supports ETags, of course. Servers usually indicate ETag support by returning the ETag in GET and PUT responses. These headers are useful for maintaining caches but absolutely essential in distributed authoring scenarios to avoid accidental overwrites.

Table 3-3. HTTP Content-* Headers, Descriptions, and Examples

Content-Type

The media type, MIME type, or file type of the response body. Usually text/html for Web pages.

 Content-Type: text/html; charset=UTF-8 

Legal MIME types and character sets are both defined by the Internet Assigned Numbers Association (IANA). The UTF-8 transformation for Unicode is defined in [RFC2279].

Content-Length

The length of the response body, an integer, the number of octets in the body. When no body is present, Content-Length may be 0, or the header may be omitted. See Section 3.2.8 for other cases concerning message length.

 Content-Length: 1882 

Content-Language

Shows the language(s) that the body is in. If the sender does not know the language or considers that the content is appropriate for any preferred language, this header is omitted.

 Content-Language: en-Us 

Content-Encoding

Lists any additional encodings on the body. For example, if the body was compressed, the server would list the compression algorithm used. This may include several sequential encodings in the order they were applied. If no encodings were used, this header is omitted.

 Content-Encoding: gzip 

Content-Location

The preferred address or location for the resource. This may be an absolute URL or a relative URL (relative to the server/host that responds to the request). This is most useful for resources that have multiple variants for example, documents that exist in several languages.

 Content-Location: /hr/recruiting/job-1229-fr.txt 

Content-MD5

An MD5 digest of the response body, provided so that the client can check the integrity of the response. This could guard against transmission errors.

 Content-MD5: Q2hlY2sgSW50ZWdyaXR5IQ== 

The Content-MD5 header is discussed in [RFC1864].

Content-Range

When a partial response body is sent, this indicates how much and what part of the body was sent.

 Content-Range: 0-511 

If-Modified-Since and If-Unmodified-Since serve mostly the same purpose as If-Match and If-None-Match, but these headers use the last-modified timestamp instead and can only handle one value. Comparing timestamps is a less reliable technique (suffering from clock skew, time zone, and granularity problems [Krishnamurthy99]), but it was the only technique specified in HTTP/1.0 and the only technique supported on some Web servers. ETags and the headers to compare ETags were added in HTTP/1.1. For clients, these headers are a less reliable fallback mechanism if the server doesn't support ETags. The server will probably supply either the ETag or the Last-Modified date when responding to GET or PUT requests.

In each of the conditional headers, multiple values are separated by commas. For If-Match and If-None-Match, * is a special value indicating "any ETag." When a request fails due to a conditional header, the response is 412 Precondition Failed. If-Range is the exception (see Table 3-4).

Table 3-4. HTTP Conditional Headers, Descriptions, and Examples

If-Match

The server compares the requested resource to all ETags in the If-Match header. The request fails if the resource fails to match any of the ETags. This allows the client to upload a new instance of a page only if the page hasn't changed since the last instance was downloaded.

 If-Match: "10880-22388" 

If-None-Match

The request fails if the requested resource matches any of the ETags in this header. If-None-Match: * is a special construct that will cause the request to fail if the requested resource already exists, because * can match any ETag. This header allows the client to download a resource only if the resource is different from any of the client's cached instances.

 If-None-Match: "10880-22388", "10880-21271" 

If-Modified-Since

The request fails if the requested resource has not been modified since the date given in the header value.

 If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT 

If-Unmodified-Since

The request fails if the requested resource has been modified since the date given.

 If-Unmodified-Since: Sat, 29 Oct 1994 19:43:31 GMT 

If-Range

This header is used with the Range header, when the client has part of a resource and wants to download the rest of the resource, but making sure that the part already received has not changed. This header can use either ETags like If-Match (preferred method) or dates like If-Modified-Since. A partial resource is returned if the condition succeeds (the part specified in the Range header). The entire resource is returned if the condition fails.

 If-Range: "10880-22388" 

3.7.7 Connection Header

The client uses the Connection header to indicate whether it would like to maintain the TCP connection after the server's response. The reasons to do this are discussed in Section 3.5. The only valid values for the Connection header are:

 
 Connection: Close Connection: Keep-Alive 

When the server sees Close in the Connection request header, it must close the connection after sending the response. The Keep-Alive value leaves the decision up to the server it may keep the connection open as the client requests, or it may close it. If the server keeps the connection open after the first response, it can still decide to close the connection later, after any subsequent response.

3.7.8 User-Agent Header

The User-Agent header identifies the HTTP client software acting on behalf of the user. Web browsers usually show the version and platform of the browser software.

MSDAIPP User Agent

Some applications use a lower layer that translates application requests from some API into WebDAV requests. For example, Word and Excel send PUT requests with the same User-Agent value. The value identifies a software layer that makes HTTP requests based on ODBC calls.

 

[View full width]

User-Agent: Microsoft Data Access Internet Publishing Provider graphics/ccc.gif DAV 1.1


Sometimes, HTTP servers use the value of the User-Agent header to decide what features the client can support and to customize the format or contents of a response. However, this practice is problematic for a couple of reasons:

  • Client software can change behavior from one release to another, or even with security patches, but may not necessarily update the User-Agent string.

  • When a server limits its features based on which User-Agent strings it receives, this encourages browsers to mimic other browsers in order to try to get a certain behavior out of the server. This can be seen in the many browsers that put MSIE and Mozilla in their User-Agent strings. What kind of browser advertises this User-Agent value[3] ?

    [3] Both MSIE 5.5 and some versions of Mozilla use this exact User-Agent string with only the operating system varying.

     
     Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0) 
  • Many browsers allow users to change the value the User-Agent header returns [Mozdev03].

Because of these problems, whenever new client features must be identified, this should be done in other headers that can be standardized and used by other clients that support the same features.

3.7.9 Expect Header

The client uses the Expect header to verify if a request is likely to succeed before sending the whole body. See Section 3.5.3.

3.7.10 Cache Control Headers

Several headers exist in order to help keep caches up to date and performing well. These header values are used by caching software to decide whether to cache resources, for whom, and for how long. All of these headers are used in server responses (see Table 3-5).

Table 3-5. Cache Control Headers, Descriptions, and Examples

Cache-Control

This header can contain values like no-cache to help caches know whether to keep a cached response body. Other possible values are private to indicate that the resource isn't publicly readable and public to indicate the inverse. The full listing of possible values and what they mean can be found in the HTTP specification.

 Cache-Control: no-cache 

Expires

The Expires header provides a date after which the document is considered "stale" and should not be used from a cache. Instead, the cache should fail the request or get a more recent copy from the server.

 Expires: Thu, 01 Dec 1994 16:00:00 GMT 

Pragma

Although the Pragma header is obsolete (replaced by the more flexible Cache-Control header), it's still used on many servers to handle older clients and proxies.

 Pragma: no-cache 

WebDAV servers may not find the HTTP cache control mechanisms sufficient, because WebDAV servers are more likely to have files with restricted permissions. A file that is publicly readable may later have its permissions changed, so it would be nice if the file wasn't cached anywhere. A file that is not publicly readable shouldn't be cached at an intermediary, because that intermediary can't be trusted to restrict permissions the same way the WebDAV server does. Thus, many WebDAV servers make every attempt to disable caching. This is most effectively done with the addition of two headers to responses with bodies. The two headers are both needed in practice to handle HTTP/1.0 and HTTP/1.1 software.

 
 Cache-Control: no-cache Pragma: no-cache 

Even when the resource is publicly readable, it's hard for a WebDAV server to know how long the client can safely cache the resource, because a WebDAV resource may be changed at any moment.

3.7.11 Date

The Date header is required on most server responses and should contain the date the response was generated. The server may choose not to send the Date header with certain status codes (100, 101, 500, 503) or if it is unable to provide a reasonable approximation of the current time.

 
 Date: Tue, 15 Nov 1994 08:12:31 GMT 

3.7.12 Range

Clients can request only part of a resource body by specifying the range in bytes, provided the server supports this feature (as shown in OPTIONS response, discussed in Section 3.3.7). This can be useful to resume an interrupted download, as long as the client makes sure that the file being downloaded has not changed since the first byte range was downloaded.

The following example shows the client request for the tail end of a resource (bytes 93300 to the end of the document). The syntax also supports requests for a range of bytes at the beginning or middle of the resource contents.

 
 GET /users/alice/report.doc HTTP/1.1 Bytes: 93300- If-Match: "etag11083" graphics/enter.gif 

3.7.13 Location

The server uses the Location header to tell the client the real address of the resource. The most common use is when the client requests a URL that the server has mapped to another URL for example, if the collection

http://www.example.com/hr/

maps to the default page

http://www.example.com/hr/index.html

When the client sends a GET request for the first URL, the server returns the body of the resource addressed by the second URL. The server also includes the Location header with the correct URL for the client to use in the future, particularly for PUT and DELETE requests:

 
 Location: http://www.example.com/hr/index.html 

WebDAV servers can also use this header in response to any method.

3.7.14 Server

The HTTP server can identify the software it is running by using a string in the Server header.

 
 Server: Tomcat Web Server/3.3 Final ( JSP 1.1; Servlet 2.2 ) 

3.7.15 Upgrade

The Upgrade header is used to upgrade from HTTP/1.1 to another version of HTTP or even a completely unrelated protocol. It's not commonly used, partly because it was added with future versions of HTTP in mind, and we're not there yet. A server should ignore this header if it does not offer upgrade opportunities. The following example is completely fictitious:

 
 Upgrade: HTTP/2.0, IRC/6.9, RTA/x11 

3.7.16 WWW-Authenticate

The WWW-Authenticate header appears frequently on 401 Unauthorized responses. When the method fails because the user isn't authenticated, the server needs to challenge the client to authenticate. The server must provide some information in this challenge at a minimum, the server needs to specify the authentication mechanisms it supports and the server the user should try to log in to. A sample Basic authentication challenge is shown here, and an example of a Digest authentication challenge appears in Section 3.6.2.

 
 WWW-Authenticate: Basic realm="example.com" 


WebDAV. Next Generation Collaborative Web Authoring
WebDAV. Next Generation Collaborative Web Authoring
ISBN: 130652083
EAN: N/A
Year: 2003
Pages: 146

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net