Step 2: Receiving Request Messages | HTTP: The Definitive Guide

5.5 Step 2: Receiving Request Messages

As the data arrives on connections, the web server reads out the data from the network connection and parses out the pieces of the request message ( Figure 5-5 ).

Figure 5-5. Reading a request message from a connection

figs/http_0505.gif

When parsing the request message, the web server:

Parses the request line looking for the request method, the specified resource identifier (URI), and the version number, ^[3] each separated by a single space, and ending with a carriage -return line-feed (CRLF) sequence ^[4]

^[3] The initial version of HTTP, called HTTP/0.9, does not support version numbers. Some web servers support missing version numbers , interpreting the message as an HTTP/0.9 request.

^[4] Many web servers support LF or CRLF as end-of-line sequences, because some clients mistakenly send LF as the end-of-line terminator.

Reads the message headers, each ending in CRLF

Detects the end-of-headers blank line, ending in CRLF (if present)

Reads the request body, if any (length specified by the Content-Length header)

When parsing request messages, web servers receive input data erratically from the network. The network connection can stall at any point. The web server needs to read data from the network and temporarily store the partial message data in memory until it receives enough data to parse it and make sense of it.

5.5.1 Internal Representations of Messages

Some web servers also store the request messages in internal data structures that make the message easy to manipulate. For example, the data structure might contain pointers and lengths of each piece of the request message, and the headers might be stored in a fast lookup table so the specific values of particular headers can be accessed quickly ( Figure 5-6 ).

Figure 5-6. Parsing a request message into a convenient internal representation

figs/http_0506.gif

5.5.2 Connection Input/Output Processing Architectures

High-performance web servers support thousands of simultaneous connections. These connections let the web server communicate with clients around the world, each with one or more connections open to the server. Some of these connections may be sending requests rapidly to the web server, while other connections trickle requests slowly or infrequently, and still others are idle, waiting quietly for some future activity.

Web servers constantly watch for new web requests, because requests can arrive at any time. Different web server architectures service requests in different ways, as Figure 5-7 illustrates:

Single-threaded web servers ( Figure 5-7 a)

Single-threaded web servers process one request at a time until completion. When the transaction is complete, the next connection is processed . This architecture is simple to implement, but during processing, all the other connections are ignored. This creates serious performance problems and is appropriate only for low-load servers and diagnostic tools like type-o-serve .

Multiprocess and multithreaded web servers ( Figure 5-7 b)

Multiprocess and multithreaded web servers dedicate multiple processes or higher-efficiency threads to process requests simultaneously . ^[5] The threads/processes may be created on demand or in advance. ^[6] Some servers dedicate a thread/process for every connection, but when a server processes hundreds, thousands, or even tens or thousands of simultaneous connections, the resulting number of processes or threads may consume too much memory or system resources. Thus, many multithreaded web servers put a limit on the maximum number of threads/processes.

^[5] A process is an individual program flow of control, with its own set of variables . A thread is a faster, more efficient version of a process. Both threads and processes let a single program do multiple things at the same time. For simplicity of explanation, we treat processes and threads interchangeably. But, because of the performance differences, many high-performance servers are both multiprocess and multithreaded.

^[6] Systems where threads are created in advance are called "worker pool" systems, because a set of threads waits in a pool for work to do.

Multiplexed I/O servers ( Figure 5-7 c)

To support large numbers of connections, many web servers adopt multiplexed architectures . In a multiplexed architecture, all the connections are simultaneously watched for activity. When a connection changes state (e.g., when data becomes available or an error condition occurs), a small amount of processing is performed on the connection; when that processing is complete, the connection is returned to the open connection list for the next change in state. Work is done on a connection only when there is something to be done; threads and processes are not tied up waiting on idle connections.

Multiplexed multithreaded web servers ( Figure 5-7 d)

Some systems combine multithreading and multiplexing to take advantage of multiple CPUs in the computer platform. Multiple threads (often one per physical processor) each watch the open connections (or a subset of the open connections) and perform a small amount of work on each connection.

Figure 5-7. Web server input/output architectures

figs/http_0507.gif