Tuning Web Servers

team bbl


Many of the approaches to tuning other servers given in other chapters are appropriate for web servers as well. For example, increasing the size of the send and receive socket buffers via sysctl or setsockopt, as described in Chapter 12, "Network Tuning," is useful. Similarly, increasing the size of the accept queue helps prevent requests from being dropped by the operating system before the web server even sees them. In this section, we focus on tuning that is done for web servers in particular. This is in addition to tuning that is done on the operating system.

Tuning for All Web Servers

These changes are useful for all web servers, regardless of the architectural model.

The following parameter increases the number of TCP SYN packets that the server can queue before SYNs are dropped:

 echo 30000 > /proc/sys/net/ipv4/tcp_max_syn_backlog 

Web servers typically have a large number of TCP connections in the TIME-WAIT state. The following parameter increases the number connections that are allowed in that state:

 echo 2000000 > /proc/sys/net/ipv4/tcp_max_tw_buckets 

The following parameter sets the length for the number of packets that can be queued in the network core (below the IP layer). This allows more memory to be used for incoming packets, which would otherwise be dropped.

 echo 50000 > /proc/sys/net/core/netdev_max_backlog 

Apache

Apache's main configuration file is httpd.conf. Several parameters can be modified in that file to improve performance:

The following parameter sets the upper bound on the number of processes that Apache can have running concurrently:

 MaxClients 150 

Larger values allow larger numbers of clients to be served simultaneously. Very large values may require Apache to be recompiled. This change should be used with care, because large numbers of processes can cause excessive overhead.

The following parameter indicates the number of requests that a single Apache process will perform on a connection for a client before it closes the connection:

 MaxKeepAliveRequests 100 

Opening and closing connections consumes CPU cycles, so it is better for the server to provide as many responses on a single connection as possible. Therefore, larger numbers are better. In fact, setting this value to 0 indicates that the server should never close the connection if possible.

The following parameter determines the number of requests an individual Apache process will serve before it dies (and is reforked). 0 implies that the number is unlimited, but on some systems with memory leaks in the operating system, setting this to a nonzero value keeps the memory leak under control.

 MaxRequestsPerChild 0 

The following parameters determine the minimum and maximum number of idle processes that Apache keeps in anticipation of new requests coming in:

 MinSpareServers 5 MaxSpareServers 10 

The idea is that it is cheaper to keep a live process idle than to fork a new process in response to a new request arrival. For highly loaded sites, you might want to increase these values.

Flash and Other Event-Driven Servers

A common problem event-driven servers have is that they use a large number of file descriptors and thus can run out of these descriptors if the maximum is not increased.

The following is an sh (shell) command that increases the number of open files a process may have:

 ulimit -n 16384 

The process (in this case, the web server) must also be modified to take advantage of large numbers of descriptors:

 #include <bits/types.h> #undef  __FD_SETSIZE #define __FD_SETSIZE 16384 

The default FD_SETSIZE on Linux 2.4 is only 1024.

Tux

The following parameter determines the number of active simultaneous connections:

 echo 20000 > /proc/sys/net/tux/max_connect 

The following parameter sets the maximum number of connections waiting in Tux's accept queue:

 echo 8192 > /proc/sys/net/tux/max_backlog 

The following parameter disables logging of requests to disk:

 echo 0 > /proc/sys/net/tux/logging 

Performance Tools for Evaluating Web Servers

Many tools are available for evaluating the performance of web servers, also known as workload generators. These are programs that run on client machines, emulating a client's behavior, constructing HTTP requests, and sending them to the server. The workload generator can typically vary the volume of requests it generates, called load, and measures how the server behaves in response to that load. Performance metrics include items such as request latency (how long it took an individual response to come back from the server) and throughput (how many responses a server can generate per second).

Perhaps the most commonly used tool is SPECWeb99. This tool is distributed by the Standard Performance Evaluation Corporation (SPEC) nonprofit organization, whose web site is www.spec.org. This tool is probably the most-cited benchmark, and it is used for marketing purposes by server vendors such as IBM, Sun, and Microsoft. Unfortunately, the tool costs money, although it is available freely to member institutions such as IBM. The benchmark is intended to capture the main performance characteristics that have been observed in web servers, such as the size distribution and popularity of files requested. The tool is considered a macro-benchmark in that it is meant to measure whole system performance.

Another tool frequently used is httperf from HP Labs, which is available freely under an open-source license. This tool is highly configurable, allowing you to stress isolated components of a web serverfor example, how well a server handles many idle connections. Thus, it is used more as a microbenchmark.

Many other tools exist for evaluating web server performance, including SURGE, WebBench, and WaspClient. However, describing them all is outside the scope of this chapter. Nevertheless, many options are available for stressing, testing, and measuring servers, and many of these are freely available.

Also useful to web site operators are log analysis tools. These tools look through the logs generated by the server and report information such as how many visits a site received over a period of time and where the visitors came from. Performance can be optimized when the operator understands how visitors are using a site. Logs are typically kept in a standard format called the Apache Common Log format. Many commercial tools are available; however, two freely available open-source tools are analog and webalizer.

    team bbl



    Performance Tuning for Linux Servers
    Performance Tuning for Linux Servers
    ISBN: 0137136285
    EAN: 2147483647
    Year: 2006
    Pages: 254

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net