Basic Apache Configuration


Even if you need to use advanced features on your Web site, you should begin by getting Apache operating on a basic level. Once Apache can serve static Web pages (that is, those that don't use advanced features like scripting), you can begin to tweak the configuration to do the more advanced things you need it to do. Basic Apache configuration involves running the server and setting fundamental options in the server's configuration files. You should also understand something of Apache modules, which are extensions that handle specific types of tasks . Fortunately, most distributions ship with an Apache configuration that works with few or no changes, but you may need to tweak some of these features to customize your server for your particular needs.

Understanding Apache Configuration Files

The Apache configuration file is usually called httpd.conf . Different distributions use different locations for the file, but the format is the same. Caldera and SuSE store the file in /etc/httpd ; Debian and Slackware use /etc/apache (Slackware provides a sample file called /etc/apache/httpd.conf.default that you must rename and modify); and Mandrake, Red Hat, and TurboLinux use /etc/httpd/conf/ .

Whatever the location, httpd.conf consists of comments, which begin with pound signs (#), and configuration option lines, which take the following form:

  Directive Value  

The Directive is the name of the configuration option you want to adjust, such as Timeout or StartServers . The Value may be a number, a filename, or some other arbitrary string. Some directives allow you to set several suboptions. These are indicated by directive names enclosed in angle brackets ( <> ), as follows :

 <Directory /home/httpd/html>     Options FollowSymLinks     AllowOverride None </Directory> 

The final line uses the same directive name as the first, but without any options, and preceded by a slash ( / ) to indicate that this is the end of the directive block.

Some additional Apache configuration files may be important in some situations. These are normally stored in the same directory as httpd.conf , and they include the following:

  • access.conf ” This is essentially a supplemental configuration file. It's set in httpd.conf with the AccessConfig directive. The access.conf file has traditionally been used for <Directory> directives, which determine how Apache treats access to the specified directory. Many configurations today leave this file empty, or use AccessConfig to point Apache to /dev/null for this file, effectively disabling it.

  • mime.types ” HTTP relies on a file type identification system known as the Multipurpose Internet Mail Extensions (MIME) to allow a Web server to inform a Web browser how to treat a file. For instance, text/plain identifies a file as containing plain text, and image/jpeg identifies a Joint Photographic Experts Group (JPEG) graphics file. The mime.types file contains a mapping of MIME types to filename extensions. For instance, the .txt and .asc filename extensions are associated with text/plain MIME type. If these mappings aren't set appropriately, Web browsers may become confused when confronted with certain file types. The default file works well for most materials you're likely to place on a Web page, but you may need to edit or add mappings if you want to serve unusual file types.

  • magic ” This file provides another way for Apache to determine a file's MIME type. Apache can examine the file's contents to look for telltale signs of the file's type. Many file types have certain key, or magic, byte sequences, and the magic file lists these, converted to a plain-text format so that the file can be edited with a text editor. It's best to leave this file alone unless you understand its format, though, and that format is beyond the scope of this chapter.

Standalone versus Super Server Configuration

Chapter 4, Starting Servers, describes different methods of running servers. Apache can be run in any of the ways discussed in that chapter ”through a super server, a SysV initialization script, or a custom startup script. Most distributions use a SysV startup script or a custom startup script, because these methods of running Apache cause the server to run continuously, and therefore to respond quickly to incoming requests. You may elect to run Apache from a super server if you like, though, and in fact Debian gives you the option of running Apache either way when you install the package. Running Apache from a super server results in slower responses to incoming Web page requests , because the super server must launch Apache for each request. The Apache developers also recommend against this configuration.

TIP

graphics/tip.gif

The delay caused by running a Web server from a super server can be reduced or eliminated by using a slimmer Web server, such as thttpd , or a kernel-based Web server. Therefore, if you want to use a super server for security reasons, you might want to more seriously consider a slimmer Web server than Apache.


Although Chapter 4 covers running servers from a super server or standalone, there is one Apache-specific option you must set: ServerType . This Apache configuration file option can be set to standalone or inetd . If you don't set this option correctly, Apache may behave erratically or fail to respond to requests. If you want to change your configuration, be sure to adjust the configuration file, disable the former startup method, and enable the new startup method. For instance, to convert from a SysV startup to running Apache from inetd , you should change the configuration file, use the SysV startup script to shut down Apache, disable the SysV startup script, edit /etc/inetd.conf to enable Apache, and restart inetd . If you forget one of these steps, you may find that Apache doesn't work correctly, or continues to work with the old configuration.

NOTE

graphics/note.gif

Some distributions call the Apache executable apache , and others call it httpd . If you change your startup script or want to shut down Apache directly, you may need to check both names.


Setting Common Configuration Options

The default Apache configuration works on most systems. After installing the server and starting it, Apache will serve files from its default directory (usually /home/httpd/html ; consult the upcoming section, "Setting Server Directory Options" for more details). This directory normally contains a default set of files that announce that an Apache server is present but unconfigured. You'll almost certainly want to replace these files with the files that make up your own Web site, as described in the upcoming section, "Producing Something Worth Serving."

There are a few general-purpose Apache options you might want to adjust to affect its overall behavior. These include the following:

  • ServerType ” This directive has already been mentioned, but it deserves reiteration. If you change how you run Apache, you must adjust this option to fit: either standalone or inetd .

  • User and Group ” Every Linux server runs as a particular user and group. You can tell Apache to run as a particular user and group with these directives. Most distributions set Apache to run as the user nobody or as a custom user with few privileges, to reduce the potential for damage should a cracker find a way to get Apache to do things you'd rather it not do. It's generally best to leave these options alone.

NOTE

graphics/note.gif

As a security measure, most Apache binaries are compiled so that they can't be run as root .


  • ServerTokens ” Apache can provide callers with varying degrees of information about the platform on which it runs by setting this directive. Most distributions set it to ProductOnly , which provides no information about the OS on which Apache is running. You can set it to Min , OS , or Full to provide increasing levels of information, but this is usually best left at ProductOnly .

WARNING

graphics/warning.gif

Don't assume that setting ServerTokens to ProductOnly will keep your OS choice hidden. Crackers can use traffic analysis tools to infer information about your OS ( mainly whether or not you're running Linux, and perhaps the kernel version number). Other servers may also provide clues about what OS or distribution you're running.


  • MinSpareServers and MaxSpareServers ” When run in standalone mode, Apache starts up several instances of itself in order to provide quick responses to incoming HTTP requests. Each instance can handle a single request. These directives set the minimum and maximum number of these "spare" servers that run at any given time. If fewer than MinSpareServers are running and unused, the master Apache process starts another. If more than MaxSpareServers are running and unused, spares are killed to bring the number in line. Setting these numbers too low can result in slow responses when the load spikes on a heavily used server, while setting them too high can result in reduced performance if the server lacks sufficient memory to handle them all. Most distributions set defaults of about 5 and 10 . You can experiment with lower values if your server is used very lightly, or higher values if your server is heavily used. Note that the total number of Apache processes that run at any given moment may be higher than MaxSpareServers , because some of these may be connected to clients , and so are not spares. A busy Web site, or one whose traffic spikes periodically, may need a lot of swap space to handle all the server instances. If the MaxSpareServers value is high, this may increase the need for memory, and hence swap space.

  • MaxClients ” This directive sets the total number of clients who may connect to the system at any one time. The default is usually about 150 , but you can adjust it up or down to suit your hardware and traffic. Setting this value too high can cause your system's performance to degrade if your site becomes very popular, but setting it too low can keep clients from connecting to your site. As with MaxSpareServers , a high MaxClients value may require you to have a lot of swap space or memory, should your traffic level rise.

NOTE

graphics/note.gif

The number of connections set in MaxClients is not the same as the number of Web browsers Apache supports. Individual Web browsers can open multiple connections (up to 8), and each consumes one of the connections allocated via MaxClients .


  • Listen ” By default, Apache binds to port 80 on all active network interfaces. You can bind it to additional ports or interfaces with this directive. For instance, Listen 192.168.34.98:8080 causes Apache to listen to port 8080 on the interface associated with the 192.168.34.98 address. Listen 8000 binds Apache to port 8000 on all interfaces.

  • BindAddress ” If your system has multiple network interfaces, you can bind Apache to just one interface by using this directive. For instance, BindAddress 192.168.34.98 binds Apache to the interface associated with 192.168.34.98. BindAddress * is the default, which binds Apache to all interfaces.

TIP

graphics/tip.gif

If you need to run Apache on a workstation for local use only, you can use BindAddress 127.0.0.1 to keep it from being accessible to other computers. You'll have to use http://127.0.0.1 or http://localhost as your URL when accessing Apache locally, though.


  • Port ” This directive tells Apache to which port it should listen. The default is 80 .

  • ServerAdmin ” You should specify the e-mail address at which you can be reached with this directive. The default is usually webmaster , which you can alias to your regular user account on the server using your mail server's alias feature, as described in Chapter 19, Push Mail Protocol: SMTP. This e-mail address isn't normally apparent to users, but it's returned with some types of error messages.

  • ServerName ” You can set this directive to your computer's true DNS hostname, if that differs from the hostname configured into the computer by default.

  • DefaultType ” If Apache can't determine the MIME type of a file based on its extension or magic sequence, as described earlier in "Understanding Apache Configuration Files," it returns the MIME type specified by the DefaultType directive. This is normally text/plain , but you might want to change it if your Web site hosts many files of a particular type that might not always be properly identified.

  • HostnameLookups ” This option can be set to On or Off , and it determines whether or not Apache looks up and logs the hostnames of the clients that connect to it. Having hostname information may be convenient when you're analyzing log files, as described in the upcoming section, "Analyzing Server Log Files," but performing the lookups takes some time and network resources, so you might prefer to forgo using this feature.

  • LogLevel ” Apache logs information on its activities. You can set the amount of information it sends to its error log by setting this directive to debug , info , notice , warn , error , crit , alert , or emerg , in decreasing order of the amount of information logged. The default is usually warn . This setting does not affect the access logs.

  • CustomLog ” This directive takes two options: the name of a log file and the format of information sent to that log file. The log file in question holds access logs ”information on what systems have requested Web pages. The format may be common , agent , referer , or combined . For still more flexibility, the LogFormat directive lets you create your own log file format. You can use multiple CustomLog directives to create multiple log files.

These are the major general-purpose configuration options in httpd.conf . Upcoming sections describe some additional options, and still more are esoteric or specialized options that are beyond the scope of this chapter. You should consult the Apache documentation or a book on Apache to learn more about such directives.

Setting Server Directory Options

URLs consist of two to four components :

  • The protocol ” The http:// , ftp:// , or similar component of the URL specifies the protocol to be used. This chapter discusses Web servers, which deal primarily with http:// URLs. (Secure sites use https :// .)

  • The hostname ” The hostname component of the URL is the same as the hostname for the computer on which the Web server runs. For instance, if the URL is http://www.threeroomco.com/thepage/index.html , the Web server's hostname is www.threeroomco.com . (A single computer can have multiple hostnames by setting up multiple DNS A address records or CNAME aliases, as described in Chapter 18.)

  • The filename ” An HTTP request is, at its core , a request for a file transfer. Following the hostname in the URL is a filename, often associated with a directory name. For instance, in http://www.threeroomco.com/thepage/index.html , the file, including its directory reference, is thepage/index.html . Note that, although there is a slash ( / ) separating the hostname from the filename, that slash doesn't indicate that the filename reference is relative to the root of the Linux filesystem; it's relative to the root of the Web site's files directory. If the filename is omitted, most Web servers return a default Web page, as specified by the DirectoryIndex directive, described shortly.

  • Additional information ” Some URLs include additional information specific to a URL type. For instance, HTML Web pages can include position anchors, which are specified by a pound sign and anchor name, and FTP URLs can include a username and password.

There are several Apache configuration options that let you set the directories in which you can store files for the Web server. There are also variant forms of addressing you can use in URLs to indicate which of several alternate directories Apache is to use for retrieving files. If you don't set these options correctly, some or all of your Web pages won't appear in the way you expect. The relevant httpd.conf options include the following:

  • ServerRoot ” This directive sets the root of the directory tree in which Apache's own binary files reside. On most Linux installations, this defaults to "/usr" , and you shouldn't change this setting.

  • DocumentRoot ” Apache looks in the directory specified by this directive for static Web page files. The default is usually "/home/httpd/ html" or something similar. (The directory name is normally enclosed in quote marks in the httpd.conf file.)

WARNING

graphics/warning.gif

Do not include a trailing slash ( / ) in your DocumentRoot directive. Although this is a valid way to refer to directories, it can cause Apache to misbehave.


  • UserDir ” If the filename specified by a Web browser begins with a tilde ( ~ ), Apache interprets the first component of the filename as a username and attempts to locate the file in a subdirectory of the user's home directory. The UserDir directive specifies the name of the subdirectory used for this access. For instance, if UserDir is set to public_html , and if a remote user types http://www.threeroomco.com/~abrown/photos.html into a Web browser, then Apache attempts to return the public_html/photos.html file in abrown 's home directory. If this directive is set to disabled , user directories are disabled. You can disable only some user directories by following disabled with a list of usernames to be disabled. This directive is often enclosed in an <IfModule> directive, which checks to see that the appropriate Apache modules for handling user directories are loaded. (The next section, "Loading Apache Modules," describes modules.)

  • DirectoryIndex ” Some URLs don't end in a filename; they end in a directory name (often followed by a single slash). When Apache receives such a URL, it first tries to locate a default index file, the name of which you specify with the DirectoryIndex directive. Most distributions set this to index.html by default, but you can change this if you like. For instance, with this setting, if a user enters a URL of http://www.threeroomco.com/public/ , Apache returns the public/index.html file from the DocumentRoot directory. You can provide the names of several index files, and Apache will search for all of them. This is often done if Apache handles CGI forms or other non-HTML files.

Most distributions' Apache packages create reasonable defaults for directory and file handling. You may want to check your configuration files to learn where you should place your Web site's files. If you prefer to place the files elsewhere, you can of course change the default settings. You might also want to change the index filename, particularly if you're setting up an Apache server to replace another Web server that used a different index filename.

Loading Apache Modules

One of Apache's strengths is that it's an extensible Web server. Programmers and administrators can write modules that extend its capabilities, without touching the Apache source code or recompiling Apache itself. These modules can add features such as access control mechanisms, parsing extended information provided by clients, and so on. In fact, a great deal of Apache's standard functionality comes in the form of modules that come with the server.

If you check your httpd.conf file, chances are you'll see references to modules. These use the LoadModule directive, and they look like this:

 LoadModule mime_module        lib/apache/mod_mime.so 

This directive gives the module's internal name ( mime_module in this example) and the filename of the external module file itself ( lib/apache/ mod_mime.so ). In this example, the module filename is referenced relative to the ServerRoot , although you can also provide an absolute path if you prefer.

It's possible to build modules directly into the main Apache binary. To find out what modules are permanently available in this way, type httpd -l or apache -l , as appropriate. In some cases, modules built into the Apache binary or loaded via LoadModule need to be activated in the Apache configuration file. This is done with the AddModule directive, thus:

 AddModule mod_mime.c 

You provide the module's source code filename as the value for this directive. Some distributions' Apache configuration files include both LoadModule and AddModule directives for important modules.

Frequently, you won't need to add to the standard Apache module configuration; the default configuration file loads the modules that are most commonly used. In fact, you might want to disable certain modules to eliminate features that might be abused, such as the ability to handle CGI. Unfortunately, it's not always easy to tell what modules can be safely removed from a configuration.

If Apache doesn't do something you require of it, you might want to investigate adding a module to do the trick. One Web site you might want to visit in this case is the Apache Module Register, http://modules.apache.org. You can search for modules others have written by typing in a key word; the site returns a list of modules, including links to the module maintainers' Web sites.



Advanced Linux Networking
Advanced Linux Networking
ISBN: 0201774232
EAN: 2147483647
Year: 2002
Pages: 203

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net