Once you ve installed the desired Apache packages, your server should be ready to serve web pages to the local computer. All you need to do is start the httpd service and direct your web browser to the localhost address.
But a web server doesn t do you much good unless you can call its web pages from other computers. In this chapter, we ll analyze the main Apache configuration file, httpd.conf , in some detail.
These settings are based on the specifications of the Hypertext Transfer Protocol (HTTP) standards version 1.1. We provide only a brief overview of Apache 2.0; for more information, see Linux Apache Web Server Administration, Second Edition (Sybex, 2002).
Tip | If you install the httpd-manual-* RPM, you ll get a full Apache manual in HTML format in the /var/www/manual directory. |
Once you ve installed the Apache packages that you need, starting Apache is easy. As with other services described throughout this book, all you need to do is start the applicable script from the /etc/ rc.d/init.d directory. In this case, the following command should work nicely :
# service httpd start
If you still have the default Apache configuration file, you ll probably see the following message:
Starting httpd: httpd: Could not determine the servers fully qualified domain name, using 127.0.0.1 for ServerName
Now you can open the browser of your choice to the localhost address. This is also known as the loopback IP address, which as defined in Chapter 20 is 127.0.0.1. Figure 30.1 shows the result in the Mozilla web browser.
You ll also want to use a command such as chkconfig , as described in Chapter 13 , to make sure Apache starts the next time you start Linux at an appropriate runlevel. For example, the following command starts the Apache daemon, httpd , whenever you start Linux in runlevel 2, 3, or 5:
# chkconfig --level 235 httpd on
Now you re ready to start customizing the Apache configuration.
The main Apache configuration file, httpd.conf , is located in the /etc/httpd/conf directory. It is split into three sections. In the global environment section, you can configure the basic settings for this web server. In the main server configuration section, you ll set up the basic defaults for any websites on your server. The Virtual Hosts section allows you to set up several different websites on your Apache server, even if you have only one IP address.
Note | There were originally three main configuration files for Apache: access.conf , srm.conf , and httpd.conf , all located in the same directory. While later versions of Apache 1.3.x incorporated the information from access.conf and srm.conf in httpd.conf , at least blank versions of access.conf and srm.conf were still required by the server. Apache 2.0.x no longer needs these extra configuration files. |
Commands in the Apache configuration file are known as directives . In the following sections, we ll analyze the directives from the default Apache httpd.conf installed with Red Hat Linux 9 in some detail. You can read the file for yourself; it includes many other useful comments.
Commands with a pound sign ( # ) in front are commented out in the default Apache configuration file. If you re learning about Apache for the first time, experiment a bit. Set up some website files on your computer. Use the directory specified by the DocumentRoot directive, which is by default /var/ www/html . Try out some of these commands, restart the httpd daemon, and examine the changes for yourself. You might be surprised at what you can do.
We ll look at each of the directives in the global environment section of the default version of the Apache httpd.conf configuration file. Variables in this section apply to all Virtual Hosts that you might configure on this server. There are basic parameters, detailed parameters related to different clients , port settings, pointers to other configuration files, and module locations.
Note | If a directive is set to 0, it normally means you re setting no limit on that directive. For example, if you set Timeout to 0, connections from a client browser are kept open indefinitely. |
The following directive gives users of your website some basic information about your software. While the following command tells users that your web server is Apache on a Unix-style system, other commands are possible, as described in Table 30.3:
ServerTokens OS
Directive | Description |
---|---|
ServerTokens Prod | Identifies the web server as Apache |
ServerTokens Min | Identifies Apache and its version number |
ServerTokens OS | Identifies Apache, its version number, and the type of operating system |
ServerTokens Full | Identifies Apache, its version number, the type of operating system, and compiled modules |
The ServerRoot directive identifies the directory with configuration, error, and log files:
ServerRoot "/etc/httpd"
If you run ls -l /etc/httpd , you ll see links to the real location of certain directories; for example, /etc/httpd/logs is linked to the /var/log/httpd directory.
Apache includes parent and child processes for different connections. The ScoreBoardFile parameter helps these processes communicate with each other. Otherwise , the communication is through active memory.
#ScoreBoardFile run/httpd.scoreboard
Tip | I normally avoid activating the ScoreBoardFile parameter; it s required only for certain architectures, which does not include Red Hat Linux 9. |
You might note that run is a relative subdirectory. The full directory name is based on the ServerRoot directive ”in other words, /etc/httpd/run .
The PidFile specifies the file where Apache records the process identifier (PID):
PidFile run/httpd.pid
If computers are having trouble communicating on your network, you need a Timeout value to keep Apache from hanging. The Timeout directive specifies a stop value in seconds.
Timeout 300
Normally, multiple requests are allowed through each connection. The following command disables this behavior:
KeepAlive Off
If the KeepAlive directive is on, you can regulate the number of requests per connection with the MaxKeepAliveRequests directive:
MaxKeepAliveRequests 100
Once a connection is made between Apache and someone s web browser, the KeepAliveTimeout directive specifies the number of seconds to wait for the next client request:
KeepAliveTimeout 15
Apache includes a number of Multi-Processing Modules (MPM). These MPMs fall into three categories:
Prefork MPMs are suited to process-based web servers; they are appropriate to use if you have Apache modules that do not require separate threads, which imitates the behavior of Apache 1.3. x .
Worker MPMs support both types of modules; however, they should not be used if you re using Apache 1.3 modules, since threads can cause problems.
Per-child MPMs support websites for clients that need different user IDs.
Note | MPMs flexible; specific modules are available for Windows NT ( mpm_winnt ) and Novell Netware ( mpm_netware ) networks. |
There are a number of common directives that you can specify in each of these MPM categories.
When Apache is started, the StartServers directive sets the number of available child server processes ready for users who want your web pages:
StartServers 8
Once Apache is started, requests from other users may come in. If the number of unused server processes falls below the MinSpareServers directive, additional httpd processes are started automatically:
MinSpareServers 5
When traffic goes down, the MaxSpareServers directive determines the maximum number of httpd processes that are allowed to run idle:
MaxSpareServers 20
You can regulate the number of clients requesting information from your web server with the MaxClients directive:
MaxClients 150
You can also regulate the number of requests for information from each client with the MaxRequests- PerChild directive:
MaxRequestsPerChild 1000
Apache 2.0 servers can start new threads for each request. The MinSpareThreads directive is similar to MinSpareServers; it allows Apache to handle a surge of additional requests:
MinSpareThreads 25
When the number of requests goes down, Apache monitors the number of spare threads; if they exceed the MaxSpareThreads directive, some are killed :
MaxSpareThreads 75
Every child process can create several threads to handle requests from each user of your website. The ThreadsPerChild directive is created when each child process starts:
ThreadsPerChild 25
You can limit the number of threads allowed for each child process with the MaxRequestsPerChild directive (there is no limit in the default httpd.conf file):
MaxRequestsPerChild 0
You can also limit the number of threads allowed for each child process with the MaxThreadsPerChild directive:
MaxThreadsPerChild 20
You can set Apache to Listen to requests from only certain IP addresses and or TCP/IP ports. The default httpd.conf file includes the following directives:
#Listen 12.34.56.78:80 Listen 80
If you have more than one network adapter, you can also limit Apache to certain networks; for example, the following directive only listens to the network adapter with an IP address of 192.168.13.64 on TCP/IP port 80:
Listen 192.168.13.64:80
Note | The Listen directive supersedes the BindAddress and Port directives from Apache version 1.3.x. |
As we noted earlier, there are other configuration files associated with the Apache 2.0. x server. By default, they re in the /etc/httpd/conf.d directory. Normally, file locations are determined by the ServerRoot directive, which is set to /etc/httpd , and the Include directive shown here:
Include conf.d/*.conf
When you need a module in Apache, it should be loaded in the httpd.conf configuration file. Normally, modules are listed in the following format:
LoadModule module_type location
For example, the following directive loads the module named access_module from the ServerRoot modules subdirectory, /etc/httpd/modules . You will find that this is linked to the actual directory with Apache modules: /usr/lib/httpd/modules .
LoadModule access_module modules/mod_access.so
Several modules are listed in the default httpd.conf file; Table 30.4 offers a brief description. The modules are listed in the same order as they appear in the file.
Module | Description |
---|---|
access_module | Supports access control based on an identifier, such as a computer name or IP address. |
auth_module | Allows authentication (usernames and passwords) with text files. |
auth_anon_module | Lets users have anonymous access to areas that require authentication. |
auth_dbm_module | Supports authentication with DBM (database management) files. |
auth_digest_module | Sets authentication with MD5 digests. |
include_module | Includes SSI (server-side includes) data for dynamic web pages. |
log_config_module | Sets logging of requests to the server. |
env_module | Allows control of the environment that is passed to CGI (Common Gateway Interface) scripts and SSI pages. |
mime_magic_module | Sets Apache to define the file type from a look at the first few bytes of the contents. |
cern_meta_module | Supports additional meta-information with a web page, per the standards of the W3C, which is housed at CERN (the French acronym for the European Laboratory for Particle Physics). |
expires_module | Lets Apache set an expiration date for the page, to support a web browser refresh request. |
headers_module | Allows control of HTTP request and response headers. |
usertrack_module | Supports user tracking with cookies. |
unique_id_module | Sets a unique identifier for each request |
setenvif_module | Allows Apache to set environment variables based on request characteristics, such as the type of web browser. |
mime_module | Associates the filename extension, such as .txt , with specific applications. |
dav_module | Supports web-based distributed authoring and versioning functionality. |
status_module | Gives information on server performance and activity. |
autoindex_module | Allows the listing of files in a web directory. |
asis_module | Sends files without adding extra headers. |
info_module | Supports user access to server configuration information. |
dav_fs_module | Supports the dav_module . |
vhost_alias_module | Allows dynamically configured Virtual Hosts. |
negotiation_module | Sets Apache to match content, such as language, to the settings from the browser. |
dir_module | Supports viewing of files in Apache directories. |
imap_module | Configures imagemap file directives (not related to e-mail). |
actions_module | Lets you run CGS scripts. |
speling_module | Allows for small mistakes in requested document names (ironically, the module name is misspelled ). |
userdir_module | Supports access to user-specific directories. |
alias_module | Sets up redirected URLs. |
rewrite_module | Supports rewriting of URLs. |
proxy_module | Sets up a proxy server for Apache. |
proxy_ftp_module | Allows proxy server support for FTP data. |
proxy_http_module | Allows proxy server support for HTTP data. |
proxy_connect_module | Required for proxy server connect requests. |
cgi_module | Configures running of CGI scripts. |
cgid_module | Supports running of CGI scripts with an external daemon. |
One of the more interesting modules is the info_module; as you ll see toward the end of the next section, it supports a detailed view of your Apache server configuration in your browser at localhost/ server- info .
Before we move on to configuring Virtual Hosts, let s take a look at the next section in the httpd.conf configuration file, which includes the default directives for Apache. While you can set different settings for many of these directives, you do need to know the defaults in this section. We analyze the basic settings in this part of the httpd.conf file in order.
Note | This is a very long section; you may want to take a break if you re in the habit of reading through a complete section at a time. |
As determined by the User and Group directives, the Apache daemon, httpd , is assigned a specific user and group name here and in /etc/passwd and /etc/group :
User apache Group apache
With web pages generated by Apache, there is a listing for an administrative contact, as determined by the ServerAdmin directive:
ServerAdmin root@localhost
If you have an administrative website for your web server, you ll want to set it with the ServerName directive. If you don t have a fully qualified domain name in a DNS server, use the IP address.
#ServerName new.host.name:80
If you activate this directive, it will normally be superseded by the name you set for each Virtual Host.
Technically, every URL, such as http://www.Sybex.com/ , is supposed to have a trailing slash. But I never remember to put it in. Without the following directive, an attempt to navigate to www.Sybex.com would end up at the address specified by the ServerName directive. The standard httpd.conf file includes the UseCanonicalName directive to add the trailing slash automatically.
UseCanonicalName Off
The root directory for your web server is specified by the DocumentRoot directive:
DocumentRoot "/var/www/html"
Next, we look at the default permissions for users within directories accessible through your server s websites. It s set up by the < Directory / > container, which defines the permissions associated with the DocumentRoot :
<Directory /> Options FollowSymLinks AllowOverride None </Directory>
The Options directive determines where you can go for files from that directory. It can be set to several different values, as described in Table 30.5. The AllowOverride directive can go to the .htaccess file for a list of users or computers allowed to see certain files; the AllowOverride None setting doesn t even look at the .htaccess file.
Value | Description |
---|---|
All | Supports all settings except MultiViews . |
ExecCGI | Allows the running of CGI scripts. |
FollowSymLinks | Lets requests follow symbolically linked files or directories. |
Includes | Allows the use of server-side includes (SSI). |
IncludesNOEXEC | Allows SSIs, but no CGIs. |
Indexes | If there is no index.html type file, sets up Apache to return a list of files in that directory. Options for this file are specified by the DirectoryIndex directive. |
MultiViews | Supports content negotiation, such as between web pages in different languages. |
SymLinksIfOwnerMatch | Follows symbolic links if the target file or directory is owned by the same user. |
An .htaccess file is a distributed configuration file that you can use to configure individual directories on a website. It is a common way to implement restricted access to a specific directory.
An .htaccess file isn t necessary in most cases; you can configure access on a per-directory basis in the main Apache configuration file, httpd.conf . In the default version of the main Apache configuration file, look for < Directory > containers. Observe how the restrictions vary for different directories.
However, if you have a large number of websites on your server, such as the personal web pages associated with many ISPs, you may want to use .htaccess files to let individual users regulate access to web pages in their home directories. You can set up a standard scheme to read .htaccess files, as described later in the "User Directory Permissions" section.
If you want to implement distributed configuration files, you can do something to make it more secure. Look for the AccessFileName directive in httpd.conf . Assign a hidden file name other than .htaccess . Also see the "Access Control" section later in this chapter.
Next, we ll look at the default permissions in httpd.conf for the /var/www/html directory, as specified by the following container:
<Directory "/var/www/html">
The following Options directive supports redirection via symbolic links and the listing of files in the current directory if there is no index.html type file (look ahead to Figure 30.2 for an example):
Options Indexes FollowSymLinks
As we mentioned in the previous section, the AllowOverride directive specifies the types of directives in the .htaccess file; the following option doesn t even look at .htaccess :
AllowOverride None
Finally, there are access control directives; the following looks for an Allow and then a Deny directive for this directory, in order:
Order allow,deny Allow from all
Now the httpd.conf file adds a couple more directives for users that access the top directory of your website, also known as DocumentRoot :
<LocationMatch "^/$"> Options -Indexes ErrorDocument 403 /error/noindex.html </LocationMatch>
The < LocationMatch "^/$" > container looks a little strange ; this specific directive applies the commands therein ( Options and ErrorDocument ) to the root ( / ) directory.
The Options -Indexes directive prohibits the listing of files, courtesy of the - in front of the Indexes setting. If no index.html page is available, the ErrorDocument directive returns the noted error web page to the user. This location is based on the ServerRoot directive; thus, noindex.html is located in the /etc/httpd/error directory.
Oddly enough, the noindex.html file is the "Test Page" that is shown when Apache starts without the pages associated with a real website. It s shown back in Figure 30.1.
You can set up web pages in your users home directories. They are disabled by default with the following command:
UserDir disable
You can replace that command with the following:
UserDir public_html
Assume you have a user named ez, and she has a set of web page files in the /home/ez/public_html directory. Also, assume that your website is named www.example.abc . You need to set appropriate permissions:
# chmod 711 /home/ez # chmod 755 /home/ez/public_html # chmod 744 /home/ez/public_html/*
Then when you direct your browser to www.example.abc/~ez , you will be able to see any index.html web page that you might have stored in the /home/ez/public_html directory.
You can further regulate access to web pages and files in users home directories. Look at the following sample commands from the default httpd.conf file:
#<Directory /home/*/public_html> # AllowOverride FileInfo AuthConfig Limit # Options MultiViews Indexes SymLinksIfOwnerMatch IncludesNoExec # <Limit GET POST OPTIONS> # Order allow,deny # Allow from all # </Limit> # <LimitExcept GET POST OPTIONS> # Order deny,allow # Deny from all # </LimitExcept> #</Directory>
If you activate these commands, Apache allows you to browse the files in the public_html sub- directory, as described later in the "Directory Listings" section.
As described earlier, the AllowOverride directive relates to the access information that Apache reads from an individual .htaccess file. The different parameters associated with this directive are shown in Table 30.6. All descriptions refer to the commands that you can use in an .htaccess file on a per-directory basis.
Parameter | Description |
---|---|
AuthConfig | Supports the use of authorization directives |
FileInfo | Lets you configure various document types |
Indexes | Permits you to configure indexing of the directory |
Limit | Supports access control restrictions, such as deny and allow |
The Options directive described in Table 30.5 supports content negotiation, file indexing, following symbolic links, and support for SSIs but not CGIs.
The Limit directive sets options for users who want to send ( POST ) and receive ( GET ) files from the user home directory; the LimitExcept directive denies the use of all other access commands.
When users navigate to your website, they re actually looking in a directory. The DirectoryIndex directive tells Apache the types of web pages to send back to the website user:
DirectoryIndex index.html index.html.var
The index.html document is a standard home page file used by many websites; index .html.var is one way to set up a dynamic home page. You can look at an example of .var files in the /var/www/error directory. Open those files in the text editor of your choice. You ll see standard error messages.
As described in the sidebar .htaccess Files," you can configure access control files on individual directories. By default, it s the hidden file .htaccess; you can set a different filename with the AccessFileName directive:
AccessFileName .htaccess
The following Files directive ensures that any file that starts with .ht is not viewable by users who are browsing your website:
<Files ~ "^\.ht"> Order allow,deny Deny from all </Files>
While the MIME (Multipurpose Internet Mail Extensions) standard was originally created for sending binary files over e-mail, it works for web pages as well. For example, you can configure your browser to open the PDF reader of your choice if you navigate to a PDF file on the Internet. The standard translation between MIME types and file extensions is listed through the TypesConfig directive:
TypesConfig /etc/mime.types
Many files do not have extensions such as .pdf or .doc . You can set the DefaultType directive to specify display options on a browser. If you use text files, the following standard should work well:
DefaultType text/plain
Alternatively, if most of your files are in binary format, you could end up sending dozens of pages of gibberish to your users unless you changed this directive to something like:
DefaultType application/octet-stream
If the extension doesn t provide a clue, you can use the MIMEMagicFile directive, which uses the mod_mime_magic module defined in Table 30.4:
<IfModule mod_mime_magic.c> # MIMEMagicFile /usr/share/magic.mime MIMEMagicFile conf/magic </IfModule>
Remember, the location of a relative path such as conf/magic is based on the ServerRoot directive. In other words, this section points to the MIMEMagicFile at /etc/httpd/conf/magic .
There is one more related directive, toward the end of the httpd.conf file. The AddType directive allows you to override the configuration as defined by TypesConfig in /etc/mime.types :
AddType application/x-tar .tgz
Apache logs can be very large. If you re running a large commercial website, you could easily collect hundreds of megabytes of log data every day. The choices you make for log data could easily overload your system.
Normally, HostnameLookups are set to Off; otherwise, Apache will look for the fully qualified domain name of every requesting user. Don t do this unless you have reliable access to a DNS server and the network capacity to handle that volume of information.
HostnameLookups Off
You can set the locations of different log files. The ErrorLog directive, as you d expect, sets the location of the error_log file. With the given value of ServerRoot , the following log file is located in the /etc/httpd/logs directory:
ErrorLog logs/error_log
You can control the types of messages sent to the ErrorLog file; available values for the LogLevel directive ( debug , info , notice , warn , error , crit , alert , emerg ) are similar to those shown in the standard error log file, /etc/syslog.conf , back in Chapter 13 .
LogLevel warn
Log information is sent to the error_log in a specific format, as defined by the following LogFormat directives:
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined LogFormat "%h %l %u %t \"%r\" %>s %b" common LogFormat "%{Referer}i -> %U" referer LogFormat "%{User-agent}i" agent
Each of these lines specifies a set of data collected in four different formats: combined , common , referer , and agent .
The variables associated with LogFormat are described in Table 30.7. A substantial number of additional variables are available, which you can review in the mod_log_config.html file in the /var/ www/manual/mod directory. Other request fields are per the standards of the World Wide Web consortium, at www.w3.org/Protocols/HTTP/HTRQ_Headers.html .
Variable | Description |
---|---|
%a | Remote IP address. |
%b | Bytes sent (not including HTTP headers). |
%h | Remote host. |
%l | Remote log name. |
%r | First line of the client request. |
%s | Request status. |
%t | Time. |
%u | Remote user. |
referer | Notes the page where someone clicked on a link. (Yes, in Apache, referer is not spelled correctly.) |
user-agent | Notes the client program, such as Mozilla. |
You can set the location of several other types of logs, as defined through the CustomLog variable. You can set this up within one of your Virtual Hosts, so the owners of individual websites on your server can get their own log files:
# CustomLog logs/access_log common CustomLog logs/access_log combined #CustomLog logs/referer_log referer #CustomLog logs/agent_log agent #CustomLog logs/access_log combined
These lines specify the location of your log files. Based on the default ServerRoot , that s /etc/ httpd/logs . The actual information that s sent to each log file is based on the referenced LogFormat . For example, the active CustomLog directive refers to the combined format, which you might recall is:
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
The httpd.conf file can add one element to dynamically generated web pages, depending on the ServerSignature directive. Normally it s set as follows:
ServerSignature On
When ServerSignature is set to On , you might see a message similar to the following at the bottom of dynamically generated web pages:
Apache/2.0.40 Server at localhost Port 80
Alternatively, if you substitute Email for On , you ll get a hyperlink from the name of the computer, in this case, localhost , to the server administrator, as defined by the ServerAdmin directive.
You can use the Alias directive to set up a link between a directory in the URL to a directory on your computer. For example, the first Alias directive in the default httpd.conf file links the /icons/ subdirectory from a URL:
Alias /icons/ "/var/www/icons/"
to the /var/www/icons/ directory on the web server. This is also a good place to specify the permissions associated with /var/www/icons/ .
<Directory "/var/www/icons"> Options Indexes MultiViews AllowOverride None Order allow,deny Allow from all </Directory>
These permissions allow users to read the contents of the directory, unless there s a DirectoryIndex file such as index.html , and support content negotiation, such as different languages, via MultiViews .
If you ve installed the httpd-manual-* RPM and want to include the Apache manual on your website, change the following default Alias directive from
Alias /manual "/var/www/manual"
to
Alias /etc/httpd/manual "/var/www/manual"
This assumes that your ServerRoot directive is set to /etc/httpd . The following lines set permissions for the noted directory, and include the Web-based Distributed Authoring and Versioning (WebDAV) database:
<Directory "/var/www/manual"> Options Indexes FollowSymLinks MultiViews AllowOverride None Order allow,deny Allow from all </Directory> <IfModule mod_dav_fs.c> # Location of the WebDAV lock database. DAVLockDB /var/lib/dav/lockdb </IfModule>
Scripts in httpd.conf refer to programs that are run through the web server. Apache starts in the default httpd.conf file with a ScriptAlias directive, which is a specialized Alias for scripts:
ScriptAlias /cgi-bin/ "/var/www/cgi-bin/"
Some scripts require access to the CGI daemon, which is defined by the Scriptsock directive:
<IfModule mod_cgid.c> Scriptsock run/httpd.cgid </IfModule>
Once again, this is a good opportunity to define the permissions associated with the scripts associated with your websites:
<Directory "/var/www/cgi-bin"> AllowOverride None Options None Order allow,deny Allow from all </Directory>
Note how these permissions don t allow the use of .htaccess but support script execution by all users.
If you change website names, you ll want to redirect users. For example, the following default Redirect directive takes users who navigate to your /bears directory to www.mommabears.com :
# Redirect permanent /bears http://www.mommabears.com
Sometimes you want to see the files in a directory. For example, Figure 30.2 illustrates the files in the /home/mike/public_html directory, based on the UserDir directives described earlier, in the User Directory Permissions section.
The IndexOptions directive determines how index files are shown in client web browsers. For example, the default IndexOptions line
IndexOptions FancyIndexing VersionSort NameWidth=*
configures FancyIndexing , for icons and file sizes; VersionSort , which sorts numbers such as RPM versions in a specific order; and a NameWidth as large as needed for the filenames in the directory.
Speaking of icons, a list of icons is available for different file types and extensions. These icons are shown with a file list, assuming you have set IndexOptions FancyIndexing as defined in the previous section. There are three basic AddIcon* directives:
AddIconByEncoding (CMP,/icons/compressed.gif) x-compress x-gzip
The AddIconByEncoding directive shown here applies to compressed binary files. Several AddIconByType directives are also included for four different file types:
AddIconByType (TXT,/icons/text.gif) text/* AddIconByType (IMG,/icons/image2.gif) image/* AddIconByType (SND,/icons/sound2.gif) audio/* AddIconByType (VID,/icons/movie.gif) video/*
Finally, there are a series of AddIcon directives that associate a specific icon with different filename extensions:
AddIcon /icons/binary.gif .bin .exe AddIcon /icons/binhex.gif .hqx AddIcon /icons/tar.gif .tar AddIcon /icons/world2.gif .wrl .wrl.gz .vrml .vrm .iv AddIcon /icons/compressed.gif .Z .z .tgz .gz .zip AddIcon /icons/a.gif .ps .ai .eps AddIcon /icons/layout.gif .html .shtml .htm .pdf AddIcon /icons/text.gif .txt AddIcon /icons/c.gif .c AddIcon /icons/p.gif .pl .py AddIcon /icons/f.gif .for AddIcon /icons/dvi.gif .dvi AddIcon /icons/uuencoded.gif .uu AddIcon /icons/script.gif .conf .sh .shar .csh .ksh .tcl AddIcon /icons/tex.gif .tex AddIcon /icons/bomb.gif core AddIcon /icons/back.gif .. AddIcon /icons/hand.right.gif README AddIcon /icons/folder.gif ^^DIRECTORY^^ AddIcon /icons/blank.gif ^^BLANKICON^^
These AddIcon directives are straightforward. For example, if Apache sees a file with an .exe extension, it adds the /icons/binary.gif icon as a label for that particular file. But this list is not comprehensive; there is a DefaultIcon directive for files with unknown extensions:
DefaultIcon /icons/unknown.gif
If you like, you can activate the following AddDescription directives to give users a bit more information about files with specific extensions:
#AddDescription "GZIP compressed document" .gz #AddDescription "tar archive" .tar #AddDescription "GZIP compressed tar archive" .tgz
You can set up directories with various HTML files. For example, the HeaderName directive specifies a file to put before the file list; the ReadmeName directive specifies a file to put after the file list.
ReadmeName README.html HeaderName HEADER.html
The IndexIgnore directive sets Apache to avoid listing the noted files in any directory list. Note how the default value includes the HEADER.html and README.html files.
IndexIgnore .??* *~ *# HEADER* README* RCS CVS *,v *,t
Some browsers can read and automatically decompress certain files in your website directories. All you need to do is specify the encoding associated with certain filename extensions by using the AddEncoding directive:
AddEncoding x-compress Z AddEncoding x-gzip gz tgz
Multilingual websites include web pages in multiple languages. The DefaultLanguage directive defines the language associated with all web pages that aren t already labeled. The following inactive directive specifies the Dutch language:
# DefaultLanguage nl
You can set up web pages in different languages, as defined by the AddLanguage directive. For example, index.html.cz is a web page associated with the Czech language:
AddLanguage cz .cz
Other language codes are listed in Table 30.8.
Code | Language |
---|---|
ca | Catalan |
cz | Czech |
da | Danish |
de | German |
en | English |
el | Modern Greek |
es | Spanish |
et | Estonian |
fr | French |
he | Hebrew |
hr | Hungarian |
it | Italian |
ja | Japanese |
kr | Korean |
ltz | Luxembourgeois |
nl | Dutch (Netherlands) |
nn | Norwegian Nynorsk |
no | Norwegian |
pl | Polish |
pt | Portuguese |
pt-br | Brazilian Portuguese |
ru | Russian |
sv | Swedish |
tw | Chinese * |
zh-tw | Chinese |
Anyone who follows the political situation in China in any depth will understand that the designation of tw as Chinese has caused some controversy. As I understand it, the people behind Apache are in the process of converting all Chinese AddLanguage codes to zh . |
A web browser should tell the web server the preferred language. However, when this doesn t work, the LanguagePriority directive sets the preferred language:
LanguagePriority en da nl et fr de el it ja kr no pl pt pt-br ltz ca es sv tw
This works hand in hand with the ForceLanguagePriority directive. As defined in the default httpd .conf file, it uses the LanguagePriority directive list to select from languages acceptable to the client web browser. If no acceptable language page is available, the first item on the LanguagePriority list (in this case, English) is used.
Many languages don t work too well unless you have the right set of characters. Most language characters have been organized into different ISO character sets. The default, which works for English and a number of similar languages, is ISO-8859-1. It s forced into the default websites for Apache with the following directive:
AddDefaultCharset ISO-8859-1
Several other character sets are available, as defined by the following AddCharset directives. For more information on these character sets, see www.iana.org/assignments/character-sets .
AddCharset ISO-8859-1 .iso8859-1 .latin1 AddCharset ISO-8859-2 .iso8859-2 .latin2 .cen AddCharset ISO-8859-3 .iso8859-3 .latin3 AddCharset ISO-8859-4 .iso8859-4 .latin4 AddCharset ISO-8859-5 .iso8859-5 .latin5 .cyr .iso-ru AddCharset ISO-8859-6 .iso8859-6 .latin6 .arb AddCharset ISO-8859-7 .iso8859-7 .latin7 .grk AddCharset ISO-8859-8 .iso8859-8 .latin8 .heb AddCharset ISO-8859-9 .iso8859-9 .latin9 .trk AddCharset ISO-2022-JP .iso2022-jp .jis AddCharset ISO-2022-KR .iso2022-kr .kis AddCharset ISO-2022-CN .iso2022-cn .cis AddCharset Big5 .Big5 .big5 # For Russian, more than one charset is used (depends on client, mostly): AddCharset WINDOWS-1251 .cp-1251 .win-1251 AddCharset CP866 .cp866 AddCharset KOI8-r .koi8-r .koi8-ru AddCharset KOI8-ru .koi8-uk .ua AddCharset ISO-10646-UCS-2 .ucs2 AddCharset ISO-10646-UCS-4 .ucs4 AddCharset UTF-8 .utf8 AddCharset GB2312 .gb2312 .gb AddCharset utf-7 .utf7 AddCharset utf-8 .utf8 AddCharset big5 .big5 .b5 AddCharset EUC-TW .euc-tw AddCharset EUC-JP .euc-jp AddCharset EUC-KR .euc-kr AddCharset shift_jis .sjis
You can map filename extensions to a specific handler. For example, the following commented AddHandler directive activates CGI script handling for files with the .cgi extension, assuming you also have set the Options ExecCGI directive for the subject directory:
#AddHandler cgi-script .cgi
The following commented directive makes sure that files that already have HTTP headers don t get processed :
#AddHandler send-as-is asis
To activate commented directives, remove the comment mark ( # ) in httpd.conf in the text editor of your choice.
This directive processes image map files:
AddHandler imap-file map
Finally, this directive supports .var files, which are associated with finding the language specified by a web browser client:
AddHandler type-map var
Part of the process includes output filters. For example, the following AddOutputFilter directive looks in web pages with .shtml extensions for Server Side Includes.
On a web server, if you have an error, you get a message associated with a specific web page. Figure 30.3 illustrates the error message associated with the HTML 404 error code, also known as the file not found error.
The default error directory is /var/www/error; the following Alias directive associates the error directory with those files:
Alias /error/ "/var/www/error/"
The following modules provide for content negotiation and SSIs in the web pages in the /var/ www/error/ directory:
<IfModule mod_negotiation.c> <IfModule mod_include.c>
The following permissions on the /var/www/error directory set the stage for error messages in English, Spanish, German, and French, in that order. You can read more about the other directives earlier in the "Directory Index" section earlier in this chapter.
<Directory "/var/www/error"> AllowOverride None Options IncludesNoExec AddOutputFilter Includes html AddHandler type-map var Order allow,deny Allow from all LanguagePriority en es de fr ForceLanguagePriority Prefer Fallback </Directory>
This works hand in hand with HTML error codes. The page a user sees depends on the error code and the web page defined by the following ErrorDocument directives:
ErrorDocument 400 /error/HTTP_BAD_REQUEST.html.var ErrorDocument 401 /error/HTTP_UNAUTHORIZED.html.var ErrorDocument 403 /error/HTTP_FORBIDDEN.html.var ErrorDocument 404 /error/HTTP_NOT_FOUND.html.var ErrorDocument 405 /error/HTTP_METHOD_NOT_ALLOWED.html.var ErrorDocument 408 /error/HTTP_REQUEST_TIME_OUT.html.var ErrorDocument 410 /error/HTTP_GONE.html.var ErrorDocument 411 /error/HTTP_LENGTH_REQUIRED.html.var ErrorDocument 412 /error/HTTP_PRECONDITION_FAILED.html.var ErrorDocument 413/error/HTTP_REQUEST_ENTITY_TOO_LARGE.html.var ErrorDocument 414 /error/HTTP_REQUEST_URI_TOO_LARGE.html.var ErrorDocument 415 /error/HTTP_SERVICE_UNAVAILABLE.html.var ErrorDocument 500 /error/HTTP_INTERNAL_SERVER_ERROR.html.var ErrorDocument 501 /error/HTTP_NOT_IMPLEMENTED.html.var ErrorDocument 502 /error/HTTP_BAD_GATEWAY.html.var ErrorDocument 503 /error/HTTP_SERVICE_UNAVAILABLE.html.var ErrorDocument 506 /error/HTTP_VARIANT_ALSO_VARIES.html.var
When a web browser asks for a web page, it tells Apache what kind of browser it is. The BrowserMatch directive helps you customize the response to different web browsers:
BrowserMatch "Mozilla/2" nokeepalive BrowserMatch "MSIE 4\.0b2;" nokeepalive downgrade-1.0 force-response-1.0 BrowserMatch "RealPlayer 4\.0" force-response-1.0 BrowserMatch "Java/1\.0" force-response-1.0 BrowserMatch "JDK/1\.0" force-response-1.0
The first two commands create special responses for older browsers; Mozilla/2 corresponds to Netscape 2. x , and MSIE 4\.0b2 corresponds to Microsoft Internet Explorer 4. x . These browsers do not conform to the current HTTP 1.1 standard. The last three commands force HTTP 1.0 “level responses to the specified web browsers.
There is a special issue with Microsoft WebFolders, which does not properly handle WebDAV databases. This issue is addressed with the following BrowserMatch directives:
BrowserMatch "Microsoft Data Access Internet PublishingProvider" redirect-carefully BrowserMatch "^WebDrive" redirect-carefully
You can send out reports on the status and configuration information on your Apache server with various server reports. For example, the following command stanza, when activated, can give you the current status of Apache:
#<Location /server-status> # SetHandler server-status # Order deny,allow # Deny from all # Allow from .your-domain.com #</Location>
I would activate it with the following commands; otherwise, the Deny from all command would stop all traffic to the http://servername/server-status address. In this case, my LAN is on the 192.168.13.0/24 network:
<Location /server-status> SetHandler server-status Order deny,allow Deny from all Allow from 192.168.13.0/24 </Location>
You can see the result from another computer on my LAN through a different web browser in Figure 30.4.
You can get similar reports on your Apache configuration when you properly activate the following commands:
#<Location /server-info> # SetHandler server-info # Order deny,allow # Deny from all # Allow from .your-domain.com #</Location>
These commands are direct from the default httpd.conf file; remember to set Allow from your_ network_address , similar to what I did in the previous stanza. When you do, you can see the results remotely, as shown in Figure 30.5.
Apache includes its own proxy server. You can set Apache to cache and serve requested web pages on local networks or all users. The basic commands are shown here; I ve changed them a bit to apply the proxy server to my LAN with a network address of 192.168.13.0/24:
#<IfModule mod_proxy.c> #ProxyRequests On # #<Proxy *> # Order deny,allow # Deny from all # Allow from 192.168.13.0/24 #</Proxy>
If you have multiple proxy servers, you should activate the following ProxyVia directive, which supports searches through a chain of proxy servers using HTTP 1.1:
#ProxyVia On
A proxy server has no purpose unless you configure a cache. Table 30.9 describes the series of special directives associated with caches. If you set up a proxy server, you may want to change some of these settings; for example, you may want a CacheSize larger than 5KB:
#CacheRoot "/etc/httpd/proxy" #CacheSize 5 #CacheGcInterval 4 #CacheMaxExpire 24 #CacheLastModifiedFactor 0.1 #CacheDefaultExpire 1 #NoCache a-domain.com another-domain.edu joes.garage-sale.com
Directive | Description |
---|---|
CacheDefaultExpire | Sets the time to cache a document, in seconds. |
CacheGcInterval | Configures the time between attempts to clear old data from a cache, in hours. |
CacheLastModifiedFactor | Sets the expiration time for files in the cache. If there is no expiration date and time associated with a web page, Apache sets it relative to the amount of time since the last known change to that page. |
CacheMaxExpire | Selects the maximum time in seconds to cache a document. |
CacheRoot | Configures the default directory with the proxy server cache. |
CacheSize | Sets the size of the cache, in kilobytes. |
One of the strengths of Apache 2.0. x is its ability to set up multiple websites on a single IP address. This is possible with the concept of Virtual Hosts .
Older versions of Apache supported only IP-based Virtual Hosts, which required separate IP addresses for each website configured through your Apache server. Apache 2.0. x supports name-based Virtual Hosts.
In this scheme, DNS servers map multiple domain names, such as www.mommabears.com and www.sybex.com , to the same IP address, such as 10.111.123.45. You can set up httpd.conf to recognize the different domain names and serve the appropriate website.
Note | You can t always use the name-based scheme; it doesn t work if you need a secure (SSL) part of your website, such as to support e-commerce. It also has problems with older clients, such as Netscape 2.0 and Internet Explorer 4.0 browsers. These browsers cannot handle a lot of information associated with the current HTTP 1.1 standard. |
The following code is an example of how to configure two Virtual Hosts, in this case for www.sybex.com and www.mommabears.com :
NameVirtualHost *
This NameVirtualHost directive listens to requests to all IP addresses on the local computer. Alternatively, you can substitute the actual IP address for the * in this section:
<VirtualHost *> ServerAdmin webmaster@sybex.com DocumentRoot /www/site1/sybex.com ServerName sybex.com ErrorLog logs/sybex.com-error_log CustomLog logs/sybex.com-access_log common </VirtualHost>
The directives in the www.sybex.com < Virtual Host * > container supersede any settings made earlier in the httpd.conf file. You can customize each Virtual Host by adding the directives of your choice:
<VirtualHost *> ServerAdmin webmaster@mommabears.com DocumentRoot /www/site2/mommabears.com ServerName mommabears.com ErrorLog logs/mommabears.com-error_log CustomLog logs/mommabears.com-access_log common </VirtualHost>
As you can see, the settings for the mommabears.com website are similar; remember, relative directories depend on the ServerRoot directive.
There are a number of Apache module-specific configuration files in the /etc/httpd/conf.d directory, installed through some of the module RPMs described earlier in the "Packages" section. They are included in the basic Apache configuration courtesy of the Include conf.d/*.conf directive in the main httpd.conf file. These module files are summarized in Table 30.10.
File | Description |
---|---|
auth_mysql.conf | Supports access to a MySQL database; the default version of this file includes various authentication commands. |
auth_pgsql.conf | Supports access to a PostgreSQL database; the default version of this file includes various authentication commands. |
perl.conf | Incorporates a Perl interpreter; supports the use of Perl commands and scripts. |
php.conf | Incorporates a PHP scripting language interpreter. |
python.conf | Configures a Python interpreter; allows the use of Python commands and scripts. |
ssl.conf | Adds Secure Socket Layer (SSL) support; uses TCP/IP port 443 by default. Includes several directives for certificates and encryption methods . |
If you re unable to make a connection to a website configured on a Apache web server, you can check a number of things. Before you begin, check the network. The most common problem on any network is physical; for example, it s good to inspect connectors and cables. Then, check connectivity using commands such as ping; for more information, see Chapter 21 .
Once you re sure that your network is operational, the next step is to see if Apache is running. Start with the following command:
# service httpd status
You should see a message such as:
httpd (pid 3464 3463 3462 3461 3460 3459 3458) is running
This tells you that a number of Apache ( httpd ) daemons are running; the number depends on httpd.conf directives such as StartServers . If you re having a problem, there are three other fairly common messages:
httpd is stopped
This is fairly simple; try a service httpd start command. Rerun the service httpd status command. You might also see the following message:
httpd is dead but pid file exists
In this example, Apache can t start, in part because there is an httpd.pid file in the /var/run directory. This can happen after a power failure (assuming you don t have an uninterruptible power supply) where Linux never got a chance to erase the httpd.pid file. Try deleting the file and then run the service httpd start command. Rerun the service httpd status command. You might now see the following message:
httpd dead but subsys locked
That tells us something else is going wrong. It s time to inspect the log files.
The default location for your Apache log files as defined in httpd.conf is /etc/httpd/logs; however, you ll find this directory linked to a more standard location for log files, /var/log/httpd . Remember, you have the freedom to put log files in a different directory by using CustomLog directives in a Virtual Host container.
Read the log files in this directory for clues. The variety of errors that you might find is beyond the scope of this book; however, many of the log entries are self-explanatory.
The Apache web server includes its own syntax checker. The following command checks the syntax of the main configuration file, httpd.conf . If there is a problem, the command
# httpd -t
often identifies the line number with the problem, such as a misspelled directive. Alternatively, the following command starts Apache in debug mode, which can help you identify additional problems:
# httpd -X
Sometimes messages just aren t getting through to your web server. That may mean that you forgot to let in messages through the standard HTTP port (80) in the firewall. Run an iptables -L command to list current firewall rules. Refer to Chapter 22 for more information on this command.
As described with the various firewall utilities ( Chapters 3 , 4 , and 19 ), you can set up firewalls that automatically allow data through the HTTP port. Remember, if you also serve secure web pages, you should also open up the associated port. In this case, for HTTPS, that is port 443. Standard TCP/IP port numbers are defined in /etc/services .