|< Day Day Up >|
Understanding Apache Administration
Apache is an extremely large piece of software that has hundreds of configuration options and possible setups. A number of books have been written about Apache. This section looks at the most common attributes that can be configured and how they affect your system. It is not meant to be a complete reference to Apache. Version 2.0 of Apache is available from http://www.apache.org, but it is not yet distributed with Tiger. The current shipping version, 1.3.33, is discussed here.
Controlling the Apache Process
You can start, stop, and restart Apache at any time using the /usr/sbin/apachectl utility. For example, to restart the server, type
brezup:jray jray $ sudo /usr/sbin/apachectl restart
Table 23.1 documents all the available apachectl options.
An interface problem occurs with Apple's use of the Personal Web Sharing metaphor when it is applied to Apache. Each user has his own directory. When web sharing is turned on for one user, it is activated for everyone.
If the computer is a multiuser system and others have the administrative capability to control Apache, you cannot be certain whether web sharing is on or off. The only ways to guarantee that your files aren't being displayed on the Web is to manually disable viewing files using Apache configuration directives, remove the files from your ~/Sites directory, or set their permissions so that they are readable by the owner only.
Apache Configuration File Locations
Apple has done an excellent job of making the Apache web server configuration manageable for machines with large numbers of personal websites. Instead of a single monolithic configuration, like the standard Linux or Windows installation, the server can be configured on two different levels:
By splitting up the configuration, the administrator can quickly adjust web permissions on a given account. To edit the user or system configuration, you must either log in (or su) as root or use sudo.
Basic Apache Configuration Directives
Apache approaches configuration in a very object-oriented manner. The configuration files are XML-like, but not compliant, so don't attempt to edit them using the plist editor. Apache calls each configuration option a directive. The two types of configuration directives are
The global options can fall anywhere within the server configuration file. If you're running a heavy-traffic website, you'll definitely want to change the defaults. By default, Apache starts only one server and keeps a maximum of five running at any given time. These numbers do not enable the server to quickly adapt to increased server load.
Table 23.2 documents the some of the most important configuration directives contained in the /etc/httpd/httpd.conf file. They are listed in the order in which you're likely to encounter them in the httpd.conf file.
This is only a partial list of commonly used global directives for a complete list, visit Apache's website.
The second type of Apache directives are container based. These directives control how Apache serves a certain group of files. Files are chosen based on pattern, location (URL), or directory and are denoted by a start and end tag in the Apache configuration file. For example, the /etc/httpd/users/ configuration files define a container consisting of each user's Sites directory. This is the configuration file created for my jray user account (in my case, that file would be /etc/httpd/users/jray.conf):
<Directory "/Users/jray/Sites/"> Options Indexes MultiViews AllowOverride None Order allow,deny Allow from all </Directory>
In this example, the directory /Users/jray/Sites is the container. Web pages within this container can use the Indexes and Multiviews options. The AllowOverride, Order, and Allow directives control who has access to the files within this container. This will be explained in more detail in the "Using Password Protection Features" section later.
Besides a directory container, there are other constructs that can also be added to the configuration file(s):
Within the container objects, the administrator can add a number of directives to control access to the contents or what special features are available in that location. Table 23.3 includes the container directives you'll encounter most often. We're going to explicitly set up password protection and virtual hosting shortly because this can be a bit tricky just going on the directive definitions alone.
Common Apache Configuration Modifications
To get a handle on configuration, let's take a look at a few different directives in use. Remember that, because of their global nature, global directives can be used anywhere within the /etc/httpd/httpd.conf configuration file, whereas container-based directives must be used within a container as described earlier.
Creating URL Aliases (Global)
As you build a complex web server, you'll probably want to spread files out and organize them using different directories. This can lead to extremely long URLs, such as/mydocuments/work/project1/summary/data/. This URL is a bit bulky to be considered convenient if it were commonly accessed or publicly advertised.
Thankfully, you can shorten long URLs by creating an alias. Aliases work in a manner similar to the Tiger Finder aliases. A short name is given that, when accessed, will automatically retrieve files from another location. To alias the long data URL to something shorter, such as /data/, you would use the following command:
Alias /data/ /mydocuments/work/project1/summary/data
Aliases can be used to access files anywhere on the server, not just within the server document root. Obviously, the files in the alias directory need to be readable by the Apache process owner.
Redirecting from One URL to Another (Global)
Websites change. URLs change. For established websites, changing the location of a single page can be a nightmare for users bookmarks break and advertised URLs fail. Although this might seem trivial to experienced web surfers, some users might not be persistent enough to figure out where the page has gone.
Many websites put a redirection page up in place of the missing page. This type of redirection relies on a browser tag to take the user to another URL after a set timeout period. This is effective for most modern browsers, but it takes several seconds between loading the original page and the redirection. In addition, a page needs to be created for each location that might be accessed by the client. This could be hundreds of pages!
A simpler, faster, neater way is to use the Redirect directive. Redirect forces the client browser to transfer to a different URL before the original page even opens. Entire URL structures can be redirected using a single command. The destination URL can even be on a remote server!
For example, if you've decided to move all the files under a URL called /ourcatalog/toys to a new server with the URL www.mynewstoreonline.com/toys, you could use
RedirectPermanent /ourcatalog/toys http://www.mynewstoreonline.com/toys
If a user attempted to access the URL /ourcatalog/toys/cooltoy1.html, he would immediately be transferred to http://www.mynewstoreonline.com/toys/cooltoy1.html.
Using redirects is more reliable and transparent for the end user. Avoid using HTML-based redirects, and rely on the Apache RedirectPermanent directive to hide changes in the structure of your website.
Increasing and Customizing Logging Capability (Global)
Apache on Tiger stores its log files in the directory /var/log/httpd. By default, there are two logs: access_log and error_log.
The access_log file contains a record of what remote computers have accessed Apache, what they asked for, and when they did it. For example:
18.104.22.168 - - [02/September/2003:16:49:47 -0400] "GET /extimage/images/ 26_thumb.jpg HTTP/1.1" 200 27012 22.214.171.124 - - [02/September/2003:16:49:47 -0400] "GET /extimage/images/ 27_thumb.jpg HTTP/1.1" 200 35793 126.96.36.199 - - [02/September/2003:16:49:47 -0400] "GET /extimage/images/ 28_thumb.jpg HTTP/1.1" 200 26141 188.8.131.52 - - [02/September/2003:16:49:47 -0400] "GET /extimage/images/ 30_thumb.jpg HTTP/1.1" 200 29316 184.108.40.206 - - [02/September/2003:16:49:47 -0400] "GET /extimage/images/ 29_thumb.jpg HTTP/1.1" 200 33626
This log excerpt shows five requests for .jpg images from the Apache server. Five fields are stored with each log entry:
This style of access log is known as the common log format. Log formats are completely customizable using the global LogFormat directive. The common format is defined as
LogFormat "%h %l %u %t \"%r\" %>s %b" common
Each of the %h elements denotes an element to be stored in the log file. The \" is an escaped quote, meaning that a quote will also be stored in that location. You can build a log format using the following:
You define a log format by using the LogFormat line, a string containing the format elements, and a name for the file. For example, to define a log called mylog that stores only the hostname of the remote client for each request, you would use
LogFormat "%h" mylog
Except for custom solutions, you'll be best served by one of Apache's default log formats. Although the common log is common, it probably isn't the best thing for doing extensive reporting. A better choice is Apache's combined log format. The combined log format includes referer and user-agent strings with each request. Most web analysis packages use the combined log style.
To activate a log format, use the CustomLog directive, followed by the pathname for the log and the log name. To activate the combined log format, uncomment the following line within the /etc/httpd/httpd.conf file:
CustomLog "/private/var/log/httpd/access_log" combined
Log files are an important part of any web server. They can provide important data on the popular pages of the server, errors that have occurred, and how much traffic your system is getting. We will look at an easy way to provide log analysis later in "Interpreting Web Server Log Files."
Using Password Protection Features
Password-protecting a directory is extremely simple. For example, suppose that a user wants to password-protect his entire public website for development purposes. The first step is to set up a username and password file that will contain the login information for those who are allowed to access the resource. This is accomplished using htpasswd. There are two steps to the process: First, create a new password file with a single user; second, add additional usernames/passwords to it.
To create a new file, use the syntax htpasswd -c <pathname> <initial username>. For example,
brezup:jray jray $ htpasswd -c /Users/jray/webpasswords jray New password: Re-type new password: Adding password for user jray
A new password file (/Users/jray/webpasswords) is created, and the initial user jray is added.
Subsequent users can be added by calling htpasswd -b <pathname><username> <password>:
brezup:jray jray $ htpasswd -b /Users/jray/webpasswords testuser testpass Adding password for user testuser
The password file now has two entries: the initial jray user, and testuser.
Next, create a directory container that encompasses the files that need to be protected. Because this example is protecting a personal website, the container already exists as a <username>.conf file in /etc/httpd/users:
<Directory "/Users/jray/Sites/"> Options Indexes MultiViews ExecCGI AllowOverride None Order allow,deny Allow from all </Directory>
To this directory container, add AuthType, AuthName, AuthUserFile, and Require directives. You must be root or using sudo to edit the file:
<Directory "/Users/jray/Sites/"> AuthType Basic AuthName "John's Development Site" AuthUserFile /Users/jray/webpasswords Require valid-user Options Indexes MultiViews ExecCGI AllowOverride None Order allow,deny Allow from all </Directory>
The AuthUserFile is set to the name of the password file created with htpasswd, whereas the Require valid-user directive allows any user in the password file to gain access to the protected resource. To activate the authentication, use sudo /usr/sbin/apachectl restart:
brezup:jray jray $ sudo /usr/sbin/apachectl restart /usr/sbin/apachectl restart: httpd restarted
Attempting to access the /Users/jray/Sites directory (~jray) now opens an HTTP authentication dialog, as seen in Figure 23.2.
Figure 23.2. The directory is now password-protected.
Authenticating Against User Accounts with mod_auth_apple
Using the basic Apache authentication is fine in many cases, but you might find yourself wanting to protect resources based on actual user accounts on your computer. Although it's simple enough to create a password file for each user, these passwords will not be updated as users update their Tiger passwords. The real solution is to provide an authentication mechanism by which resources could be protected by actual system accounts and system passwords. This is entirely possible courtesy of Apple's mod_auth_apple Apache module.
Included with Tiger Server by default, Tiger client users can download and install mod_auth_apple with very little trouble. There are two components to the install: First, a missing header file Security/checkpw.h must be copied from the Darwin CVS repository or mirror, and then the source code for mod_auth_apple can be downloaded and installed.
The header file can be downloaded directly from Apple at http://developer.apple.com/darwin/projects/darwin/darwinserver/, but you'll need to register before downloading. Alternatively, download the header from http://www.opendarwin.org/cgi-bin/cvsweb.cgi/src/Security/checkpw/. After downloading, make a new directory /usr/include/Security, and copy the header file to the new location:
brezup:jray jray $ curl -O "http://www.opendarwin.org/cgi-bin/cvsweb.cgi/~checkout~/src /Security/ checkpw/checkpw.h" brezup:jray jray $ sudo mkdir /usr/include/Security brezup:jray jray $ sudo mv checkpw.h /usr/include/Security/
Next, download the latest mod_auth_apple package from http://developer.apple.com/darwin/projects/darwin/darwinserver/, unarchive it, and then enter the source distribution directory. (Note: To download mod_auth_apple, you will need to create an Apple ID if you don't already have one.)
brezup:jray jray $ tar zxf mod_auth_apple-XS-10.3.tgz brezup:jray jray $ cd mod_auth_apple
Be sure to check the Apple README file for installation instructions; they might change between versions. The instructions shown here are modified from Apple's directions so that the software configures automatically. Use make followed by sudo apxs -i -a mod_auth_apple.so to compile and install the module:
brezup:jray mod_auth_apple $ make /usr/sbin/apxs -c -Wc,"-traditional-cpp -Wno-four-char-constants -F/System/Library/PrivateFrameworks -DUSE_CHKUSRNAMPASSWD" -Wl,"-bundle_loader /usr/sbin/httpd -framework Security" -o mod_auth_apple.so mod_auth_apple.c gcc -DDARWIN -DUSE_HSREGEX -DUSE_EXPAT -I../lib/expat-lite -g -Os -pipe -DHARD_SERVER_LIMIT=2048 -DEAPI -DSHARED_MODULE -I/usr/include/httpd -traditional-cpp -Wno-four-char-constants -F/System/Library/PrivateFrameworks -DUSE_CHKUSRNAMPASSWD -c mod_auth_apple.c ... brezup:jray mod_auth_apple $ sudo apxs -i -a mod_auth_apple.so [activating module 'apple_auth' in /private/etc/httpd/httpd.conf] cp mod_auth_apple.so /usr/libexec/httpd/mod_auth_apple.so chmod 755 /usr/libexec/httpd/mod_auth_apple.so cp /private/etc/httpd/httpd.conf /private/etc/httpd/httpd.conf.bak cp /private/etc/httpd/httpd.conf.new /private/etc/httpd/httpd.conf rm /private/etc/httpd/httpd.conf.new
The mod_auth_apple module is now compiled and installed. Using it is identical to the examples we've already seen for Basic authentication, except no password file is needed. For example, to protect my Sites directory so that only my account (jray) can access it, I would use the following in my /etc/httpd/users/jray.conf file:
<Directory "/Users/jray/Sites/"> AuthType Basic AuthName "John's Development Site" Require user jray Options Indexes MultiViews ExecCGI AllowOverride None Order allow,deny Allow from all </Directory>
To verify against any account on the machine, replace Require user jray with Require valid-user. Alternatively, to validate against a group, Require group <groupname> could be employed.
Restricting Access by Network
To create more stringent control over the users who can access a given resource, use Allow and Deny to set up networks that should or shouldn't have access to portions of your website. This is extremely useful for setting up intranet sites that should only be accessible by a given subnet. For example, assume that you want to restrict access to a resource from everyone except the subnet 192.168.0.x. The following rules define the access permissions:
Allow from 192.168.0.0/255.255.255.0 Deny from all
Because there isn't an ordering specified, what really happens with these rules is ambiguous. Is the connection allowed because of the allow statement? Or denied because all the connections are denied?
To solve the problem, insert the Order directive:
Order Deny,Allow Allow from 192.168.0.0/255.255.255.0 Deny from all
With this ordering, an incoming connection is first compared to the deny list. Because all access is denied by default, any address matches this rule. However, the Allow directive is used for the final evaluation of the connection and will allow any connection from the network 192.168.0.0 with the subnet 255.255.255.0.
Using different orderings and different Allow/Deny lists, you can lock down a website to only those people who should have access, or disable troublesome hosts that misuse the site.
Creating Virtual Hosts
A virtual host is a unique container object, in that it can define an entirely separate web space unrelated to the main Apache website or user sites. For example, the three domains poisontooth.com, vujevich.com, and shadesofinsanity.com are all being served from a single computer. To the end user, these appear to be different and unique hosts. To Apache, however, they're just different directories on the same hard drive.
There are two types of virtual hosts name-based and IP-based as described in the following list:
To set up a virtual host, you must first have an IP address and a domain name assigned for the host. If you're using name-based hosts, you will have a single IP address but multiple hostnames. Your ISP or network administrator should be able to help set up this information.
There are only two differences in the Apache configuration of name-based and IP-based virtual hosts. Name-based hosts must include the NameVirtualHost directive, whereas IP-based hosts will need to use Listen to inform Apache of all the available addresses.
Let's take a look at two different ways to configure the virtual hosts www.mycompany.com and www.yourcompany.com. First, we'll use named-based hosting.
Assume that both mycompany and yourcompany domain names point to the IP address 192.168.0.100. To configure name-based virtual hosts, you could add the following directives to the end of the /etc/httpd/httpd.conf file:
NameVirtualHost 192.168.0.100 <VirtualHost 192.168.0.100> ServerName www.mycompany.com DocumentRoot /Users/jray/mycompany ServerAdmin firstname.lastname@example.org </VirtualHost> <VirtualHost 192.168.0.100> ServerName www.yourcompany.com DocumentRoot /Users/jray/yourcompany ServerAdmin email@example.com </VirtualHost>
The NameVirtualHost sets up the IP address that Apache will expect multiple domain name requests to come in on. The two VirtualHost directives define the basic properties of the two sites: what their real domain names are, where the HTML documents are loaded, and the email address for the person who runs the site.
Creating this same setup using IP-based hosts doesn't require much additional effort. For this sample configuration, assume that www.mycompany.com has the address 192.168.0.100 and that www.yourcompany.com uses 192.168.0.101. The configuration becomes
Listen 192.168.0.100 Listen 192.168.0.101 <VirtualHost 192.168.0.100> ServerName www.mycompany.com DocumentRoot /Users/jray/mycompany ServerAdmin firstname.lastname@example.org </VirtualHost> <VirtualHost 192.168.0.101> ServerName www.yourcompany.com DocumentRoot /Users/jray/yourcompany ServerAdmin email@example.com </VirtualHost>
This time, the Listen directive is used to tell Apache to watch for incoming web connections on both of the available IP addresses. The VirtualHost containers remain the same, except they now use different IP addresses for the two different sites.
Virtual hosting provides an important capability to the web server. Although available with a GUI configuration tool in Tiger Server, the Apache distribution included in the standard version of Tiger is every bit as powerful. It just takes a bit of manual editing to get things done!
|< Day Day Up >|