The first generation of web servers was designed to handle the contents of a single site. The standard way of hosting several websites in the same machine was to install and configure separate web server instances for each site. As the Internet grew, so did the need for hosting multiple websites, and a more efficient solution was developed: virtual hosting. Virtual hosting allows a single instance of Apache to serve different websites, identified by their domain names or IP addresses. IP-based virtual hosting means that each of the domains is assigned a different IP address; name-based virtual hosting means that several domains share a single IP address.
Web clients use the domain name server system (DNS) to translate hostnames into IP addresses and vice versa. Several mappings are possible:
By the Way
When a many-to-one mapping is in place, a DNS server usually can be configured to respond with a different IP address for each DNS query, which helps to distribute the load. This is known as round robin DNS. However, if you have the opportunity to utilize a load-balancing device instead of relying on a DNS server, doing so will alleviate any problems that may arise when tying your web server to your DNS server. Utilizing a load balancer eliminates the possibility that high traffic to your web server will bring down your DNS server as well.
IP-Based Virtual Hosting
The simplest virtual host configuration is when each host is assigned a unique IP address. Each IP address maps the HTTP requests that Apache handles to separate content trees in their own VirtualHost containers, as shown in the following snippet:
Listen 192.168.128.10:80 Listen 192.168.129.10:80 <VirtualHost 192.168.128.10:80> DocumentRoot /usr/local/apache2/htdocs/host1 </VirtualHost> <VirtualHost 192.168.129.10:80> DocumentRoot /usr/local/apache2/htdocs/host2 </VirtualHost>
If a DocumentRoot is not specified for a given virtual host, the global setting, specified outside any <VirtualHost> section, will be used. In the previous example, each virtual host has its own DocumentRoot. When a request arrives, Apache uses the destination IP address to direct the request to the appropriate host. For example, if a request comes for IP 192.168.128.10, Apache returns the documents from /usr/local/apache2/htdocs/host1.
If the host operating system cannot resolve an IP address used as the VirtualHost container's name and there's no ServerName directive, Apache will complain at server startup time that it can't map the IP addresses to hostnames. This complaint is not a fatal error. Apache will still run, but the error indicates that there might be some work to be done with the DNS configuration so that web browsers can find your server. A fully qualified domain name (FQDN) can be used instead of an IP address as the VirtualHost container name and the Listen directive binding (if the domain name resolves in DNS to an IP address configured on the machine and Apache can bind to it).
Name-Based Virtual Hosts
As a way to mitigate the consumption of IP addresses for virtual hosts, the HTTP/1.1 protocol version introduced the Host: header, which enables a browser to specify the exact host for which the request is intended. This allows several hostnames to share a single IP address. Most browsers nowadays provide HTTP/1.1 support.
By the Way
Although Host: usage was standardized in the HTTP/1.1 specification, some older HTTP/1.0 browsers also provided support for this header.
Listing 29.1 shows a typical set of request headers from the Mozilla Firefox browser. If the URL were entered with a port number, it would be part of the Host header contents as well.
Listing 29.1. Request Headers
Apache uses the Host: header for configurations in which multiple hostnames can be shared by a single IP addressthe many-to-one scenario outlined earlier this chapterthus, the description name-based virtual hosts.
The NameVirtualHost directive enables you to specify IP address and port combinations on which the server will receive requests for name-based virtual hosts. This is a required directive for name-based virtual hosts. Listing 29.2 has Apache dispatch all connections to 192.168.128.10 based on the Host header contents.
Listing 29.2. Name-Based Virtual Hosts
For every hostname that resolves to 192.168.128.10, Apache can support another name-based virtual host. If a request comes for that IP address for a hostname that is not included in the configuration file, say host3.example.com, Apache simply associates the request to the first container in the configuration file; in this case, host1.example.com. The same behavior is applied to requests that are not accompanied by a Host header; whichever container is first in the configuration file is the one that gets the request.
An end user from the example.com domain might have his machine set up with example.com as his default domain. In that case, he might direct his browser to http://host1/ instead of the fully qualified http://host1.example.com/. The Host header would simply have host1 in it instead of host1.example.com. To make sure that the correct virtual host container gets the request, you can use the ServerAlias directive as shown in Listing 29.3.
Listing 29.3. The ServerAlias Directive
In fact, you can give ServerAlias a space-separated list of other names that might show up in the Host header so that you don't need a separate VirtualHost container with a bunch of common directives just to handle all the name variants.
HTTP 1.1 forces the use of the Host header. If the protocol version is identified as 1.1 in the HTTP request line, the request must be accompanied by a Host header. In the early days of name-based virtual hosts, Host headers were considered a trade-off: Fewer IP resources were required, but legacy browsers that did not send Host headers were still in use and, therefore, could not access all of the server's virtual hosts. Today, that is not a consideration; there is no statistically significant number of such legacy browsers in use.
Mass Virtual Hosting
In the previous listings, the DocumentRoot directives follow a simple pattern:
where hostname is the hostname portion of the fully qualified domain name used in the virtual host's ServerName. For just a few virtual hosts, this configuration is fine. But what if there are dozens, hundreds, or even thousands of these virtual hosts? The configuration file can become difficult to maintain. Apache provides a good solution for cookie-cutter virtual hosts with mod_vhost_alias. You can configure Apache to map the virtual host requests to separate content trees with pattern-matching rules in the VirtualDocumentRoot directive. This functionality is especially useful for ISPs that want to provide a virtual host for each one of their users. The following example provides a simple mass virtual host configuration:
NameVirtualHost 192.168.128.10 Listen 192.168.128.10:80 VirtualDocumentRoot /usr/local/apache2/htdocs/%1
The %1 token used in this example's VirtualDocumentRoot directive will be substituted for the first portion of the fully qualified domain name (FQDN). The mod_vhost_alias directives have a language for mapping FQDN components to filesystem locations, including characters within the FQDN.
If we eliminated all the VirtualHost containers and simplified our configuration to the one shown here, the server would serve requests for any subdirectories created in the /usr/local/apache2/htdocs directory. If the hostname portion of the FQDN is matched as a subdirectory, Apache will look there for content when it translates the request to a filesystem location.
Although virtual hosts normally inherit directives from the main server context, some of them, such as Alias directives, do not get propagated. For instance, the virtual hosts will not inherit this filesystem mapping:
Alias /icons /usr/local/apache2/icons
The FollowSymLinks flag for the Options directive is also disabled in this context. However, a variant of the ScriptAlias directive is supported.
The VirtualScriptAlias directive shown in the following snippet treats requests for any resources under /cgi-bin as containing CGI scripts:
NameVirtualHost 192.168.128.10 Listen 192.168.128.10:80 VirtualDocumentRoot /usr/local/apache2/htdocs/%1/docs VirtualScriptAlias /usr/local/apache2/htdocs/%1/cgi-bin
Note that cgi-bin is a special token for that directive; calling the directory just cgi won't work; it must be cgi-bin.
For IP-based virtual hosting needs, there are variants of these directives: VirtualDocumentRootIP and VirtualScriptAliasIP.