Section 6.3 Scouting Out Apache (httpd) Problems | Real World Linux Security Prentice Hall Ptr Open Source Technology Series

6.3 Scouting Out Apache (httpd) Problems

Several issues specific to Apache itself are addressed here, though some of them apply to those using other Web servers. One of the reasons security is such a problem with a Web server is that it is one of the few programs that will talk with anyone in the world who connects to its port; sendmail would be another.

Although telnet, FTP, and others will talk to anyone, their interface is quite limited and well defined. The HTTP protocol, on the other hand, will allow practically anyone to pass practically any 8-bit byte sequence to any CGI that he wants. Commonly, the CGIs are not written by those who are experts in security, nor are they "standard" programs that get the very wide distribution and analysis that, say, sendmail or FTP would.

Refer to "Special Techniques for Web Servers" on page 284, including "Do Not Trust CGIs" on page 285. As part of your "not trusting CGIs" policy, the critical Apache files and directories must be of an owner and mode that the CGIs cannot alter.

6.3.1 Apache Ownership and Permissions

Apache ownership and permissions are important because Apache must be started as root so that it may open privileged TCP port 80. This is one reason some servers run httpd on port 8080. Port 8080 is not privileged since it is above 1023.

Use the User and Group directives in httpd.conf to cause the forked httpd daemons to run as other than root!!! The nobody and httpd account names are popular. Do not run as nobody if any other server uses nobody.

Apache never, ever should be run as root (except during startup to open port 80 and the log files); even some documentation incorrectly claims that it should be run as root.

There are several typical locations for the base of your Apache tree, including /httpd, /usr/local/apache, and /home/httpd. Some sites will be configured for the logs to be in the /var/log/httpd. Any of these or other similar locations are fine; it is the permissions and ownership that are important.

Assuming that the base of your Apache tree is /httpd, the following directories all should have a UID and GID of 0 and be mode 755:

 /httpd /httpd/bin /httpd/conf /httpd/logs /httpd/html /httpd/cgi-bin /httpd/icons

The daemon itself, /httpd/httpd, should be mode 511 and have a UID and GID of 0. Because you typically set the defines desired and build a custom httpd, different sites will have different binaries. For this reason it is recommended that httpd not be world-readable to prevent rogues from learning about what options you have allowed. The log files should be owned by root and mode 600 to prevent a rogue from truncating them after doing mischief.

For many sites, the cgi-bin directory and programs in it should be owned by root and be mode 755. This will prevent "trusted" CGIs from being replaced due to someone cracking another less secure CGI. Having the CGI scripts and programs themselves owned by root increases security but that prevents nonroot users from installing updated versions of CGIs. Watch out for set-UID permissions on these CGIs.

At some sites, this would be an excellent way to cause a developer to come to the SysAdmin to get the script (or program) installed and allows the SysAdmin to inspect (audit) the CGI for possible problems. For sites where this is not appropriate, an excellent alternative would be to have the CGIs and cgi-bin be owned by a third UID (not root or the httpd owner).

6.3.2 Server Side Includes

Server side includes (SSI) allow "cool" stuff such as displaying the current date when clients look at your organization's Web site. A few backprimes (`), an invocation of date, and this is done. However, this might cause security problems too.

If allowed to run unrestricted, SSI allow any arbitrary Linux program to be invoked out of a Web page. A writer creating a Web page might be using an SSI technique that she read in a magazine or saw on the Web that may lack security. Because it is "only a Web page" security may not be worried about as much as with a CGI even though the danger is the same.

One excellent solution, where appropriate, is to use the

 Includes NOEXEC

option to the Options directive in httpd.conf for Apache. This prevents their execution.

6.3.3 ScriptAlias

Another issue is where CGI programs are allowed to exist. It should be considered mandatory to use the ScriptAlias directive to limit CGI program locations to one or more particular locations. The usage is

 ScriptAlias fake_name real_name

where fake_name is what is specified in HTML as the path and real_name is where the CGIs really are located.

More than one ScriptAlias may be used; the following is typical:

 ScriptAlias /cgi-bin/ /httpd/cgi-bin/

It is important to locate the cgi-bin directory not to be under the html documents tree. This will prevent crackers from viewing your programs simply by browsing them.

6.3.4 Preventing Users from Altering System-Wide Settings

graphics/fourdangerlevel.gif

It is important to prevent users from creating their own .htaccess files with which they could alter the global parameters that affect security. To prevent them from doing this, put the following in httpd.conf before putting in the directives for individual directories.

 <Directory /> AllowOverride None Options None allow from all </Directory>

6.3.5 Controlling What Directories Apache May Access

graphics/fourdangerlevel.gif

By default, Apache will access any directory that it has permissions on. Although Apache should be operating under a unique UID and GID, you still do not want it to access any file with world-read permission (004). You can prohibit Apache from accessing the RCS directories used by the revision control system (source code control software). The following would be typical directives in httpd.conf.

 <Directory /> Order deny,allow Deny from all </Directory> <Directory /home/*/public_html/RCS> Order deny,allow Deny from all </Directory> <Directory /home/*/public_html> Order deny,allow Allow from all </Directory> <Directory /httpd/html/RCS> Order deny,allow Deny from all </Directory> <Directory /httpd/html> Order deny,allow Allow from all </Directory>

6.3.6 Controlling What File Extensions Apache May Access

graphics/threedangerlevel.gif

Unless told otherwise, Apache will access all files under the directories that it is allowed to use. This may be changed with Files directives. They are placed after the Directory and .htaccess directives and before the Location directives. A first argument of "~" will enable wildcards, with ".", "*", and "$" matching any character, zero or more characters except for a "/", and the end of the line, respectively. A backslash removes the special property of the following character.

Building on the previous example, these commands will prevent browsers from reading files ending in "~" , .swp, .tar, or .tgz.

 <Files ~ "~$"> Order deny,allow Deny from all </Files> <Files ~ "\.(swp|tar|tgz)$"> Order deny,allow Deny from all </Files>

6.3.7 Miscellaneous

graphics/threedangerlevel.gif

The directive

 <Location />

will override a

 <Directory /> Order deny,allow Deny from all </Directory>

if it is present.

Assuming that you are using at least version 1.3 of Apache, the following is recommended strongly in httpd.conf:

 UserDir disabled root

It is possible to operate Apache in a chroot environment but properly setting this up is a lot of trouble and probably not worth the effort for most. The RPM's

 --root dir

directive does support this.

A preferred solution is picking a directory, such as /httpd, for Apache to operate under, as the example shows.

PHP has had a lot of security problems. Either avoid it or track patches daily.

6.3.8 Database Draining

One way in which Web clients will use a Web site inappropriately is to try to obtain a substantial portion of a database by repeated lookups. This would be where you offer a service but do not want to give away all of your data to anybody that wants it. I call this draining a database. An example would be someone looking up information on every employee in your company as a prelude to raiding your company, that is, trying to hire them away. This would be a case where your Webmistress provides employee information to aid the company's customers, vendors, and, perhaps, friends of the employees.

Other examples would be someone offering a map generation service or offering information on a city's leisure activities, such as clubs and restaurants. These sites welcome consumer use but do not want someone to drain (copy) their databases and put up competing sites or otherwise not pay them a negotiated sum for the valuable data.

The Sunset Computer, http://www.cavu.com/sunset.html, that Mike O'Shaughnessy and I provide as a public service was drained of most of its data in 1999 by someone coming in through a major company on another continent. In this case, the person had the Sunset Computer look up every combination of three-letter airport identifiers automatically, no doubt to get data on the world's airports. This would be more than 17,000 hits.

Because e-mail to us is generated when invalid combinations are tried, this also generated a massive DoS of e-mail, disk space, and bandwidth, as well as being a criminal intrusion and violation of our copyright. Unfortunately, that e-mail address was not checked for mail very often at that time and so the intruder got most of the database before we blocked access. This problem has been fixed in the program. Despite wonderful cooperation from the abuse team at that company, the perpetrator was never found. The company assured us that if he had been caught, he surely would have been sacked.

Were his actions legal? Probably not. In the U.S., as in many other countries, access to any computer system is "by permission only." Violation is punishable by imprisonment. The mere presence of Apache listening on port 80 does not constitute blanket permission to "get" whatever can be gotten from the site.

What can be done to prevent database draining? First, display a prominent Use Policy and copyright notice. It should be displayed in an obvious place, such as the submission form or the results page. This prevents people from rationalizing that it is acceptable "because the site does not say it is not." Additionally, it will scare away some due to fear of a criminal prosecution, job dismissal, etc.

A "No unauthorized use" message for other services such as telnet and FTP is an excellent idea too. The files to put this message in are /etc/issue.net and /home/ftp/welcome.msg, respectively. They should be owned by root and have mode 644. The Use Policy and copyright notice also aid in dealing with the problem, if it occurs, both with the SysAdmins of the offending site and with law enforcement. I found from experience that the SysAdmin at the offender's site or ISP will take you more seriously if you can say, "He violated our displayed Use Policy."

Without such a displayed policy it becomes "Well, I do not think one hit every minute constitutes a DoS attack." This quote is from a SysAdmin whose user created and ran a Java program that did a lookup against the Sunset Computer every minute (1440 times a day) for several months before we detected him, contacted the SysAdmin, and blocked the site. Although not clever enough to cache the data, he was clever enough to switch to using an alternate site to interrogate us as soon as we blocked the first site. Because the Sunset program will provide data for an entire year with one request, there was no need for more than a single request. Instead, his would generate more than 500,000 requests per year.

This capability of providing more than one piece of data at a time is a valuable feature for some sites to cut down on bandwidth. Other sites want to dole out data in dribbles to encourage people to keep coming back and this is one recommended technique. The "All rights reserved." is necessary for copyright protection in some countries. You need to worry about the country that any perpetrator might be in too. If the data is dynamically generated, the copyright date should reflect the year that the data was generated or displayed, not the year that the program was last modified.

The Sunset Computer appropriate Use Policy that you are welcome to adapt, which was not reviewed by an attorney, presently is:

You may freely use these results (of not more than 20 different airports and not more than 50 total Forms submissions). We are not liable for errors, etc., especially since we are not charging you for the data.

Do not use for navigation.

Second, put some fake data in the database to detect whether someone does steal it and so that you can prove it. Map makers have been adding fake streets and fake towns to their maps for decades in order to prove a copyright violation more easily. The Sunset database did (and does) have some fake entries that will allow proving any copyright violation should the data turn up. Third, and most difficult, come up with a strategy for detecting in real time or near real time that someone is draining your database. This can be nontrivial and resource intensive. The optimum solution is to have a separate database (or separate table) that logs access counts by hostname, or by IP address for those clients that do not supply a hostname.

Recognize that you will get large numbers of hits from servers of large ISPs such as AOL. Even AOL has a number of servers so that if there are many hits in a short period time from one of AOL's servers, it probably is a single individual. You might need a provision to have a higher threshold from these high-volume sites. The blocking then can be done either in the application (CGI) or with an entry to httpd.conf. If the blocking is done in the application, a custom message could be generated including, possibly, offering to allow further service with a monetary payment. Blocking via an entry in httpd.conf is discussed in "Kicking Out Undesirables" on page 282.

Clearly, logging every individual client system to hit a busy site will result in a large database. There are some alternatives that will do a reasonable job in many cases. One would be to keep counts on the most recent X unique client systems to access the database and take action for any whose count reaches a threshold. The action could be automatically adding that site to a separate blocked site list that the application uses and even generating e-mail to abuse@bad_domain.com and to yourself. You even could look up the domain in the whois database of networksolutions.com and generate e-mail to the domain's technical contact.

A very simple solution that is adequate for many would be to generate e-mail on, perhaps, every 100th request. If you receive two in a row from the same site, you would study the logs and if they indicate abuse, block that site. If your use counter is displayed to clients and you report every nth use, do not use the modulo n usage. In other words, if you will report every 100th request do not use the 100th one because some may be smart enough to hit you 95 times, wait a few minutes, and then resume until the displayed count gets up to 95 again. Instead, pick a random number, such as 37 and report use 37, 137, 237, etc.

For CGIs that do not maintain a use counter, an essentially random counter such as the process ID could be used. Even for those applications that do have a usage counter, using a random number will make it harder for someone to outsmart it. In C the following will work for nonpersistent CGIs.

 if (!(getpid() % 100))          report();

For C shell scripts, the following will work:

 # $$ is the shell's Process ID (PID) if (($$ % 100) == 0) then          echo "$REMOTE_ADDR = $REMOTE_HOST 1%" \            | /bin/Mail -s '1%' webmistress@pentacorp.com endif

6.3.9 Kicking Out Undesirables

Even if you run a noncommercial public service Web site, someone in the world is going to abuse it. Large sites and those of large corporations and government agencies should expect a lot of abuse. This abuse will include those trying to crack security, those using the site in various inappropriate ways, including database draining and abusing any bulletin boards, surveys, etc., and those that use it excessively. Suppose you want to block the domain cracker.com, the host trouble.somecorp.com, and some site whose IP address, 216.247.56.62, does not resolve to a name. The following entries would be added to your httpd.conf file:

 # Controls who can get stuff from this server. order allow,deny deny from .cracker.com deny from trouble.somecorp.com deny from 216.247.56.62 allow from all

It should be added inside each of the sections starting <Directory something>, where something would be /httpd/htdocs and /httpd/cgi-bin in our example. Then send a hangup signal to the parent httpd daemon to cause it to read the data. This signal may be sent with

 killall -HUP httpd

6.3.10 Links to Your Site

graphics/twodangerlevel.gif

Some sites do not want other sites to link to them or to certain of their pages and some have threatening language aimed at those that would have links to them. In the summer of 2000, there was a ruling by a U.S. federal judge in Los Angeles that a site having links to another site, including a competitor, is legal so long as two conditions are met.^[3]

^[3] Ticketmaster Online-CitySearch Inc. vs. Tickets.com. Reported by USA Today, June 7, 2000.

Users know whose site they are on.
One company's page is not a duplication of another's page. The plaintiff's argument of unfair business competition was dismissed by the judge. The plaintiff is appealing.

Your CGI programs can use the $HTTP_REFERER environment variable to see if the referring page is acceptable. Although it can be spoofed easily, it will keep out the opportunists that are not wanted. By having the proper referring pages have unique (varying) URLs, the spoofing becomes very hard. Cookies, too, and SSL may be used for this purpose.

Top