Authorization and authentication are common requirements for many websites. Authentication establishes the identity of parties in a communication. You can authenticate yourself through something you know such as a password or a cookie, through something tangible such as an ID card or a key, through something intrinsically part of you such as your fingerprint or your retina, or through any combination of these elements. In the context of a website, authentication is usually restricted to the use of passwords and certificates.
Authorization deals with protecting access to resources. You can authorize access based on several factors, such as the IP address the user is coming from, the user's browser type, the content the user is trying to access, or who the user is (previously determined via authentication).
Apache includes several modules that provide authentication and access control and that can be used to protect both dynamic and static content. You can either use one of these modules or implement your own access control at the application level and provide customized login screens, single sign-on, and other advanced functionality.
Users are authenticated for tracking or authorization purposes. The HTTP specification provides two authentication mechanisms: basic and digest. In both cases, the process is the following:
In the basic authentication scheme, the username and password are transmitted in clear text, as part of the HTTP request headers. This poses a security risk because an attacker could easily peek at the conversation between server and browser, learn the username and password, and reuse them freely afterward.
The digest authentication provides increased security because it transmits a digest instead of the clear text password. The digest is based on a combination of several parameters, including the username, password, and request method. The server can calculate the digest on its own and check that the client knows the password, even when the password itself is not transmitted over the network.
By the Way
A digest algorithm is a mathematical operation that takes a text and returns another text, a digest, which uniquely identifies the original one. A good digest algorithm should make sure that, at least for practical purposes, different input texts produce different digests and that the original input text cannot be derived from the digest. MD5 is the name of a commonly used digest algorithm.
Unfortunately, although the specification has been available for some time, only very recent browsers support digest authentication. This means that for practical purposes, digest authentication is restricted to scenarios in which you have control over the browser software of your clients, such as in a company intranet.
In any case, for both digest and basic authentication, the requested information itself is transmitted unprotected over the network. A better choice to secure access to your website involves using the HTTP over SSL protocol, as described in Chapter 30, "Setting Up a Secure Web Server."
User Management Methods
When the authentication module receives the username and password from the client, it needs to verify that they are valid against an existing repository of users. The usernames and passwords can be stored in a variety of ways, including the file-and database-based mechanisms provided for by Apache. Third-party modules provide support for additional mechanisms such as Lightweight Directory Access Protocol (LDAP) and Network Information Services (NIS).