LDAP and the Web: A Case Study | The ABCs of LDAP: How to Install, Run, and Administer LDAP Services

< Day Day Up >

In the first part of this chapter, we discovered the possibilities of accessing an LDAP server using a Web browser, and we learned how to interconnect these two important Internet services. In this second part of the chapter, we will use these technologies in a concrete application.

Web technology has revolutionized software development over the past five years, and there are no signs that evolution of this technology is coming to an end. Because LDAP is an internetworking protocol, it is also an important part of the World Wide Web. LDAP facilitates data distribution throughout networks, not only in an enterprise's intranet but also over the Internet.

LDAP is particularly useful for the extranet because it allows you to push data out of your intranet in a defined way. The replication capacity of LDAP is of particular importance. In this section, we will see a robust framework that allows you to push data out of the enterprise to your extranet, where your customers can access this information. The LDAP protocol also allows you to receive data from your customers, data that was previously collected via your company's Web site, accessible from the Internet.

This second section of the chapter provides a case study for such an environment. It contains a number of solutions and demonstrates that LDAP can go well beyond the classic phone-book application. This does not mean to imply that the phone-book application is trivial. In most cases, a phone-book application is the first step toward "LDAP enabling" of a company.

First, we will briefly review the requirements for the application: a Web site offering a number of applications for its users. The administration and authentication of the users is made using LDAP, which is where the collaboration of Web services and LDAP comes into play. After learning the requirements, we will see the schema of the application. The proposed solution could have been simpler, but we decided to include security requirements in the case study. We then briefly review the schema of the directory, without going into much detail. After that, we will see how a Web server can use LDAP authentication and describe how to access a directory using a normal Web browser. Finally, we will learn about the design of a broker connecting different data sources.

Requirements

Note that this case study is far from complete. We tried to keep it as general as possible so that it would apply to a wide range of application requirements. That said, let us move on to the list of requirements.

Assume that we have a medium-sized company and that this company needs to install an Internet Web server. Further, assume that this Internet Web server hosts the company's entire Web site, which holds different types of content:

Publicly available data
Data available after subscription
Confidential data

The Web server needs to know the identity of the user if it is to grant access to privileged content (the last two content types). The Web server may also require encrypted traffic between clients, i.e., Web browsers and Web servers.

The Web server should also run applications, e.g., discussion groups, news server, information-retrieval systems, file upload and download utilities, and similar applications. All of these Web application need to know the identity of the users trying to connect with them. Consequently, there are several categories of users:

Anonymous users: Users navigating the Web and accessing publicly available data on the Web site. The user does not need to deliver any credentials.
Authenticated users (low level): Users that the system recognizes. They have access to all unrestricted information. The only purpose of authentication at this level is to deliver the user to a familiar environment, such as "my" site, with predefined user preferences (filtered news on the company server, site setup, etc.). The server requires authentication, but the conversation need not be encrypted because the data is available to everyone, including anonymous users.
Authenticated users (high level): Users that have access to restricted data. In this case, the conversation should be encrypted. The information is not confidential, so the Web server can rely on the authentication procedure to let the user access the data. An example would be a company that, for legal reasons, has to restrict access to certain information.
Authenticated users accessing sensitive data: These users could be internal persons needing important data on the Internet for their daily work, e.g., employees in the sales or marketing departments. Business partners might also need to exchange sensitive information via the Internet, e.g., business-to-business (B2B) applications.

Where does the directory server get the information that restricts what the user can do and see? Some users are logged in automatically using information held in the company database. Other users can subscribe themselves. You can configure these new subscriptions to become active immediately or delay their status pending a manual or filtered review.

These are just some general restrictions, but you can further refine the restrictions to suit your needs. For example, you might want to restrict a particular user group to connect only at a certain time of the clay.

LDAP Internet Environment

Now that we have an idea of what we want to achieve, we are ready to design an architecture that will meet our needs. Exhibit 5 shows the picture. The two towns represent the Internet users who connect via browser to your company Web site. However, they do not link directly to your site. Upon opening a connection to your Web site at http://www.LdapAbc.org, the user is connected to the Web server through a firewall. A discussion about the configuration of the firewall would go far beyond the scope of this book. However, it is worth mentioning that the firewall has the job of keeping out traffic that the site is not designed for. Remember, the firewall cannot keep out bad guys!

click to expand
Exhibit 5: Architecture of LDAP-Handled Web Site

Because we only need access to the Web server, we will forbid all protocols except for the HTTP protocol. We also enable the HTTPS protocol, i.e., the secured HTTP protocol. In the same way, the LDAP directory server can use SSL/TLS as a security layer between itself and the insecure TCP/IP protocol stack. The HTTP protocol also uses this layer. We will also handle the file upload and download features via the HTTP protocol. The fewer the protocols the firewall has to negotiate, the better is its performance. You can always add a new protocol later if you really need it.

Once the request from the user's Web browser arrives at the Web server, the user gets the "welcome page." At this point, she might click on a link pointing to a location that requires authentication. As mentioned previously, we will use LDAP to authenticate the user. Of course, the Web server also has to speak the LDAP protocol. We could handle this by an application, but we will also be using static pages, and we do not want to use any scripting at all. Therefore, it is better to leave it to the server. In a later section, "LDAP Authentication and the Web Server," we will see how. For now, just assume that the server uses the LDAP protocol to communicate with the LDAP server. The Web server therefore gets an LDAP client. The Web server and the two LDAP servers are in the so-called demilitarized zone (DMZ), the zone between the firewall and the intranet. Inside the DMZ, the Web server and the LDAP server can speak LDAP. The firewall cannot pass the LDAP protocol, so the LDAP server cannot be contacted directly from the Internet.

If the user is unknown, she can subscribe herself on the LDAP server. However, if the firewall does not forward the LDAP protocol, how can the user subscribe herself on the LDAP server? She can do so using an LDAP gateway. We will learn about a very flexible gateway later, in the section "LDAP-HTTP Gateway."

When the user subscribes to the LDAP server, nothing happens until a broker mediates the request. There is a broker in the intranet that looks from time to time to see if there is something new on the LDAP server out in the DMZ. If there is, the broker decides what to do. A number of actions are possible: subscribe the user; ask the administrator what to do; or drop the subscription. The broker gets this information from its configuration files. We will learn more about this in the section "LDAP Application Broker."

Once the broker has handled the request correctly, the new entry, if any, should go into the directory. For security reasons, we keep a copy inside the enterprise. This copy is also used to put the changes onto the LDAP gateway. The LDAP gateway pushes the updates it gets onto the read-only LDAP server in the DMZ. The HTTP server obtains authentication information from the read-only LDAP server.

The broker has yet another function. It observes changes in the directory containing the data of the employees. When something changes, it replicates this change into the LDAP directory inside the intranet. The directory holds the userIDs of the employees. It does not hold the passwords. For security reasons, the employees should not use the same password for both the Internet LDAP database and the company's intranet.

As you may have noticed, the DMZ contains two LDAP databases. One directory is read-only and is used to deliver authentication information to the Web server. The second directory is used to hold the subscription requests and the modify requests of the users. This directory is configured to allow users to insert and modify their individual entries. The broker is used to update the master directory within the intranet. The intranet directory is then used to update via replication the read-only directory in the DMZ. This somewhat complicated architecture is used for security reasons. The directory that grants access to users is reachable in the DMZ, but it is almost in a read-only state. Only one user on one machine can update this directory. In reuse, it is read-only. This directory is a replica of the directory gateway.

Now that we have an overview of the architecture, let us have a look at the details, beginning with the Web server and its LDAP module. This example uses the open-source Web server implementation from Apache. Open-source software has the big advantage of allowing programmers to look at the source code to see exactly what is going on in the program. You can tune this open-source Web server until it meets exactly your requirements. Furthermore, you can learn a lot from this implementation and can reuse this knowledge, even if you decide to use a commercial Web server.

LDAP Directory

Let us have a very brief look at how we could organize the data in the directory. We will use the standard inetOrgPerson object class because, based on our requirements, we are not interested in any attributes that this object class cannot hold.

Because we will also define groups of persons accessing a particular Web site or a particular discussion group, we will use a number of groupOfUniqueNames object classes to hold the group information.

We also want to distinguish between external persons (customers) and internal persons (employees). By assigning an organizational unit to each category, we can easily identify all customers or all employees having access to the Web server.

We have on our Web server different Web sites to protect, different discussion groups with defined access, and we want to differentiate the access of the employees. For example, we do not want a member of the sales group to gain access to the site reserved for the marketing group, and the marketing people should not have access to the sites of the sales group. It is possible to differentiate the groups into much finer categories. If you need to do so, you can simply add further complexity to the DIT. Exhibit 6 shows the DIT we will use in this example.

click to expand
Exhibit 6: DIT of Web Site Directory

A big problem is always the decision of how to name the entries for the individual users. You need to make the entries unique in the directory, but you also need to assign unique user credentials to each user to guarantee that no two users have the same log-in information. The easiest way to do this is to use a unique userID. How to generate the userID is the next problem. You could let the user choose one, or you could use an automatic procedure that generates the userID, perhaps from surname and name. The decision is up to you. You could also use the e-mail address of the user, which should be unique. We will use the e-mail address because this can be handled consistently for employees and customers. However, this means that the customer must have an e-mail account. You can solve this problem by offering him an e-mail account if he does not already have one.

As you see, we have three different types of groups plus two groups that specify the province of the user:

Discussion groups, contained in the organizational unit labeled "lists" (ou = lists)
Groups to limit access to Web sites, contained in the organizational unit labeled "sites" (ou = sites)
Groups inside your enterprise, contained in the organizational unit labeled "depts," deriving from departments (ou = depts)
Users from outside the enterprise (ou=customers)
Users from inside the enterprise (ou=employees)

With the first three group types, you can later answer such questions as "How many discussion groups are hosted by our Web server?" Also using these group types, you can develop queries that will provide useful site statistics.

LDAP Authentication and the Web Server

Recall that, for this example, we are using the open-source Web server Apache, available from http://httpd.apache.org. If we are to let the Web server authenticate the user, we also need the module that allows authentication against LDAP. This is also available as open-source software. You can find many such modules using the database on the Apache Web site. Look at http://modules.apache.org to get the list of modules available. At this link, you also will find a utility to search the module database. The module used in this example is the "auth_ldap" module written by Dave Carrigan. The software and documentation for this module are available at http://www.rudedog.org. This software works with Netscape (SUN) and OpenLDAP libraries. It also allows access to LDAP over SSL.

We used the OpenLDAP libraries to compile the mod_auth module into the Apache Web server. Because we will later also be using the HTTPS protocol, we will need to set up a secure connection using SSL. The first thing to do is to install OpenSSL, available at http://www.openssl.org. Once OpenSSL is installed, you can install Apache, adding the mod_auth module. You will need to install two Apache versions: one based upon SSL and one without SSL. Once you have successfully installed the two Apache Web servers, you can set up the authentication methods.

Before configuring the Web server to use LDAP authentication, we must first establish which types of authentication we need:

We need to control using the userID if the user is known by the system, i.e., if the mail address corresponds to a valid user of the system.
We need to control using the mail if the user is part of a particular group.
We may need to give access to only a few persons, without defining for them a particular group.

The following subsections show how to configure these three login categories.

Control if the User Is Known by the System

Because we have two different types of users — customers and employees — we will need to distinguish between the two of them. We need to give the LDAP server three pieces of information: the base LDAP URL the entry lives in and the log-in information the user will provide.

 AuthLDAPURL ldap://ldap1.LdapAbc.org/dc=LdapAbc,dc=org?mail require valid-user

This instruction defines that every user contained in the subtree dc = LdapAbc,dc = org can gain access. The mail address is used for login. If you only want to log in customers, you would use:

 AuthLDAPURL ldap://ldap1.LdapAbc.org/ou = customers, dc = LdapAbc,dc = org?mail require valid-user

Likewise, if you only want to grant access to employees, you would use:

 AuthLDAPURL ldap://ldap1.LdapAbc.org/ou = employees, dc = LdapAbc,dc = org?mail require valid-user

Accept Only Members of Particular Groups

In this example, assume that we want to limit access to this Web page to members of the discussion group named "LDAP Fundamentals." Because we accept persons who are members of a discussion group, the organizational unit is "ou = lists":

 AuthLDAPURL ldap://ldap1.LdapAbc.org/dc = LdapAbc,dc = org?mail require group cn = LDAP Fundamentals, ou = lists, dc = LdapAbc,dc = org

Again, we can limit access to those members who are employees with the following definition:

 AuthLDAPURL ldap://ldap1.LdapAbc.org/ou = employees, dc = LdapAbc,dc = org?mail require group cn = LDAP Fundamentals, ou = lists, dc = LdapAbc,dc = org

or limit access to group members who are customers:

 AuthLDAPURL ldap://ldap1.LdapAbc.org/ou = customers, dc = LdapAbc,dc = org?mail require group cn = LDAP Fundamentals, ou = lists, dc = LdapAbc,dc = org

Accept Only a Particular User

Here we accept only the user with the e-mail address: <ReinhardVoglmaier@LdapAbc.org.> Thus, the definition looks like this:

 AuthLDAPURL ldap://ldap1.LdapAbc.org/dc = LdapAbc,dc = org?mail require user ReinhardVoglmaier@LdapAbc.org

That is all you need to do to authenticate the different types of users in our environment. The credential the user provided is then further available to you using the environment variable Remote_User, which is provided by the Web server.

LDAP-HTTP Gateway

At this point in our example, we can authenticate existing users to allow them access to protected information, and we can give the Web server the information of who is making a particular request. Remember that we also wanted to allow the user to sign up and insert her name into the database for a particular group. Let us look at a detail of Exhibit 5, but view it in greater detail in Exhibit 7.

click to expand
Exhibit 7: Web Server Speaking with an LDAP Server

The user is connected via browser to the Web server, so the Web server and the browser are using the HTTP protocol for communication. Recall that this was the only protocol we allowed in the firewall. To subscribe to a group, the user compiles a form on the Web Server. Once compiled and submitted, the Web server launches an application that contacts the LDAP server, speaking LDAP, obviously, and asks the LDAP server to insert a new entry in the directory. The application called LDAP-enabled in Exhibit 7 can be a CGI script written in Perl, it could be a page written in PHP, or it could be a Java servlet. The whole architecture shown in Exhibit 8 is called an HTTP-LDAP gateway.

click to expand
Exhibit 8: Architecture of HttpLdap Gateway (Developed by Jon Roberts, Mentata Systems.)

Because we want to allow the users to sign up and also to modify their own entries, we must provide a form to update the information contained in the directory. Thus, we need to configure the LDAP server to permit users to change only their own entries. If you compare the work you have to do to implement this functionality on an RDBMS with the work you need to do so using LDAP, you will see that the LDAP server comes in very handy in these activities.

Once developed, the LDAP-HTTP gateway can also come in handy within the intranet. Recall that the broker brings in data from the Internet and puts it on the directory within the intranet. However, the this directory also has to be maintained. For example, there could be requests from users to reset their passwords or similar activities. To keep up with directory maintenance, you need an administration interface. You could write an application using your preferred development system and the corresponding LDAP library. But if you are using the same application logic that you used outside the intranet, you can avoid inventing the wheel again. Furthermore, you can save time later on during software maintenance if you use a similar architecture for both applications.

Instead of writing our own solution, we will use a ready-to-use framework in our example. Luckily, there is an open-source project written in Java. The framework is developed and maintained by Jon Roberts, the proprietor of Mentata Systems. It is available for download from http://www.mentata.com together with excellent documentation. The software is based on the standard servlet technology. You need a Web server that can handle servlets, e.g., Tomcat. Tomcat is part of the open-source project Java.

Exhibit 8 shows the architecture of this framework. On one end is the Web server speaking pure HTML with the browser, thus allowing the use of a standard browser. On the other end is the LDAP server speaking the LDAP protocol. The Web server holds the user interface in the form of static pages. The dynamic part is handled by Java server pages (JSP) and Java servlets that are held within Tomcat, which is also known as a "servlet container." The Web server contacts Tomcat as soon as it needs the dynamic part to be executed. For the access at the LDAP side, an LDAP software development kit (SDK) is needed. Together with the servlet container, the SDK contains the base for two packages written in Java: The gateway extends the servlet classes by adding LDAP support, and the LDAP-HTTP package offers the classes that can be used by the final application. In the application, you have nothing else to do other than overwrite the methods of the LDAP-HTTP classes to obtain the needed functionality.

LDAP Application Broker

The last piece missing in our design is the broker, which mediates transactions between the outside directory, the inside copy of the directory, and the database containing the employee data. We propose a simple design: a central broker, configured by a simple ASCII configuration file, that speaks with agents that contact the different servers. In our particular case, we have two LDAP servers — one in the DMZ where the updates are read from and one in the intranet where the updates are written to — and one RDBMS, which also receives updates. Exhibit 9 shows an overview of the architecture.

click to expand
Exhibit 9: Architecture of the LDAP Application Broker

We could have avoided writing an application of our own by using a commercial metadirectory product. However, the broker architecture adds flexibility to the design that we will appreciate later when we may need to add new functionality to our LDAP sites.

To keep things simple, the design is broken down into components. Each component is called an agent, and there are local and remote agents. The remote agents are responsible for collecting the freshly modified data or writing back the data modified by another data source. The local agents keep in contact with the remote agents and interact with the broker. The broker then puts two local agents in contact, one to receive the data and one to deliver the data. The broker negotiates these transactions among the different local agents based on its configuration file.

The remote agent that updates a directory is easy to write. It only has to do the normal add, delete, and modify operations. The remote agent can interact with the RDBMS using any number of libraries that facilitate contact with an RDBMS. The remote agent that gets the modifications in the LDAP repository simply uses the log file to see whether an update in the directory has occurred. As soon as the remote agent notices an update, it contacts a local agent.

The broker uses a dispatcher that continues to create new local agents. This dispatcher uses a pool of active dispatcher processes. It always has a configurable number of dispatcher processes to contact if work has to be done.

This section has shown you how to approach the job of combining information from different data sources. As you have seen, you can achieve a lot with an easy-to-implement solution. A project to implement this broker architecture had already begun at the time of this writing. This work is also in the form of an open-source project. Sources and documentation can be downloaded from http://www.sourceforge.net.

< Day Day Up >