Chapter 5: Distributed Architectures | The ABCs of LDAP: How to Install, Run, and Administer LDAP Services

< Day Day Up >

Overview

Until now, we have seen only simple directory architectures. We had a single directory server holding the whole directory and serving a number of clients. However, life is rarely so simple, and you probably have a somewhat more difficult implementation in place.

A single directory server is not always sufficient to meet your requirements. Sometimes it is necessary to keep a copy of the same data at different locations to optimize access speed. There are many good reasons to do so, and we will discuss them later in greater detail. For now, let us review a few without further comment. For example, one reason could be to bring the server closer to the clients. This is helpful in the case of a wide-area network (WAN) with slow network connections between the different local-area networks (LANs) that make up the network. Another reason could be load balancing, where an overloaded directory server is distributed over two or more machines to increase throughput. Whatever the reason for distributing the directory over several physical servers, the pieces should form a single logical directory server.

The LDAP protocol running over TCP/IP is well suited for running in a distributed architecture. You can even design your LDAP architecture to cross enterprise boundaries. In this chapter, we will learn how.

Let us begin with a familiar example, the Web browser. The HTTP protocol running also over TCP/IP is indeed a very well known example of such a distributed architecture. Consider the myriad of Web servers holding a wealth of data; it is clear that not all the data can be held on a single server. Furthermore, it happens that the same data may be available from different places. This redundancy allows the user to choose the nearest location. It furthermore allows the user to contact another server should the first server become unavailable. You can design the architecture so that the search for the available server becomes automatic (round-robin architecture or proxy). You can also cross enterprise boundaries, with your links pointing to Web servers outside your enterprise boundary, if the configuration of the firewall allows this. The underlying concepts are the same for LDAP.

Data distribution is closely coupled with the protocol nature of LDAP. Indeed, directory services is one piece of the distributed computing environment (DCE), which was developed by the Open Systems Foundation (OSF) that converged in the Open Group. DCE version 1.0 was released in 1992 and contained the following major services:

Distributed time services
Distributed file services
Remote procedure call (RPC)
Directory services

The distributed computing environment has the goal of coordinating resources and applications in large heterogeneous networks. A heterogeneous network is made up of computers on different hardware platforms running different operating systems. The TCP/IP and OSI protocol suites provide the basis of the communication between these computers.

The administration of such a DCE requires standards for making a number of services available over the whole network. One such service is the time service. Timing is very important when a large number of computers are linked together. Another very important service is our directory service. As we have seen in previous chapters, directory services are used to hold very different information. Naming services (e.g., domain name services [DNS]), network information services (NIS), and the distribution of configuration data (for example, alias maps for sendmail) are examples of information that should be administered at one point only but made available all over the network. To distribute directory services effectively, reliably, and efficiently, you must have the tools of replication and partitioning.

It frequently happens that a company's intranet has a number of different data repositories on different platforms. The reason is mostly historical and derives from the way applications were born. It is quite common for each application to uses its own repository. It is also common for these applications to store the same or similar data. These applications normally have grown up separately from each other and have a life of their own. In this way, a number of information islands are created in the enterprise, each of them using different hardware and software. It happens, therefore, that almost the same data is held on different systems in a different way. This not only keeps administration efforts high; it also makes it difficult to obtain a unique and consistent view of the data. The installation of a directory could potentially solve this problem, but it is not always possible to migrate all application to use a directory for data storage.

If you have such a distributed system — for whatever reason — you will need to optimize the management of and access to the data. What you have to do is to find a way of consolidating the data. This chapter addresses the issue of how to "hold together" such distributed systems. It consists of two parts, corresponding to the two reasons for why systems are distributed:

To divide (partition) the same architecture for a gain in performance or availability or both. The division is planned and optimized for a given situation.
To have the same data on different architectures (replication) for application needs. The division has historical reasons, slows down performance, and increases administration efforts. Sometimes you have to live with this situation, even if it is not optimized. However, we will see that there are strategies for improving this situation too.

< Day Day Up >