Analyzing Your Namespace Needs | Understanding and Deploying LDAP Directory Services (2nd Edition)

Now that you have some idea what you're going to do with your namespace after you define it, it's time to turn our attention to the design process itself. The first step in designing a namespace is to understand your needs. Do you need a flat namespace or a hierarchical one? What attributes should you use to name entries? Do you have replication or partitioning needs that may affect the design of your namespace? What about access control? What applications will the directory be supporting? Are your needs constant, or might they change over time? These questions and others need to be answered before you can have confidence in your namespace design.

This section takes you through the major decisions you'll need to make when designing your namespace. Keep in mind that, like many other design problems, namespace design involves a series of trade-offs, such as administrative convenience for future flexibility. As we examine each of these trade-offs, we'll try to point out what you gain and what you sacrifice at each step. At the end of this chapter, we provide a checklist summary of the issues you should consider during the design process.

Choosing a Suffix

Your directory may have only a local scope, or it may be part of a larger, or even a global, directory system. In either case, one of the first choices you have to make when designing your namespace is determining the suffix below which your namespace will live. Picture your namespace as a tree: A suffix is the name of the entry at the top of the subtree you're designing.

Although LDAP places no restrictions on the suffix you use, three methods are commonly used. All the methods base the suffix on the name of your organization so that it's likely to be unique. This practice allows your directory to coexist with other LDAP servers, should the need arise (imagine what would happen if your company merged with another company).

The first method, and the one we recommend, is the technique described in RFC 2247 that maps a DNS domain name to a DN. In summary, the technique is to separate the components of the domain name, prepend "dc=" to each, and join them with commas. So, for example, the DNS domain name example.com would map to the DN dc=example, dc=com . This method has the advantage that your domain name is already guaranteed to be unique because it was assigned by an Internet domain name registrar. Therefore, the suffix you derive from that domain name should also be unique, assuming that other organizations are following the same convention. Beneath the suffix entry, you are free to divide the namespace however you see fit. Netscape Directory Server 6 and Microsoft Active Directory both use this method as their default for naming suffixes, although both allow you to override the default and invent your own suffix.

The second common method is to use your organization's DNS name. If your organization's domain name is example.com , then the suffix for your directory will be o=example.com (the o attribute signals that the entry probably has the organization object class). This method also has the benefit of leveraging your already unique domain name. However, it deviates from the standard set by RFC 2247, and we do not recommend it.

The third method is to use the X.500 model for choosing your suffix. In this model you choose a suffix that can plug into the global X.500 directory. In the United States, the X.500 hierarchy has the c=US entry at the top, with entries for each of the U.S. states and territories directly beneath c=US . Organizational entries reside beneath each state entry. The RDN of the suffix entry is named with the o (organization) attribute, and the name of the company, as registered with the state or federal government, is used.

If the company is named Example Electronics, Inc., and is incorporated in the state of Delaware, the suffix will be o=Example Electronics\, Inc., st=Delaware, c=US . This DN is cumbersome for two reasons. First, it is long (47 characters versus 18 for the RFC 2247 “derived suffix). Second, the RDN contains a comma, which must be escaped. Although good client software hides DNs from users, directory administrators frequently need to type them. For these reasons, we recommend against using X.500-style suffixes unless you know that you need to participate in the global X.500 directory.

How Suffixes Work

It may help you to understand how a suffix is used by a directory server when the server is servicing a typical directory operation. For example, suppose that a client wants to modify the entry uid=bjensen,dc=example,dc=com . When a Netscape Directory Server 6 server receives a modification request, it compares the DN in the request to the directory suffixes it holds to determine whether the entry to be modified is beneath one of the suffixes held by the server. If the server holds the suffix dc=example,dc=com or the suffix dc=com , the modification proceeds.

If the server does not hold either of these suffixes, it can do one of three things. It might refer the client to a different directory server that does hold the requested data. It might forward the operation to the directory server that holds the requested data. Or the directory server might simply return an error, assuming that the requested entry does not exist. The specific behavior depends on the directory's configuration.

If you are designing a namespace for your department, which is only one part of a larger tree designed for the company, your suffix will name the entry at the top of your department's tree. For example, the Engineering department at Example Electronics might choose the suffix ou=Engineering,dc=example,dc=com . Figure 9.8 shows examples of suffixes.

Figure 9.8. Examples of Directory Suffixes

Often a directory server may hold more than one suffix. You may want to use this capability to design a service with multiple suffixes if you have two or more directory trees of information that do not have a natural common root.

Flat and Hierarchical Schemes

The next choice you need to make when you're designing a namespace is whether to use a flat or hierarchical scheme, and if you choose a hierarchical scheme, what type of hierarchy to construct. Of course, this is not a binary decision; your real decision is how much hierarchy and what type to introduce. As a guiding design principle, you should strive to make your namespace as flat as possible.

Name changes are typically one of the more burdensome administrative tasks of running a directory, inconvenient for both administrators and users. The flatter a namespace is, the less likely names are to change. All other things being equal, one would expect the likelihood of a name change to be proportional to the number of components in the name with the potential for change. The more hierarchical a namespace is, the more components it has and the longer the names are. The longer a name is, the more likely it is to change. Thus, shorter, flatter names will change less frequently. Figure 9.9 shows an example of a flat namespace that requires only short names .

Figure 9.9. A Flat Namespace That Minimizes Name Changes

Tip

Make your namespace as flat as possible, while still meeting your other needs concerning topology, replication, and access control. Flat names change less and are easier to administer. Long names introduce needless complexity and administrative burden .

Of course, there are equally valid reasons to introduce a certain amount of hierarchy into a namespace. As described in the previous section, hierarchy may be required to enable data partitioning among multiple servers, replication, and certain kinds of access control. In addition, hierarchy can be useful to applications that want to browse the directory, although these applications are often better served by construction of virtual directory browsing views using attributes such as seeAlso , which can refer to other directory entries.

If you anticipate a centralized directory small enough to exist on a single machine, there is no need to introduce hierarchy to enable data distribution. Such a directory is not as constraining as it may sound. An average- sized Pentium-class machine can handle a directory on the order of millions of entries (depending on your directory implementation, of course), and it could be replicated to several other machines to handle additional client search load.

Another reason to introduce hierarchy is to enable the distribution of administrative authority via access control. For example, suppose that you want to allow an administrator from the Marketing department to have control over marketing entries, the Engineering administrator to have control over engineering entries, and so on. Many directory products allow this kind of administrative distribution of control only at branch points in your namespace. With such a system, different access control rules cannot easily be applied to the same subtree.

Some modern systems allow the setting of access control on the basis of directory content rather than the directory namespace. With Netscape Directory Server 6, for example, you can define a single access control rule stating that the Engineering administrator has access to all entries that have an attribute value indicating that they belong to the Engineering department (for example, ou=engineering ). Carefully examine your chosen directory implementation's access control capabilities to make sure you understand how they will restrict your namespace design, or look for software that supports your preferred design.

If you do need to introduce hierarchy in your namespace, try to do so sparingly and in a way that avoids problematic name changes as much as possible. The reason for needing hierarchy in the first place may nullify much of your flexibility. For example, if you need hierarchy to distribute authority to different departments, there is not much hope in avoiding a name change when a user changes departments.

However, name changes can be avoided if you are able to design your hierarchy on the basis of information that is not connected to directory information that is likely to change. For example, you could base your hierarchy on the types of objects in each tree, with one area of the tree for people, another for groups, and so on. It is unlikely , to say the least, that an entry would need to move from one area of the hierarchy to another with a scheme like this. This kind of partitioning can make replication easier in some cases as well. Figure 9.10 shows an example of this kind of hierarchical namespace.

Figure 9.10. A Hierarchical Namespace in Which Data Is Not Likely to Change

Naming Attributes

After you've decided on the basic structure of your namespace and the level of hierarchy, you need to decide on attributes to use when you're naming entries. The attribute you should use depends on the type of entry you are naming and other requirements at your site. In this section we present some general principles that you can apply to naming all kinds of entries.

The only requirements imposed on naming attributes by the LDAP model are these:

The RDN of the entry must be chosen from one or more of the entry's attributes.
The RDN must be unique among all its sibling entries (other entries that have the same parent).

Although these are the only restrictions imposed by the LDAP model, we suggest that you adopt a policy ensuring that all RDNs for people are unique across your entire directory. This policy has the benefit that, even if you need to move an entry to a new location in your namespace, its name will not clash with another entry's name.

To meet this additional restriction, there are two approaches you can take: (1) You can name entries using an existing name that is already guaranteed to be unique, or (2) you can generate your own unique names for entries. We discuss these two approaches next.

Naming Entries by Using Existing Unique Names

Your organization may already have a method for assigning unique names to users. Many companies assign a unique user ID to an employee when she is hired , and the employee uses this ID to log in to various computing services throughout the company. If the user IDs assigned by this organizational process are known to be unique, they can serve as the naming attributes for users in your directory.

If you are fortunate enough to have such a method already deployed, we suggest that you use the unique name assigned by this process as the uid attribute of each user's entry, and that you name the entry with the uid attribute.

For example, suppose that during the process of being hired at Example Electronics, Inc., Barbara Jensen chose the login name "babs." Using our suggestion, the uid attribute of Barbara's entry would be babs , and the entry would be named by that attribute. Assuming that Barbara works in the Engineering department, that a separate branch exists for that department, and that RFC 2247 “style naming is in use, her entry's DN would be uid=babs,ou=Engineering,dc=example,dc=com .

Because the login ID "babs" has been guaranteed to be unique by an external organizational process, we can be certain that no other "babs" will end up in another branch of our directory. Thus we will not need to worry about name clashes if Barbara moves to a different department within the company.

Be careful when choosing existing unique identifiers for users. Some naturally occurring naming attributes may be sensitive, and you may not want to use them for fear of unintentionally revealing information that should not be revealed. A U.S. Social Security number is a good example of such an attribute. If you use it to name your people entries, you guarantee uniqueness ”but at the expense of publishing everybody's Social Security number, a practice guaranteed to make you highly unpopular.

Naming Entries by Constructing New Unique Identifiers

If your organization does not currently have a process for assigning login names to people, you have more work to do.

You might consider creating such a process yourself. Creating a unique login ID for each new employee is not all that difficult, and it is beneficial. You can guide users in choosing a unique ID by allowing them to propose an ID and checking whether the name is in use by searching the directory. If the name is not in use, the user's entry can be created. Otherwise , the name clash is reported to the user, and a new name can be chosen.

Another approach to guaranteeing uniqueness is to artificially make the attribute you have chosen unique, perhaps by appending a number. For example, suppose that you choose the cn attribute to name entries and have two entries with the same parent that would otherwise both be named cn=Barbara Jensen . You could append a number to one or both of the names, making cn=Barbara Jensen 1 and cn=Barbara Jensen 2 the names of the two entries.

Although this approach may have some aesthetic value, it is also more difficult to maintain. In our experience, users generally dislike having their names changed in any way, even for such a clear administrative reason. It may well be better to use something more arbitrary with no value to the user. This scheme may be more difficult to maintain because it requires an external mechanism to manage the process of making names unique. Be sure to pilot any user-naming decisions with your user community; it's often difficult to predict what they will like and dislike. This is another good reason to hide DNs from your users.

The LDAP model also allows the use of multiple attributes from an entry to form a multivalued RDN. The idea behind this capability is to use the additional attributes to distinguish entries that otherwise would have the same name. For example, suppose that you have two users named Barbara Jensen ”one in the California office and the other in the Michigan office. Using multivalued RDNs, you could distinguish between these two entries by naming one cn=Barbara Jensen + L=California and the other cn=Barbara Jensen + L=Michigan .

This practice tends to lead to long, complicated names that change frequently (what if either Barbara moves?). Also some directory implementations , such as Netscape Directory Server 6, do not fully support multivalued RDNs. For these reasons, we strongly discourage their use and encourage you instead to use one of the other naming conflict resolution strategies discussed.

A final approach you might consider is to make up a meaningless identifier that is unique. For example, you might generate a random number and use that as the cn attribute for a new entry. Although this approach has no aesthetic appeal , it meets the requirements for uniqueness, and if your LDAP clients hide entry names from users, there's no reason not to adopt this approach.

Application Considerations

Most people do not design and run a directory service for its own sake. Typically, the directory is required to support one or more directory-enabled applications. The requirements these applications place on the namespace and other aspects of your design are important design considerations. After all, if your directory does not satisfy the requirements of the applications driving its deployment, your chances of postdeployment employment are small.

Multivalued RDNs and Client Complexity

Multivalued RDNs pose a difficult problem for LDAP client software writers. Often LDAP client software needs to compare two DNs for equivalence. Multivalued RDNs make this difficult because each RDN is a set, according to X.500 standards (on which the LDAP standards are based). In mathematical terms, a set is an unordered list of items. This means that the individual attributes that make up a multivalued RDN may appear in any order in the RDN. For example, the name cn=Barbara Jensen + L=California,dc=example,dc=com refers to the same entry as the name L=California + cn=Barbara Jensen,dc=example,dc=com . Clients that need to compare DNs need to be able to understand this. In our experience, few clients properly handle this situation. This is another reason we recommend that you avoid using multivalued RDNs.

The requirements that an application can place on your directory are as varied as the applications themselves . Lest you become dismayed and think that anticipating the needs of an endless parade of different applications is a lost cause, consider the following.

First, focusing on the needs of directory applications existing or being deployed in your organization today will probably provide you with a fairly representative cross section of requirements. Make sure that you understand these needs as well as possible before you consider yourself finished with your directory design (see Chapter 6, Defining Your Directory Needs). Piloting your directory on a smaller, test-scale deployment is also a good idea.

Second, some general principles you can follow will help prepare you for that future parade of directory-enabled applications. These principles are important to keep in mind both when you're designing your directory and when you're writing a directory-enabled application. For more information, see Chapter 21, Developing New Applications.

A well-written directory-enabled application makes a concerted effort to assume as little as possible about the directory service it will access. An application should be configurable and capable of adapting to new namespaces, new types of acceptable queries, schema differences, nonstandard port numbers , new host names, and more. Of course, not all applications can provide this kind of flexibility. How can you design your namespace to anticipate as many of these problems as possible?

If your existing needs allow it, one good approach is to use a standard namespace design, such as the RFC 2247 style of naming we described earlier in this chapter. Because this naming method is documented in a standards document, it's likely that application vendors will support it.

Another good approach is to be conservative when picking the attribute used to name entries. Try to use a standard attribute such as cn or uid . If you're considering creating a new attribute to hold the naming value ”for example, employeeID ” consider placing the value of the employeeID attribute in the cn or another standard attribute, and then use that attribute for naming instead of employeeID . Although this approach might seem aesthetically unpleasant, applications that assume standard attributes for the namespace will not become confused .

Namespace and other directory design choices like this are common. When starting from scratch, you can often afford to make things aesthetically pleasing as well as functional. But it is rare, unfortunately , that you will be able to start completely from scratch without worrying about any existing applications.

Administrative Considerations of Naming Attributes and RDNs

When designing your namespace, consider the effect the namespace will have on common administrative tasks. For example, when you're adding an entry, can the naming attributes be generated automatically, or is it a manual process? When entries are deleted, can their naming attributes be reused, or should they forever be reserved for the deleted entry? What effect will a name change have? How often are names likely to change? What other things depend on the namespace? The answers to these questions are seldom independent of the design decisions discussed in the previous sections. In this section we discuss the administrative implications of those decisions.

If your organization already has a unique identifier assigned to each user (for example, an employee number, login name, or user ID), it may make sense to use it as the value of the naming attribute. This saves you the administrative burden of devising and maintaining another unique identifier, and it is a good solution for naming user entries.

Other entries, perhaps for printers, groups, or other entities, are another matter. In either case, using an existing attribute type can eliminate another small administrative task: defining a new attribute type to use when naming entries. It also reduces the likelihood that a less-than - intelligent client could be confused by an unknown attribute.

Maintenance of the naming attribute is also a consideration. Whether or not the directory is the ultimate source of authority, the problem of reusability must be addressed. Depending on your policies, namespace identifiers like user login names might be (1) assigned only once and never reassigned, (2) reassigned after a suitable interval, or (3) reassigned immediately. Whatever policy you choose, it must be enforceable.

Most directory software does not support an out-of-the-box namespace reuse policy. Instead you have to enforce such a policy either through an external administrative agent or through an extension to the directory software itself. For example, if a user leaves your organization, you may want to avoid assigning the same login name to a new user for a few months. One way to do this is to mark the terminated employee's entry in a special way (possibly by marking it with a special objectclass attribute value and removing the password). Because the entry is still in the directory, it prevents the name from being reused, but it is impossible to authenticate as that user. It's also useful to store the employee's termination date in the entry so that an automated task can clean up these deletion records.

Almost inevitably, names will change for various reasons. If you choose a naming attribute that has any significance other than its uniqueness, it can change. If you choose a naming attribute that is related to a real-world attribute of the entity being named, or if you choose a hierarchical namespace whose upper components could change, names will also change. The consequences of a name change should be considered carefully. How much trouble will the change cause, and what is the likelihood of its occurrence?

Tip

Beware of thinking that a name change will be the exceptional case, so rare that you would not mind handling such occurrences through even a tedious manual process. Our experience shows that such trouble has a habit of occurring more frequently than you might imagine. You also need to consider what will happen as your directory grows (a prediction likely to come true). What may seem uncommon in a small directory can become a downright nuisance in a large one.

Privacy Considerations

Directory names are usually public information available to anyone who can access the directory. Trying to control access to names via your directory's access control mechanism can often lead to difficulties. In practical terms, it's not possible to hide directory names from clients, no matter how advanced the access control capabilities in your server are. For this reason, you must carefully consider the privacy implications of your namespace design. Your goal should be not to divulge any information through the namespace that you do not intend to divulge.

For example, if you design a namespace for your people entries based on organizational hierarchy, you reveal the part of your organization in which an entry (and presumably the corresponding person) resides. The same problem holds true for many other hierarchical namespace designs.

As described earlier, the attribute after which you choose to name your entries may be considered sensitive. We saw an example earlier involving the use of Social Security numbers for naming attributes. Clearly, this would be a bad idea, so we suggested not using the social security number as a naming attribute.

There are other, more subtle privacy concerns as well. For example, using the cn attribute containing a person's name to name entries has a host of implications. A person's gender can often be inferred from his or her name, as can other information such as nationality or ethnicity . Not to mention the fact that a name is often enough to gain other information ”such as an address, phone number, and so on ”from other publicly available sources.

Care should be taken to protect privacy and to ensure that unwanted disclosure of information is minimized. Keep in mind that things obviously acceptable to you may be completely unacceptable to some of your users. For example, you might not mind disclosing your name or even your address to everyone in your company or the world. However, one of your users who might be concerned about potential harassment or stalking ”or even worse ”might feel differently.

Try to design a namespace that is free of such considerations, and be prepared to make exceptions for people who have legitimate concerns with any design you come up with. It may be a good idea to involve your Legal department to help interpret legal issues associated with directory information privacy. Directory privacy is covered in more detail in Chapter 12, Privacy and Security Design.

Anticipating the Future

Finally, as difficult as it may be, you must try to anticipate the future when designing your namespace. The reason is simple: Redesigning a namespace is a costly and inconvenient process that you want to avoid. Because none of us have a crystal ball, the best we can do is try to avoid common situations in which namespace changes are required.

The question naturally arises, therefore, about the kinds of situations that precipitate a namespace redesign. Here are some of the more common situations:

Choosing the wrong naming attribute can easily lead to a namespace redesign. For example, if you name entries with the cn attribute using a value of first name followed by last name, what do you do when two people have the same name? Either a namespace redesign is required or you must be prepared to artificially make one of the names unique, as described previously.
If your directory starts out under central administrative control, but later you decide to delegate control of a portion of the data, a namespace redesign may be required. As mentioned earlier, some access control implementations do not allow delegation except at subtree boundaries. The same is true for replication and partitioning of the data.
If you choose a hierarchical namespace with a hierarchy based on a geographical, organizational, or other scheme that is likely to change, constant namespace redesigns, both big and small, may haunt you. It is best to avoid this situation altogether from the start. If you choose to reflect your organizational hierarchy, for example, a namespace redesign will be required each time your company reorganizes. For some reorganization-happy companies, this can be a problem!

Although no one can accurately predict the future, you can use some defensive namespace design tactics to minimize your risk. Choosing a flat namespace is one such tactic. Subdividing your namespace on the basis of unchanging information ”perhaps into areas for people, groups, and devices ” permits redesigns in one space that do not affect the others.