2.1 The Active Directory | Microsoft Exchange Server 2003 Administrators Pocket Consultant

< Day Day Up >

Messaging people know what a directory is because they have had to deal with many different versions of directories over the years. Exchange has had its own directory, Lotus Notes has its address book, and the Sun iPlanet mail server has an LDAP directory.

From a messaging standpoint, the ideal directory is:

Available: If an application depends on a directory, it must be there all the time. If the application is distributed around an enterprise, often on a global level, then the directory must also be distributed and available wherever user populations or servers are found.
Secure: Most access to a directory is in read operations, with a relatively small proportion of writes. Any access must be secure. Users must be able to see information they are authorized to access but nothing else. Only privileged users or applications should be able to update the directory.
Scalable: Many Exchange servers support small communities, but some support very large corporate populations that span several hundred thousand mailboxes. As a directory gets larger, its responsiveness must be consistent.
Accessible: The directory should be open to as many clients as possible through the provision of different client interfaces. Once they get in, all clients should be able to perform roughly equivalent operations against the directory at comparable speeds. In other words, no client should be favored or penalized in relation to another.

The Internet has driven the definition of many standards to the benefit of everyone. LDAP and SMTP are obvious examples. During the debates to formulate standards, the question of how to define a directory has been considered many times. Here is a widely quoted description that comes from the "Introduction to Slapd and Slurpd," published by the University of Michigan:^[1]

"A directory is like a database but tends to contain more descriptive, attribute-based information. The information in a directory is generally read much more often than it is written. As a consequence, directories don't usually implement the complicated transaction or rollback schemes regular databases use for doing high-volume complex updates. Directory updates are typically simple all-or-nothing changes, if they are allowed at all. Directories are tuned to give quick response to high-volume lookup or search operations. They may have the ability to replicate information widely in order to increase availability and reliability while reducing response time. When directory information is replicated, temporary inconsistencies between the replicas may be OK, as long as they get in sync eventually."

It is good to compare this description of the characteristics of the ideal directory with Microsoft's implementation of the AD. In most cases, Microsoft has done a good job of matching up the AD against the definitions, at least on a feature-by-feature level. Implementation is the true test of any software, and there is no doubt that a bad implementation will ruin any product.

From Windows 2003, AD supports application partitioning to allow applications to create their own section of the directory and decide what data to hold and how and where it replicates. Some applications generate configuration and other data that is unsuited to the replication and control mechanisms used by a general-purpose directory such as AD, so it is good to have the ability to exercise more control. For example, if you install AD-integrated DNS in a new Windows 2003 deployment, DNS uses its own partition to avoid the need to replicate DNS information to GCs. However, neither Exchange 2000 nor 2003 use this feature.

2.1.1 Multiple forests or just one

When Microsoft initially released Exchange 2000, best practice for AD designs was very simple. Exchange supports a single organization per AD forest, so Microsoft and system integrators told customers that they should deploy a single AD forest and use that as the foundation of their deployment. Another major point of best practice was a focus on reducing the overall number of domains to simplify the infrastructure in response to the fact that most companies had deployed far too many Windows NT domains.

Some customers required multiple Exchange organizations, perhaps because they wanted to keep separate email systems for different operating units within the company-less often when they wanted to use different AD schemas. However, because no utilities existed to integrate the AD forests to create a single view of the Exchange organizations, not many companies took this route.

Over time, system designers realized that the restriction of one organization per forest limited the flexibility of Exchange in a period of great corporate volatility when mergers and acquisitions were the order of the day. Another influence was the realization that the forest is the only true security boundary for Windows-not the domain, as had been assumed. Since 2000, we have seen increasing interest in running multiple Exchange organizations and a corresponding requirement for utilities to bridge the gap between the organizations. Microsoft's normal answer to this need includes products such as Metadirectory Services (MMS), utilities to synchronize public folders between organizations, and documentation describing how to implement Exchange in a multiforest environment. Other companies have their own utilities. For example, HP's LDAP Directory Synchronization Utility^[2] (LDSU) synchronizes entries from any LDAP-compliant directory, and you can use LDSU to synchronize entries between different instances of the AD.

If you start an AD design today, you may consider running two or more Exchange organizations and therefore two or more instances of AD. It is true that the larger the company, the more obvious the need to consider the multiforest option. If you are in this situation, you should take the time to read the latest opinions on the subject, including the white papers available from Microsoft. For now, we can summarize the situation as follows:

The Exchange architecture assumes that servers run inside a single forest, so you attain full functionality immediately by deploying this model. Therefore, always commence a design by assuming that you will use a single instance of the AD and a single Exchange organization. This is always the best option when you operate a highly centralized management model where you can maintain tight control over security and access to sensitive systems such as domain controllers.
Only consider a multiforest deployment when you have good need for such an implementation. For example, you may have an administrative model that gives great control to many units within the company, all of which want to manage their own servers and applications. Remember that multiple forests increase complexity and introduce the need for synchronization that does not exist in a single forest. In addition, you will probably need to deploy additional hardware (servers) to form the core of each forest. You may also need to deploy multiple instances of add-on applications to serve each forest. For example, a single instance of the BlackBerry Enterprise Server may be sufficient to serve users who have mailboxes on multiple servers in a small Exchange organization. With multiple organizations, you would need multiple BlackBerry Enterprise Servers. Management software may not be able to monitor and control servers in multiple organizations.

We can assume that the number of multiforest deployments will increase in the coming years and that Microsoft will respond with better tools and better flexibility in both Windows and Exchange. The basic building block remains the AD, so that is where we need to head next.

2.1.2 Components of the Active Directory

The AD is core to Windows and provides the repository for the operating system to store essential information required to run itself as well as dependent applications such as Exchange. As shown in Figure 2.1, many different objects can be stored in the AD, including:

Users and groups
Security credentials such as X.509 certificates
Information about computers and applications
Configuration data, including information about Windows sites
DNS routing information

click to expand
Figure 2.1: The contents of the Active Directory.

The schema defines the information held in the AD, including objects and their attributes. Applications can extend the schema to define their own objects or add additional properties to existing objects. Exchange relies on the AD for all user management, including the creation, modification, and deletion of mailboxes, and therefore extends the default schema to add support for messaging properties, such as email addresses, the store where a user's mailbox is located, and so on. You can add messaging attributes to any object that can be mail enabled (users, contacts, public folders, and groups), although the same set of attributes is not available for each object type. A discussion about how you can change the schema begins is given in section 2.8.3.

2.1.3 Moving to a Windows namespace

The AD uses a hierarchical namespace. From Windows 2000 onward, DNS provides the default naming service for Windows, so Windows aligns the AD namespace with DNS. All objects in the directory have distinguished names based on the DNS namespace as implemented within a company. This is quite different from Windows NT, which uses a flat namespace.

Figure 2.2 illustrates the difference. The Windows NT namespace comes from a set of three master account domains (in this case, named dom1, dom2, and dom3). We also find that each master account domain has a set of resource domains beneath it. Each of the resource domains has a separate namespace, and one-way trust relationships form the link between each resource domain and its master account domain. Most large Exchange 5.5 deployments use this type of Windows NT domain structure to isolate the Exchange servers from the master account domains and ensure that access to the servers is limited to accounts that hold the necessary permissions in the resource domain.

click to expand
Figure 2.2: Windows NT and AD namespaces.

2.1.4 Forests, trees, and domains

DNS provides the foundation for the Windows namespace. A single Windows domain is a very simple namespace, but most organizations will have a namespace consisting of one or more trees of Windows domains. Windows defines a domain tree as a hierarchical organization of domains joined by trust relationships. Trees may be joined together to form a single forest, but in some cases the trees will be kept separate and may not be joined at all.

Figure 2.3 illustrates a design for an AD forest that spans three trees: one each for Compaq, Tandem, and Digital, the companies that merged in the 1996-1998 period to form Compaq Computer Corporation. Of course, Compaq merged with HP in 2002 and the AD designers had a chance to redo a Windows NT migration again, because HP still used Windows NT. After looking at various options, they made the decision to continue using the same AD forest and use their well-proven migration techniques to move users and other objects over into the forest. However, the details of how Compaq and HP came together in a technical sense around a common AD forest is enough material for a separate book, so for the moment we can use the original situation at Compaq to explore some of the design issues that you have to consider when you deploy AD.

click to expand
Figure 2.3: AD forest and trees.

The domains in each tree have a contiguous namespace, meaning that the namespace has a common root based on the name given to each domain at its level within the tree. Thus, the us.compaq.com domain shares a common namespace with compaq.com, as does the sales.us.compaq.com domain. Windows defines a forest as a collection of trees joined by Kerberos trust relationships. The three trees for Compaq, Tandem, and Digital form a forest, but each tree has its own namespace, which it does not share with the other trees. Thus, we say that a forest has a discontiguous namespace. Each of the trees maintains its own DNS database and namespace, but when you join the trees to form the forest, you can manage it as a single entity, providing you have the appropriate permissions. All the domains in a forest share a common configuration and schema, and the different domains replicate partial details of all of their objects to Global Catalog servers so that we can establish a single view of objects across the entire forest.

When the Compaq designers looked at the right AD structure for the organization, they could have decided to keep separate namespaces for each operating company and unify everything through the forest. However, the namespace was only one issue to consider when deciding what type of design to use. In other situations, such as when an ASP needs to host multiple companies within a single AD, you find that the opposite and multiple namespaces are used, one for each company. Other important factors included the need to unify the three companies with a common network built on the class "A" IP address space owned by Digital^[3] and a desire to replace three different Windows NT designs with a common approach.

After considering all of the alternatives, Compaq decided to build a brand new Windows infrastructure instead of upgrading the three Windows NT implementations and attempting to unify objects in a forest of domains. Another factor that drove this decision is that a straight migration of many Windows NT infrastructures does not result in particularly good AD designs. This is understandable when you consider that many of the features exploited by administrators (such as resource domains) are workarounds to compensate for limitations in Windows NT, such as its limited security model. Other large companies took similar decisions when the time came for them to move to Windows 2000 and, in general, the approach of building a new Windows infrastructure and then cloning user accounts when they need to move is a good and valid method of implementing Windows. However, if you run a small company that has a single domain and just a few servers, a direct upgrade is the right path for you to take.

When finally deployed, Compaq's new Windows infrastructure used the cpqcorp.net namespace, retaining the existing compaq.com name for external communications. Servers received cpqcorp.net names as they joined the domain, and users started to use the new namespace as their accounts moved over.

It is worth emphasizing that the decision to use two separate namespaces for internal and external names has nothing whatsoever to do with Windows. Compaq made the decision for administrative convenience to avoid any confusion between internal and external systems and to make it easier to configure proxies for browsers. Thus, because I am located in Europe, I log on to the emea.cpqcorp.net domain. Anyone with experience in the DNS naming convention can scan a name like this and know that the domain is a child of cpqcorp.net, which is under the .net root of the overall DNS namespace. Note that the namespaces used by Windows and DNS can be different. At HP, we now have an hpqcorp.net DNS namespace running with the cpqcorp.net Windows namespace.

2.1.5 Domain renaming

According to the documentation, you can rename a domain after installation if you run a native-mode Windows 2003 forest. In other words, you can rename a domain in the midst of a deployment without having to reinstall the operating system on all of the controllers in a domain. However, you cannot use this feature in any domain that supports an Exchange server, because Microsoft was not able to do the testing required to determine the impact of a domain rename on its configuration data held in the AD. Microsoft is certainly aware that companies sometimes need to rename domains, especially in times of mergers and acquisitions. You can, therefore, expect to see future Microsoft support for domain renames even in domains that include Exchange, perhaps as early as Exchange 2003 SP1 or SP2.

2.1.6 The Global Catalog

You can think of the Global Catalog (GC) as a special form of a DC that contains information replicated from every domain in a forest. The GC maintains a collection of every object in the AD, and the copy of the AD maintained on a GC is the closest comparison to the entries held in the "old" Exchange 5.5 Directory Store.

The GC contains full information about every object in its own domain plus partial information about the objects in other domains. You can update the objects from the GC's own domain, but the objects from other domains are read only. Windows defines the partial attribute set replicated by GCs by properties of the attributes set in the AD schema. You can modify attributes to include or exclude them in the replication set, always taking the impact on replication into account.

From a Windows perspective, you can perform updates to the AD schema only at the server that acts as the schema master for the forest. The default set of replicated attributes includes the information necessary to allow the GC to serve as a GAL for Exchange and its clients. The GC also holds information about the membership of universal groups, which can contain objects from any domain in the forest. During authentication, Windows performs a lookup against a GC to build a complete security ticket for a client. Of course, if you only operate a single domain, then all DCs are GCs and Windows can execute the lookup for universal group membership against the same controller. See section 2.6 for further information on how Exchange uses DCs and GCs.

The first DC in a domain is automatically a GC, and you can subsequently promote any other DC to become a GC by modifying the NTDS connection object of the server to mark it as a GC. This action forces the controller to publish a special "_gc" service record into DNS to advertise the fact that it now can act as a GC and begin the process of requesting information from other domains in the forest. At the same time, the server enables port 3268 for read-only access and starts a special "listener thread" in order to respond to replication messages sent to the GC from every other domain in the forest.

Typically, the GC satisfies queries that might be forest wide. For example, universal groups can contain users from any domain in a forest, so the membership of a universal group can only be resolved by consulting the GC, which you can browse to find objects that might not belong to a specific domain. You could find an object by drilling down through each domain in each tree, but it is obviously more convenient to consult the index provided by the GC.

Exchange clients depend on the GAL, so the role of the GC is tremendously important within an Exchange organization. Without easy access to a GC, clients will not be able to consult the GAL, but the dependency on the GC extends even deeper to the heart of Exchange. The Routing Engine validates mail addresses against the GC when it decides how to dispatch messages across available routes, so if replication is incomplete, you may find that you cannot send messages to some users whose mailboxes are located in domains that have problems replicating data to the GC.

^[1] . Slapd is the standalone LDAP daemon, while Slurpd is the standalone LDAP update replication daemon. You can find the paper by using any of the common search engines. A good example is at http://www.aeinc.com/aeslapd/slapdadmin/1.html.

^[2] . Search HP's Web site at www.hp.com for the latest information about LDSU.

^[3] Digital was one of the original companies on the Internet, and as a result held a complete class "A" network. All IP addresses that begin with 16 now belong to Compaq. Network 15 was then added to the mix when HP and Compaq merged in 2002, and a network redesign is under way to decide how best to use these network addresses.

< Day Day Up >