Replication Issues | Windows .NET Server 2003 Domains & Active Directory

The Active Directory service is a distributed network database, and synchronization of its replicas stored on different domain controllers is vital in order for the whole service to work. Replication issues are one of the main sources of trouble for an administrator; that is why we will consider some replication-related topics in more detail.

Intra- and Inter-Site Replication

The concept of the site that has appeared in Active Directory domains significantly affects the methods of replicating directory partitions (within a site and between sites) as well as the replication transports (protocols).

Replication Transports

The following table lists the rules applicable to the various transports used for different types of replication:

Directory partitions	Within a site	Between sites
Directory partitions	Within a site	The same domain	Different domains
Domain Naming context	RPC over IP	"IP" (RPC over IP)	–
Configuration Schema Global Catalog		"IP" (RPC over IP)	"IP" (RPC over IP) or SMTP over IP
	Uncompressed	Compressed (by default)

"RPC over IP" enables high-speed, synchronous replication.

"IP" (as accepted in the Active Directory Sites and Services snap-in) enables a low-speed, point-to-point, synchronous replication for all directory partitions.

"SMTP over IP" provides low-speed, asynchronous replication between sites and supports only Configuration, Schema, and Global Catalog replication; requires installation of an enterprise Certification Authority (CA).

Important

It is not possible to change the default replication transport for connections generated automatically by the directory system. You may either agree with the protocol used or mark the connection as manually created (in the Active Directory Sites and Services snap-in, such connections will have GUID names, whereas for the user created connections you can specify any name).

Normal Replication Intervals

There are two default methods of replicating object changes in Active Directory forests:

Change notification is usually used between DCs within a site. If a DC updates an object attribute, it will send notification to its first replication partner within a specified time interval (5 minutes by default). Then, the partner "pulls" the changes from the originating DC. You can change the default interval (300 seconds) by modifying the Replicator notify pause after modify (secs) value under the HKLM\SYSTEM\CurrentControlSet\Services\NTDS\Parameters registry key. The originating DC will notify the next replication partner within the time specified by the Replicator notify pause between DSAs (secs) registry value (30 seconds by default). On Windows .NET domain controllers, these values are not defined, and the defaults are used. Therefore, you should first create the key values.
Changes are replicated between sites according to a schedule (configured for site links and connections). You can configure change notifications between sites too. (Microsoft, however, does not generally recommend such a practice.) To do so, point to the Sites | Inter-Sites Transports | IP node in the Active Directory Sites and Services snap-in, open the Properties window, and check the Ignore schedules box on the General tab.

Tip

Use ReplMon.exe for monitoring various replication parameters. Select a monitored server and run the Generate Status Report command from the context menu.

If you experience unrecoverable problems after modification of the replication structure, you can delete all disturbing connections in the Active Directory Sites and Services snap-in and re-create them by starting the Knowledge Consistency Checker (KCC) for each affected server. To do so, right click an NTDS Settings object and select the All Tasks | Check Replication Topology command from the context menu.

See also the "Replication Issues" section in Appendix B.

Urgent Replication

Certain events on DCs are replicated immediately rather than at predefined intervals. This is known as urgent replication. The following events on any domain controller trigger urgent replication between DCs in the same site:

Setting an account lockout after a certain number of failed user logon attempts
Changing a Local Security Authority (LSA) secret
Changing the RID Master FSMO role owner

(If a change notification is configured between sites, the urgent replication can be propagated to other sites.)

Urgent replication in Active Directory domains is not initiated by the following events:

Changing the Account Lockout Policy
Changing the domain Password Policy
Changing the password on a machine account
Changing inter-domain trust passwords

If a user changes the password at a specific DC, that DC attempts to urgently replicate the changes to the PDC Emulator. The updated password is then normally replicated to other DCs located in the same site. If the user is repeatedly authenticated by a DC that has not yet received the updated password, this DC refers to the PDC Emulator to check the user credentials.

Replication of Group Policy Objects

A Group Policy Object (GPO) consists of two parts. One part is located in Active Directory (the DS part), and the other part is stored on the hard disk in the SYSVOL volume (the Sysvol part). Hence, GPOs are replicated in two ways: by normal Active Directory replication and by File Replication Service (FRS). If one replication completes successfully, this does not mean that there are no problems with the other replication. That is why you should monitor the consistency of GPOs. For that purpose, use such tools as Active Directory Replication Monitor (ReplMon.exe) (see a server's Status Report) or GPOTool.exe that display the DS version and the Sysvol version separately for each GPO. (See detailed description of GPOTool.exe in Chapter 15, "Group Policy Tools.")

Monitoring Replication

You may wish to monitor replication events both in a test and a field environment. (The only difference may be in the level of detail given for registering events.) To fulfill this task, it is possible to use the event logs and performance counters.

Logging Replication Events

You can use the Directory Service event log for monitoring such events as the moments of replication request completion, the number, total size, and names of replicated attributes, and so on. The granularity level of logged events is set through the system registry (see below).

Set the 5 Replication Events value at the HKLM\SYSTEM\CurrentControlSet\Services\NTDS\Diagnostics registry key equal to 3 or 4 (the difference between the cases will be discussed later). This will help you to see all replication requests, the sequence of replicated directory partitions, and the result of the requests. (Two domain controllers from the same domain — NETDC3 and NETDC4 — are used in the following examples.) The following two events are logged after each directory partition has been successfully replicated (NETDC4 asks NETDC3 for the changes):

    Event Type: Information    Event Source: NTDS Replication    Event Category: Replication    Event ID: 1060    ...    User: Everyone    Computer: NETDC4    Description:    Internal event: The directory replication agent (DRA) call completed    successfully.    - - - - -    Event Type: Information    Event Source: NTDS Replication    Event Category: Replication    Event ID: 1488    ...    User: NETDOM\administrator    Computer: NETDC4    Description:    Internal event: The Directory Service completed the sync request with    status code 0.

Any replicated information is logged as an event similar to the following (NETDC3 asks NETDC4 for the outbound changes):

    Event Type: Information    Event Source: NTDS Replication    Event Category: Replication    Event ID: 1073    ...    User: NETDOM\NETDC3$    Computer: NETDC4    Description:    Internal event: The directory replication agent (DRA) got changes returning 2 objects, 2448 bytes total and entries up to update sequence number (USN) 100225, with extended return 0.    - - - - -    Event Type: Information    Event Source: NTDS Replication    Event Category: Replication    Event ID: 1490    ...    User: NETDOM\NETDC3$    Computer: NETDC4    Description:    Internal event: The Directory Service finished gathering outbound    changes with the following results:    Object Update USN: 100225    Attribute Filter USN: 100225    Object Count: 2    Byte Count: 2448    Extended Operation Result: 0    Status: 0

To see replication events, set the registry value to at least level 3. Level 4 allows you to track replication of each changed attribute. (Use this level for debugging only!

If many objects are replicated, the number of events in the Directory Service log may be huge. This also significantly affects the performance of the DC.) You can easily find all replicated data by the Event ID. A message (ID 1239) similar to the following is written in the log for all unchanged attributes of the replicated objects:

    Event Type: Information    Event Source: NTDS Replication    Event Category: Replication    Event ID: 1239    ...    User: NETDOM\NETDC3$    Computer: NETDC4    Description:    Property 20002 (whenCreated) of object CN=John    Smith,OU=Staff,DC=net,DC=dom (GUID 3b2653cf-76e6-4d35-abf5-ec4c78fad8ee)    is not being sent to DSA a9d28d8e-e681-449f-blbe-38dadf6f4c06 because    its up-to-date vector implies the change is redundant.

If an attribute was changed, it is replicated to another DC, and a message with the ID 1240 will appear in the log:

    Event Type: Information    Event Source: NTDS Replication    Event Category: Replication    Event ID: 1240    ...    User: NETDOM\NETDC3$    Computer: NETDC4    Description:    Property d (description) of object CN=John Smith,OU=Staff,DC=net,DC=dom    (GUID 3b2653cf-76e6-4d35-abf5-ec4c78fad8ee) is being sent to DSA    a9d28d8e-e681-449f-blbe-38dadf6f4c06.

You can filter out events that do not have ID 1240, and quickly check all replicated objects and changed attributes.

Using the Performance Counters

Performance counters are very useful to monitor replication events, especially the replication traffic. To start monitoring, run the Performance snap-in (from the Administrative Tools group). Select System Monitor and click the Add button on the taskpad. Select NTDS in the Performance object list and add the counters shown in Fig. 6.1.

(The Report View seems to be the most useful in this case.) All counters have zero values immediately after startup of the DC. (The NTDS performance object has a great number of counters, and I have selected only some of them that are related to replication.)

click to expand
Fig. 6.2: Some of the performance counters important for monitoring replication

Note

As you can see in Fig. 6.1, compression of replication traffic (both inbound and outbound) can reach a ratio 12 to 1 (ratios up to 20 to 1 are also possible). If the replicated block of information is not large enough (less than 32 Kbytes), compression of inter-site traffic is not carried out.

You can also create a custom MMC console that will contain a copy of the System Monitor Control for each selected domain controller. (Start an MMC console, select ActiveX Control in the Add Standalone Snap-in window, and find the System Monitor Control in the Control type list. Repeat this procedure for each desired DC. Then, select the performance counters from different DCs and add them to the appropriate ActiveX controls.) See also the "Managing Replication" section in Chapter 11, "Verifying Network and Distributed Services."

The DRA Sync Requests Successful value must normally be equal to the DRA Sync Request Made value. However, these values usually differ on the DC(s) booted first, when its replication partners may yet be non-operational. Remember the values when all DCs in the network are online and replicated. From that moment on, the difference between these values must not increase.

The DRA Pending Synchronizations counter displays the number of replication requests that have not been completed yet (any result — a success or a failure — is possible).

The DRA Inbound/Outbound Bytes Not Compressed (Within Site) Since Boot counters register both intra- and inter-domain replication traffic, provided this traffic has not been compressed (i.e., replication block size does not exceed 32 KB).