Chapter 10: Diagnosing and Maintaining Domain Controllers

The two system tools described in this chapter treat domain controllers as components that represent Active Directory as a distributed network service. In order for Active Directory to function properly as a whole, all of its elements — Directory System Agents (DSA) disposed on domain controllers — must be in good operational condition. Therefore, an administrator must have tools that can help him or her to test the actual state of every DC. Moreover, an administrator must have repair tools for those emergency cases when the Active Directory database needs restoring, cleaning up, or other "low-level" operations.

Domain Controller Diagnostic Tool (DCdiag.exe) (ST)

This utility, as a matter of fact, is a complex test (or more precisely — a set of specialized tests) that allows an administrator to give a DC a quick "check up" and locate any possible problems. DCdiag verifies the serviceability of a DC as well as its relations (connectivity, trusts, replication issues, etc.) with other DCs that are the replication partners of the selected DC. Thus, the utility primarily checks up parts of Active Directory at a functional rather than a logical level (i.e., aspects such as data consistency, semantic contents, etc., are not affected).

It would be a good idea to run DCdiag after and even before (see the description of the DcPromo and RegisterInDns tests in Chapter 5, "Installing Active Directory") installation of domain controllers. It is fairly typical — especially in small networks — for an administrator to successfully (in his or her opinion) install Active Directory on the first DC and consequently believe that the domain will automatically work correctly since no faults were detected or error messages received. Problems usually begin when the administrator adds clients to the domain or installs a second DC (in the same or a different domain). Apparently, the first (root forest) DC was configured incorrectly; very often this concerns DNS service (on a Windows 2000/.NET Server or a third-party server). In such a situation, DCdiag (especially in conjunction with Net-Diag.exe) could locate many potential problems from the beginning.

The DCdiag utility can generate a very generous output — most of all, with enterprise-wide tests, and you should therefore use log files for subsequent analysis of the results. The diagnostic messages are quite informative and very often plainly specify a problem, so you need only eliminate it without any further analysis.

The Windows .NET version of DCdiag is sufficiently documented. You can use the built-in Help (dcdiag /?) as a quick reference on all parameters. (The built-in Help seems to be more accurate than Support Tools Help.) The Windows 2000 Server Resource Kit contains a great deal more information on how to use this utility. We shall discuss some of the most interesting features of DCdiag and examples of their use.

Note 

For the Windows 2000 environment, it is recommended that you download the updated version of DCdiag from http://www.microsoft.com/downloads/release.asp?ReleaseID=22939.

You can run DCdiag from any network computer and test any DC in the forest. Some tests can be performed under a normal user account (Replications, NetLogons, and ObjectsReplicated cannot), but to get the full functionality of DCdiag, you must either be logged on as an administrator (or even as an enterprise administrator) or provide an administrator's credentials with the command.

The New Version of DCdiag

In comparison with Windows 2000, the Windows .NET version of DCdiag provides a few new tests:

CrossRefValidation

CheckSDRefDom

VerifyReplicas

VerifyReferences

VerifyEnterpriseReferences

 

These tests are especially informative when used with /a and /e parameters that force the testing of all DCs in the current site or in the forest, respectively. Using these tests, you can quickly locate fault elements of the replication infrastructure — invalid cross-references, improper security descriptors, directory partitions that are not fully instantiated, etc. — particularly when there are many application directory partitions deployed in your enterprise.

Standard Full Test

First, let us look at what happens if DCdiag is passed successfully. You will see from this output how the tests are structured and which tests the DCdiag tool contains. The Topology, CutoffServers, and OutboundSecureChannels tests are omitted by default, so you must run them explicitly or use the /c (comprehensive) parameter.

In the Windows 2000 environment, DCdiag runs faster if you specify the DC's DNS name rather than its NetBIOS name. (If the DC name is omitted, your current logon server is implied.) The /a or /e parameters specified in a command allow you to run a selected test on every DC in a site or in the forest.

If you want to retrieve the maximum possible amount of information from DCdiag, use the /v (verbose) parameter with any test. As well, the /d parameter is equal to /v, and, in addition, produces plenty of information (pDsInfo) on your forest configuration. With these parameters, use the more pipe or redirect the output to a file in the current or another folder with the /f parameter. (The pDsInfo section of tests is never redirected to a file: use the pipe to save that information.)

Here is a typical command for testing a DC:

     C:\>dcdiag /s:netdc3.net.dom [| more] 

The command's output will be similar to:

     Domain Controller Diagnosis      Performing initial setup:            Done gathering initial info.      Doing initial required tests            Testing server: NET-Site\NETDC3               Starting test: Connectivity                  ...........................NETDC3 passed test Connectivity      Doing primary tests           Testing server: NET-Site\NETDC3              Starting test: Replications                 ...........................NETDC3 passed test Replications              Starting test: NCSecDesc                 ...........................NETDC3 passed test NCSecDesc              Starting test: NetLogons                 ...........................NETDC3 passed test NetLogons              Starting test: Advertising                 ...........................NETDC3 passed test Advertising              Starting test: KnowsOfRoleHolders                 ...........................NETDC3 passed test KnowsOfRoleHolders              Starting test: RidManager                 ...........................NETDC3 passed test RidManager              Starting test: MachineAccount                 ...........................NETDC3 passed test MachineAccount              Starting test: Services                 ...........................NETDC3 passed test Services              Starting test: ObjectsReplicated                 ...........................NETDC3 passed test ObjectsReplicated              Starting test: frssysvol                 ...........................NETDC3 passed test frssysvol              Starting test: kccevent                 ...........................NETDC3 passed test kccevent              Starting test: systemlog                 ...........................NETDC3 passed test systemlog              Starting test: VerifyReferences                 ...........................NETDC3 passed test VerifyReferences           Running partition tests on : Schema              Starting test: CrossRefValidation                 ...........................Schema passed test CrossRefValidation              Starting test: CheckSDRefDom                 ...........................Schema passed test CheckSDRefDom           Running partition tests on : Configuration              Starting test: CrossRefValidation                 ...........................Configuration passed test CrossRefValidation              Starting test: CheckSDRefDom                 ...........................Configuration passed test CheckSDRefDom        Running partition tests on : net              Starting test: CrossRefValidation                 ...........................net passed test CrossRefValidation              Starting test: CheckSDRefDom                 ...........................net passed test CheckSDRefDom        Running enterprise tests on : net.dom              Starting test: Intersite                 ...........................net.dom passed test Intersite              Starting test: FsmoCheck                 ...........................net.dom passed test FsmoCheck 

The first two sections — Initial setup and Initial required tests — are always executed, even if you specify only one test. You may first run the full test to find any problems, which may exist. Then, it's advisable to run tests selectively in verbose mode to get a detailed diagnosis.

In practice, it is handy to run DCdiag with the /q parameter. If the DC is working properly, DCdiag will not display any messages at all, so you do not need to worry about anything. Otherwise, only the failed tests will be reported.

Error and diagnostic messages (in verbose mode) are very descriptive, so it is un-necessary to give many examples.

Testing DNS Registration and Accessibility of DCs

The mandatory Connectivity test is executed on every DCdiag test run. This test verifies the most important functionalities of a domain controller: whether all DNS resource records are registered on the preferred DNS server, and whether DCs are pingable and have LDAP/RPC accessibility. For example, using /a or /e parameters, you can quickly find all DCs that are not responsible. The following output will be displayed:

     C:\>dcdiag /test:connectivity /e     ...        Testing server: NET-Site\NETDC2           Starting test: Connectivity              Server NETDC2 resolved to this IP address 192.168.0.2,              but the address couldn't be reached(pinged), so check the              network.              The error returned was: Win32 Error 11010              This error more often means that the targeted server is              shutdown or disconnected from the network              ..........................NETDC2 failed test Connectivity     ... 

The next output sample illustrates errors with DNS registration of the selected DC:

     ...     Doing initial required tests        Testing server: NET-Site\NETDC1           Starting test: Connectivity              The host 02c2b1f6-e9b6-4e64-91f6-3a54b087bacc._msdcs.net.dom              could not be resolved to an IP address. Check the DNS server,              DHCP, server name, etc. Although the Guid DNS name (02c2b1f6-              e9b6-4e64-91f6-3a54b087bacc._msdcs.net.dom) couldn't be              resolved, the server name (netdc1.net.dom) resolved to the IP              address (192.168.0.1) and was pingable. Check that the IP              address is registered correctly with the DNS server.              ...........................NETDC1 failed test Connectivity     Doing primary tests        Testing server: NET-Site\NETDC1           Skipping all tests, because server NETDC1 is           not responding to directory service requests     ... 

As you can see from the foregoing messages, none of the other tests will even be started if there are any errors in the Connectivity section of DCdiag. You should first eliminate any existing issues before you continue the diagnosis. To locate the problem shown, use the netdiag /test:DNS command, and check all warnings in the "DNS test" section of the output data.

Verifying Replication

DCdiag allows an administrator to resolve replication problems quite well. Let us suppose that a site contains three domain controllers, one of which is refusing to replicate with its partners. The following command will test all DCs and check replication issues on each DC:

     C:\>dcdiag /test:Replications /a /v 

Only failed replication events will be included in the resulting report. The output of this command is the following:

     Domain Controller Diagnosis     Performing initial setup:        * Verifying that the local machine netdc1, is a DC.        * Connecting to directory service on server netdc1.        * Collecting site info.        * Identifying all servers.        * Identifying all NC cross-refs.        * Found 3 DC(s). Testing 3 of them.        Done gathering initial info.     Doing initial required tests        Testing server: NET-Site\NETDC1           Starting test: Connectivity              * Active Directory LDAP Services Check              * Active Directory RPC Services Check              ...........................NETDC1 passed test Connectivity        Testing server: NET-Site\NETDC3     ...        Testing server: NET-Site\NETDC2     ...     Doing primary tests        Testing server: NET-Site\NETDC1           Starting test: Replications              * Replications Check              [Replications Check,NETDC1] A recent replication attempt failed:                    From NETDC3 to NETDC1                    Naming Context: DC=net,DC=dom                    The replication generated an error (8456):                    Win32 Error 8456                    The failure occurred at 2002-05-11 20:50:59.                    The last success occurred at 2002-05-11 19:48:32.                    6 failures have occurred since the last success.                    Replication has been explicitly disabled through the server options.                 [Replications Check,NETDC1] A recent replication attempt failed:                    From NETDC3 to NETDC1                    Naming Context: CN=Configuration,DC=net,DC=dom     ...                 [Replications Check,NETDC1] A recent replication attempt failed:                    From NETDC3 to NETDC1                    Naming Context: CN=Schema,CN=Configuration,DC=net,DC=dom     ...                 *Replication Latency Check     [Information on all directory partitions stored on that DC is reported here.]                    DC=ForestDnsZones,DC=net,DC=dom                       Latency information for 1 entries in the vector were                       ignored. 0 were retired Invocations. 1 were either:                       read-only replicas and are not verifiably latent, or dc's                       no longer replicating this nc. 0 had no latency                       information (Win2K DC).                    DC=DomainDnsZones,DC=net,DC=dom     ...                    CN=Schema,CN=Configuration,DC=net,DC=dom     ...                    CN=Configuration,DC=net,DC=dom     ...                    DC=net,DC=dom     ...                    DC=subdom,DC=net,DC=dom     ...                    ...........................NETDC1 passed test Replications     ...               Testing server: NET-Site\NETDC3                  Starting test: Replications                     * Replications Check                     [Replications Check,NETDC3] Outbound replication is disabled.                     To correct, run "repadmin /options NETDC3 -DISABLE_OUTBOUND_REPL"                     ...........................NETDC3 failed test Replications     ...        Testing server: NET-Site\NETDC2           Starting test: Replications              * Replications Check              Skipping server NETDC3, because it has outbound replication disabled              Skipping server NETDC3, because it has outbound replication disabled              *Replication Latency Check     ...              ...........................NETDC2 passed test Replications     ... 

As you can see from the test output, DCdiag provides comprehensive information about failed connections for each DC in the site and on each directory partition.

If the local DC has not received replication information from a number of DCs within the configured latency interval (24 hours by default; see Appendix B), messages similar to the following ones will be included in the output:

                REPLICATION-RECEIVED LATENCY WARNING                NETDC1: Current time is 2002-05-10 19:24:28.                   CN=Schema,CN=Configuration,DC=net,DC=dom                      Last replication recieved from NETDC3 at 2002-05-09     19:16:55.                   CN=Configuration,DC=net,DC=dom                      Last replication recieved from NETDC2 at 2002-05-09     19:48:56.     ... 

To obtain additional information on replication topology, you can also use the repadmin /showconn command that verifies whether all required connections between DCs were created.

Testing Application Directory Partitions

Application directory partitions that appeared in Windows .NET can generate specific replication problems, and the Windows .NET version of DCdiag offers new tests for troubleshooting similar issues. The following example illustrates the VerifyReplicas test that checks on whether all application partitions have replicas stored on the DCs specified as replica servers for these partitions. The problem presented here was caused by missing permissions for the Enterprise Domain Controllers group on the DC=ForestDnsZones, DC=net, DC=dom partition, and could be easily detected by the NCSecDesc test. When the proper permissions are missed, the NETDC2 and NETDC3 domain controllers could not create the replicas of that partition and, therefore, generate the appropriate replication connections.

     C:\>dcdiag /test:VerifyReplicas /a     Domain Controller Diagnosis     ...     Doing initial required tests     ...     Doing primary tests        Testing server: NET-Site\NETDC1           Starting test: VerifyReplicas              ...........................NETDC1 passed test VerifyReplicas        Testing server: NET-Site\NETDC3           Starting test: VerifyReplicas              This NC (DC=ForestDnsZones,DC=net,DC=dom) is supposed to be              replicated to this server, but has not been replicated yet. This              could be because the replica set changes haven't replicated here              yet. If this problem persists, check replication of the              Configuration Partition to this server.              ...........................NETDC3 failed test VerifyReplicas              Testing server: NET-Site\NETDC2                 Starting test: VerifyReplicas     ...                    ...........................NETDC2 failed test VerifyReplicas     ... 

Enterprise Wide Tests

Enterprise tests check on many elements that are vitally necessary in order for an enterprise (forest) to work: intersite links, bridgehead servers, FSMO role owners and their accessibility, etc.

It is not a good idea to run the Intersite test on one DC only, so you must include the /a (current site) or /e (entire enterprise) parameters.

Here is a snippet of the test output for two sites (NET-Site and Remote-Site) in the forest net.dom (some lines are in bold for clarity):

     C:\>dcdiag /test:Intersite /e /v     ...        Running enterprise tests on : net.dom           Starting test: Intersite           Doing intersite inbound replication test on site NET-Site:                 Locating & Contacting Intersite Topology Generator (ISTG) ...                    The ISTG for site NET-Site is: NETDC3.                 Checking for down bridgeheads ...                    Bridghead Remote-Site\NETDC2 is up and replicating fine.                    Bridghead NET-Site\NETDC1 is up and replicating fine.                 Doing in depth site analysis ...                    All expected sites and bridgeheads are replicating into                    site NET-Site.              Doing intersite inbound replication test on site Remote-Site:                    Locating & Contacting Intersite Topology Generator (ISTG) ...                       The ISTG for site Remote-Site is: NETDC2.                    Checking for down bridgeheads ...                       Bridghead NET-Site\NETDC1 is up and replicating fine.                       Bridghead Remote-Site\NETDC2 is up and replicating fine.                    Doing in depth site analysis ...                       All expected sites and bridgeheads are replicating into                       site Remote-Site.                ........................... net.dom passed test Intersite 



Windows  .NET Domains & Active Directory
Windows .NET Server 2003 Domains & Active Directory
ISBN: 1931769001
EAN: 2147483647
Year: 2002
Pages: 154

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net