< Day Day Up > |
On a larger scale than a single DC is the issue of the restoration of the entire AD or large pieces of it. Previously, we discussed nonauthoritative restore, which was a simple backup and restore from media or just letting a DC sync from other DCs. This section discusses the authoritative restore, which allows you to back the AD to the state it was in at a point of time in the past, essentially allowing a rollback of the AD. Also included are special considerations for restoration of operations masters (Flexible Single Master Operations [FSMO] role holders), recovery from accidental mass deletion of objects, recovery of the NTDS.DIT database, and recovery of the entire forest. note When objects are authoritatively restored, the objects and their attributes overwrite the tombstoned objects created when the objects were initially deleted. Authoritative RestoreAuthoritative restoration of the AD can be done for the entire AD, a tree in the AD (such as a single domain or OU [Organizational Unit]), or an individual object that was accidentally deleted. The concept is that you have a source DC that you have identified as the source of certain objects, and you want all other DCs to replicate from this source. This source object will forcefully replace all other versions on the other DCs and should be done only when other options have failed to repair the situation. An authoritative restore from a domain with at least two replica DCs is actually a merge operation. A copy of the AD at some point in the past is restored to a DC. This restores all deleted and modified objects to the point in time of the backup. Objects created after the backup will be replicated from the other DCs. This is a good feature in that you restore the deleted objects and keep the ones created in the meantime. This process is accomplished by identifying a well-connected DC, restoring a backup from tape or other media that has the state of the AD you want to force on the existing DCs, and then using the NTDSUtil authoritative restore feature. Authoritative restore, as well as the restore from media, has to be performed in DS Repair Mode. For example, suppose we installed an AD-enabled application. We installed it on Wednesday and it's now Friday. We want to roll back to the AD as of Tuesday. Assuming we performed a full backup on Tuesday, we could perform the following steps:
You can see the results of raising the Update Sequence Number (USNs) by executing the Repadmin command: Repadmin /showmeta dc=company,dc=com The output of that command is shown here with an explanation following: 38 entries. Loc.USN Originating DC Org.USN ======== ========================== ======== Org.Time/Date Ver Attribute ============= ============= 65538 61c413db-23fe-414e-9d46-6c881d4eabc4 65538 2004-02-14 18:20:200001 dc 74640 61c413db-23fe-414e-9d46-6c881d4eabc4 75554 2004-02-14 23:43:200001 msDS-PerUserTrustTombstonesQuota 74640 61c413db-23fe-414e-9d46-6c881d4eabc4 75554 2004-02-14 23:43:200001 msDS-AllUsersTrustQuota 74640 61c413db-23fe-414e-9d46-6c881d4eabc4 75554 2004-02-14 23:43:200001 msDS-PerUserTrustQuota 4098 61c413db-23fe-414e-9d46-6c881d4eabc4 4098 2003-10-31 22:32:08 1 msDS-Behavior-Version 74640 61c413db-23fe-414e-9d46-6c881d4eabc4 75554 2004-02-14 23:43:200001 ms-DS-MachineAccountQuota 74640 61c413db-23fe-414e-9d46-6c881d4eabc4 75554 2004-02-14 23:43:200001 gPOptions 74640 61c413db-23fe-414e-9d46-6c881d4eabc4 75554 2004-02-14 23:43:200006 gPLink 74640 61c413db-23fe-414e-9d46-6c881d4eabc4 75554 2004-02-14 23:43:200001 isCriticalSystemObject 74640 61c413db-23fe-414e-9d46-6c881d4eabc4 75554 2004-02-14 23:43:200001 objectCategory 74640 61c413db-23fe-414e-9d46-6c881d4eabc4 75554 2004-02-14 23:43:200001 wellKnownObjects 74640 61c413db-23fe-414e-9d46-6c881d4eabc4 75554 2004-02-14 23:43:200001 Note that in the Ver (version) column, there are objects with version number of 200001, indicating that these objects were authoritatively restored from media that was 2 days old (2*100000+existing version of 1 = 200,001). The process described here restores the entire AD. Microsoft KB article 241594, "How to Perform an Authoritative Restore to a Domain Controller in Windows 2000," contains excellent information regarding what can and can't be restored and what objects, such as those in the Configuration container, should and shouldn't be restored. Note that this article applies to Windows Server 2003 as well. If you are doing an authoritative restore, you should read this KB article and associated KB articles. Remember, authoritative restore is the Big Hammer approach to repairing the AD. Don't hit your head with it! Authoritative Restore: Subtrees and Individual ObjectsThis method restores specific component(s) of AD and marks them as authoritative for the directory. This method is the most commonly used because there are few occasions when the entire directory needs to be restored. The syntax to restore the Marketing OU that is in the parent NorthAmerica OU in the Company.Com domain is Authoritative Restore: restore subtree OU=Marketing, OU=North America,DC=company,DC=com Exit from NTDSUtil and reboot the DC. This server is now the authoritative AD DC for the Marketing OU and changes will be replicated. Individual objects can be restored in the same manner using the Restore Object option and specifying the DN of the object: Authoritative Restore: restore object CN=olseng, OU=users,DC=company,DC=com Authoritative Restore and Override Version IncreaseNote in Figure 11.4 that each of the restore options has an associated option to override version increase, such as Restore database verinc %d. This permits you to override the version increase of 100,000. One use for this would be if you suspect that a version increase of 100,000 is insufficient. This is used when you have to run the authoritative restore twice, which could be the case when something goes wrong the first time (doesn't finish, power outage , and so on) or as a normal consequence of having to restore user accounts before groups. For instance, to set the version increase to be 250,000, you could enter the command: Authoritative Restore: Restore database verinc 250000 Recovery from Accidental Massive Object DeletionOne of the most common reasons to have to restore all or part of an AD is due to errors by Administrators. In fact, in a disaster recovery presentation, Microsoft indicated that it was surprisingly getting a lot of support calls due to large numbers of objects (usually users) being deleted. Microsoft also indicated that the number one way to protect your AD against this is to be careful to whom you give delete privileges. Although some of these operations ask a couple of times if you are sure, many Admins still goes through with a destructive approval. In all fairness to those who have done this, there isn't anyone in the IT industry who hasn't been in one directory, thought they were in another, and deleted the directory. In my days as a VMS Administrator, I actually set up security to deny myself delete privilege so I'd have to change it to perform a delete. Some of the common ways to accidentally delete valid objects include
There are some things you can do to reduce the risk of accidental object deletion, such as:
warning At the writing of this book, the Lag Site is a new concept. Using the Lag DCs for testing, introduce the possibility of putting it back online and propagating bad data as a result of the test. Test this procedure and establish rigid testing procedures to ensure you don't populate the production domain with bad data. For instance, testing a script shows it inadvertently deletes a wrong set of users. Plugging this into the production domain will propagate those deletions. However, because the Lag Site replicates only once a week, you will have some window of safety even if this should happen. When you plan for the recovery of user objects, there is an important concept to understand: the back link issue. This makes restoration of user objects much more complicated than it might seem. The issue and how to successfully recover deleted user objects is detailed in the next section. Restoring Objects and the Back Link IssueYour ability to restore user objects isn't as simple as it might seem. In Windows 2000, Windows Server 2003 domains, and forests whose functional level have not been raised to Windows Server 2003 level, the group membership is treated as a single attribute. The groups a user is a member of are not stored with the user; instead, the groups are stored as a link to the user object in an attribute of the group (the member attribute). However, you should understand that a group cannot store a link to the user object, if this object doesn't exist on a DC. In case of authoritatively restoring both users and groups at the same time, this can be critical because an Administrator doesn't have control over which objects get replicated to the partner DCs first. If the group happens to be replicated first, then a user's membership can be dropped on the partner DCs because they do not yet have the user object in their database. Thus, the user account must be restored before the group in order to safely restore the group membership (at least within the same domain). This becomes a problem if you are restoring a large section of the AD, such as the entire directory or an OU that contains both users and groups. Microsoft recommends doing an authoritative restore twice to restore deleted user accounts and groups. The first time it will try to restore both users and groups, but the groups will fail if the users have not all been replicated first. By running authoritative restore a second time, it will restore the groups this time. Note that if your AD design places users and groups in separate OUs so that the user OU can be restored before the group OU, then a single restoration of each OU (first the user OU then the group OU) is sufficient. The Problem with Nonrestored Object LinksIf accidentally deleted objects are not restored with their associated cross-domain links between users and groups in a multiple domain environment, it can have some ramifications on the administration of the domain. After authoritatively restoring accidentally deleted users, the links to groups whose memberships span multiple domains are not restored. In addition, when users that are members of other domains, or groups that are nested in groups in other domains, are restored from backup, other object links will not be restored. These include the manager/directReports (in the "Organization" tab of the user properties) or managedBy/managedObjects in objects such as computers and printers. Loss of group membership can have serious consequences, such as
These circumstances are true for Windows 2000 and Windows Server 2003. Note that both Windows 2000 and upgraded Windows Server 2003 forests also have this trouble even in the same domain. That is, the user-to-group object links do not get restored when only restoring the user objects. New group-memberships that were added to a Windows Server 2003 forest after switching to the highest forest functional level (enabling Link Value Replication [LVR]), will be restored correctly within the same domain. Administration AnomaliesThe functionality of these object links even in a healthy domain has certain anomalies that make administration difficult in a multiple-domain environment. If a user is a member of a Domain Local Group (DLG) that is hosted in another domain, the Administrator of the user's home domain can't see that group in the user's Member Of properties. For example, Figure 11.6 shows the membership of the DLG Amer-DLG1, in the Qamericas.Qtest.cpqcorp.net domain. Note that there are several users in the group who are members of the root domain Qtest.cpqcorp.net, one of them being Gary Olsen . However, in Figure 11.7, viewing Gary's Member Of tab in his user account properties, the Amer-DLG1 group is not listed. Only groups in the Qtest root domain are shown. Figure 11.6. Membership of the DLG, Amer-DLG1. Note that Gary Olsen is a member.
Figure 11.7. Listing of user Gary Olsen's group memberships ”note that the DLG Amer-DLG1 in the child domain is not in the list.
In addition, there are problems when attempting to view memberships of a user in a universal group that is hosted in another domain. If you are connected to a GC when you view the user properties, you can see the universal group listed. However, if you are connected to a DC in the user's home domain, you can't see the membership. As a system administrator, these issues are probably frustrating to you. In an attempt to make this behavior consistent, Microsoft filtered the results in Windows Server 2003 so you couldn't see this membership even if connected to a GC. Administrators complained and Microsoft put it back in SP1. Another object whose group object link gets lost is the managedBy object. This allows a user or contact ( name , e-mail, phone, and so on) to be responsible for a resource (such as computer, printer, and so on), allowing others to know who the primary contact on that machine is. Unfortunately, you can't get a list of objects that a given user is responsible for, and it's difficult to see the managedObjects if the user is to manage objects in a different domain in the forest. These links also fail to be restored correctly with accidentally deleted objects. The manager / directReports/manager object, defined in the Organization tab, can't see users from another domain and restoration of the objects won't restore these links. To summarize, there are basically two problems. User's group memberships are stored as links in the AD database, so these links are not restored correctly in multiple domain forests. This problem exists for other links such as manager , direct reports , and managedBy objects. Authoritative restore does not correctly restore these links in multidomain forests, and Windows 2000 even has issues restoring object links in the same domain. Additionally, the default snap-ins, such as AD Users and Computers, will not support viewing these group memberships across domains. Hewlett Packard recently developed a tool called Active Directory Link Recovery Manager (ADLRM) that provides a way to save and restore these links, as described in the next section. Active Directory Link Recovery Manager (ADLRM)Guido Grillenmeir, of HP, was one of the first to discover this relationship of the object links when performing a restore for a customer. He has published internal HP articles on the subject and, with Walter Knopf, developed the ADLRM tool, which can be used to save, restore, and manage these links through a GUI-based console. ADLRM specifically addresses the problems noted previously in this section regarding the unrestored cross-domain links and other object links by storing them in a SQL (or Microsoft Database Engine [MSDE]) database and providing a GUI-based interface for management. ADLRM's capability to view and manage inter-domain links, cross-domain links, and memberships of users and groups in a DLG is a powerful tool for Administrators because these links are otherwise invisible. For example, all the links for a user can be displayed, as seen in Figure 11.8. Figure 11.8. A user's group membership links can be displayed in the ADLRM tool; this information is not otherwise viewable by a tool.The ADLRM tool has the following capabilities:
It is important to note that ADLRM is not intended to be a backup tool or to replace the system state backup process. It is an add-on tool to repair a hole in the process of authoritatively restoring AD objects. You can see a block diagram of the processes used by ADLRM in Figure 11.9. The two core components , the collector service and the console run on the same machine, but separately. The ADLRM collector service is the heart of the system and stores the links from all domains in the forest in a central SQL database (see Figure 11.10). Features and recommendations include
Figure 11.9. Block diagram showing processes in the ADLRM tool.Figure 11.10. The collector service of the ADLRM tool shows all links for all domains.Using ADLRM in combination with proper recovery practices using authoritative restore provides the best possibility of success in restoring accidentally deleted objects in single as well as multiple domain forests. It is important to note that if your disaster recovery strategy is to reanimate tombstones, as provided by Windows Server 2003, not all of the attributes are recovered because the tombstone only contains a subset of the attributes. ADLRM does not take care of backing up or restoring these attributes; this must be done by other means. Microsoft has recently released KB article 840001, "How to Restore Deleted User Accounts and Their Group Memberships in Active Directory." This article is quite detailed regarding the issue of restoring deleted objects. Note that issues such as this one are found and solved as a natural consequence of using AD. It is advisable, then, to monitor Microsoft's Web site for updates on this and other issues. The Microsoft KB article 840001 article provides information in six important areas:
ADLRM can also be used to repair the LVR-specific attributes that are not updated correctly in a Windows 2000 to Windows Server 2003 upgrade. The tool can be used to repair the LVR status of groups as well as the manager or managedBy attributes. The ADLRM tool is available from HP by going to the Web site at http://TheADLRMWebsite.com. Recovery of SYSVOLAuthoritative and nonauthoritative restores of SYSVOL are not related to the authoritative and nonauthoritative restore procedures described here. Refer to Chapter 5 for details on recovery of the SYSVOL tree. Remember that the system state includes the SYSVOL tree. The restoration of the SYSVOL tree when using the IFM feature in Windows 2003 has some caveats. I know of a customer who had performed a DCPromo using the IFM operation to save bandwidth over the network. However, the customer noticed that at the end of the promotion, in spite of the fact that SYSVOL is part of the system state that was used as a source for the promotion, a full synch was performed from an existing DC to the newly promoted DC. Because SYSVOL was about 250MB, this was a problem. The customer finally resolved it by clearing the ADM cache, but the problem centers around the fact that the system state must be restored to the same volume that you specify to host SYSVOL in the DCPromo UI. For example, you can't restore system state to drive C: and then specify the C: drive for the NTDS.DIT and logs, and the D: drive for SYSVOL. In this case, SYSVOL will not be sourced from the restored system state. If you want to place the database (NTDS.DIT) and the logs on one volume and SYSVOL on another, you must follow a defined process, which is described in Microsoft KB article 311078, "Install From Media." Recovery of Operations MastersOperations masters, sometimes called FSMO role holders, have unique characteristics that make them more important than the ordinary vanilla DC. When one of these DCs becomes unavailable, the method of restoration depends on the FSMO role it holds, the type of role, how long it will be before the DC is restored to the network (if ever), and how long your environment can live without that role. The options for recovering these DCs are as follows :
One way to provide for recovery of FMSO role holders is to identify a DC as the Standby FSMO. This DC should be in a site with good connectivity, sufficient resources (memory, disk, etc.), and so forth so that it has the resources to be a role holder and has the connectivity to ensure it is always up-to-date. Thus, when it's necessary to seize an FSMO role, you can seize it to this Standby FSMO. Repadmin provides an excellent way to determine whether the DC that you want to seize a role to is up-to-date in replication with the /showvector option. For instance: Schema owner hpqnet-dc5.hpqnet.qtest.cpqcorp.net Domain role owner hpqnet-dc3.hpqnet.qtest.cpqcorp.net PDC role hpqnet-dc3.hpqnet.qtest.cpqcorp.net RID pool manager hpqnet-dc1.hpqnet.qtest.cpqcorp.net Infrastructure owner hpqnet-dc3.hpqnet.qtest.cpqcorp.net The command completed successfully. HPqnet-DC3 holds the domain naming master, the PDC Emulator, and the infrastructure master roles. To find out who has the most updated replication from HPQnet-DC3, we use the Repadmin /showvector command, specifying the DN of the domain followed by the name of the DC we want to look at. In this case, there are three other DCs: HPQnet-DC9, HPQnet-DC5, and HPQnet-DC1. So, we execute the command for each DC and observe the USN for HPQnet-DC3: C:\>repadmin /showvector c=hpqnet,dc=qtest,dc=cpqcorp,dc=net HPqnet-dc1.hpqnet.qtest .cpqcorp.net >fsmodc1.txt Dublin\HPQNET-DC9 @ USN 3745860 @ Time 2004-02-18 03:43:08 Brussels\HPQNET-DC5 @ USN 2201360 @ Time 2004-02-18 03:38:25 Alpharetta\HPQNET-DC3 @ USN 2871861 @ Time 2004-02-18 03:54:10 Seattle\HPQNET-DC1 @ USN 1570576 @ Time 2004-02-18 04:25:09 C:\>repadmin /showvector c=hpqnet,dc=qtest,dc=cpqcorp,dc=net HPqnet-dc5.hpqnet.qtest .cpqcorp.net >fsmodc1.txt Dublin\HPQNET-DC9 @ USN 3746087 @ Time 2004-02-18 04:10:54 Brussels\HPQNET-DC5 @ USN 2201478 @ Time 2004-02-18 04:24:28 Alpharetta\HPQNET-DC3 @ USN 2871959 @ Time 2004-02-18 04:07:10 Seattle\HPQNET-DC1 @ USN 1570573 @ Time 2004-02-18 03:58:12 C:\>repadmin /showvector c=hpqnet,dc=qtest,dc=cpqcorp,dc=net HPqnet-dc9.hpqnet.qtest .cpqcorp.net >fsmodc1.txt Dublin\HPQNET-DC9 @ USN 3746189 @ Time 2004-02-18 04:24:47 Brussels\HPQNET-DC5 @ USN 2201478 @ Time 2004-02-18 04:23:23 Alpharetta\HPQNET-DC3 @ USN 2872067 @ Time 2004-02-18 04:22:09 Seattle\HPQNET-DC1 @ USN 1570573 @ Time 2004-02-18 03:58:12 Note that the USN for DC3 as it appears on DC1, DC5, and DC9 is shown in Table 11.1. HPQnet-DC9 has the highest USN and thus is most up to date with HPQnet-DC3. Table 11.1. Comparison of HPQnet-DC3's USN on the Three Other DCs
Because HPQnet-DC9 has the highest USN recorded for HPQnet-DC3, HPQnet-DC9 is the best candidate to seize the FSMO roles held by HPQnet-DC3. Roles and Implications of LossWhether or not you immediately recover a role holder depends on the function of the role and how long you can get along without it. Here is a summary of this functionality.
To determine whether you need to seize the role, ask the following questions:
If you answer yes to all three questions, then you should seize the role. If you answer no to any of them, then wait for the DC to come back online and don't seize the role. In addition, you should never restore a DC holding the RID master. note Additional information on FSMO role holders and their function and replacement is available in the "FSMO Placement" section of Chapter 6. Disaster Recovery Q&ATo reinforce some of these concepts, lets do a Q&A. See if you pass the test!
note Remember that cross-domain group memberships won't be recovered in the restore. This was covered earlier in this section. Database RecoveryAnother aspect of disaster recovery is the recovery and maintenance of the AD database. This section describes some basics of the database architecture, how the associated transaction logs work in a write operation, and recovery procedures. Also included are some tips on defragmenting and running integrity checks on the database and the file structure. Figure 11.11 shows a conceptual diagram of the AD architecture. A thorough description of the layers in this architecture is described in Chapter 8 of my Windows 2000 book, Windows 2000 Active Directory Design and Deployment (New Riders, 2000). Figure 11.11. AD architecture.
The Directory Store ”NTDS.DIT fileThe data or directory store is the NTDS.DIT file. This should be familiar to anyone who has installed a DC. One of the questions posed during the final stages of the UI portion of DCPromo is for a desired location for the NTDS.DIT file. This is the database of the AD in one file. It's important to plan and predict the potential size of this file to obtain adequate disk space. Experimentation with large numbers of AD objects (tens of millions) suggest that performance is enhanced by putting the NTDS.DIT file on a different disk than the logs and SYSVOL share, not only for storage reasons, but also for performance on disk access. These should all be separate from the system disk. note The NTDS.DIT file can be moved to a different location with the NTDSutil.exe program using the move db command when a DC is booted to the DSRM. See Microsoft KB article 257420, "HOW TO: Move the Ntds.dit File or Log Files," for details. Other FilesAs shown in Figure 11.11, there are three other files in addition to the NTDS.DIT file: Log Files, the EDB.CHK file, and the TEMP.EDB file. The EDB logs contain information not yet saved in the NTDS.DIT file. EDB.CHK is referred to as the checkpoint file and holds the current transaction. The EDB.CHK file is used to restore log files. The TEMP.EDB file holds the transactions that are in progress as a new log file is being created. The AD Write OperationFigure 11.12 illustrates how a write operation is processed in the AD. In step one, the client requestor performs a write operation. This could be an Administrator adding a user. When this operation occurs, shown in step 2, the LSASS process saves that data in a log buffer in memory. In step 3 (two arrows in the figure), the data is then saved in both the log file and in a memory cache. If the transaction is successful on both the log file and memory cache, then the data is saved in the NTDS.DIT file, as shown in step 4. Figure 11.12. The AD write operation is essentially a four-step process.
With a basic understanding of the Windows 2003 system architecture, the layered AD architecture, and how the AD writes information to the AD data store, we can turn our attention to managing the AD database and issues such as predicting the database size that was noted previously in this chapter. AD Database ManagementIn the previous section, we saw how physical database, cache, and log files were used during the AD write operation. We will now turn our attention to the management of the AD database to determine predictable resource demands to aid the architect in determining resource allocation. It's important to understand the structure of the database tables. Objects are recorded in the database tables as rows and include things such as users and printers. Attributes are recorded as columns , such as a user's address and phone number. Backing Up and Restoring the ADThe previous discussion of how the AD writes data to the data store described interaction with a number of log files that probably seemed redundant. It is redundant by design. By writing data to these logs and verifying successful transactions before it is finally written to the NTDS.DIT file, it guarantees data integrity. It also provides a method of restoring the log data after a system crash to bring the AD to the last state recorded in the logs. The AD restoration operation is shown in Figure 11.13. If the AD terminates unexpectedly, such as with a power outage or system crash, the database is prevented from being moved to disk, and a recovery is attempted on the next boot. When the system starts up again, the log files are read sequentially and apply the changes recorded in them to the database to bring the database to the state it was in when the AD terminated . Figure 11.13. Flow chart of AD recovery after an unplanned shutdown or crash.
The log file name that this information is stored in is the edbXXXXX.log file, where XXXXX is a sequential number starting at 00001. Following is a directory listing of the edb logs from a DC. Directory of C:\WINNT\NTDS 02/19/2000 07:49p <DIR> . 02/19/2000 07:49p <DIR> .. 12/29/1999 12:19a <DIR> Drop 02/16/2000 07:54p 8,192 edb.chk 02/19/2000 07:49p 10,485,760 edb.log 02/19/2000 07:49p 10,485,760 edb00001.log 02/11/2000 07:38p 10,502,144 ntds.dit 01/24/2000 10:31p 10,485,760 res1.log 01/24/2000 10:31p 10,485,760 res2.log 02/11/2000 07:38p 2,113,536 temp.edb 7 File(s) 54,566,912 bytes 3 Dir(s) 990,056,448 bytes free Note that the EDB log files are sequentially numbered in hex. This is used to restore the data in order. The edb.log file is the current log file being written. Note that in this listing there is an edb00001.log. This was the first edb.log file written. It was filled to the maximum size, closed, and saved as edb0001.log; then, a new edb.log was started. The two "reserve"logs, shown in the directory listing, res1.log and res2.log, are simply placeholders that reserve disk space to be used by the log files for a controlled shutdown if disk space is exhausted. The EDB logs, as well as the res1.log and res2.log files, are always the same size 10MB. Obviously, there is a need to limit the size of the logs to fit within the physical storage limits. Windows 2003 enables circular logging , which overwrites the oldest log file after a certain number of log files are created. This is the default. If circular logging is turned off, the Administrator must delete the old EDB logs manually to manage disk space. The function of the res1.log and res2.log files is not needed when circular logging is enabled. Edb.chk is the checkpoint file that knows from where in the log file the recovery process should start. If this is missing or not accessible, the recovery would have to start with the oldest log file it could find and determine where to begin to write data to the information store. Defragmentation and Integrity CheckingDefragmentation consists of online and offline operations. Online defragmentation takes place automatically as part of the garbage-collection process that purges expired tombstones. Online defragmentation releases more space for the database to use, but doesn't release any space back to the system. You should do an offline defragmentation on a regular basis. Compacting the NTDS.DIT database can make a significant difference in the time it takes to replicate. HP's current NTDS.DIT size of a GC is about 7.2GB compressed (depending on the domain of the GC) and more than 9GB uncompressed. This is important in backup and recovery times, or when promoting a new DC or GC. As noted previously in Chapter 1, the IFM feature considerably reduces the time required to promote a GC. Refer to Microsoft KB articles 229602, "Defragmentation of the Active Directory Database," and 232122, "Performing Offline Defragmentation of the Active Directory Database." If you suspect that the database is corrupt, you can run the Semantic Database Analysis option in NTDSUtil. You must be booted into DSRM to do this, but I've seen a lot of problems caused by database corruption that were fixed by running this option. There is a Fixup option that fixes any problems it encounters as well as an option to just report the results. I've never seen this hurt the database; it can only help. Refer to Microsoft KB article 315136, "HOW TO: Complete a Semantic Database Analysis for the Active Directory Database by Using NTDSUtil.exe." Recovery of a ForestThe absolute worst-case scenario in AD disaster recovery is to have to recover an entire forest from backups. This scenario should also be considered as part of your disaster recovery plan. Typically, this plan would allow for a duplicate set of backup tapes sent to an offsite facility ”probably managed by a separate company that specializes in this work. The frequency of this delivery is up to you ”once a day is probably too much work ”so perhaps send the weekly full backup tapes. The plan should specify that this third-party company have the capability to get new hardware and restore the forest. Make sure that you not only define the procedure, but also test it to make sure it will work. Again, a backup without validation is a waste of time and an incredible risk. You might deploy such a plan for a number of reasons:
note Restoring a forest from backup will lose all the changes in the AD since that backup was made. Make sure you include a plan to re-create the objects created between the backup date and the date of the disaster. Disaster Recovery SiteMany companies employ a disaster recovery site (DRS) n which a DC from each domain is in a physical location separate from the company's buildings . A high-speed link connects the DRS with the hub site of the normal infrastructure, or perhaps additional links and associated replication topology are built so that this DRS is always up to date. Thus, if you do lose all your DCs in all your sites, you still have a nucleus in the DRS, which is much easier to restore from than backup tapes, and with far less potential loss of data. Figure 11.14 is the topology of one customer I worked with who employed this concept. This shows the replication topology employed to ensure the DRS site had preferential replication from the corporate hub site. Figure 11.14. AD replication topology map for a Disaster Recovery Site (DRS).Forest Recovery from MediaThere are valid reasons for needing to recover an entire forest from media, but there will seldom be an occasion to do so. Barring a natural disaster that destroys every DC in every domain in the infrastructure, there are certainly better and safer ways of recovery. Remember you are betting the company's entire infrastructure on this operation, and there are a number of dependencies such as:
Preparing for a complete forest recovery involves a great deal of planning and testing to make sure the plan works. After you develop the plan, test it by taking the backup media and restoring it to new hardware. Because you can't guarantee that the hardware you restore it to will be the exact same as the hardware it's running on now, make sure you follow Microsoft's procedure for restoring to dissimilar hardware, detailed in KB article 263532, "How to Perform a Disaster Recovery Restoration of Active Directory on a Computer with a Different Hardware Configuration." Again, this is a procedure to be used in a "life or death" scenario. If there is no other possible way to recover your AD and Microsoft advises it, then use the procedure documented in Microsoft's whitepaper "Best Practice Recommendation for Recovering your Active Directory Forest" located at http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=3EDA5A79-C99B-4DF9-823C-933FEBA08CFE. Even during the writing of this book, some processes of disaster recovery have changed per Microsoft's recommendations, so you should make sure you are acquainted with the most recent Microsoft recommendations. warning Be aware of the incredible impact that a recovery of the entire forest could have on a company ” especially a large global enterprise with multiple domains and perhaps tens of thousands of user accounts. Of course, if all your DCs are in a building that burns down and you have no other sites, then you have no choice. But barring physical destruction of all DCs in the forest, do the following:
|
< Day Day Up > |