< Day Day Up > |
The restoration of data in AD really has a lot of flexibility and the method you choose depends on what has been lost or corrupted. In the simplest, and most common case, a single DC or GC server needs to be restored. This could be caused by
When a DC becomes unavailable, the decision of how to restore it depends on whether (and when) it will come back online, whether the failure is a hardware or software issue that is not AD- related , or whether the failure is due to an AD failure, such as broken replication. If the failure is a hardware issue, Windows Server 2003, like Windows 2000, provides Safe Mode startup and the Recovery Console, which allows a low-level boot, permitting the Administrator to replace drivers, make Registry changes, or make other repairs . The Recovery Console is an option available in Windows Server 2003 Setup that can be initialized by following these steps:
The next section describes how to recover DCs and GCs if AD problems exist. Repairing and Restoring DCs and GCsOne of the things we have learned supporting Windows 2000 ”that Microsoft actually perpetuated ”was to use demotion of a DC as a common troubleshooting technique. I know it sounds kind of lazy, but if you have made a reasonable attempt to fix a DC that is having trouble with replication, AD database corruption, and so on, and if the problem seems isolated to one DC, the easiest and most effective repair is to demote the DC and repromote it back into the domain. The only data lost would be objects created on that DC or GC that had not replicated to any others. Those objects would have to be re-created or restored from media if they had been backed up. Obviously, if the problem is rampant, you need to spend some time and find the problem, but if the problem is a missing object, a corrupt ntds.dit file, or a missing service principle name (and a reasonable effort to repair it has failed), just repromote it. I've seen Administrators spend days on a problem when a demote-repromote action in a few hours would have solved the problem. This rule applies to GCs as well, but GCs typically hold considerably more data and have to sync with other GCs and might take a much longer time to replicate. Demoting a GC will have other repercussions if Exchange is deployed because the Global Address List (GAL) is held in the GC. Exchange clients will find another GC to get the GAL from, but performance will be reduced if the local GAL is removed and they have to connect to a remote GC. The strategy of recovering DCs and GC servers depends upon whether other DCs exist in the domain, the recovery of a GC when other GCs exist in the forest, and the recovery of a DC or GC when no others exist in the domain. Nonauthoritative RestoreA nonauthoritative restore is simply the restoring of a DC from backup. When the DC completes the restore process, it will get updates from its peer DCs via normal AD replication. The process to perform this restore operation is as follows :
After rebooting, the system will replicate with its replication partner to get changes made since the backup that was just used to restore the system state. note A nonauthoritative restore of AD automatically executes a nonauthoritative restore of SYSVOL; therefore, no additional steps are required. SYSVOL is included in the system state backup. Restoration of SYSVOL using the Install From Media feature requires certain precautions noted in the Microsoft KB article 311078, "Install from Media," to promote replica Windows Server 2003 DCs. Recovery using the Install From Media features is covered later in this chapter as well. Effects of TombstonelifetimeAs explained in Chapter 5, "Active Directory Logical Design," tombstonelifetime is a forest setting that defines how long a deleted object remains in the deleted objects container before it is purged from the AD. Within this timeframe, the deleted object (tombstone) is replicated to other DCs to inform them of the deletion. After the tombstonelifetime has expired , the Garbage collector removes the deleted objects from the AD. The valid lifetime of any backup media to restore AD, is equal to that of the TombstoneLifetime parameter, which is, by default, 60 days. Chapter 5 explained that if a DC or GC came back online after 60 days, it might contain objects that had deleted in the meantime and now been purged from the AD. It then tries to replicate them again since its replication partners wouldn't have them in their copy of the AD. This causes orphaned objects to be propagated in the AD, which then breaks AD replication. Windows Server 2003 and Windows 2000 SP3+ provides ways to prevent and repair this situation. However, restoring a DC or GC from a backup media that is more than tombstonelifetime days old, will have the same effect ”causing those old purged objects to be replicated. It is not recommended to change this value. That said, I have seen statements by Microsoft advising a Tombstone Lifetime recommendation of 120 days to mitigate the effects of a short lifetime. Disaster Recovery of AD on Different HardwareIt is possible to recover AD onto servers that were not the same hardware configuration as the hardware that the backup was performed on. Obviously, the closer you can get to the same hardware, the easier the restore will be, but it's possible to do it. This procedure is intended for an off-site recovery plan. Many organizations contract with a company that specializes in data storage and recovery, who in turn stores a copy of all backups in an off-site facility. In the event a disaster wipes out all of the DCs, this vendor could take the backup tapes, obtain new hardware, and restore the backups to the new hardware, which might be different that the original hardware that hosted the DC. Microsoft has published an article on how to do this in KB article 263532, "How to perform a disaster recovery restoration of Active Directory on a computer with a different hardware configuration," which contains step-by-step instructions. At the writing of this book, this KB article specifies Windows 2000, and there are no specific instructions for Windows Server 2003, so I'd advise you to watch Microsoft's site for updates. In selecting the hardware, the new server must meet the following criteria:
Note that you can always upgrade the hardware (add processors, disks, and so on) after you complete the recovery. Manual Demotion of a DC/GCOne of the problems in restoring a single DC by demoting and repromoting via DCPromo is that DCPromo can fail. For instance, one of the most frequent reasons to repromote a DC is failure of replication. If replication is broken, DCPromo won't be able to replicate the changes (to remove this DC's objects from the AD on its partners), so you are stuck. Not long after Windows 2000 was released, Microsoft came up with a method to manually demote a DC. Although complex, the method worked. Microsoft never published the method because it hadn't been fully tested , but Microsoft would step you through the procedure if you logged a case and a manual demotion was required to solve the problem. warning Manual Demotion of a DC removes the DC from the domain (deletes the computer object) and leaves it as a standalone server in a workgroup. If a DC is also an Exchange server, manually demoting the machine requires resetting security if you manually demote it. See the "Caveats" section (coming up) for details on how to fix this. Likewise, any application relying on domain security could be affected by manual demotion. Be sure to determine if you have apps that fall into this category and determine the necessary recovery procedures. Manually Demoting a DCWindows Server 2003 and Windows 2000 SP4 and later provided an official way to force a DC to be demoted to a Standalone Server:
This procedure deletes the domain computer account and puts the server in a Workgroup, unjoined from the domain. To rebuild this standalone server back to become a DC with the same name, perform the following steps:
note If you repromote the DC using a different computer name, you can promote the DC back into the domain without waiting for end-to-end replication. Also if you try to promote a machine back into the domain with the same DC name before replication completes, it will eventually work. Refer to the Replication section in Chapter 5 for additional details. CaveatsThe first caveat is critical to understand before you do this. Unlike a normal DC demotion that bumps the DC back to a member server, manual demotion puts it in a work group . In the process, it deletes the computer account. If the DC is only a DC, that's no big deal ”when DCPromo is run again, the computer account will be re-created. However, if the DC is also hosting an application that uses the computer account, it will likely break the application. A good example is Exchange. Manually demoting a DC that is also an Exchange server removes the computer account that contains security Access Control Entries (ACEs) for Exchange to work. If you DCPromo the machine back to the domain, the account will be re-created, but the ACEs will not be there. The computer account for the Exchange server is granted Full Control on the Exchange server configuration object of the same name: CN=ServerName,CN=Servers,CN=AdminGroupName,CN=Administrative Groups,CN=OrgName,CN=Microsoft Exchange,CN=Services, CN=Configuration,DC=domain,DC=com The permissions for the computer account should also flow down to the child objects. This is accomplished by opening the ADSIedit tool, available in the Windows Server 2003 Support Tools located in the \Support directory on the Windows Server 2003 CD. Go to Start, Run, and enter ADSIedit.msc to open the snap-in. Browse the Distinguished Name (DN) path just noted. Figure 11.3 shows how we drilled down to the rights for Exchange server ALFNADRLAB5 in the ALFMSLAB.Local domain, in the Exchange Organization ALFMESSAGELAB, and a member of the FirstSite administrative group. For this example, we expanded the following folders: Configuration [alfnadrlab5.alfmslab.local] CN=Configuration,DC=alfmslab,DC=local CN=Services CN=Microsoft Exchange CN=ALFMESSAGELAB CN=Administrative Groups CN=FirstSite CN=Servers CN=ALFNADRLAB5 Right-click CN=ALFNADRLAB5 to get the Properties page. Select the Security tab, and then click the Advanced button at the bottom of the page. In the Advanced Security Settings dialog box, select the machine account, in this case . ALFNADRLAB5$ , and then click the Edit button. In the Permission Entry for the ALFNADRLAB5$ dialog box, check the Full Control Right (all boxes for all rights should then be checked). In the Apply Onto field, select This Object and All Child Objects. This is an important step, because if this is left as This Object Only, Exchange won't work (no mail received or sent). Preventing Disaster: The Lag SiteErrors, inconsistencies, and corruption in AD can be corrected by an authoritative restore, described in the following section, which backs the AD to a version that existed when the last backup was completed. However, this has repercussions in that there is a possibility of losing data and changes since the backup. In addition, you can't restore the schema other than by restoring the entire forest as described in the "Recovery of a Forest" section in this chapter. Utilizing a Lag Site can conceivably mitigate these disaster scenarios. Think of it as an almost-real-time-backup. The concept of a Lag Site is to schedule the replication frequency of one or more sites so that the site doesn't replicate for several days, purposely keeping the associated DCs several days behind in replication. Of course, you don't want to have a bunch of DCs in normal sites in this condition, so a special site is created with a DC in it and the replication frequency on the site link to that site is configured with a long time period. The lag site would have the following characteristics:
As with everything, there are some drawbacks:
Currently, there are a number of companies, including HP, either using Lag Sites or considering their use. Of course, this should be tested and thoroughly analyzed to ensure it is a good solution for your company. |
< Day Day Up > |