Lesson 3: Designing a Disaster Recovery Strategy

Despite your best efforts to protect your systems from failure and ensure high availability to your Web clients, problems can arise and you might find yourself trying to recover from some sort of disaster, such as a destructive virus, system failure, theft or sabotage, or a natural disaster like a fire or flood. As a result, you should have a recovery plan in place in order to minimize loss of data and services. When designing a recovery strategy, you must prepare for disasters by taking steps that help to ensure a smooth recovery. For example, you should prepare Setup disks, Startup disks, and Emergency Repair Disks (ERDs) for your systems so that they’re ready to use should a disaster occur. In this lesson you’ll learn about the steps that you need to take in order to design a disaster recovery strategy.

After this lesson, you will be able to

  • Plan a disaster recovery strategy

Estimated lesson time: 25 minutes

Disaster Recovery

A disaster is any situation that causes a serious disruption in your system’s services. A disaster can result in data loss or machine failure, making your system unavailable to users and applications. As a result, an organization must prepare for a possible disaster by developing a disaster recovery strategy. You should take the following steps when designing a disaster recovery strategy: prepare recovery systems, collect configuration and system information, test system components, test recovery systems, and document recovery procedures.

Preparing Recovery Systems

In the event of a disaster, you must have several systems in place that allow you to perform a smooth recovery operation. To prepare for a possible disaster, you should create the Windows 2000 Setup disks, Startup disks, and ERDs. You should also back up your data on a regular basis to ensure against the loss of any critical system state data, files, or other data important to your system.

Creating Windows 2000 Setup Disks, Startup Disks, and ERDs

In some types of failure you might not be able to access certain systems in order to repair an installation or reinstall Windows 2000. For example, the computer might not support a bootable CD-ROM, or you might not be able to access directories or files. In these situations you can often use the Windows 2000 Setup disks, the Startup disks, or the ERD to access system resources or reinstall Windows 2000. In some cases, you could use these disks in conjunction with each other to repair a system. For example, you can start a repair process by using the Setup disks and then repair the problem by using the ERD.

To prepare for the possibility of a disaster, you should create all three types of disks so that they’re on hand just in case. Each type of disk is described in Table 11.5.

Table 11.5 Windows 2000 Emergency Disks

Type of Disk Description

Setup disks

The Windows 2000 Setup disks consist of four floppy disks that allow you to access your computer in case of system failure. You can use the Setup disks to start Setup, the Recovery Console, and the Emergency Repair Process. The Setup disks allow you to access your system on computers that can’t be started from the CD-ROM drive. You can create the Setup disks by running the makebt32 utility on the Windows 2000 Server installation CD-ROM.

Startup disks

Each Windows 2000 Startup disk is unique to the system for which it is created. It allows you to access a drive with a faulty startup sequence. The disk can access a drive that’s configured with Windows NT file system (NTFS) or the file allocation table (FAT) file system. You can use the Startup disk to help with problems that involve a corrupted boot sector, a corrupted Master Boot Record (MBR), a virus infection, a missing or corrupt NTLDR file or NTDETECT.COM file, or an incorrect NTBOOTDD.SYS file. You cannot use the Startup disk for incorrect or corrupted device drivers that have been installed into the Windows 2000 System directory or for startup problems that occur after the boat loader starts. To create a Startup disk, copy the NTLDR, NTDETECT.COM, and BOOT.INI files from your hard drive to the floppy disk. The disk should also include a copy of the correct device driver for your hard disk drive. You should create a Startup disk for each computer that you want to protect.

ERD

The ERD allows you to help repair problems with your system files (if they’re accidentally erased or become corrupt), your startup environment (if you have a multiple-boot system), or the partition boot sector on the boot volume. You can create the ERD by using Windows Backup. You should create an ERD for each computer that you want to protect.

Backing Up Your Data

The only way that you can ensure that your data is protected is to back up that data on a regular schedule. If you don’t back up the data, you might not be able to recover important information or settings when problems occur. Regular backups prevent data loss and damage caused by disk drive failures, power outages, virus infections, and other computer-related problems.

Window 2000 includes the Backup utility (shown in Figure 11.11), which allows you to back up programs and files, restore previously backed up data, and create an ERD.

Figure 11.11 - The Backup tab of the Windows 2000 Backup utility

The Backup utility is integrated with the core Windows 2000 distributed services, which means that you can use Backup to back up system state data. System state data includes the following types of information:

  • Boot files and all files protected by Windows File Protection (WFP)
  • Active Directory service
  • Sysvol
  • Certificate Services
  • Cluster database
  • Registry
  • Performance counter configuration
  • Component Services Class registration database

You can use the Backup utility to copy data to a tape drive, logical drive, removable disk, or an entire library of disks or tapes.

Collecting Configuration and System Information

In addition to preparing your recovery systems, you should maintain a record of various types of information that will help you restore your system should a disaster occur. Your documentation should be thorough and complete, and it should be stored in a safe and accessible location. Table 11.6 provides an overview of the types of information that you should maintain.

Table 11.6 Maintaining System Information

Type of Information Description

Hardware configurations

Include information about each computer (such as type, model, serial number, basic input/output system [BIOS], complementary metal-oxide semiconductor [CMOS], and network adapters). Also include information about the disk subsystem. Record such details as type of disk, type of adapter, configured volumes, sizes, and type of disk.

Software configurations

Include information about kits, tools, and add-ons that have been installed. Record software configuration information and backups for each computer. Information should include applications and volumes on which they’re installed, licensing information, installed service packs and hot fixes. Be sure to include special settings like video mode settings, if that’s important to a particular machine.

Computer names and IP addresses

For each computer, record the computer name and, if applicable, the static IP address.

Domains

For each computer, record which domain that computer belongs to.

Local administrative passwords

Record in a safe location the local administrative password that was used when the backup was created.

Miscellaneous documentation

Include any other information that might be necessary when restoring a system, such as vendor documentation, internal documents, and contact information.

Testing System Components

Another important step in developing a recovery strategy is to test your components to try to predict failure situations and to practice recovery procedures. You should stress test all functionality in your system. This includes internal components such as hard disks, controllers, processors, and RAM, as well as external components, such as routers, bridges, switches, cables, and connectors.

You should try to simulate the following situations when you stress test your system:

  • Heavy network loads
  • Heavy disk I/O
  • Heavy use of file and application servers
  • Large numbers of users simultaneously logged on

Testing Recovery Systems

Once you’ve created your recovery systems (Windows 2000 Setup disks, Startup disks, and ERDs, as well as a data backup system), you should practice recovering from disasters that can occur. Your practice should include using the Setup disks, Startup disks, and ERDs, as well as Safe Mode and the Recovery Console. Practicing will help you determine how long a recovery process will take, whether you’ve backed up all the data that needs to be backed up, and whether you’ve collected all the configuration and system information necessary to recover from a disaster.

Your testing should help you determine which recovery procedure you should use in certain situations. For example, you might find that in some circumstances you can use Safe Mode to recover, while in other situations you must use the Setup disks along with the ERD.

Your testing should include scenarios that represent the most common causes of unexpected downtime. At a minimum, you should perform the following types of recoveries:

  • Restoring data from backups
  • Rebuilding redundant array of independent disks (RAID) volumes
  • Promoting member servers to domain controllers to replace a failed domain controller
  • Replacing components, such as hard disks, adapters, and power supplies
  • Recovering MBRs and boot sectors
  • Restoring Windows 2000 system files

You should test your recovery procedures before bringing a new computer or server into production.

Testing recovery procedures goes hand-in-hand with training personnel. When administrators practice recovering from a disaster, they’re being trained in how to handle disasters should they occur. Properly trained personnel can reduce the likelihood of failures as well as the severity of a failure.

Documenting Recovery Procedures

Your disaster recovery strategy should include step-by-step procedures for recovering from different types of failures. You can use these procedures to test new computers before putting them into production, to train administrators and operators, and to create an operations handbook. You should update your procedures when you change your systems configuration, when you install a new operating system, or when you change the utilities that you use to maintain your system.

Making a Decision

When designing a disaster recovery strategy, you should prepare recovery systems, collect configuration and system information, test system components, test recovery systems, and document recovery procedures. Table 11.7 provides an overview of the factors that you should consider for each step that you should take when preparing for possible disasters.

Table 11.7 Disaster Recovery

Step Considerations

Preparing recovery systems

Recovery systems include the Windows 2000 Setup disks, startup disk, and ERD. Recovery systems also include any critical system state data, files, or other data important to your system that’s backed up on a regularly scheduled basis.

Collecting configuration and system information

Information necessary to restore a system should a disaster occur includes information about hardware configurations, software configurations, computer names and IP addresses, domains, local administrative passwords, and other documentation to support recovery efforts.

Testing system components

You should conduct stress tests to test all functionality in your system, including internal components as well as external ones. Tests should try to simulate heavy network loads, heavy disk I/O, heavy use of file and application servers, and large numbers of simultaneous users.

Testing recovery systems

You should use your recovery systems (Windows 2000 Setup disks, Startup disks, and ERDs, as well as a data backup system) to practice recovering from disasters that can occur.

Documenting recovery procedures

Your disaster recovery strategy should include step-by-step procedures for recovering from different types of failures.

Recommendations

An effective disaster recovery strategy relies heavily on the steps you take before a disaster occurs. Waiting until you have a problem can be too late to figure out what information you’re missing or whether you’ve backed up all data that needs to be preserved. To prepare a strategy, you should follow the five steps outlined in this lesson: prepare recovery systems, collect configuration and system information, test system components, test recovery systems, and document recovery procedures.

Example: Preparing the Recovery Systems for Coho Vineyard

Coho Vineyard is implementing a disaster recovery strategy to prepare for any disasters that might occur in its Web site. The company maintains a small Web site that hosts static content only and supports Anonymous access. The site uses two IIS server computers that are configured as a Network Load Balancing (NLB) cluster. As part of its disaster recovery strategy, Coho Vineyard is preparing the following recovery systems:

  • Windows 2000 Setup floppy disks Administrators prepare the Setup disks by using the MAKEBT32.EXE utility on the Windows 2000 Server installation CD-ROM. The utility is located in the Bootdisk directory on the CD-ROM. The administrators create four disks: Windows 2000 Setup Boot Disk, Windows 2000 Setup Disk #2, Windows 2000 Setup Disk #3, and Windows 2000 Setup Disk #4.
  • Windows 2000 Startup floppy disks For each Windows 2000 Server computer, the administrators create a Startup floppy disk (or disks) by copying the NTLDR, NTDETECT.COM, and BOOT.INI files from the root directory to a floppy disk.
  • Windows 2000 ERDs For each Windows 2000 Server computer, the administrators create an ERD by using the Emergency Repair Disk Wizard in the Backup utility. The ERD contains information about the current Windows systems settings for the specific computer.
  • Backed-up data Administrators use the Backup utility to create backup jobs for each computer and define a backup schedule for each of those jobs. For each Windows 2000 Server computer, they back up everything on that computer, including the system state data.

Lesson Summary

A disaster is any situation that causes a serious disruption in your system’s services. When designing a disaster recovery strategy, you should prepare recovery systems, collect configuration and system information, test system components, test recovery systems, and document recovery procedures. Recovery systems include the Windows 2000 Setup disks, Startup disks, and ERDs. Recovery systems also include the critical system state data, files, or other data important to your system that’s backed up on a regularly scheduled basis. In addition to preparing the recovery systems, you should maintain a record of various types of information that will help you restore your system should a disaster occur, including information about hardware and software configurations; domains, computer names and IP addresses; local administrative passwords; and miscellaneous documentation. You should also test your internal and external components to try to predict failure situations and to practice recovery procedures. Once you’ve created your recovery systems, you should practice recovering from disasters that can occur. Your practice should include using the Setup disks, Startup disks, and ERDs, as well as Safe Mode and the Recovery Console. Finally, your disaster recovery strategy should include step-by-step procedures for recovering from different types of failures.



Microsoft Corporation - MCSE Training Kit. Designing Highly Available Web Solutions with Microsoft Windows 2000 Server Technologies
MCSE Training Kit (Exam 70-226): Designing Highly Available Web Solutions with Microsoft Windows 2000 Server Technologies (MCSE Training Kits)
ISBN: 0735614253
EAN: 2147483647
Year: 2001
Pages: 103

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net