Troubleshooting Boot and Startup Problems

 < Day Day Up > 

This section presents approaches to solving problems that can occur during the Windows startup process as a result of hard disk corruption, file corruption, missing files, and thirdparty driver bugs. First, we describe three Windows boot-problem recovery modes: last known good, safe mode, and the Recovery Console. Then we present common boot problems, their causes, and approaches to solving them. The solutions refer to last known good, safe mode, the Recovery Console, and other tools that ship with Windows.

Last Known Good

Last known good (LKG) is a useful mechanism for getting a system that crashes during the boot process back to a bootable state. Because the system's configuration settings are stored in HKLM\System\CurrentControlSet\Control and driver and service configuration is stored in HKLM\System\CurrentControlSet\Services, changes to these parts of the registry can render a system unbootable. For example, if you install a device driver that has a bug that crashes the system during the boot, you can press the F8 key during the boot and select last known good from the resulting menu. The system marks the control set that it was using to boot the system as failed by setting the Failed value of HKLM\System\Select and then changes HKLM\System\Select\Current to the value stored in HKLM\System\Select\LastKnown-Good. It also updates the symbolic link HKLM\System\CurrentControlSet to point at the LastKnownGood control set. Because the new driver's key is not present in the Services subkey of the LastKnownGood control set, the system will boot successfully.

Safe Mode

Perhaps the most common reason Windows systems become unbootable is that a device driver crashes the machine during the boot sequence. Because software or hardware configurations can change over time, latent bugs can surface in drivers at any time. Windows offers a way for an administrator to attack the problem: booting in safe mode. Safe mode is a concept Windows borrows from Consumer Windows a boot configuration that consists of the minimal set of device drivers and services. By relying on only the drivers and services that are necessary for booting, Windows avoids loading third-party and other nonessential drivers that might crash.

When Windows boots, you press the F8 key to enter a special boot menu that contains the safe-mode boot options. You typically choose from three safe-mode variations: Safe Mode, Safe Mode With Networking, and Safe Mode With Command Prompt. Standard safe mode comprises the minimum number of device drivers and services necessary to boot successfully. Networking-enabled safe mode adds network drivers and services to the drivers and services that standard safe mode includes. Finally, safe mode with command prompt is identical to standard safe mode except that Windows runs the command prompt application (Cmd.exe) instead of Windows Explorer as the shell when the system enables GUI mode.

Windows includes a fourth safe mode-Directory Services Restore mode which is different from the standard and networking enabled safe modes. You use Directory Services Restore mode to boot the system into a mode where the Active Directory directory service of a domain controller is offline and unopened. This allows you to perform repair operations on the database or restore it from backup media. All drivers and services, with the exception of the Active Directory service, load during a Directory Services Restore mode boot. In cases where you can't log in a system because of Active Directory database corruption, this mode enables you to repair the corruption.

Driver Loading in Safe Mode

How does Windows know which device drivers and services are part of standard and networking-enabled safe mode? The answer lies in the HKLM\SYSTEM\CurrentControlSet\ Control\SafeBoot registry key. This key contains the Minimal and Network subkeys. Each subkey contains more subkeys that specify the names of device drivers or services or of groups of drivers. For example, the vga.sys subkey identifies the VGA display device driver that the startup configuration includes. The VGA display driver provides basic graphics services for any PC-compatible display adapter. The system uses this driver as the safe-mode display driver in lieu of a driver that might take advantage of an adapter's advanced hardware features but that might also prevent the system from booting. Each subkey under the SafeBoot key has a default value that describes what the subkey identifies; the vga.sys subkey's default value is "Driver".

The Boot file system subkey has as its default value "Driver Group". When developers design a device driver's installation script, they can specify that the device driver belongs to a driver group. The driver groups that a system defines are listed in the List value of the HKLM\SYSTEM\CurrentControlSet\Control\ServiceGroupOrder key. A developer specifies a driver as a member of a group to indicate to Windows at what point during the boot process the driver should start. The ServiceGroupOrder key's primary purpose is to define the order in which driver groups load; some driver types must load either before or after other driver types. The Group value beneath a driver's configuration registry key associates the driver with a group.

Driver and service configuration keys reside beneath HKLM\SYSTEM\CurrentControlSet\ Services. If you look under this key, you'll find the VgaSave key for the VGA display device driver, which you can see in the registry is a member of the Video Save group. Any file system drivers that Windows requires for access to the Windows system drive are in the Boot file system group. If the system drive is NTFS, the NTFS driver is part of this group. (The value of Group under the Ntfs key is Boot file system.) Otherwise, the Fastfat file system driver (which supports FAT12, FAT16, and FAT32 drives in Windows) is part of this group. Other file system drivers are part of the File system group, which the standard and networking-enabled safe-mode configurations also include.

When you boot into a safe-mode configuration, the boot loader (Ntldr) passes an associated switch to the kernel (Ntoskrnl.exe) as a command-line parameter, along with any switches you've specified in the Boot.ini file for the installation you're booting. If you boot into any safe mode, Ntldr passes the /SAFEBOOT: switch. Ntldr appends one or more additional strings to /SAFEBOOT:, depending on which type of safe mode you select. For standard safe mode, Ntldr appends MINIMAL, and for networking-enabled safe mode, it adds NETWORK. Ntldr adds MINIMAL(ALTERNATESHELL) for safe mode with command prompt and DSREPAIR for Directory Services Restore mode.

The Windows kernel scans boot parameters in search of the safe-mode switches early during the boot and sets the internal variable InitSafeBootMode to a value that reflects the switches the kernel finds. The kernel writes the InitSafeBootMode value to the registry value HKLM\SYSTEM\ CurrentControlSet\Control\SafeBoot\Option\OptionValue so that user-mode components, such as the SCM, can determine what boot mode the system is in. In addition, if the system is booting safe mode with command prompt, the kernel sets the HKLM\SYSTEM\ CurrentControlSet\Control\SafeBoot\Option\UseAlternateShell value to 1. The kernel records the parameters that Ntldr passes to it in the value HKLM\SYSTEM\CurrentControlSet\ Control\SystemStartOptions.

When the I/O manager kernel subsystem loads device drivers that HKLM\SYSTEM\Current-ControlSet\Services specifies, the I/O manager executes the function IopLoadDriver. When the Plug and Play manager detects a new device and wants to dynamically load the device driver for the detected device, the Plug and Play manager executes the function IopCallDriver-AddDevice. Both these functions call the function IopSafeBootDriverLoad before they load the driver in question. IopSafeBootDriverLoad checks the value of InitSafeBootMode and determines whether the driver should load. For example, if the system boots in standard safe mode, IopSafeBootDriverLoad looks for the driver's group, if the driver has one, under the Minimal subkey. If IopSafeBootDriverLoad finds the driver's group listed, IopSafeBootDriverLoad indicates to its caller that the driver can load. Otherwise, IopSafeBootDriverLoad looks for the driver's name under the Minimal subkey. If the driver's name is listed as a subkey, the driver can load. If IopSafeBootDriverLoad can't find the driver group or driver name subkeys, the driver can't load. If the system boots in networking-enabled safe mode, IopSafeBootDriverLoad performs the searches on the Network subkey. If the system doesn't boot in safe mode, Iop-SafeBootDriverLoad lets all drivers load.

An exception loophole exists regarding the drivers that safe mode excludes from a boot: Ntldr, rather than the kernel, loads any drivers with a Start value of 0 in their registry key, which specifies loading the drivers at boot time. Ntldr doesn't check the SafeBoot registry key because it assumes that any driver with a Start value of 0 is required for the system to boot successfully. Because Ntldr doesn't check the SafeBoot registry key to identify which drivers to load, Ntldr therefore loads all boot-start drivers (and later Ntoskrnl starts them).

Safe-Mode-Aware User Programs

When the service control manager (SCM) user-mode component (which Services.exe implements) initializes during the boot process, the SCM checks the value of HKLM\SYSTEM\ CurrentControlSet\Control\SafeBoot\Option\OptionValue to determine whether the system is performing a safe mode boot. If so, the SCM mirrors the actions of IopSafeBootDriverLoad. Although the SCM processes the services listed under HKLM\SYSTEM\CurrentControlSet\ Services, it loads only services that the appropriate safe-mode subkey specifies by name. You can find more information on the SCM initialization process in the section "Services" in Chapter 4.

Userinit (\Windows\System32\Userinit.exe) is another user-mode component that needs to know whether the system is booting in safe mode. Userinit, the component that initializes a user's environment when the user logs on, checks HKLM\SYSTEM\CurrentControlSet\Control\SafeBoot\Option\UseAlternateShell. If this value is set, Userinit runs the program specified as the user's shell in the value HKLM\SYSTEM\CurrentControlSet\Control\SafeBoot\ AlternateShell rather than executing Explorer.exe. Windows writes the program name Cmd.exe to the AlternateShell value during installation, making the Windows command prompt the default shell for safe mode with command prompt. Even though command prompt is the shell, you can type Explorer.exe at the command prompt to start Windows Explorer, and you can run any other GUI program from the command prompt as well.

How does an application determine whether the system is booting in safe mode? By calling the Windows GetSystemMetrics(SM_CLEANBOOT) function. Batch scripts that need to perform certain operations when the system boots in safe mode look for the SAFEBOOT_ OPTION environment variable because the system defines this environment variable only when booting in safe mode.

Boot Logging in Safe Mode

When you direct the system to boot into safe mode, Ntldr hands the string specified by the /BOOTLOG option to the Windows kernel as a parameter, together with the parameter that requests safe mode. When the kernel initializes, it checks for the presence of the boot log parameter, whether or not any safe-mode parameter is present. If the kernel detects a boot log string, the kernel records the action the kernel takes on every device driver it considers for loading. For example, if IopSafeBootDriverLoad tells the I/O manager not to load a driver, the I/O manager calls IopBootLog to record that the driver wasn't loaded. Likewise, after IopLoad-Driver successfully loads a driver that is part of the safe-mode configuration, IopLoadDriver calls IopBootLog to record that the driver loaded. You can examine boot logs to see which device drivers are part of a boot configuration.

Because the kernel wants to avoid modifying the disk until Chkdsk executes, late in the boot process, IopBootLog can't simply dump messages into a log file. Instead, IopBootLog records messages in the HKLM\SYSTEM\CurrentControlSet\BootLog registry value. As the first user-mode component to load during a boot, the Session Manager (\Windows\System32\ Smss.exe) executes Chkdsk to ensure the system drives' consistency and then completes registry initialization by executing the NtInitializeRegistry system call. The kernel takes this action as a cue that it can safely open a log file on the disk, which it does, invoking the function Iop-CopyBootLogRegistryToFile. This function creates the file Ntbtlog.txt in the Windows system directory (\Windows by default) and copies the contents of the BootLog registry value to the file. IopCopyBootLogRegistryToFile also sets a flag for IopBootLog that lets IopBootLog know that writing directly to the log file, rather than recording messages in the registry, is now OK. The following output shows the partial contents of a sample boot log:

Service Pack 1 3 30 2004 14:05:21.500 Loaded driver \WINDOWS\system32\ntoskrnl.exe Loaded driver \WINDOWS\system32\hal.dll Loaded driver \WINDOWS\system32\KDCOM.DLL Loaded driver \WINDOWS\system32\BOOTVID.dll Loaded driver ACPI.sys Loaded driver \WINDOWS\System32\DRIVERS\WMILIB.SYS Loaded driver pci.sys Loaded driver isapnp.sys Loaded driver intelide.sys Loaded driver \WINDOWS\System32\DRIVERS\PCIIDEX.SYS Loaded driver MountMgr.sys Loaded driver ftdisk.sys Loaded driver dmload.sys Loaded driver dmio.sys Microsoft (R) Windows 2000 (R) Version 5.0 (Build 2195)  2 11 2000 10:53:27.500 Loaded driver \WINNT\System32\ntoskrnl.exe Loaded driver \WINNT\System32\hal.dll Loaded driver \WINNT\System32\BOOTVID.DLL Loaded driver ACPI.sys Loaded driver \WINNT\System32\DRIVERS\WMILIB.SYS Loaded driver pci.sys Loaded driver isapnp.sys Loaded driver compbatt.sys Loaded driver \WINNT\System32\DRIVERS\BATTC.SYS Loaded driver intelide.sys Loaded driver \WINNT\System32\DRIVERS\PCIIDEX.SYS Loaded driver pcmcia.sys Loaded driver ftdisk.sys Loaded driver Diskperf.sys Loaded driver dmload.sys Loaded driver dmio.sys § Did not load driver \SystemRoot\System32\Drivers\lbrtfdc.SYS Did not load driver \SystemRoot\System32\Drivers\Sfloppy.SYS Did not load driver \SystemRoot\System32\Drivers\i2omgmt.SYSDid not load driver Media Contro l Devices Did not load driver Communications Port Did not load driver Audio Codecs §

Recovery Console

Safe mode is a satisfactory fallback for systems that become unbootable because a device driver crashes during the boot sequence, but in some situations a safe-mode boot won't help the system boot. For example, if a driver that prevents the system from booting is a member of a Safe group, safe-mode boots will fail. Another example of a situation in which safe mode won't help the system boot is when a third-party driver, such as a virus scanner driver, that loads at the boot prevents the system from booting. (Boot-start drivers load whether or not the system is in safe mode.) Other situations in which safe-mode boots will fail are when a system module or critical device driver file that is part of a safe-mode configuration becomes corrupt or when the system drive's Master Boot Record (MBR) is damaged. You can get around these problems by using the Windows Recovery Console. The Recovery Console allows you to boot into a limited command-line shell from the Windows CD or boot disks to repair an installation without having to boot the installation.

When you boot a system from the Windows CD or boot disks, you eventually see a screen that gives you the choice of either installing Windows or repairing an existing installation. If you choose to repair an installation, the system prompts you to insert the Windows CD (if it isn't already loaded in the system's CD drive) and then to choose among two repair options: to start the Recovery Console or to initiate the emergency repair process. If you press the F10 key at the Setup Welcome screen, you bypass the menu options and take a shortcut directly to the Recovery Console.

When you start the Recovery Console, it gives you a list of Windows NT and Windows installations to choose from that it compiled when it scanned the computer's hard disks. After you make a selection, the system prompts you to enter the Administrator account password to log on to the installation as the administrator. If you successfully log on, the system puts you into a command shell that is similar to an MS-DOS environment. The command set is flexible and lets you perform simple file operations (such as copy, rename, and delete), enable and disable services and drivers, and even repair MBRs and boot records. However, the Recovery Console won't let you access directories other than root directories, the system directory of the installation you logged on to, or directories on removable drives such as CDs and 3.5-inch floppy disks unless local security policy settings stored in the SECURITY hive of the Registry of the installation into which you log in permit it. This prohibition provides a certain level of security for data that an administrator might not usually be able to access. You can override this restriction by using the Local Security Policy editor (secpol.msc) to configure the Recovery Console settings in the Security Options folder of Local Policies when the system is booted normally.

The Recovery Console uses the native Windows system call interface to perform file I/O to support commands such as Cd, Rename, and Move. The Enable and Disable commands, which let you change the startup modes of device drivers and services, work differently. For example, when you tell the Recovery Console that you want to disable a device driver, it reaches into the installation's Services key and manipulates the Start value of the specified driver's key, changing the value to SERVICE_DISABLED. The next time the installation boots, that device driver won't load. (The Recovery Console also loads the SYSTEM hive [\Windows\System32\Config\System] for the installation you log on to. This hive contains the information stored in the HKLM\SYSTEM\CurrentControlSet\Services registry key.)

When you boot from the Windows CD or the boot disks, by the time the system gives you the choice to install or repair Windows, the CD has booted a copy of the Windows kernel, including all necessary supporting device drivers (for example, NTFS or FAT drivers, SCSI drivers, a video driver). On x86 systems, the Txtsetup.sif file in the I386 directory of the Windows CD guides the boot from the CD; the file contains directives that identify which files need to load and where the files are located on the CD. Just as when you boot Windows from a hard disk, the first user-mode program the kernel executes is Session Manager (Smss.exe), located in the I386\System32 folder. The Session Manager that Windows Setup uses differs from the standard-installation Session Manager. The former component presents you with the menus that let you install or repair Windows and the menu that asks you what type of repair you want to perform. If you're installing Windows, Session Manager is the component that guides you through choosing a partition to install to and copies files to the hard disk.

When you run the Recovery Console, Session Manager loads and starts two device drivers that implement the Recovery Console: Spcmdcon.sys and Setupdd.sys. Spcmdcon.sys presents an interactive command prompt and performs high-level command processing. Setupdd.sys is a support driver that gives Spcmdcon.sys a set of functions that let Spcmdcon.sys manage disk partitions, load registry hives, and display and manage video output. Setupdd.sys also communicates with disk drivers to manage disk partitions and uses basic video support built into the Windows kernel to display messages on the screen.

When you choose an installation to log on to and the Recovery Console accepts your password, the Recovery Console must validate your logon attempt, even though the installation's Windows security subsystem isn't up and running. Thus, the Recovery Console alone must determine whether your password matches the system's Administrator account. The Recovery Console's first step in this process is to use Setupdd.sys to load the installation's Security Accounts Manager (SAM) registry hive, which stores password information, from the hard disk. The SAM hive resides in \Windows\System32\Config\Sam. After loading the hive, the Recovery Console locates the system key in the installation's registry and uses the system key to decrypt the in-memory copy of the SAM. SAM hive encryption is a feature introduced in Windows NT 4 Service Pack 3 that adds protection against MS-DOS-based password snoopers who try to read passwords directly out of a hive file.

Next, the Recovery Console (Spcmdcon.sys) locates the Administrator account password in the SAM, and in the final authentication step, the Recovery Console uses the MD5 hash algorithm the same algorithm that the Windows logon process uses to hash the password entered and compares the hash against the hashed password that the SAM stores. If the Recovery Console finds a match, the system considers you logged on. If the Recovery Console doesn't find a match, the system denies you access to the Recovery Console.

Solving Common Boot Problems

This section describes problems that can occur during the boot process, describing their symptoms, causes, and approaches to solving them. To help you locate a problem that you might encounter, they are organized according to the place in the boot at which they occur.

MBR Corruption
  • Symptoms A system that has Master Boot Record (MBR) corruption will execute the BIOS power-on self test (POST), display BIOS version information or OEM branding, switch to a black screen, and then hang. Depending on the type of corruption the MBR has experienced, you might see one of the following messages: "Invalid Partition Table," "Error Loading Operating System," or "Missing Operating System."

  • Cause The MBR can become corrupt because of hard-disk errors, disk corruption as a result of a driver bug while Windows is running, or intentional scrambling as a result of a virus.

  • Resolution Boot into the Recovery Console and execute the fixmbr command. This command replaces the executable code in the MBR. Unfortunately, it does not repair the partition table. The only way to restore a damaged partition table is to restore it from a backup copy or to use a third-party disk-corruption repair tool.

Boot Sector Corruption
  • Symptoms Boot sector corruption can look like MBR corruption where the system hangs after BIOS POST at a black screen, or you might see the messages "A disk read error occurred," "NTLDR is missing," or "NTLDR is compressed" displayed in a black screen.

  • Cause The MBR can become corrupt because of hard disk errors, disk corruption as a result of a driver bug while Windows is running, or intentional scrambling as a result of a virus.

  • Resolution Boot into the Recovery Console and execute the fixboot command. This command rewrites the boot sector of the volume that you specify. You should execute the command on both the system and boot volumes if they are different.

Boot.ini Misconfiguration
  • Symptom After BIOS POST, you'll see a message that begins "Windows could not start because of a computer disk hardware configuration problem," "Could not read from selected boot disk," or "Check boot path and disk hardware."

  • Cause The Boot.ini file has been deleted, is corrupted, or no longer references the boot volume because the addition of a partition has changed the Advanced RISC Computing (ARC) name of the volume.

  • Resolution Boot into the Recovery Console, and execute the "bootcfg /rebuild". This command has the Recovery Console scan each volume looking for Windows installations. When it discovers an installation, it asks you whether it should add it to Boot.ini as a boot option and what name it should display for the installation in the boot menu.

System File Corruption
  • Symptoms There are several ways the corruption of system files which include executables, drivers, or DLLs can manifest. One way is with a message on a black screen after BIOS POST that says, "Windows could not start because the following file is missing or corrupt," followed by the name of a file and a request to re-install the file. Another way is with a blue screen crash during the boot with the text, "STOP: 0xC0000135 {Unable to Locate Component}."

  • Causes The volume on which a system file is located is corrupt or one or more system files have been deleted or become corrupt.

  • Resolution Boot into the Recovery Console, and execute the chkdsk command. Chkdsk will attempt to repair volume corruption. If Chkdsk does not report any problems, obtain a backup copy of the system file in question. One place to check is in the \Windows\System32\DllCache directory, in which Windows places copies of many system files for access by Windows File Protection. (See the "Windows File Protection" sidebar.) If you cannot find a copy of the file there, see if you can locate a copy from another system in the network. Note that the backup file must be from the same Service Pack or hot fix as the file that you are replacing.

In some cases, multiple system files are deleted or become corrupt, so the repair process can involve multiple reboots and boot failures as you repair the files one by one. If you believe the system file corruption to be extensive, you should consider restoring the system from a backup image, such as one generated by Automated System Recovery (ASR). When you run Windows Backup (located in the System folder under Accessories in the Start menu), you can generate an ASR backup image, which includes all the files on the system and boot volumes, plus a floppy disk on which it stores information about the system's disks and volumes. To restore a system from an ASR, back up boot from the Windows setup media and press F2 when prompted.

If you do not have a backup from which to restore, a last resort is to execute a Windows repair install: boot from the Windows setup media, and follow the wizard as if you were going to perform a new installation. The wizard will ask you whether you want to perform a repair or fresh install. When you tell it that you want to repair, Setup reinstalls all system files, leaving your application data and registry settings intact.

Windows File Protection

In addition to its role as the interactive logon interface and Session Manager, Winlogon also implements Windows File Protection (WFP). WFP, which is implemented in the two DLLs \Windows\System32\Sfc.dll and \Windows\System32\Sfc_os.dll, monitors several directories for changes to key drivers, executables, and DLLs, including most subdirectories under \Windows, using the native API version of ReadDirectoryChangesW. When WFP sees that a change has occurred to a system file listed in \Windows\System32\ Sfcfiles.Dll (and you can use the Strings utility from http://www.sysinternals.com to see the files listed in Sfcfiles.dll), it checks to see whether the file is digitally signed by Microsoft (a process for which you can find more information in the "Driver Installation" section of Chapter 9). If the file is digitally signed by Microsoft, WFP allows the change and copies the file to the WFP backup directory. By default, the backup directory is \Windows\ System32\DllCache, although that can be overridden by defining the Registry value HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\ Winlogon\SFCDllCacheDir. Hot fixes and service packs always install Microsoftsigned system files.

If the file modification doesn't result in a file that isn't Microsoft-signed, WFP replaces the modification with the backup version of the file from the DLLCache subdirectory. If Winlogon can't find a backup version in that directory, it checks in the network install path if the system was installed using a network install or in the setup media (prompting for insertion) if the install was from local media.


System Hive Corruption
  • Symptoms If the System registry hive (which is discussed along with hive files in the "Registry" section of Chapter 4) is missing or corrupted, NTLDR will display the message, "Windows could not start because the following file is missing or corrupt: \WINDOWS\SYSTEM32\CONFIG\SYSTEM," on a black screen after the BIOS POST.

  • Causes The System registry hive, which contains configuration information necessary for the system to boot, has become corrupt or has been deleted.

  • Resolution Boot into the Recovery Console, and execute the chkdsk command on the boot volume to correct any volume corruption. If the problem is not corrected, obtain a backup of the System registry hive. If you have made ASR backups of the system or have used the Windows Backup utility to make backups of system state (an option in the backup UI), copies of the registry hives from the most recent backup are stored in \Windows\Repair, so copy the file named System to \Windows\System32\Config.

If you're running Windows XP and System Restore is enabled (System Restore is discussed in Chapter 12), you can often obtain a more recent backup of the registry hives, including the System hive, from the most recent restore point. However, you may not be able to access the directory in which restores points are stored, \System Volume Information, from within the Recovery Console. Windows XP Service Pack 1 Versions of the Recovery Console allow access to that directory, but older versions do not unless the system's local security policy allows it. You can override the restriction if necessary by using the Local Security Policy Editor to change Recovery Console settings, as described earlier. You can also use third-party tools to gain access to other directories. If you can access the restore point directories, you can follow these steps to get at their registry hives:

  1. Navigate to the directory whose name begins with "_restore" under the \System Volume Information directory of the boot volume.

  2. Locate the RP subdirectory with the highest number as its suffix (for example, "RP173").

  3. Copy the file named _REGISTRY_MACHINE_SYSTEM from the snapshot subdirectory to \Windows\System32\Config\System.

  4. Reboot.

Another option is to try and repair the corruption using the Microsoft ChkReg tool. The tool attempts to automatically repair registry corruption, and it works by running off of the Windows XP setup floppy disks. You can find the tool and instructions on how to use it at http://www.microsoft.com/downloads/details.aspx?displaylang=en&familyid=56d3c201-2c68-4de8-9229-ca494362419c.

If you haven't made backups, don't have access to restore points, and the ChkReg tool doesn't fix the corruption (or the hives are missing), you can use the copy of the System hive stored in \Windows\Repair as a last resort. Windows Setup makes a copy of the System hive after it completes an installation, so you will lose system configuration changes and device driver installations made since then.

Post Splash Screen Crash or Hang
  • Symptoms Problems that occur after the Windows splash screen displays, the desktop appears, or you log in fall into this category and can appear as a blue screen crash or a hang, where the entire system is frozen or the mouse cursor tracks the mouse but the system is otherwise unresponsive.

  • Causes These problems are almost always a result of a bug in a device driver, but they can sometimes be the result of corruption of a registry hive other than the System hive.

  • Resolution You can take several steps to try and correct the problem. The first thing you should try is the last known good configuration. Last known good (LKG), which is described earlier in this chapter and in the "Services" section of Chapter 4, consists of the registry control set that was last used to boot the system successfully. Because a control set includes core system configuration and the device driver and services registration database, using a version that does not reflect changes or newly installed drivers or services might avoid the source of the problem. You access last known good by pressing the F8 key early in the boot process to access the same menu from which you can boot into safe mode.

As stated earlier in the chapter, when you boot into LKG, the system saves the control set that you are avoiding and labels it as the failed control set. You can leverage the failed control set in cases where LKG makes a system bootable to determine what was causing the system to fail to boot by exporting the contents of the current control set of the successful boot and the failed control set to .reg files. You do this by using the Regedit's export functionality, which you access under the File menu (or under the Registry menu if you are running Windows 2000):

  1. Run Regedit, and select HKLM\System\CurrentControlSet.

  2. Select Export from the File menu, and save to a file named good.reg.

  3. Open HKLM\System\Select, read the value of Failed, and select the subkey named HKLM\System\ControlXXX, where XXX is the value of Failed.

  4. Export the contents of the control set to bad.reg.

  5. Use Wordpad (which is found under Accessories in the Start menu) to globally replace all instances of "CurrentControlSet" in good.reg with "ControlSet".

  6. Use Wordpad to change all instances of "ControlXXX" (replacing XXX with the value of the Failed control set) in bad.reg with "ControlSet".

  7. Run Windiff from the Support Tools, and compare the two files.

The differences between a failed control set and a good one can be numerous, so you should focus your examination on changes beneath the Control subkey as well as under the Parameters subkeys of drivers and services registered in the Services subkey. Ignore changes made to Enum subkeys of driver registry keys in the Services branch of the control set.

If the problem you're experiencing is caused by a driver or service that was present on the system since before the last successful boot, LKG will not make the system bootable. Similarly, if a problematic configuration setting changed outside the control set or was made before the last successful boot, LKG will not help. In those cases, the next option to try is safe mode (described earlier in this section). If the system boots successfully in safe mode and you know that particular driver was causing the normal boot to fail, you can disable the driver by using the Device Manager (accessible from the Hardware tab of the System Control Panel applet). To do so, select the driver in question and choose Disable from the Action menu. If you're running Windows XP or Windows Server 2003, you recently updated the driver, and believe that the update introduced a bug, you can choose to roll back the driver to its previous version instead, also with the Device Manager. To restore a driver to its previous version, double-click on the driver to open its properties dialog box and press Roll Back Driver on the Drivers tab.

On Windows XP systems with System Restore enabled, an option when LKG fails is to roll back all system state (as defined by System Restore) to a previous point in time. Safe mode detects the existence of restore points, and when they are present it will ask you whether you want to log in to the installation to perform a manual diagnosis and repair or launch the System Restore Wizard. Using System Restore to make a system bootable again is attractive when you know the cause of a problem and want the repair to be automatic or when you don't know the cause but do not want to invest time to determine the cause.

If System Restore is not an option or you want to determine the cause of a crash during the normal boot and the system boots successfully in safe mode, attempt to obtain a boot log from the unsuccessful boot by pressing F8 to access the special boot menu and choosing the boot logging option. As described earlier in this chapter, Session Manager (\Windows\ System32\Smss.exe) saves a log of the boot that includes a record of device drivers that the system loaded and chose not to load to \Windows\ntbtlog.txt, so you'll obtain a boot log if the crash or hang occurs after Session Manager initializes. When you reboot into safe mode, the system appends new entries to the existing boot log. Extract the portions of the log file that refer to the failed attempt, and safe mode boots into separate files. Strip out lines that contain the text "Did not load driver", and then compare them with a text comparison tool such as Windiff. One by one, disable the drivers that loaded during the normal boot but not in the safe-mode boot until the system boots successfully again. (Then re-enable the drivers that were not responsible for the problem.)

If you cannot obtain a boot log from the normal boot (for instance, because the system is crashing before Session Manager initializes), if the system also crashes during the safe-mode boot, or if a comparison of boot logs from the normal and safe-mode boots do not reveal any significant differences (for example, when the driver that's crashing the normal boot starts after Session Manager initializes), the next tool to try is the Driver Verifier combined with crash dump analysis. (See Chapter 14 for more information on both these topics.)

     < Day Day Up > 


    Microsoft Windows Internals
    Microsoft Windows Internals (4th Edition): Microsoft Windows Server 2003, Windows XP, and Windows 2000
    ISBN: 0735619174
    EAN: 2147483647
    Year: 2004
    Pages: 158

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net