Troubleshooting Startup Problems

Windows Server 2003 is certainly the most reliable version of Windows, possessing a level of robustness that its predecessors lack, even Windows 2000 and Windows XP. Does this mean startup problems cannot occur in Windows Server 2003? No, it doesn't. No existing operating system can be considered crash- or corruption-proof, not even specialized operating systems used by military organizations. Any operating system can be rendered unbootable, and the newest release of Windows is no exception. This issue becomes extremely important in a large-scale corporate network, which depends on the availability of the network operating system (OS)-particularly for network servers. You must be prepared to handle anything that goes terribly wrong and to avoid feeling helpless. A significant part of this chapter is dedicated to troubleshooting startup problems. These, I admit, are the most frustrating ones, especially if your system won't boot when you have a lot of work to do. So, what should you do in an emergency? First, don't panic. Next, try to detect what is preventing the operating system from booting.

Since boot sequence in Windows Server 2003 closely resembles that in Windows 2000/XP, most (but not all) techniques described here can be applied to all three versions. A detailed description of Windows Server 2003 boot sequence was provided in Chapter 6. Therefore, I will provide only a short explanation of the boot sequence, and then proceed with problem detection and troubleshooting.

Table 12.1 lists Windows Server 2003 startup phases with brief descriptions of the processes that take place at each stage of the normal boot process.

Table 12.1: Windows Server 2003 Startup Process

Startup stage

Description (x86-based systems)


POST routine

CPU initiates the system board POST routines. POST routines of the individual adapters start after the motherboard POST is accomplished successfully.

Initial startup process

The system searches for a boot device according to the boot order setting stored in CMOS. If the boot device is a hard disk, Ntldr starts.

Operating system load

Ntldr switches the CPU to protected mode, starts the file system, then reads the contents of the Boot.ini file. This information determines the startup options and initial boot menu selections.

Hardware detection and configuration selection

Ntdetect.com gathers basic hardware configuration data and passes this information to Ntldr. If more than one hardware profile exists, Windows XP and Windows Server 2003 attempt to use the correct one for the current configuration. Notice that if your computer is ACPI-compliant, Windows XP or Windows Server 2003 ACPI functionality will be used for device enumeration and initialization. (More information on this topic was provided in Chapter 5.)

Kernel loading

Ntldr passes the information collected by Ntdetect.com to Ntoskrnl.exe. Ntoskrnl then loads the kernel, HAL, and registry information. A status bar at the bottom portion of the screen indicates progress.

Operating system logon process

Networking-related components (such as TCP/IP) load asynchronously with other services, and the Begin Logon prompt appears on screen. After a user logs on successfully, Windows updates the Last Known Good Configuration information to reflect the current state.

New devices are detected by Plug and Play

If Windows XP or Windows Server 2003 detects new devices, they are assigned system resources. The operating system extracts the required driver files from the Driver.cab file. If this file is not found, Windows XP or Windows Server 2003 prompts the user to provide them. Device detection occurs asynchronously with the operating system logon process.

Diagnosing Startup Problems

Fortunately, boot failures in Windows XP and Windows Server 2003 are rare, especially if you perform regular maintenance and take preventive measures against disaster. However, problems still can arise. As with any other operating system, they might be caused both by hardware malfunctions and by software errors. If the problem is severe enough, the system stops booting and displays an error message. A brief list of error messages and their meanings is presented in Table 12.2. Although this list is by no means comprehensive, it covers the most common problems that can cause startup failures of Windows NT-based operating systems, including Windows 2000, Windows XP, and Windows Server 2003.

Table 12.2: Startup Problem Symptoms

Startup problem symptom

Possible cause


The POST routine emits a
series of beeps and displays
error messages, for example:

  Hard disk error.  Hard disk absent/  failed.

The system self-test routines stopped because of improperly installed devices.

To recover from hardware problems, carefully review the documentation supplied with your system and perform the basic hardware checks. Verify that all cables are attached correctly and all internal adapters are installed properly. Make sure that all peripheral devices (such as keyboards) necessary to complete the POST without error messages are installed and functioning. If applicable, verify that you have configured correctly all jumpers or dual in-line package (DIP) switches. Jumpers and DIP switches are especially important for hard disks. Run diagnostic software to detect hardware malfunction, and replace the faulty device.

Unfortunately, the topic of troubleshooting hardware problems goes beyond the range of problems discussed in this book. It deserves a separate comprehensive volume. However, I can recommend some resources on the topic that would help you make sense of the BIOS error codes:

  • BIOS Survival Guide, available at http://burks.bton.ac.uk/burks/pcinfo/hardware/bios_sg/bios_sg.htm

  • Definitions and Solutions for BIOS Error Beeps and Messages/Codes, available at http://www.earthweb.com

CMOS or NVRAM settings are not retained

The CMOS memory is faulty, data is corrupt, or the battery needs replacing.

Master boot record (MBR)-related error messages similar to the following:

   Missing operating   system.   Insert a system   diskette and restart   the system.

The MBR is corrupt.

The easiest method of recovering the damaged MBR is provided by Recovery Console (the methods of starting Recovery Console were discussed in Chapter 2). Once you are in Recovery Console, use the FIXMBR command to repair the MBR.

The FIXMBR command uses the following syntax:

   Fixmbr [device_name] 

The parameter device_name specifies the drive on which you need to repair the damaged MBR. For example:

   fixmbr \Device\HardDiskO 

If the device_name parameter is omitted, the new MBR will be written to the boot device, from which your primary system is loaded. Notice that you'll be prompted to confirm your intention to continue if an invalid partition table is detected.

Partition table-related error message similar to the following:

   Invalid partition   table.   A disk-read error   occurred.

The partition table is invalid.

You can recover from this problem using the DiskProbe Resource Kit utility or any third-party low-level disk editor. Note that to prevent this problem, you must create a backup copy of the MBR beforehand. (You can use the DiskProbe tool for this purpose.) Detailed information on this topic can found in the Resource Kit documentation.

If the MBR on the disk used to start Windows is corrupt, most likely you will be unable to start Windows XP or Windows Server 2003 (and, consequently, DiskProbe). Therefore, before proceeding any further, you'll need to start Recovery Console to replace the damaged MBR.

Boot failure caused by disk or file system corruption, not related to damaged MBR or partition table

Start Recovery Console and run the CHKDSK command to repair the disk. If this proves to be insufficient, you will need to take additional actions to fully recover the damaged file system.

Windows XP or Windows Server 2003 cannot start after you have installed another operating system

The Windows XP or Windows Server 2003 boot sector was overwritten by the other operating system's setup program.

Recovery Console provides the FIXBOOT command that enables you to restore the overwritten boot sector.

Missing Boot.ini, Ntoskrnl.exe, or Ntdetect.com files (x86-based systems)

Required startup files are missing or damaged, or entries in the Boot.ini are pointing to the wrong partition.

Start the Recovery Console and use available commands, such as REN, DEL, or COPY, to restore working copies of boot files.

Bootstrap loader error messages similar to the following:

   Couldn't find loader   Please insert another   disk.

Ntldr is missing or corrupt.

If Ntldr or any other file required to boot the system is missing or corrupt, start Recovery Console and copy the required file.

Windows NT-based OS cannot start and displays message similar to the one provided below:

   Windows could not   start because the   following file is   missing or corrupt:   \WINNT\SYSTEM32\CON   FIG\SYSTEM   You can attempt to   repair this file by   starting   Windows Setup using   the original Setup   floppy disk of CD-ROM.   Select 'r' at the   first screen to   repair.

This and similar error messages specifying different file names indicate that the boot failure was caused by a damaged registry hive(s) or by invalid registry settings.

First, try to boot using the safe mode startup option. If your attempt has failed, try the Last Known Good boot option. If you still can't boot successfully, start Recovery Console and use the COPY command to restore known good registry files (for example, those located in the %SystemRoot%\Repair folder) to the %SystemRoot%\System32\Config folder.

If the problem is related to settings for a specific service or driver, you also may be able to use Recovery Console's DISABLE command to disable the offending service or driver.

Boot failure caused by a video display driver problem

Use the safe mode startup option, then repair or replace the driver.

Boot failure caused by service or driver initialization

As a first line of defense, try to boot in safe mode and disable the offending service or driver. If your attempt fails, start Recovery Console and use the LISTSVC and DISABLE commands to identify and disable the service or driver that prevents Windows from booting. If you have a working copy of the system registry, you can use the Recovery Console's COPY command to restore the system registry.

Boot failure caused by invalid file attributes set on system files or folders

Start Recovery Console and use the ATTRIB command to restore the correct attributes.

Boot failure caused by unknown system startup event

Try to boot into the safe mode. If this is not successful, try to use the Boot Logging startup option. Then start Recovery Console and use the TYPE command on the resulting log file to identify the failed initialization event.

Stop messages appear

Many software or hardware issues can cause these messages. In addition to official Microsoft documentation (such as the long list of common STOP messages usually supplied in the Resource Kit documentation), there are other useful resources on troubleshooting STOP messages. One such resource can be found at http://www.aumha.org/kbestop.htm

As mentioned in Chapter 6, all Windows NT-based systems generate system messages known as blue screens, or "Blue Screens of Death", if they encounter serious errors which they can't correct. If Windows stops loading, the blue screen also may appear to prevent further data corruption. If the STOP message appears during system startup, it's likely that the cause of the problem is among the following:

  • The user installed third-party software that's destroyed part of the system registry (that is, the HKEY_LOCAL_MACHINE root key). This may happen if the application tries to install a new service or driver. The blue screen will appear, informing the user that the registry or one of its hives couldn't be loaded.

  • The user incorrectly modified the hardware configuration and, as a result, one of the critical system files was overwritten or corrupted.

  • The user installed a new service or system driver that is incompatible with the hardware, causing the blue screen to appear after rebooting. Strictly speaking, it's the attempt to load an incompatible file that leads to the corruption of a correct system file.

Note 

One of the drawbacks of Windows NT 4.0 and earlier versions of the Windows NT operating system was shared system files, which could be overwritten during installation of incompatible third-party software. Starting with Windows 2000, this drawback was eliminated by the addition of appropriate protection for critical system files. This functionality was discussed in Chapter 6. If you wish to avoid startup problems, I recommend that you regularly use these tools.

Parallel Installation of the Operating System

What else can be done to provide universal troubleshooting tools for startup problems? A traditional method of increasing the probability of quick and easy recovery became popular among users of Windows NT 4.0 and earlier. This method is known as "parallel installation of the operating system". The parallel installation is another copy of the Windows NT-based operating system installed on the same computer in a different installation folder, preferably on a hard disk different from the primary installation. If the main operating system (the one you use most frequently) fails to boot, an additional copy of the operating system will allow quick access to NTFS volumes, system files, and registry hives. Another method of providing access to NTFS volumes after system failures is to use the NTFSDOS utility, which will be discussed in Chapter 14.

Note 

Parallel OS installations weaken the system security; like NTFSDOS, parallel installations provide a backdoor to your main operating system. Thus, from a reliability and recoverability point of view, both parallel OS installations and NTFSDOS are beneficial. From a security point of view, they're not ideal methods.

Although the introduction of Recovery Console has significantly reduced the need for parallel OS installation during a system recovery operation, this additional safeguard should not be dismissed altogether. Despite the power of Recovery Console, you may wish to continue using parallel OS installation on the most critical servers running Windows 2000 or Windows Server 2003. For example, it gives you the ability to quickly reset permissions on the primary installation's %Systemroot% folder if the permissions are configured incorrectly. This is desirable because Recovery Console does not provide an easy way of resetting such permissions. Furthermore, despite all of its impressive new capabilities and power, Recovery Console remains a limited command-line environment. For example, it doesn't allow you to run a GUI-based registry editor or backup utility, or any other application that requires full GUI-based functionality. If the primary installation becomes inaccessible and you need to access this type of application to restore it, then parallel OS installation will be an enormous relief.

Note 

If you need a more advanced, GUI-based version of Recovery Console with built-in Registry Editor, I'd like to draw your attention to ERD Commander 2002 from Aelita Software. This utility will be covered in more detail in Chapter 14.

You should install the parallel OS in advance; the procedure is time-consuming, and you may be short of time when problems occur. Note that you can only install the minimum set of options in the parallel OS. To make a parallel installation more useful and the system effective, consider placing the installation on a disk partition other than that where the primary OS is installed. This improves the chances that the parallel OS installation will be accessible if the primary installation's boot partition is damaged severely.

Additional Hardware Profiles

In addition to a parallel installation of the operating system, there's another method of performing quick recovery. If you experiment with various hardware devices and aren't sure if the device you're going to install is listed in Hardware Compatibility List (HCL), you may want to use additional hardware profiles for the system recovery. Proceed as follows:

  1. Before installing a new device that may cause a problem, create a new ERD (Windows 2000) or prepare for Automated System Recovery (ASR) (Windows XP and Windows Server 2003). Then back up the system registry using one of the methods described in Chapter 2. The ERD (or ASR backup) and registry backup copies will be useful.

  2. Create a new hardware profile. Launch the System applet in Control Panel, go to the Hardware tab, and click the Hardware Profiles button. The Hardware Profiles window will open (Fig. 12.1). Click the Copy button and create a new hardware profile by copying one of the existing profiles. It's best to name hardware profiles using "speaking names" that explain their purpose (for example, Working-the current hardware profile, free of errors; and Experimental-the new hardware profile, where you'll try solutions to the problem). In the Hardware profiles selection group, set the Wait until I select a hardware profile radio button.

    click to expand
    Figure 12.1: Before installing a new device that isn't listed in the HCL, create an additional hardware profile.

  3. Check if the hardware profiles are working. Try to start the operating system using each of them.

  4. Start the computer using the Working profile and try to install the new device and its drivers using the Hardware Wizard. If the system prompts you to reboot the computer, don't reboot the system immediately. Start Device Manager, find the newly installed device in the list, and select the Properties command from the context menu. You'll see the General tab of the properties window for this device. If the newly installed device is incompatible, you'll immediately see that "something is wrong" (despite the phrase "This device is working properly" displayed in the Device status field, as shown in Fig. 12.2). For example, this device may be marked as an Unknown device that may cause problems. To avoid possible problems, disable this device in the current hardware profile by selecting the Do not use this device in the current hardware profile (disable) option from the Device usage list. The device will be disabled in the current hardware profile, but it will remain enabled in the experimental hardware profile.

    click to expand
    Figure 12.2: Although the Device Manager states This device is working properly, a newly installed device will cause problems. Disable it in the current hardware profile

  5. Now reboot the system and select the experimental hardware profile (where the problem device is enabled). Do you see the "Blue Screen of Death"? Probably not, because the device is disabled in the working hardware profile. In most cases, you'll be able to boot the system using the working hardware profile.

Note 

I recommend that you always have a working hardware profile that contains no errors and enables no problem devices. This profile often provides an easier means of recovering a system with configuration problems than the Advanced startup menu.

How Can I See the "Blue Screen of Death"?

Have you ever seen the "Blue Screen of Death"? If you haven't, most people will consider you a lucky person. What... you're curious to see what it is? Well, here you are!

Windows 2000, Windows XP, and Windows Server 2003 have one undocumented function that allows you to generate an artificial STOP error (blue screen) and manually create a crash dump (Memory.dmp). The STOP screen that appears after using this feature will contain the following message:

   *** STOP: 0x000000E2 (0x00000000, 0x00000000, 0x00000000, 0x00000000)   The end-user manually generated the crashdump.

By default, this feature is disabled. To enable it, you'll need to edit the registry and reboot the computer. Open the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\i8042prt\Parameters registry key, add the REG_DWORD CrashOnCtrlScroll value, and set it to 1.

After rebooting the system, you'll be able to manually "crash" the system. To view the "Blue Screen of Death," press and hold the right <Ctrl> key, and press the <Scroll Lock> key twice.

How to Recreate a Missing ASR Floppy Disk

If you tried all of these troubleshooting options and nothing happened, you may decide to run the Automated System Recovery (ASR) process. What should you do if the ASR diskette is missing? Does this mean everything is lost? No, it doesn't, not if your ASR backup for storing media works. Using this, you can recreate the missing ASR floppy disk.

The Asr.sif and Asrpnp.sif files contained on the ASR diskette are ASCII files that can be viewed or edited with any text editor, such as Notepad.exe. These files also can be extracted from the ASR backup set and copied to a floppy disk that can be used for an ASR procedure. You can use Backup Utility supplied with Windows Server 2003, Windows XP, or even the version supplied with Windows 2000.

To recreate a missing ASR diskette:

  1. Format a 1.44 megabyte (MB) floppy disk and insert the disk into the floppy disk drive of any computer running Windows 2000, Windows XP, or Windows Server 2003.

  2. In System Tools, start the Backup program. If it starts in wizard mode, switch to the advanced mode and go to the Restore and Manage Media tab (Windows XP and Windows Server 2003) or to the Restore tab (Windows 2000). Insert your backup media with the ASR backup set into the backup device and select the Catalog a backup file command from the Tools menu. When the next window appears, specify the path to the backup copy that you require. (Use the Browse button if necessary.)

  3. Select the backup media containing the required ASR backup set. Expand the Automated System Recovery Backup Set option corresponding to the ASR disk you need to recreate.

  4. Expand the Windows folder/Repair folder and click the following files from this repair folder: Asr.sif, Asrpnp.sif, and Setup.log (Fig. 12.3). In the Restore files to field, select Alternate location. In the Alternate location field, specify the path to the root of your floppy drive (for example, "A:\").

    click to expand
    Figure 12.3: Recreating the missing ASR floppy disk

  5. Click Next. The other options in this wizard are not mandatory and do not affect the transfer of files to the floppy disk. When the wizard is finished, the files are copied to the specified location. The ASR floppy disk is ready if you need to perform an ASR restore operation.

Note 

The Asr.sif and Asrpnp.sif files must reside on the root of the floppy disk drive to be used during ASR restore operation.



Windows Server 2003 Registry
Unicode Explained
ISBN: 1931769214
EAN: 2147483647
Year: 2005
Pages: 129

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net