Hack89.Resolve Common Boot and Startup Problems


Hack 89. Resolve Common Boot and Startup Problems

Malicious crackers, overenthusiastic software updates, or simple hardware failures can prevent you from rebooting or accessing a system. The first thing to do is to relax and try a few standard tips and tricks to get your ailing system back on its feet.

Sooner or laterusually just before one of your users is about to submit her thesis or you have a meeting to present the IT strategy document you've been working on for weeksyou'll find that attempting to boot one of your systems results in a variety of cryptic error messages, a blinking cursor, or a graphical user interface that won't accept any keyboard or mouse input. In other words, not the standard Linux login you're used to at all. Of course, you have backups of your critical files elsewhere, but if your system isn't running for one reason or another, backups are just a distant security blanket. In all likelihood, your data is probably still present on the host formerly known as "your desktop machine," but you just can't boot the box to get to it. What's a girl to do?

Depending on the types of errors you're seeing, you may need anything from a crash course in BIOS settings, a PhD in the use of fsck and its friends, or some way of booting your system and accessing your data quickly. This hack discusses some of the standard tips and tricks for trying to get your box running on its own. If the tips in this hack aren't sufficient, see "Rescue Me!" [Hack #90] for the big hammer, which is creating a bootable CD containing a Linux distribution that provides the tools you need to repair an ailing Linux box. You can then apply the tools provided on that CD to repair your filesystems, recover partitions, and perform the other hacks listed at the end of this one that will enable you to get your system back and booting on its own.

10.2.1. Check BIOS Settings

If your system doesn't boot at all, the first thing to check is whether it's actually finding the device from which you expect it to boot. If you've recently added a disk to your system or changed its hardware configuration in any way, chances are that your BIOS settings are simply wrong. For example, I have a 64-bit server with a variety of removable drives that boots off an internal disk. For some reason, each time I add, remove, or change one of the removable drives, the BIOS forgets that it's supposed to boot off an internal SATA drive and insists on trying to boot from one of my music archives or one of the disks containing user home directories. Crap.

The standard symptoms of a system that has become confused about its boot settings are a blinking cursor after the system has tried to initiate the boot process, or a message saying something like "No bootable devices found." To make sure that your system is actually attempting to boot from the right device, you'll have to investigate its Basic Input/Output System (BIOS) settings.

On many systems, either there's a boot splash screen that hides the command needed to enter the BIOS, or the display comes up after this information has already been displayed. Most modern systems enable you to access their BIOS settings by pressing the Delete key (the one in the cluster of keys with Home, End, Page Up, and Page Down) as soon as the system powers up. The system will still perform some initial checks, but it will then display a BIOS settings screen. If pressing Delete does not provide access to your system's BIOS, other popular keys/key combinations to try (in order) are F2, F1, F3, F10, Esc, Ctrl-Alt-Esc, Ctrl-Alt-Insert, and Control-Alt-S. One of these should give you access to your system's BIOS, though trying them all can be somewhat tedious and time-consuming.

Most modern x86 boxes feature one of a small number of different BIOS types. Two of the more popular BIOS types are the different Award BIOS screens shown in Figures 10-1 and 10-2.

Figure 10-1. An Award BIOS with vertical menus


In the BIOS shown in Figure 10-1, the boot settings are stored in the Advanced Settings screen, which you can navigate to using the down arrow key. Press Return to display this screen once its name is highlighted. On the Advanced Settings screen, use the down arrow key to navigate to the First Boot Device entry, and press Return to display your choices. Use the arrow keys to select the entry corresponding to your actual boot drive, and press Return. You can then press the Escape key to exit this screen, and press F10 to save the new settings, exit the BIOS settings screen, and reboot.

Figure 10-2. An Award BIOS with horizontal menus


In the BIOS shown in Figure 10-2, the boot settings are stored in the Boot screen, which you can navigate to using the right arrow key. Press Return to display this screen once its name is highlighted. On the Boot screen, use the down arrow key to navigate to the Hard Drive entry, and press Return to display a list of available drives. You can then highlight the correct drive using the arrow keys and press Return to select it. Once the correct hard drive is selected, you can use the plus symbol to move that entry to be the first bootable device, and then press F10 to save the new settings, exit the BIOS settings screen, and reboot.

If the BIOS boot settings for the system on which you're having problems appear to be correct, this is probably not the root of your problem, and you should change these settings only as a last resort. Changing too many variables at one time is a normal reaction to an unbootable system, but it's rarely the right one.


Depending on the types and configuration of the drives in your system, you may have to experiment a bit with BIOS boot device settings before your system will boot correctly. If the BIOS doesn't find a drive that you know to be physically present, the drive may have failed, in which case there isn't all that much you can do without drive-specific hardware recovery techniques that are outside the scope of this book. If the BIOS finds the drive but you can't read the disk's partition table using the rescue CD, see "Recover Lost Partitions" [Hack #93] for suggestions about recreating the partition table. If the partition table is fine but you can't mount or repair one or more partitions, see "Recover Data from Crashed Disks" [Hack #94] for suggestions about recovering data from the disk.

10.2.2. Fixing Runlevel or X Window System Problems

Most Linux distributions nowadays provide some sort of free online update service. These are great for keeping your system up to date with the newest, brightest, shiniest software available for your distribution. If you get a bogus update, however, they can also incapacitate your systemand some of the more common bogus updates that I've seen are updates to the X Window System (for X.org or, in the past, XFree86). Unfortunately, the fix that corrects someone else's problem may take your GUI to its knees, where it doesn't accept keyboard or mouse input. If you can't get your X Window System display to respond to keyboard or mouse input, try the following:

  • Switch to another virtual console by pressing Ctrl-Alt-F1 or Ctrl-Alt-F2, log in there, and edit /etc/inittab to start at another runlevel until you can correct the problem. The specific inittab line you are looking for is:

     id:5:initdefault: 

    You need to change the 5 to another runlevel (usually 3). Some distributions, such as Ubuntu and Gentoo, merely require you to stop the display manager from running, which usually means removing the xdm, gdm, or kdm service from the boot process. Once you've done this, reboot.

  • Go to another machine and SSH or telnet into the system where you're having problems. Once logged in, su and edit /etc/inittab to start at another runlevel (usually 3) until you can correct the problem. Reboot.

  • If you can't do either of the previous suggestions (for example, if no other machine is handy or you've disabled virtual consoles and gettys to optimize performance), use the information provided later in this hack to reboot in single-user mode. You can then edit /etc/inittab to start at another runlevel until you can correct the problem. Reboot.

Once you're in a nongraphical runlevel, you can perform repair tasks such as running filesystem repair utilities, repairing your X Window System configuration, and so on.

10.2.3. Regenerating a Default X Window System Configuration File

If you can boot your system successfully in a nongraphical runlevel but cannot start the X Window System automatically or manually, your configuration file may simply be hosed (in technical terms). Whether this happened because you've installed an updated version of the X Window System, your root filesystem took a hit and the file was deleted, or you've "fine-tuned" your configuration files to the point where X won't start any more, you can start from scratch by generating a default X Window System configuration file that you can then use as a starting point to correct the problems you're seeing. Both the X.org and XFree86 implementations of the X Window System provide a -configure option that enables you to generate a default configuration file. Depending on which X Window System server you have on your Linux system, log in as root and execute one of the following two commands to generate a default configuration file:

 # Xorg -configure # XFree86 -configure 

These commands cause the X server to probe your graphics hardware and generate a default X Window System configuration file in the /root directory called xorg.conf.new or XF86Config.new. You can then test this generic configuration file by starting your X server with the following command:

 # X -config  /root/filename  

If the X server starts correctly, replace your default X configuration file with the new one and (after creating a backup copy) resume normal use or finetuning. One common failing is that X won't start because it can't detect your mouse. If this happens, check the InputDevice section of the configuration file you created for the value of the Device option. If this is simply /dev/mouse, try changing it to /dev/input/mice and restarting X using the updated configuration file.

If you're having problems starting or configuring X in general, your video card may use a chipset that is not yet supported by the version of the X Window System that you're using. If this happens, you can try using a lowest common denominator as a fallback. Video Electronic Standards Association (VESA) is supported by most cards and should enable X to work at lower resolutions on almost any system with graphical capabilities. To use VESA, simply set the Driver line in your Device section to be vesa.


10.2.4. Booting to Single-User Mode

If you're having problems booting to a specified runlevel, you may need to boot to single-user mode in order to repair your system. This can happen for a number of reasons, most commonly because of filesystem consistency problems, but also because of things such as the failure of any of the low-level system initialization scripts.

If you're using the GRUB bootloader, press any key to interrupt the standard GRUB boot process, use the arrow keys to select the kernel you want to boot, and press the e key to edit the boot options for that kernel. Select the line containing the actual boot options (usually the first line), press e again to edit that command line, and append the command single to the end of the command line. You can then press b to boot with those boot options, and your system will go through the standard boot process but terminate either at a root shell prompt or by prompting you for your root password before starting that shell.

If you're still using the LILO bootloader, you can do the same thing by entering the name of the boot stanza that you want to boot (usually linux), followed by a space and the -s directive. Again, you should get a root shell prompt or a request for the root password in a few seconds.

If you're having problems starting a single-user shell, there may still be a problem in some low-level aspect of your boot process, or (gasp) you may have forgotten or be unable to supply the root password. In this case, see "Bypass the Standard Init Sequence for Quick Repairs" [Hack #91] for a quick way of bypassing the /sbin/init process and starting a shell directly.

10.2.5. Resolving Filesystem Consistency Problems

When a system doesn't boot because it claims that one or more of your partitions is inconsistent and therefore needs to be repaired, you're in luckit's hard to see disk corruption as a good thing, but it beats some of the alternatives. At least your system found the boot sector, booted off the right drive, and got to the point where it found enough applications to try to check your filesystems.

One of the most common problems when booting a system is resolving filesystem consistency problems encountered during boot time. When you shut down a system normally, the system automatically unmounts all of its filesystems, marking them as "clean" so that it can recognize that they are in a consistent state when you next boot the system. If a system crashes for some reason, the filesystems are not marked as clean and must therefore be checked for consistency and correctness the next time you boot the system. Different types of filesystems each have their own filesystem consistency verification and repair utilities. In most cases, your system will automatically run these for you as part of the boot process and will correct any filesystem consistency problems that these utilities detect. Sometimes, however, you're not so lucky, and you'll have to run these utilities manually to correct serious filesystem problems.

Similarly, if you're using the XFS filesystem, all the vanilla repair utility does is return trUE, since it expects that the XFS filesystem can correctly replay the journal and fix any problems as part of its mount process. If that's not the case, you can find yourself in single-user mode if the boot and root partitions are OK. If not, see "Rescue Me!" [Hack #90] for information about getting a rescue CD, because you're going to need it.

The details of manually running each filesystem's consistency-checking utility are outside the scope of this hack, but it's at least useful to know which utility to use if you have to manually repair a filesystem. Table 10-1 shows the filesystem consistency utilities that you use to manually repair various types of Linux filesystems.

Table 10-1. Repair utilities for different Linux filesystems

Filesystem

Utility

ext2, ext3

e2fsck

JFS

jfs_fsck

reiserfs

reiserfsck

XFS

xfs_check, xfs_repair


In the case of the XFS filesystem, xfs_check is a shell script that simply identifies problems in a specified filesystem, which you must then use the xfs_repair utility to correct.

10.2.6. See Also

  • RIP home page: http://www.tux.org/pub/people/kent-robotti/looplinux/rip/

  • "Rescue Me!" [Hack #90]

  • "Bypass the Standard Init Sequence for Quick Repairs" [Hack #91]



Linux Server Hacks (Vol. 2)
BSD Sockets Programming from a Multi-Language Perspective (Programming Series)
ISBN: N/A
EAN: 2147483647
Year: 2003
Pages: 162
Authors: M. Tim Jones

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net