Section 7.3. I Can t Boot Because of a Kernel Panic


7.3. I Can't Boot Because of a Kernel Panic

One of the most feared problems in the world of Unix or Linux is the kernel panic, when the system stops completely during the boot process. The computer won't respond to any input, save the power switch. This is where your backups, rescue modes, or rescue media can be a lifesaversee Chapter 6 for how you can prepare for this situation.

A number of problems can cause a kernel panic, many of which occur when you try to recompile or install a new kernel. During the boot process, if Linux can't find the hard drive, the partitions, or initial RAM disk files, you'll get a kernel panic. But kernel panics aren't limited to these issues.

Unless there's corruption on your disk or some problem with your hardware, kernel panics generally come from some recent change to key components in the boot sequence, driver problems, or boot issues, such as:

  • The bootloader (GRUB or LILO)

  • A recompiled kernel

  • A new Initial RAM disk

  • Partition changes associated with the root (/) or /boot directories

  • Power problems

  • Troublesome drivers, especially those created for other systems

Record the messages that the console displays immediately before your kernel panic. Review what you did just before the kernel panic, especially with respect to the preceding list. These actions can give you hints to your problems. If you still can't figure out the problem, use these messages as keywords for a search for similar problems with search engines such as http://www.yahoo.com or http://groups.google.com.

7.3.1. Sample Panic Messages and Their Possible Meanings

Here's a typical example of a kernel panic:

 VFS: Cannot open root device "hda6" or unknown-block(0,0) Please append a correct "root=" boot option Kernel panic: VFS: Unable to mount root fs on unknown-block(0,0) 

This problem is caused by an error in the bootloader configuration file. The Virtual File System (VFS) could not find some filesystem such as root (/) or /boot.

One possible cause is the confusing nature of the GRUB configuration file. For example, if you see the following directive in /boot/grub/grub.conf or /boot/grub/menu.1st:

 root (hd0,5) 

You might think this points to the /boot directory on /dev/hda5. But as you should know from "Rooting Out the Bootloader" in Chapter 6, this directive actually tells your computer to look for the /boot directory on /dev/hda6.

Another example shown here is slightly misleading. This error message might suggest that there is a problem with the /sbin/init command, which is the first process (process 1) always run by the system:

 Warning: unable to open an initial console Kernel panic - not syncing: No init found. Try passing init= option to kernel 

In fact, this problem is not directly related to init. My computer could not find init because the bootloader pointed to the wrong partition for the top-level root (/) directory. The root directory on my system was on /dev/hda7, but the bootloader configuration file pointed to /dev/hda6, as shown here.

 kernel  /vmlinuz-2.6.8-mj1 root=/dev/hda6 

If you have a separate partition for the /boot directory, a mislocated partition could lead to a similar kernel panic message.

Another possible cause of panics in Debian are the links from the /vmlinuz and /initrd.new files. Debian links these files from the top-level root (/) directory. If the links are broken or point to the wrong locations, you might get the following message:

 pivot_root: No such file or directory /sbin/init: 426: cannot open dev/console: No such file Kernel panic: Attempted to kill init! 

Naturally, you can address this problem either by linking the noted files from the top-level root (/) directory to the right locations in the /boot directory or by revising the menu.lst configuration file to point directly to /boot.

Another panic is related to the following message:

 Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(3,7) 

While this appears similar to previous messages related to misplaced partitions, it actually is based on a missing Initial RAM disk file. Look at your menu.lst file. It should point you to an initrd file in the /boot directory. If you don't find the cited initrd file, you may need to re-create it with the mkinitrd command.

From these examples, we see that the cause may not be directly related to the error message. If you have some experience, you may recognize some of these messages. Otherwise, the best approach is to analyze the files and directories associated with the boot process, with the help of books such as this one.

If you haven't made any recent changes to your kernel, check your power supply and fans. Hardware doesn't last forever, and the lack of sufficient ventilation could cause your system to stop with a kernel panic. Dust can also affect ventilation and heat transfer, especially in non-clean room environments.


7.3.2. Reviewing the Rescue Process After a Panic

"Dual-Boot Recovery" in Chapter 6 describes how to use a rescue CD or other medium to boot a system; after a system panic, the process is straightforward. Try each of the following steps to boot a system. They're ordered by increasing levels of difficulty:

  • If you have more than one kernel configured in your bootloader, try them all. If a different kernel works, you may have a corrupt kernel, initial RAM disk, or an error in the bootloader configuration file.

  • If you have a rescue disk or CD customized for your system, try that next. Such disks are designed to boot your system in your current configuration. At that point, you can connect to any backups that you might have to recover a previously working configuration.

  • Use the rescue mode customized for your distribution. If you have the Red Hat/Fedora installation CD, it searches for and mounts your existing partitions. SUSE's installation disk and the Debian from Scratch CD install familiar tools that can help you mount and diagnose any problems you may have.

  • Boot with a CD-based Linux distribution such as Knoppix. It includes a complete Linux distribution, including specialized tools designed to help you rescue a system.

"I Lost the Root Password" in Chapter 6 describes booting into single user mode. Unfortunately, if you have a kernel panic, your system has usually stopped before it could boot into this useful runlevel.

7.3.3. Rescuing from a Kernel Panic

Once you've started your system using some emergency or rescue disk, review what you've done since your last successful boot. If you've changed a kernel, revised a bootloader, created a new initial RAM disk, or revised the partition associated with your root (/) or /boot directories, that could be the cause of your kernel panic.

The cure, then, is to reverse what you've done recently. If applicable, restore the original kernel, initial RAM disk, /boot or root (/) partition, or bootloader. Alternatively, restore the key parts of your system from a backup. Once you've gone back to your previous working configuration, test the result.

If you revise a key part of the boot system, you should test the result as soon as possible by rebooting Linux. If there are problems, you can restore your system while your memory of the changes you've made is fresh. You should also document the changes you've made.




Linux Annoyances for Geeks
Linux Annoyances for Geeks: Getting the Most Flexible System in the World Just the Way You Want It
ISBN: 0596008015
EAN: 2147483647
Year: 2004
Pages: 144
Authors: Michael Jang

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net