Section 8.2. My Hard Drive Is Failing and I Need a BackupFast


8.2. My Hard Drive Is Failing and I Need a BackupFast

It's best to configure a regular backup of your entire system. But hard drives are large. Gigabytes of data take time to copy. So you can't be blamed for avoiding backups as long as possible. (That is, until there is a hard drive failure.) While you might have configured backups for those workstations that you administer, other people might not have been so farsighted and may look to you as a Linux geek when they hit the inevitable disk problem. Thus, you may be asked to recover the data of a less experienced Linux user who forgot to back up his hard drive.

The techniques listed in this annoyance may or may not work for you, depending on the level of damage to your hard drive. I can testify, though, that without these techniques, I would have spent several days reloading programs onto my laptop computer.


8.2.1. Symptoms

One symptom of an imminent hard drive failure is the following message, which you might see during the Power On Self Test (POST) process:

 1720 - S.M.A.R.T Hard Drive detects imminent failure(Failing Attr:05h) Please back up the contents of the hard drive and run HDD self test in F2 setup 

While you could run the HDD self-test, chances are good that if you see this message, your hard drive is about to fail. So you should take steps right away to recover what you can.

The first thing you should do is mark the bad blocks; we've described this process in the previous annoyance.

At this point, you've applied the fsck command to your system. You've tried the regular backup techniques described in Chapter 2. You've marked the bad blocks with the techniques described in the previous annoyance. Commands such as dd or tar fail because they find errors when they hit bad blocks.

First and foremost, save the files that you can't live without. Next, proceed with an emergency backup of the entire hard drive, described in the next section.

8.2.2. Configuring an Emergency Backup

To explain what you should do to back up a failing drive in terms as concrete and easy to follow as possible, I'll revisit a recent frightening day when my laptop hard drive failed, and describe the steps I took. It should not be hard for you to apply the lesson to another disk failure. I start with a narrative, followed by a step-by-step description of what I did to recover and transfer my data to a new hard drive. While this may be repetitive, if your hard drive is failing, it's important to get these steps right the first time.

When the symptoms described in earlier sections showed me that my laptop hard drive was failing, my first step was to save the critical files that I absolutely needed. But that was not enough. I had spent several hours configuring Debian on this laptop computer and would have been really annoyed if I had to start over. I needed an emergency backup.

Fortunately, I had a large external IEEE 1394 (FireWire) hard drive, which had plenty of space for my Debian partitions. Generally, most distributions with Linux kernel 2.6 have no problems with IEEE 1394 hard drives.

I bought another hard drive to replace the one currently on my laptop. It turned out that I could get a significantly larger drive for just a little more money. This made things easier because I could specify slightly larger partitions than I had on my old disk, rather than spend a lot of effort trying to re-create each partition at exactly the same size. Once you realize that you need a new hard drive, you may want to order it as soon as possible, as shipping can take time.

Because my hard drive seemed ready to fail, I needed to minimize the stress on that drive. I also needed a magic tool that could ignore the errors associated with the bad blocks on my drive while copying the partitions or all the files within them.

What I needed was a Linux distribution that recognized my IEEE 1394 hard drive, included a magic backup tool, and could be loaded directly from a CD. From previous experience, I knew that when I boot Knoppix with kernel 2.6, it recognizes and allows me to partition, format, and mount my IEEE 1394 hard drive. If that didn't work, I knew Knoppix recognized my network card; I could have backed up my partitions over my network.

As for the magic tool, current versions of Knoppix include the dd_rescue command. As it's designed to ignore errors such as bad blocks on a partition, it was what I needed at that moment. For more information on dd_rescue, see http://www.garloff.de/kurt/linux/ddrescue/.

I booted my system with a Knoppix CD. Because it loaded Linux and the associated utilities onto a RAM disk, it minimized the stress on my hard drive. If you have a different magic tool, you may be able to use another CD-based distribution such as Ubuntu or SUSE Live CD.

Next, I loaded and mounted my backup media. I formatted my external hard drive to the ext3 filesystem. Knoppix recognizes standard external drives and network connections, generally with little difficulty.

As of this writing, the current version of Knoppix is 4.0, which supports booting with the Linux 2.6 kernel. If you're using modern backup devices such as external IEEE 1394 or USB 2.0 drives, use kernel version 2.6. It provides better support. If you're using an older Knoppix CD, the default may be associated with Linux kernel 2.4; you may be able to start with Linux kernel 2.6 by entering knoppix26 at the boot: prompt.


After formatting partitions on my IEEE 1394 drive, I rebooted into Knoppix to make sure the new partitions were properly written. Most of these commands require superuser mode, but when you boot Knoppix from a CD, no root password is required. Finally, I could use dd_rescue to save the data I could, and then write that data to the new laptop hard drive.

Before you start, make sure you have the following available:

  • Space on another hard drive to save your data. A remote or portable hard drive can work for this purpose.

  • A reliable connection to the hard drive you're using for backup. Network and even IEEE1394 cables can come loose.

  • A replacement hard drive, suitable for your system.

  • A Linux installation that you can boot directly from a CD, such as Knoppix. Make sure it has tools such as dd_rescue.

  • Appropriate tools to replace the physical hard drive.

Now that I had the basic story and the tools I needed, I took the following steps to rescue my laptop hard drive:

  1. I saved the files that I absolutely needed to a different computer. These included personal, data, and perhaps configuration files. In case the backup didn't work, I would have preserved at least these files.

  2. I used the tools described in the previous annoyance to mark bad blocks, including badblocks and fsck.

  3. I booted my laptop with a Knoppix CD. Knoppix boots to a K desktop environment, with icons for the partitions on my laptop and IEEE1394 hard drives.

  4. Based on the output from a fdisk -l /dev/hda command, I recorded the sizes of the partitions that I wanted to back up.

  5. I used QTParted (http://qtparted.sourceforge.net) on Knoppix to make room on my IEEE1394 hard drive.

  6. I ran the fdisk utility to create a partition (/dev/sda1) large enough for my failing laptop hard drive partitions. I then formatted this partition to the ext3 filesystem:

     mkfs -t ext3 /dev/sda1 

  7. After rebooting to make sure the partition table reflected the new configuration, I mounted the new ext3 partition on my IEEE1394 hard drive. Knoppix makes appropriate mount points available; the following command uses the Knoppix default for the noted partition:

     mount /dev/sda1 /mnt/sda1 

  8. I used the dd_rescue command to back up each partition on my hard drive. Most of the commands took less than an hour to copy each partition image to the specified image file (.img). As you might guess from the filenames I chose, my laptop had a dual boot of Windows XP and Debian Linux:

     dd_rescue /dev/hda1 /mnt/sda1/xppro.img dd_rescue /dev/hda5 /mnt/sda1/debboot.img dd_rescue /dev/hda6 /mnt/sda1/debhome.img 

  9. However, the initial results were less than perfect. The dd_rescue command got stuck when I tried to back up the partition with my top-level root (/) directory. Fortunately, dd_rescue -r was able to read my partition backward, skipping over the errors on that drive. While it took hours, I was pleased to see that it saved all of my data:

     dd_rescue -r /dev/hda7 /mnt/sda1/debroot.img 

    If you aren't able to save all of your partitions with the dd_rescue command, there's one more option. The dd_rhelp project, based at http://www.kalysto.org/utilities/dd_rhelp, uses other techniques to move past bad sectors more quickly.


  10. I then exited from Knoppix. Standard commands such as poweroff can be used because the Knoppix Live CD/DVD provides password-free access to superuser mode.

  11. I disconnected power to my laptop computer.

  12. Next, I installed the new hard drive. Be careful. If your computer is still under warranty, take care to follow your manufacturer instructions. In any case, take care to avoid static in handling any computer parts.

  13. I navigated to my laptop's BIOS menu to make sure it correctly detected the new hard drive.

  14. I restarted Knoppix once again.

  15. I partitioned and formatted my new hard drive. Because it was larger, it was easy for me to make sure that each partition that I created was equal to or larger than the one on my previous hard drive. I partitioned the disk with the fdisk command and formatted with the mkfs.ext3 command.

  16. I was finally able to restore data from the disk images that I created. As no special handing was required, I was able to use the dd command:

     dd if=/mnt/sda1/xppro.img of=/dev/hda1 dd if=/mnt/sda1/debboot.img of=/dev/hda5 dd if=/mnt/sda1/debhome.img of=/dev/hda6 dd if=/mnt/sda1/debroot.img of=/dev/hda7 

    Alternatively, I could have mounted these images and saved the individual files. This would take full advantage of any additional space I could spare on the new hard drive. For example, I could have used the following commands to restore the files from the debboot.img file (assuming /mnt/back existed):

     mount -o loop /mnt/sda1/debboot.img /mnt/back/ mount /dev/hda5 /mnt/hda5 cp -ar /mnt/back/* /mnt/hda5/ 

  17. Before rebooting, I wrote my bootloader to the MBR. I'll show you how to do so in the next annoyance.



Linux Annoyances for Geeks
Linux Annoyances for Geeks: Getting the Most Flexible System in the World Just the Way You Want It
ISBN: 0596008015
EAN: 2147483647
Year: 2004
Pages: 144
Authors: Michael Jang

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net