Corrupt FFS Partitions


If you have a system crash while writing to disk, the disk is considered dirty. It's in a kind of limbo: The operating system has requested that information be written to disk, but the data is not yet completely written out. Part of the data block may have been written, the inode might have been edited but the data not written, or any combination of the two. You need to identify and resolve these inconsistencies before the system will let you mount a disk read-write. OpenBSD includes a powerful FFS checking tool, fsck(8).

When a rebooting system finds a dirty disk partition, it automatically checks the disk and tries to clean everything up. Any data that was not written to the disk before the failure is lost, of course, but fsck does its best to clean up the data that remains. If successful, everything should be right where you left it - except for that unwritten data.

Failed Automatic Fscks

Occasionally a reboot will fail, and you'll be left staring at a single-user prompt asking you to run fsck manually. At this point, you have a few choices: run fsck, run fsck in automatic mode, backing up the damaged partition, or debugging the file system.

Running Fsck

If you enter "fsck", fsck will check every block and inode on the disk. It will probably find any number of blocks that have become disassociated from their inodes and will make a good guess as to how they fit together and how they should be attached. This can take quite a while on the huge disks that are so common these days.

When fsck finds a problem that it isn't absolutely sure about, it will ask you if you want to perform a fix it suggests. You have two choices, yes or no. If you answer "y," fsck will rebuild the disassociated file and place it in a lost+found directory on the partition, such as /usr/lost+found. If you answer "n," the file will be lost. Files in the lost+found directory have a number for a name. Use grep(1) to scan these files for missing data.

Running Fsck in Automatic Mode

If your disk was in the middle of a very busy operation when the system failure occurred, you could end up with many, many disassociated files. Rather than spending an hour typing "y" over and over again to tell fsck(8) to attempt to recover these files, you can just run "fsck -y" at the single-user prompt. This tells fsck(8) to assume that you're answering "y" to every question; this is much easier than typing "y" repeatedly.

In most cases it is. It's possible for the entire contents of the disk to migrate to the lost+found directory thanks to fsck -y. Recovery becomes difficult at that point.

Backing Up the File System

You can use dump(8) to grab a copy of the damaged file system and place it somewhere for further work. This gives you the luxury of being able to try various ways to restore the data, while leaving the possibility of starting over. Chances are, if you have to try this, you didn't have an adequate backup process in the first place.

File System Debugging

OpenBSD includes the powerful system debugging tools fsdb(8) and clri(8), which allow a skilled user to debug the file system and redirect files to their proper locations. These tools work on a block-and-inode level, and many partitions have hundreds of thousands of both. You need a very good understanding of FFS to be able to use these tools, so it's not an option for most people. The best way to learn how to use them is to fsdb(8) and clri(8) your way through a few corrupt file systems, sadly.




Absolute Openbsd(c) Unix for the Practical Paranoid
Absolute OpenBSD: Unix for the Practical Paranoid
ISBN: 1886411999
EAN: 2147483647
Year: 2005
Pages: 298

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net