Checking and Repairing Filesystems with fsck


Checking and Repairing Filesystems with fsck

The fsck (Filesystem Consistency Check) program is the equivalent of Microsoft ScanDisk and other disk utilities, at least as far as its role in the boot process and its interactive nature go. The fsck program runs at boot time just before mounting the filesystems out of /etc/fstab, to make sure that all the filesystems are "clean" and eligible for mounting. This is the "preen" mode that is called for with the -p option. But fsck also exists to repair any inconsistencies that it finds and to clean filesystems that have not been marked "clean" by a proper shutdown method.

The most likely place you'll encounter fsck is at boot time, no matter what role it plays in your life. In the happiest circumstances, it runs invisibly just after all the devices have been identified, and all you see on the console are a few lines like this:

/dev/ad0s1e: 103469 files, 858450 used, 9066025 free (25777 frags, 1130031 blocks, 0.3% fragmentation)


Note

Don't worry about the "fragmentation" figure that fsck prints out. It looks pretty dire, but be aware that even fragmentation of 2 to 3 percent (which is the highest you'll likely see) is miniscule compared to the kind of fragmentation that occurs under Windows. It is not unusual to see a DOS/VFAT disk with 50 percent or more fragmentation; that is why defragmenting utilities sell so well in the desktop market. UNIX filesystems, however, are designed with mechanisms to keep related sectors together on the fly, so fragmentation is kept to a minimum. You won't ever need to defragment a UNIX hard drive. See "Blocks, Files, and Inodes," later in this chapter, for a closer look at the mechanics of data storage and fragmentation.


You might run into trouble, however, if the system has not been shut down cleanlyif power has failed or someone has hit the power switch without running shutdown first. UNIX filesystems keep track of their structural information by writing that metadata to the disk in a synchronous manner, which may take multiple write cycles. If the system goes down while it's in the middle of the write sequence, the metadata becomes corrupted, and the filesystem cannot be used until it is made consistent again. This is what fsck is for.

If a filesystem is brought up in this "unclean" state, fsck drops into its investigative mode. It then walks through the filesystem block by block, examining the metadata and making sure it is consistent. This can take a very long time, depending on the size of the filesystem and the speed of the disk. When fsck finds an inconsistency that it cannot repair automatically while guaranteeing data integrity (see man fsck for details), fsck prompts you about whether you want to fix it. In most cases, you do. However, if you are being prompted, it's most likely that the inconsistency is so severe that you will have lost some datausually the file or files being written at the time of the crash, which tends to mean that data loss is fairly small.

After fsck finishes running, you may be dropped to a # prompt. Type boot to continue booting or reboot to go through the entire boot process again. Going through a reboot might be a good idea, just to make sure it will come up cleanly without intervention. You don't want to lock the server in a cabinet and drive away, only to have it not come up the next time it crashes.

Boot time is not the only place for fsck, though. You also can run it from the command line at any time on mounted filesystems, although it's a bad idea to do so when the system is fully up and running! It's important that the filesystem in question not be changing while you're trying to give it a consistency check. If you have to run fsck on one of your system's main disk partitions, take the precaution of dropping to single-user mode:

# shutdown +5


This command closes down multiuser mode five minutes after you issue the command. Naturally, everything from this point on has to be done at the physical console. You can't remotely administer the system in single-user mode!

Tip

You can also shut down the system immediately, rather than after a pause, with the shutdown now command.


With the system in this quiescent state, you can now use fsck to your heart's content. This may be necessary if during runtime you find a message in your dmesg output (part of the daily monitoring scripts that get sent to root) that says it found a bad inode or file descriptor, and you want to go directly to the root of the problem without rebooting. After you run fsck on one or all of your devices (use syntax such as fsck -p /dev/ad1s1g), you can then simply exit from the single-user shell (type exit) to bring up the rest of the multiuser system.

One case in which it is safe to use fsck while still in multiuser mode is when you're trying to mount a second disk with a noncritical or new filesystem that you're attempting to add. The fsck that runs at boot time will only check the filesystems that are listed in /etc/fstab. Rather than adding the new device to the fstab file and rebooting, you can simply try to mount the device. If the mount fails, telling you that you need to run fsck, do so with the syntax shown previously. Then try mounting the disk again. This mount can be done safely in multiuser mode because nobody will be writing to a device that hasn't yet been mounted.

Note

FreeBSD's fsck is similar in functionality to that of the fsck used on similar operating systems; however, it does lack one or two nice interface features, such as the progress bar on the Linux fsck. Be assured, though, that the core features behave almost exactly the same way.


Journaling Filesystems and Soft Updates

Many different solutions to the synchronous-write issues that lead to fragmentation and lost file pointers have been developed. You hear a lot these days about journaling (or logging) filesystems like Ext3FS and Journaled HFS+, for example, which keep a log of all write operations before they are executed. This log dramatically speeds up fsck because it no longer needs to comb the entire filesystemit knows where the inconsistencies are and how to fix them.

FreeBSD does not include support for journaling filesystems; what it does have, though, is Soft Updates. Whereas journaling filesystems work by maintaining a log file of write actions, Soft Updates (which is built into the GENERIC, or default, kernel ever since FreeBSD 4.5, and is enabled on all newly created disk partitions) provides a different technique that offers the same kind of benefits. Soft Updates uses precalculated, ordered writes to eliminate the need for an external log; at the same time, Soft Updates protects the integrity of the metadata to provide filesystem consistency as good as or better than that offered by journaling. It has performance advantages over journaling as well; a filesystem can be brought up immediately at boot time, and the consistency checking is done afterward through the use of automated snapshots in a background task.

Soft Updates can be enabled on any or all of your filesystems. A toggle option in the Disk Label Editor (the utility you use to divide your disk into subpartitions and assign them to different mount points, as discussed in the section "Creating the Disk Labels," in Chapter 20) lets you set Soft Updates on whichever partitions, or filesystems, you wish. Soft Updates is particularly effective when used on filesystems that contain frequently changing data, such as /var and /usr.

An optional daemon called diskcheckd supports Soft Updates. Installable from the ports or packages, diskcheckd runs in the background and performs periodic filesystem integrity scans, thus dramatically reducing the reliance on fsck at boot time and the risks associated with abrupt shutdowns. The configuration file is /usr/local/etc/diskcheckd.conf; see man diskcheckd for instructions for and examples of using this file. Any errors that diskcheckd finds are logged through the syslogd service, described in the section "The System Logger (syslogd) and the syslog.conf File," in Chapter 14, "System Configuration and Startup Scripts." While diskcheckd is running, you can use ps to view its progress:

# ps -ax | grep diskcheckd   251  ??  Ss     0:00.28 diskcheckd: ad0 13.26% (diskcheckd)


You can find more information on Soft Updates and comparisons between it and journaling filesystems at http://www.mckusick.com/softdep/ and http://www.ece.cmu.edu/~ganger/papers/CSE-TR-254-95/.

Using fsck to Recover a Damaged Super Block

Chances are that you'll never have to use this technique. However, according to Murphy's Law, the best way to ensure a certain kind of disaster never happens is to prepare for it.

A common kind of filesystem corruption is a damaged super blockthe block that contains the critical data for a device's filesystem. This corruption occurs when the system, for whatever reason, cannot read the device's super block, located on sectors 16 through 31 at the beginning of the device. The super block is such an indispensable part of a filesystem that FreeBSD keeps an alternate super block at the beginning of every cylinder group. This way, if your main super block becomes corrupted, dozens of backups throughout the device are available for you to use. The first alternate is always at block 32, and the rest are at regular intervals throughout the disk, but are much less easily predictable.

Let's say you try to mount a filesystem that you know is otherwise validfor example, a removable hard disk that worked the last time you had it in the machine. Upon issuing the mount command, you get the following error:

/dev/ad1s1h on /mnt: Incorrect super block


As dire as the situation sounds, it is easily dealt with by using fsck:

#  fsck /dev/ad1s1h ** /dev/ad1s1h BAD SUPER BLOCK: MAGIC NUMBER WRONG LOOK FOR ALTERNATE SUPERBLOCKS? [yn] y USING ALTERNATE SUPERBLOCK AT 32 ** Last Mounted on /home2 ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 148 files, 15660 used, 7038840 free (208 frags, 879829 blocks, 0.0% fragmentation) UPDATE STANDARD SUPERBLOCK? [yn] y ***** FILE SYSTEM WAS MODIFIED *****


FreeBSD makes dealing with a "damaged super block" error easy. Other platforms make you use command-line options to specify where the alternate super block is that you want to use, but the one at 32 is really the only one you know for sure is there. Using fsck in this manner, the first available alternate super block is copied over the primary one, and you should be able to mount the filesystem cleanly.

Note

You can determine where all the super blocks on the device are (if you're really interested) by using the newfs utility, with the -N parameter (which prints out the filesystem's stats without actually making any changes to the disk). Here's an example:

# newfs -N /dev/ad1s1h Warning: 2672 sector(s) in last cylinder unallocated /dev/ad1s1h: 14558608 sectors in 3555 cylinders of 1 tracks, 4096 sectors  7108.7MB in 223 cyl groups (16 c/g, 32.00MB/g, 7936 i/g) super-block backups (for fsck -b #) at: 32, 65568, 131104, 196640, 262176, 327712, 393248, 458784, 524320, 589856, 655392, 720928, 786464, 852000, 917536, 983072, 1048608, 1114144, 1179680, 1245216, 1310752, 1376288, 1441824, 1507360, 1572896, 1638432, 1703968, 1769504, 1835040, 1900576, 1966112, 2031648, 2097184, 2162720, 2228256, 2293792, 2359328, 2424864, 2490400, 2555936, 2621472, 2687008, 2752544, 2818080, 2883616, 2949152, 3014688, 3080224, 3145760, 3211296, 3276832, 3342368, 3407904, 3473440, 3538976, 3604512, 3670048, 3735584, 3801120, 3866656, 3932192, 3997728, 4063264, 4128800, 4194336, 4259872, 4325408, 4390944, 4456480, 4522016, 4587552, 4653088, 4718624, 4784160, 4849696, 4915232, 4980768, 5046304, 5111840, 5177376, 5242912, 5308448, 5373984, 5439520, 5505056, 5570592, 5636128, 5701664, 5767200, 5832736, 5898272, 5963808, 6029344, 6094880, 6160416, 6225952, 6291488, 6357024, 6422560, 6488096, 6553632, 6619168, 6684704, 6750240, 6815776, 6881312, 6946848, 7012384, 7077920, 7143456, 7208992, 7274528, 7340064, 7405600, 7471136, 7536672, 7602208, 7667744, 7733280, 7798816, 7864352, 7929888, 7995424, 8060960, 8126496, 8192032, 8257568, 8323104, 8388640, 8454176, 8519712, 8585248, 8650784, 8716320, 8781856, 8847392, 8912928, 8978464, 9044000, 9109536, 9175072, 9240608, 9306144, 9371680, 9437216, 9502752, 9568288, 9633824, 9699360, 9764896, 9830432, 9895968, 9961504, 10027040, 10092576, 10158112, 10223648, 10289184, 10354720, 10420256, 10485792, 10551328, 10616864, 10682400, 10747936, 10813472, 10879008, 10944544, 11010080, 11075616, 11141152, 11206688, 11272224, 11337760, 11403296, 11468832, 11534368, 11599904, 11665440, 11730976, 11796512, 11862048, 11927584, 11993120, 12058656, 12124192, 12189728, 12255264, 12320800, 12386336, 12451872, 12517408, 12582944, 12648480, 12714016, 12779552, 12845088, 12910624, 12976160, 13041696, 13107232, 13172768, 13238304, 13303840, 13369376, 13434912, 13500448, 13565984, 13631520, 13697056, 13762592, 13828128, 13893664, 13959200, 14024736, 14090272, 14155808, 14221344, 14286880, 14352416, 14417952, 14483488, 14549024


As you can see, plenty of backups are available, but all except for the one at 32 are at odd locations. fsck does all the dirty work for you. However, it does let you specify a certain super block if you don't want to let it pick one automatically. For example, fsck -b 2490400 /dev/ad1s1h is the command to use the super block at sector 2490400.





FreeBSD 6 Unleashed
FreeBSD 6 Unleashed
ISBN: 0672328755
EAN: 2147483647
Year: 2006
Pages: 355
Authors: Brian Tiemann

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net