Hack70.Reduce Restart Times with Journaling Filesystems


Hack 70. Reduce Restart Times with Journaling Filesystems

Large disks and filesystem problems can drag down the boot process unless you're using a journaling filesystem. Linux gives you plenty to choose from.

Computer systems can only successfully mount and use filesystems if they can be sure that all of the data structures in each filesystem are consistent. In Linux and Unix terms, consistency means that all of the disk blocks that are actually used in some file or directory are marked as being in use, all deleted blocks aren't linked to anything other than the list of free blocks, all directories in the filesystem actually have parent directories, and so on. This check is done by filesystem consistency check applications, the best known of which is the standard Linux/Unix fsck application. Each filesystem has its own version of fsck (with names like fsck.ext3, fsck.jfs, fsck.reiserfs, and so on) that understands and "does the right thing" for that particular filesystem.

When filesystems are mounted as part of the boot process, they are marked as being in use ("dirty"). When a system is shut down normally, all its on-disk filesystems are marked as being consistent ("clean") when they are unmounted. When the system reboots, filesystems that are marked as being clean do not have to be checked before they are mounted, which saves lots of time in the boot process. However, if they are not marked as clean, the laborious filesystem consistency check process begins. Because today's filesystems are often quite large and therefore contain huge chains of files, directories, and subdirectories, each using blocks in the filesystem, verifying the consistency of each filesystem before mounting it is usually the slowest part of a computer's boot process. Avoiding filesystem consistency checks is therefore the dream of every sysadmin and a goal of every system or filesystem designer. This hack explores the basic concepts of how a special type of filesystem, known as a journaling filesystem, expedites system restart times by largely eliminating the need to check filesystem consistency when a system reboots.

8.3.1. Journaling Filesystems 101

Some of the more inspired among us may keep a journal to record what's happening in our lives. These come in handy if we want to look back and see what was happening to us at a specific point in time. Journaling filesystems operate in a similar manner, writing planned changes to a filesystem in a special part of the disk, called a journal or log, before actually applying them to the filesystem. (This is hard to do in a personal journal unless you're psychic.) There are multiple reasons journaling filesystems record changes in a log before applying them, but the primary reason for this is to guarantee filesystem consistency.

Using a log enforces consistency, because sets of planned changes are grouped together in the log and are replayed transactionally against the filesystem. When they are successfully applied to the filesystem, the filesystem is consistent, and all of the changes in the set are removed from the log. If the system crashes while transactionally applying a set of changes to the filesystem, the entries remain present in the log and are applied to the filesystem as part of mounting that filesystem when the system comes back up. Therefore, the filesystem is always in a consistent state or can almost always quickly be made consistent by replaying any pending transactions.

I say "almost always" because a journaling filesystem can't protect you from bad blocks appearing on your disks or from general hardware failures, which can cause filesystem corruption or loss. See "Recover Lost Partitions" [Hack #93], "Recover Data from Crashed Disks" [Hack #94], and "Repair and Recover ReiserFS Filesystems" [Hack #95] for some suggestions if fsck doesn't work for you.


8.3.2. Journaling Filesystems Under Linux

Linux offers a variety of journaling filesystems, preintegrated into the primary kernel code. Depending on the Linux distribution that you are using, these may or may not be compiled into your kernel or available as loadable kernel modules. Filesystems are activated in the Linux kernel on the File Systems pane of your favorite kernel configuration mechanism, accessed via make xconfig or (for luddites) make menuconfig. The options for the XFS journaling filesystem are grouped together on a separate pane, XFS Support.

The journaling filesystems that are integrated into the Linux kernel at the time this book was written are the following:


ext3

ext3 adds high-performance journaling capabilities to the standard Linux ext2 filesystem on which it's based. Existing ext2 filesystems can easily be converted to ext3, as explained later in this hack.


JFS

The Journaled File System (JFS) was originally developed by International Business Machines (IBM) for use on their OS/2 and AIX systems. JFS is a high-performance journaling filesystem that allocates disk space as needed from pools of available storage in the filesystem (known as allocation groups) and therefore creates inodes as needed, rather than preallocating everything as traditional Unix/Linux filesystems do. This provides fast storage allocation and also removes most limitations on the number of inodes (and therefore files and directories) that can be created in a JFS filesystem.


ReiserFS

Written by Hans Reiser and others with the financial support of companies such as SUSE, Linspire, mp3.com, and many others, ReiserFS is a high-performance, space-efficient journaling filesystem that is especially well suited to filesystems that contain large numbers of files. ReiserFS was the first journaling filesystem to be integrated into the Linux kernel code and has therefore been popular and stable for quite a while. It is the default filesystem type on Linux distributions such as SUSE Linux.


Reiser4

Written by Hans Reiser and others with the financial support of the Defense Advanced Research Projects Agency (DARPA), Reiser4 is the newest of the journaling filesystems discussed in this hack. Reiser4 is a very high-performance, transactional filesystem that further increases the extremely efficient space allocation provided by ReiserFS. It is also designed to be extended through plug-ins that can add new features without changing the core code.


XFS

Contributed to Linux by Silicon Graphics, Inc. (SGI), XFS (which doesn't really stand for anything) is a very high-performance journaling filesystem that dynamically allocates space and creates inodes as needed (like JFS), and supports a special (optional) real-time section for files that require high-performance, real-time I/O. The combination of these features provides a fast filesystem without significant limitations on the number of inodes (and therefore files and directories) that can be created in an XFS filesystem.

Each of these filesystem has its own consistency checker, filesystem creation tool, and related administrative tools. Even if your kernel supports the new type of filesystem that you've selected, make sure that your filesystems also include its administrative utilities, installed separately through your distribution's package manager, or you're in for a bad time the next time you reboot and a filesystem check is required.

The purpose of this hack is to explain why journaling filesystems are a good idea for most of the local storage that is attached to the systems you're responsible for, and to provide some tips about integrating journaling filesystems into existing systems. I can't really say more about these here without turning this hack into a tome on Linux filesystemswhich I already wrote a few years ago (Linux Filesystems, SAMS Publishing), though it's now somewhat dated. All of these journaling filesystems are well established and have been used on Linux systems for a few years. Reiser4 is the newest of these and is therefore the least time-tested, but Hans assures us all that no one does software engineering like the Namesys team.

8.3.3. Converting Existing Filesystems to Journaling Filesystems

Traditional Linux systems use the ext2 filesystem for local filesystems. Because the journaling filesystems available for Linux all use their own allocation and inode/storage management mechanisms, the only journaling Linux filesystem that you can begin using with little effort is the ext3 filesystem, which was designed to be compatible with ext2.

To convert an existing ext2 filesystem to an ext3 filesystem, all you have to do is add a journal and tell your system that it is now an ext3 filesystem so that it will start using the journal. The command to create a journal on an existing ext2 filesystem (you must be root or use sudo) is the following:

 # tune2fs -j /dev/  filesystem  

If you create a journal on a mounted ext2 filesystem, it will initially be created as the file .journal in the root of the filesystem and will automatically be hidden when you reboot or remount the filesystem as an ext3 filesystem.


You will need to update /etc/fstab to tell the mount command to mount your converted filesystem as an ext3 filesystem and reboot to verify that all is well.

In general, if you want to begin using any of the non-ext3 journaling filesystems discussed in this chapter with any existing system, you'll need to do the following:

  • Build support for that journaling filesystem into your Linux kernel, make it available as a loadable kernel module, or verify that it's already supported in your existing kernel.

  • Make sure you update the contents of any initial RAM disk you used during the boot process to include any loadable kernel modules for the new filesystem(s) that you are using.

  • Install the administrative tools associated with the new filesystem type, if they aren't already available on your system. These include a minimum of new mkfs.filesystem-type and fsck.filesystem-type utilities, and may also include new administrative and filesystem repair utilities.

  • Manually convert your existing filesystems to the new journaling filesystem format by creating new partitions or logical volumes that are at least as large as your existing filesystems, formatting them using the new filesystem format, and recursively copying the contents of your existing filesystems into the new ones.

  • Go to single-user mode, unmount your existing filesystems, and update the entries in /etc/fstab to reflect the new filesystem types (and the new disks/volumes where they are located unless you're simply replacing an existing disk with one or more new ones).

When migrating the contents of existing partitions and volumes to new partitions and volumes in different filesystem formats, always back up everything first and test each of the new partitions before wiping out its predecessor. Forgetting any of the steps in the previous list can turn your well-intentioned system improvement experience into a restart nightmare if your system won't boot correctly using its sexy new filesystems.

8.3.4. Summary

Journaling filesystems can significantly improve system restart times, provide more efficient use of the disk space available on your partitions or volumes, and often even increase general system performance. I personally tend to use ext3 for system filesystems such as / and /boot, since this enables me to use all of the standard ext2 filesystem repair utilities if these filesystems become corrupted. For local storage on SUSE systems, I generally use ReiserFS, because that's the default there and it's great for system partitions (such as your mail and print queues) because of its super-efficient allocation.

I tend to use XFS for physical partitions on Linux distributions other than SUSE Linux, because I've used it for years on Linux and SGI boxes, it has always been stable in my experience, and the real-time section of XFS filesystems is way cool. I generally use ext3 on logical volumes because the dynamic allocation mechanisms used by JFS and XFS and ReiserFS's tree-balancing algorithms place extra overhead on the logical volume subsystem. They all still work fine on logical volumes, of course.

8.3.5. See Also

  • "Recover Lost Partitions" [Hack #93]

  • "Recover Data from Crashed Disks" [Hack #94]

  • "Repair and Recover ReiserFS Filesystems" [Hack #95]

  • man tune2fs

  • ext3 home page: http://e2fsprogs.sourceforge.net/ext2.html

  • JFS home page: http://jfs.sourceforge.net

  • ReiserFS/Reiser4 home page: http://www.namesys.com

  • XFS home page: http://oss.sgi.com/projects/xfs/



Linux Server Hacks (Vol. 2)
BSD Sockets Programming from a Multi-Language Perspective (Programming Series)
ISBN: N/A
EAN: 2147483647
Year: 2003
Pages: 162
Authors: M. Tim Jones

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net