Hack61.Quick and Dirty NAS | BSD Sockets Programming from a Multi-Language Perspective (Programming Series)

Hack 61. Quick and Dirty NAS

Combining LVM, NFS, and Samba on new file servers is a quick and easy solution when you need more shared disk resources.

Network Attached Storage (NAS) and Storage Area Networks (SANs) aren't making as many people rich nowadays as they did during the dot-com boom, but they're still important concepts for any system administrator. SANs depend on high-speed disk and network interfaces, and they're responsible for the increasing popularity of other magic acronyms such as iSCSI (Internet Small Computer Systems Interface) and AoE (ATA over Ethernet), which are cool and upcoming technologies for transferring block-oriented disk data over fast Ethernet interfaces. On the other hand, NAS is quick and easy to set up: it just involves hanging new boxes with shared, exported storage on your network.

"Disk use will always expand to fill all available storage" is one of the immutable laws of computing. It's sad that it's as true today, when you can pick up a 400-GB disk for just over $200, as it was when I got my CS degree and the entire department ran on some DEC-10s that together had a whopping 900 MB of storage (yes, I am old). Since then, every computing environment I've ever worked in has eventually run out of disk space. And let's face itadding more disks to existing machines can be a PITA (pain in the ass). You have to take down the desktop systems, add disks, create filesystems, mount them, copy data around, reboot, and then figure out how and where you're going to back up all the new space.

This is why NAS is so great. Need more space? Simply hang a few more storage devices off the network and give your users access to them. Many companies made gigabucks off this simple concept during the dot-com boom (more often by selling themselves than by selling hardware, but that's beside the point). The key for us in this hack is that Linux makes it easy to assemble your own NAS boxes from inexpensive PCs and add them to your network for a fraction of the cost of preassembled, nicely painted, dedicated NAS hardware. This hack is essentially a meta-hack, in which you can combine many of the tips and tricks presented throughout this book to save your organization money while increasing the control you have over how you deploy networked storage, and thus your general sysadmin comfort level. Here's how.

6.7.1. Selecting the Hardware

Like all hardware purchases, what you end up with is contingent on your budget. I tend to use inexpensive PCs as the basis for NAS boxes, and I'm completely comfortable with basing NAS solutions on today's reliable, high-speed EIDE drives. The speed of the disk controller(s), disks, and network interfaces is far more important than the CPU speed. This is not to say that recycling an old 300-MHz Pentium as the core of your NAS solutions is a good idea, but any reasonably modern 1.5-GHz or greater processor is more than sufficient. Most of what the box will be doing is serving data, not playing Doom. Thus, motherboards with built-in graphics are also fine for this purpose, since fast, hi-res graphics are equally unimportant in the NAS environment.

In this hack, I'll describe minimum requirements for hardware characteristics and capabilities rather than making specific recommendations. As I often say professionally, "Anything better is better." That's not me taking the easy way outit's me ensuring that this book won't be outdated before it actually hits the shelves.

My recipe for a reasonable NAS box is the following:

A mini-tower case with at least three external, full-height drive bays (four is preferable) and a 500-watt or greater power supply with the best cooling fan available. If you can get a case with mounting brackets for extra cooling fans on the sides or bottom, do so, and purchase the right number of extra cooling fans. This machine is always going to be on, pushing at least four disks, so it's a good idea to get as much power and cooling as possible.
A motherboard with integrated video hardware, at least 10/100 onboard Ethernet (10/100/1000 is preferable), and USB or FireWire support. Make sure that the motherboard supports booting from external USB (or FireWire, if available) drives, so that you won't have to waste a drive bay on a CD or DVD drive. If at all possible, on-board SATA is a great idea, since that will enable you to put the operating system and swap space on an internal disk and devote all of the drive bays to storage that will be available to users. I'll assume that you have on-board SATA in the rest of this hack.
A 1.5-GHz or better Celeron, Pentium 4, or AMD processor compatible with your motherboard.
256 MB of memory.
Five removable EIDE/ATA drive racks and trays, hot-swappable if possible. Four are for the system itself; the extra one gives you a spare tray to use when a drive inevitably fails.
One small SATA drive (40 GB or so).
Four identical EIDE drives, as large as you can afford. At the time I'm writing this, 300-GB drives with 16-MB buffers cost under $150. If possible, buy a fifth so that you have a spare and two others for backup purposes.
An external CD/DVD USB or FireWire drive for installing the OS.

I can't really describe the details of assembling the hardware because I don't know exactly what configuration you'll end up purchasing, but the key idea is that you put a drive tray in each of the external bays, with one of the IDE/ATA drives in each, and put the SATA drive in an internal drive bay. This means that you'll still have to open up the box to replace the system disk if it ever fails, but it enables you to maximize the storage that this system makes available to users, which is its whole reason for being. Putting the EIDE/ATA disks in drive trays means that you can easily replace a failed drive without taking down the system if the trays are hot-swappable. Even if they're not, you can bounce a system pretty quickly if all you have to do is swap in another drive and you already have a spare tray available.

At the time I wrote this the hardware setup cost me around $1000 (exclusive of the backup hard drives) with some clever shopping, thanks to http://www.pricewatch.com. This got me a four-bay case; a motherboard with onboard GigE, SATA, and USB; four 300-GB drives with 16-MB buffers; hot-swappable drive racks; and a few extra cooling fans.

6.7.2. Installing and Configuring Linux

As I've always told everyone (regardless of whether they ask), I always install everything, regardless of which Linux distribution I'm using. I personally prefer SUSE for commercial deployments, because it's supported, you can get regular updates, and I've always found it to be an up-to-date distribution in terms of supporting the latest hardware and providing the latest kernel tweaks. Your mileage may vary. I'm still mad at Red Hat for abandoning everyone on the desktop, and I don't like GNOME (though I install it "because it's there" and because I need its libraries to run Evolution, which is my mailer of choice due to its ability to interact with Microsoft Exchange). Installing everything is easy. We're building a NAS box here, not a desktop system, so 80% of what I install will probably never be used, but I hate to find that some tool I'd like to use isn't installed.

To install the Linux distribution of your choice, attach the external CD/DVD drive to your machine and configure the BIOS to boot from it first and the SATA drive second. Put your installation media in the external CD/DVD drive and boot the system. Install Linux on the internal SATA drive. As discussed in "Reduce Restart Times with Journaling Filesystems" [Hack #70], I use ext3 for the /boot and / partitions on my systems so that I can easily repair them if anything ever goes wrong, and because every Linux distribution and rescue disk in the known universe can handle ext2/ext3 partitions. There are simply more ext2/ext3 tools out there than there are for any other filesystem. You don't have to partition or format the drives in the bayswe'll do that after the operating system is installed and booting.

Done installing Linux? Let's add and configure some storage.

6.7.3. Configuring User Storage

Determining how you want to partition and allocate your disk drives is one of the key decisions you'll need to make, because it affects both how much space your new NAS box will be able to deliver to users and how maintainable your system will be. To build a reliable NAS box, I use Linux software RAID to mirror the master on the primary IDE interface to the master on the secondary IDE interface and the slave on the primary IDE interface to the slave on the secondary IDE interface. I put them in the case in the following order (from the top down): master primary, slave primary, master secondary, and slave secondary. Having a consistent, specific order makes it easy to know which is which since the drive letter assignments will be a, b, c, and d from the top down, and also makes it easy to know in advance how to jumper any new drive that I'm swapping in without having to check.

By default, I then set up Linux software RAID and LVM so that the two drives on the primary IDE interface are in a logical volume group [Hack #47].

On systems with 300-GB disks, this gives me 600 GB of reliable, mirrored storage to provide to users. If you're less nervous than I am, you can skip the RAID step and just use LVM to deliver all 1.2 TB to your users, but backing that up will be a nightmare, and if any of the drives ever fail, you'll have 1.2 TB worth of angry, unproductive users. If you need 1.2 TB of storage, I'd strongly suggest that you spend the extra $1000 to build a second one of the boxes described in this hack. Mirroring is your friend, and it doesn't get much more stable than mirroring a pair of drives to two identical drives.

If you experience performance problems and you need to export filesystems through both NFS and Samba, you may want to consider simply making each of the drives on the main IDE interface its own volume group, keeping the same mirroring layout, and exporting each drive as a single filesystemone for SMB storage for your Windows users and the other for your Linux/Unix NFS users.

The next step is to decide how you want to partition the logical storage. This depends on the type of users you'll be delivering this storage to. If you need to provide storage to both Windows and Linux users, I suggest creating separate partitions for SMB and NFS users. The access patterns for the two classes of users and the different protocols used for the two types of networked filesystems are different enough that it's not a good idea to export a filesystem via NFS and have other people accessing it via SMB. With separate partitions they're still both coming to the same box, but at least the disk and operating system can cache reads and handle writes appropriately and separately for each type of filesystem.

Getting insights into the usage patterns of your users can help you decide what type of filesystem you want to use on each of the exported filesystems [Hack #70]. I'm a big ext3 fan because so many utilities are available for correcting problems with ext2/ext3 filesystems.

Regardless of the type of filesystem you select, you'll want to mount it using noatime to minimize file and filesystem updates due to access times. Creation time (ctime) and modification time (mtime) are important, but I've never cared much about access time and it can cause a big performance hit in a shared, networked filesystem. Here's a sample entry from /etc/fstab that includes the noatime mount option:

 /dev/data/music   /mnt/music   xfs   defaults,noatime   0 0

Similarly, since many users will share the filesystems in your system, you'll want to create the filesystem with a relatively large log. For ext3 filesystems, the size of the journal is always at least 1,024 filesystem blocks, but larger logs can be useful for performance reasons on heavily used systems. I typically use a log of 64 MB on NAS boxes, because that seems to give the best tradeoff between caching filesystem updates and the effects of occasionally flushing the logs. If you are using ext3, you can also specify the journal flush/sync interval using the commit=number-of-seconds mount option. Higher values help performance, and anywhere between 15 and 30 seconds is a reasonable value on a heavily used NAS box (the default value is 5 seconds). Here's how you would specify this option in /etc/fstab:

 /dev/data/writing /mnt/writing ext3 defaults, cls, commit=15 0 0

A final consideration is how to back up all this shiny new storage. I generally let the RAID subsystem do my backups for me by shutting down the systems weekly, swapping out the mirrored drives with a spare pair, and letting the RAID system rebuild the mirrors automatically when the system comes back up. Disk backups are cheaper and less time-consuming than tape [Hack #50], and letting RAID mirror the drives for you saves you the manual copy step discussed in that hack.

6.7.4. Configuring System Services

Fine-tuning the services running on the soon-to-be NAS box is an important step. Turn off any services you don't need [Hack #63]. The core services you will need are an NFS server, a Samba server, a distributed authentication mechanism, and NTP. It's always a good idea to run an NTP server [Hack #22] on networked storage systems to keep the NAS box's clock in sync with the rest of your environmentotherwise, you can get some weird behavior from programs such as make.

You should also configure the system to boot in a non-graphical runlevel, which is usually runlevel 3 unless you're a Debian fan. I also typically install Fluxbox [Hack #73] on my NAS boxes and configure X to automatically start that rather than a desktop environment such as GNOME or KDE. Why waste cycles?

"Centralize Resources Using NFS" [Hack #56] explained setting up NFS and "Share Files Across Platforms Using Samba" [Hack #60] shows the same for Samba. If you don't have Windows users, you have my congratulations, and you don't have to worry about Samba.

The last step involved in configuring your system is to select the appropriate authentication mechanism so that you have the same users on the NAS box as you do on your desktop systems. This is completely dependent on the authentication mechanism used in your environment in general. Chapter 1 of this book discusses a variety of available authentication mechanisms and how to set them up. If you're working in an environment with heavy dependencies on Windows for infrastructure such as Exchange (shudder!), it's often best to bite the bullet and configure the NAS box to use Windows authentication. The critical point for NAS storage is that your NAS box must share the same UIDs, users, and groups as your desktop systems, or you're going to have problems with users using the new storage provided by the NAS box. One round of authentication problems is generally enough for any sysadmin to fall in love with a distributed authentication mechanismwhich one you choose depends on how your computing environment has been set up in general and what types of machines it contains.

6.7.5. Deploying NAS Storage

The final step in building your NAS box is to actually make it available to your users. This involves creating some number of directories for the users and groups who will be accessing the new storage. For Linux users and groups who are focused on NFS, you can create top-level directories for each user and automatically mount them for your users using the NFS automounter and a similar technique to that explained in [Hack #57], wherein you automount your users' NAS directories as dedicated subdirectories somewhere in their accounts. For Windows users who are focused on Samba, you can do the same thing by setting up an [NAS] section in the Samba server configuration file on your NAS box and exporting your users' directories as a named NAS share.

6.7.6. Summary

Building and deploying your own NAS storage isn't really hard, and it can save you a significant amount of money over buying an off-the-shelf NAS box. Building your own NAS systems also helps you understand how they're organized, which simplifies maintenance, repairs, backups, and even the occasional but inevitable replacement of failed components. Try ityou'll like it!

6.7.7. See Also

"Combine LVM and Software RAID" [Hack #47]
"Centralize Resources Using NFS" [Hack #56]
"Share Files Across Platforms Using Samba" [Hack #60]
"Reduce Restart Times with Journaling Filesystems" [Hack #70]