Walt Ligon and Rob Ross
Ever more frequently users of clusters find themselves in an interesting situation: it isn't the processors, communication network, or memory that is limiting their application; it is the storage system. This might force the users to checkpoint less frequently than they would like, might limit the resolution of output visualization data, or might prevent the use of out-of-core solutions needed for the largest of problems. What's worse, the I/O hardware in the system may indeed be adequate for the user's needs but may be being used ineffectively by one of the many software layers involved.
A lot of mystery surrounds I/O solutions in clusters today. For this reason we have rewritten this chapter in the second edition. We begin by covering what we believe are some of the most important issues in parallel I/O systems. These include parallel access patterns, parallel I/O system components and architectures, and consistency semantics. Knowing how parallel I/O systems operate and the issues involved can be useful when performance tuning an application for a particular system or choosing an I/O solution to match expected workloads. This material builds on material in many preceding chapters, including the I/O hardware discussion in Chapter 2, the local and distributed file system discussion in Chapter 3, and the network hardware discussion in Chapter 4.
Following this more general discussion, we delve into PVFS, specifically covering some of the quirks of PVFS, management and tuning, and approaches for narrowing down the source of problems that may crop up. Finally, we discuss some critical issues for parallel file systems and how PVFS2, the next-generation parallel file system being developed by the PVFS team, attempts to address these.
These are very interesting times for parallel file systems on Linux clusters. As we are writing this chapter, the Lustre, PVFS2, and GPFS groups are all bringing new parallel file systems to the Linux cluster environment. The relative success of each of these is not likely to be known for quite some time, but we can certainly hope that at least one of these projects will result in a new, high-performance parallel file system designed to operate on systems with thousands of nodes (and, we hope, more!).