It seems likely that Beowulf-style cluster computing will continue to grow, due to considerations of both supply (costs will continue to decrease, driven by commodity markets) and demand (more applications will come into existence and evolve to exploit parallelism to meet their computing resource requirements). As the use of clusters grows, we will see even more "integration vendors" that bundle pre-assembled hardware with increasingly professional software to provide turnkey solutions. At the same time, those seeking the most economical solutions will still be able to create their own quite capable parallel computers from components available at the nearest mall and software they can download for free. A wonderful thing about Beowulf computing is that the same technology underlies both approaches.
The amazing increases in CPU clock rates will continue, at least for the next few years, following the "doubling every 18 month" prediction of Moore's law. However, Moore's law, which is really an observation about the rate at which the size of features such as a transistor shrink on silicon wafer, cannot hold true indefinitely. If nothing else, feature sizes are rapidly approaching the dimensions of a single atom, where no further reduction will be possible (even if a gate can be built with a single atom). One possibility is to increase the CPU power by using parallelism; a number of research groups are already looking at such approaches. In some ways, these CPUs become little clusters. Other approaches look at different architectures, concentrating on a memory-centric, rather than processor-centric, computing model. Whatever the approach taken, we can expect that CPU's will continue to rapidly increase in performance.
The development that would have the greatest impact on the range of applications that can exploit cluster computing would occur if interconnection networks began behaving according to Moore's Law. So far, this has not been historically true, but recent developments are encouraging. The early parallel computer networks started at relatively low speed. (The Intel iPSC 1 used the original Ethernet to connect its nodes based on the Intel 80286 CPU.) There was a rapid increase through the time of the Intel Paragon and the ASCI Red machine, which had more than 100 MByte/second bandwidth between nodes. It is unfortunate that these early networks were never commoditized into high performance system area networks (the one exception being Myrinet, which grew out of Chuck Seitz's pioneering work with the Cosmic Cube, a machine that can be viewed as the ancestor of all cluster computers because it used commodity CPUs as the building block).
One solution to the problem of commodity, multi-vendor high-performance networking may be Infiniband. The original goals for Infiniband included doubling bandwidth at the same rate as Moore's law—every 18 months. Unlike latency, which is constrained by the speed of light to no less than about 1 ns/foot (3 ns/meter), increasing bandwidth is an engineering problem. Infiniband vendors are just beginning to sell large-scale switches this year. Time will tell whether Infiniband achieves its promise and provides a suitable cost-effective cluster network. Software will also be required; fortunately, support for MPI, both as part of the MPICH project and MVAPIBCH (nowlab.cis.ohio-state.edu/projects/mpi-iba), is already available.
This year (2003), multiple Linux clusters are being installed with more than a thousand nodes each. Even larger, "Beowulf-like" systems are coming soon, such as the 10,000-CPU Red Storm machine from Cray and the 64,000-CPU BG/L from IBM. These will be among the very most powerful computers in the world when they are installed. While you can't buy all of their components at the corner electronics store, many of the topics covered in this book are relevant to their design, system software, programming, use, and management. And in the future, the technology used in these machines may become more generally available.
One interesting open-source effort is the Scalable Systems Software project (www.scidac.org/ScalableSystems). In it a number of groups are collaborating on the development of a component architecture, with well-defined interfaces expressed in XML, for the systems software (schedulers, process managers, monitors, accounting systems, etc.) for large systems. The component structure makes it possible for alternate component implementations to evolve individually and interact with other, separately developed, components.
Nodes developed for the game market have become capable enough to run Linux and thus serve as cluster building blocks. A number of sites have assembled clusters from Sony Playstation 2's. Continuing downward in size, as we noted above, increasing densities for transistors on a chip are leading in the direction of clusters that fit on a single chip. You can already buy small clusters that are in a single PC tower; desktop clusters will become common-place in the next few years. And someday soon, you may have a cluster in your PDA or cell phone.