Understanding Basic Clustering Concepts


Clusters are meant to run "jobs" exclusively for a single user and often without interaction. A job scheduler manages the tasks, and libraries and programs manage the behavior of the task. It's basically that simple.

Why cluster systems? Some tasks are just too large not to have multiple systems calculate some aspect of a sizeable project or problem. Here is a simple example: I built a six-system (12-processor) cluster. The systems have Intel Pentium III 866MHz processors and 512MB of RAM per processor (not a fast system by today's standards). ClusterKnoppix was configured the first time on the entire cluster in about 15 minutes. As a test, a graphic rendering package called POV-Ray was used to render a benchmark file called skyvase.pov. (POV-Ray rendering is often referred to as an embarrassingly parallel application; it easily runs on a cluster because there aren't any cross-communication and other complexities associated with the process.) Running the benchmark on a single system took 1 minute, 24.5 seconds. Running the same benchmark across all 12 processors took 34 seconds. The point is pretty obvious: If the problem that you're trying to solve fits into the parallel computer model, you'll see a dramatic increase in computational speed and power.

A Linux cluster is a bunch of systems that share nothing; you can't read the memory on processor 0 on machine foo from processor 0 on machine bar. You could equivocate and say, "No, but there is fancy hardware MemFooBar that will do shared memory" (often called shmem), or "But I have a massive 64-way SMP system running Linux and I can see all the memory from any processor some expert you are, Mr. Numbskull." That's not the kind of cluster that's being discussed here. This chapter is tackling cheap, ubiquitous PC-quality systems networked together with Ethernet. The most you're going to share is an NFS filesystem.

This sort of cluster, one made up of independent machines, was once complicated to set up and maintain. Now, thanks to Knoppix-derived distributions, you can set one up in a matter of minutes. In this chapter, you set up clusters using ParallelKnoppix and ClusterKnoppix, and then run various applications on them.



Hacking Knoppix
Hacking Knoppix (ExtremeTech)
ISBN: 0764597841
EAN: 2147483647
Year: 2007
Pages: 118

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net