|< Day Day Up >|| |
As just described, clusters may exist in many different forms. The most common cluster types are:
High availability (HA)
High performance computing (HPC)
Horizontal scaling (HS)
It should be noted that the boundaries between these cluster types are somewhat indistinct and often an actual cluster may have properties or provide the function of more than one of these cluster types.
High-availability clusters are typically built with the intention of providing a fail-safe environment through redundancy, that is, provide a computing environment where the failure of one or more components (hardware, software, or networking) does not significantly affect the availability of the application or applications being used. In the simplest case, two computers may be configured identically with access to shared storage. During normal operation, the application environment executes on one system, while the other system simply stands by ready to take over running the application in the case of a failure.
When a failure does occur, the second system takes over the appropriate resources (storage, networking address, and so on). This process is typically called failover. The second system then completely replaces the failed system and the end users have no need to know that their applications are running on a different physical machine.
High performance computing clusters are designed to use parallel computing techniques to apply more processor power in order to develop a solution for a given problem. There are many examples of this in the scientific computing arena where multiple low-cost processors are used in parallel to perform a large number of operations. This is referred to as parallel computing or parallelism. In How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters (Scientific and Engineering Computation), by Sterling, et al, parallelism is defined as "the ability of many independent threads of control to make progress simultaneously toward the completion of a task."
High performance clusters are typically made up of a large number of computers. The design of high performance clusters is a challenging process that needs to be carefully examined throughout the entire lifecycle of the solution. A typical solutions lifecycle includes five basic phases: Requirements Analysis, Design, Implementation, Configuration, and Maintenance. IBM takes this a step further with their complete Business Transformation Management System (BTMS). Through this and other systems they ensure that each customer receives a true end-to-end solution.
Here is a list that includes some of the key areas that need to be taken into consideration during the entire lifecycle of your project. Then we examine how the IBMCluster 1350, when used in conjunction with CSM and GPFS will allow for the complete mitigation of these issues:
The subsystems that will enable inter-process communication between the nodes and enable the coordination of a parallel workload
Configuration of parallel, concurrent, and high performance access to the same file system(s)
Management of a large number of computers
Addressing the bullets listed above can be an overwhelming task, especially if you are required to build an HPC solution from scratch. IBM realized this and solved this issue when they released the IBMCluster 1350, which is a fully integrated / turn-key solution for the HPC market. IBM has completely mitigated the concerns listed above in bullets number 1 through 5, with the advent of the Cluster 1350. The Cluster 1350 is built upon a solid foundation and incorporates perfectly matched hardware, software, and integrated subsystems.
The value of the Cluster 1350 goes beyond other basic systems by automating many of the most difficult processes associated with typical HPC's and Beowulf configurations.
IBM takes the Cluster 1350 solution one step further by providing both CSM and GPFS. These products allow for a much higher level of automation and will be discussed in great detail as the redbook progresses.
The goal of High Performance Clustering is to present what appears to be a single "virtual" system to any given process or task. When the cluster system is configured properly, the process or task has no idea that its work load is being divided up into smaller more manageable pieces and then delegated for simultaneous execution by many or all of the compute nodes.
Horizontal scaling clusters are used to provide a single interface to a set of resources that can arbitrarily grow (or shrink) in size over time. The most common example of this is a Web server farm. In this example, a single interface (URL) is provided, but requests coming in through that interface can be allocated across a large set of servers providing higher capacity and the ability to manage the end-user experience through functions such as load balancing.
Of course, this kind of cluster also provides significant redundancy. If one server out of a large farm fails, it will likely be transparent to the users. Therefore, this model also has many of the attributes of a high-availability cluster. Likewise, because of the work being shared among many nodes, it also is a form of high-performance computing.
|< Day Day Up >|| |