24.3 A High Availability Cluster | HP-UX CSE(c) Official Study Guide and Desk Reference

In this section, we discuss the Technology Infrastructure we use to minimize the "brief time " during which our application may be unavailable. At the heart of this Technology Infrastructure will be a High Availability Cluster . This is a collection of hardware and software components designed to enable us to maximize the uptime of our systems and applications by eliminating as many Single Points Of Failure (SPOF) as we can "afford." When I say "afford," you should be thinking of the total cost of a failure; think of the time the device will be unavailable including the time to replace, update, and make operational the new device. This can be quite a considerable time when you consider devices such as CPU and memory and when we consider that at present we must shut down HP-UX to replace a faulty processor or memory. Before we look at a basic High Availability cluster, I will clear up some assumptions:

You have been careful in selecting competent and experienced Support Partners to assist you in achieving your goals in relation to High Availability.
You have reviewed and updated, where appropriate, your IT processes in order to ensure that everyone involved in your High Availability initiative is aware, informed, and trained in his or her role and responsibilities.

I have always been advised not to make assumptions because an assumption "only makes an ASS out of an UMPTION," but in this case I am going to persevere with the above two assumptions. This leaves us with the third of our Pillars of High Availability, the " Technology Infrastructure ."

Let's look at "clustering" as a technology itself. We in the High Availability Clustering arena are not unique at employing redundant components and resources to achieve a task. Understanding some of the other clustering technologies will allow us to see where our approach to clustering "fits in" and how other vendors approach and use clustering for their goals.

At present there are no formal rules as to what constitutes a cluster; this leads to many misconceptions or misunderstandings. Clustering is essentially a collection of interconnected complete computer systems collaborating to achieve a particular end goal. The individual components of the cluster lose their individuality because, from a user 's perspective, they see a computing resource instead of a collection of independent machines. As a result of implementing multiple complete computer systems into a single "whole," we can immediately identify two aspirations for a cluster by virtue of the fact it is employing more hardware:

Parallel and/or high performance processing
Application availability

These are by no means the only two aspirations, but they are broad enough to allow us to look at some key technologies currently available and ascertain which category or categories a particular technology aspires to.

High Performance Cluster : This is where we take a (large) collection of individual nodes and "glue" them together using (normally) proprietary cabling and interfaces. The nodes are " coupled " in such a way that they run their own instance of the operating system but provide a single interface to the user. Events are managed via a central coordinator of the "coupling facility." Individual nodes will need access to a common pool of IO devices, possibly via a Storage Area Network (SAN). The underlying operating system will have to provide some form of cluster-wide filesystem or allow cluster-wide access to raw disk space. This is unlike a Massively Parallel Processing (MPP) system that normally "glues" together nodes at the CPU/memory level with a separate IO system used by all. An MPP will run one instance of the operating system. NOTE: Whole computers can be made into MPP systems, but this could be seen as parallel processing in a cluster; consider, for example, HP's Scalable Computing Architecture, which takes a collection of large computers, e.g., V-Class, and links them together with a high-speed interconnect, e.g., HyperFabric. Sophisticated message passing is used to synchronize events between elements in the cluster. One instance of HP-UX is running across all nodes.

Six of the Top 10 (http://www.top500.org) fastest computers in the world are running as high-performance clusters.

Hewlett Packard offers the HyperPlex product to provide high-performance parallel processing. HP has worked in conjunction with the company Platform Computing, utilizing their LSF suite of products designed specifically for long-running, compute-bound applications.
Load-Leveling Cluster : In this type of cluster, we are giving the user a view of a single computing resource insofar as submitting jobs to be executed is concerned . The key component here is the load-leveling software. This will provide a single interface to the user as well as distribute workload over all nodes in the cluster. Each node is running its own instance of the operating system; in fact, in some installations, the operating system can be heterogeneous. We hope to achieve high throughput because we have multiple machines to run our jobs, as well as high availability because the loss of a single node means that we can rerun a particular job on another node. This will require all nodes to have access to all data and applications necessary to run a job. IBM's LoadLeveler software is one example.
Database Cluster : Many database vendors now offer the facility to have a single database managed from several nodes, e.g., Oracle Parallel Server. The user simply sends a request to the database management processes, but in this case the processes can be running concurrently on multiple nodes. Consistency is maintained by sophisticated message passing between the database management processes themselves. Each node is running its own instance of the operating system. All nodes in the cluster are connected to the disks holding the database and will have to provide either a cluster-wide filesystem or simple raw disk access managed by the database processes themselves .
Web Server Cluster : In this solution, we have a "front end" Web server that receives all incoming requests from the intranet/Internet. Instead of servicing these thousands of requests, this "dispatcher" will send individual requests to a "farm" of actual Web servers that will individually construct a response and send the response back to the originator directly. In effect, the "dispatcher" is acting as a "load balancer." Some solutions simply send requests to backend servers on a round- robin basis. Others are more "content aware" and send requests based on the likely response times.

Local Director from Cisco Systems, ACEdirector from Alteon, and HP e-Commerce Traffic Director Server Appliance SA8220 are examples of Web-cluster solutions. One thing to be aware of is the redundancy in these solutions; Local Director, for example, is a hardware-based solution. This could now be your Single Point Of Failure . While being able to achieve higher performance in responding to individual requests, we may need to investigate further whether individual solutions can support improved availability through redundancy of devices.
Storage Clusters : A common problem for systems is being limited in the number of devices they can attach to. Interfaces like PCI are common these days, but to have a server with 10+ interfaces is not so common because the individual costs of such servers are normally high. A solution would be to a collection of small " blade " servers all connected to a common pool of storage devices. Today, we would call this pool of storage devices a SAN, a storage area network. The use of the acronym SAN predates what we would call a SAN. Back in the late 1980s and early 1990s, Tandem (as it was known then) developed a product known as ServerNet, which was based on the concept known as System Area Network (SAN). Here, we have inter-device links as well as inter-host links. All nodes can, if necessary, see all devices. Centralized storage management has benefits for ease of management, for example, performing a backup to a tape device. Because we have inter-device communication, we can simply instruct a "disk" to send IO to a "tape" without going through the system/memory bus of the individual host. Today, we would call this a " serverless backup." We gain higher throughput in this instance, although we need to be careful that the communications medium used has the bandwidth and low-latency needed to respond to multiple requests from multiple hosts . High availability is not necessarily achieved because individual devices, e.g., disks and tapes, are still Single Points Of Failure.
High Availability Clusters : In these clusters, we are providing a computing resource that is able to sustain failures that would normal render an individual machine unusable. Here, we are looking at the types of outages we normally experience and trying to alleviate the possibility of such an outage affecting our ability to continue processing. Outages come in two main forms: planned and unplanned. Planned outages include software and hardware upgrades, maintenance, or regulatory conditions that effectively mean you know in advance that you are going to have to stop processing. If we know about them in advance, we can send out for some pizza, because the systems will be unavailable. Unplanned outages are the difficult ones: power failures, failed hardware, software bugs , human error, natural disasters, and terrorism, to name only a few. We don't know in advance when they are going to happen, but Murphy's Law says, "If it can go wrong, it will go wrong." With a High Availability Cluster, we are trying, in essence, to alleviate the impact of unplanned outages. Because the focus of these clusters is on high availability, it is unlikely that we will concentrate on high performance; however, some would argue that, having greatly improved availability, we will see an overall increase in throughput simply by virtue of the fact that users can use the systems for longer. Hewlett Packard's Serviceguard, IBM's HACMP, and Veritas Cluster Services are all examples of High Availability Clusters.
Single System Image (SSI ) : An SSI is a conceptual notion of providing, at some level, a single instance of "a thing." All of the designs we have listed above can be viewed as SSIs, but it depends at what level of abstraction you view them from. A simple load-leveling batch scheduler can be seen as an SSI from the perspective of a user submitting batch jobs. A Web cluster is an SSI because the user at home has no concept of who, what, or how many devices are capable of responding to his query; he gets a single response. Database users submit their queries and their screens are updated accordingly , regardless of which node performed their query. Nuclear scientists at Los Alamos National Laboratory perform their massive nuclear weapon simulations unaware of which nodes are processing their individual calculations. As we can see here, these SSIs appear to operate at different levels. This is true for all SSIs. There are three main levels at which an SSI can exist: application, operating system (kernel), and hardware . An individual SSI can support multiple levels, each building on the other. Hewlett Packard's (formerly Comaq's, formerly DEC's) TruCluster technology is an example. We have a cluster at the operating system level and the application level providing to the cluster administrator a "single image" of the operating system files held on a central machine. Installing and updating operating system software happens only once, on the SSI files. The individual systems themselves are independent nodes running effectively their own instance of the operating system.

The SSI also has a concept known as a "boundary." In the case of TruCluster, the boundary would be at the operating system level. Anything performed outside the boundary obliterates the idea of a unified computing resource; for example, in the case of TruCluster, performing hardware upgrades on an individual node exposes the "individuality" of single nodes in the cluster.