The Serviceguard name is used to describe a set of high-availability and disaster-tolerant solutions. These are all based on the Serviceguard HA product but provide different features depending on the goals of the solution. We will provide a very brief overview of why you might need a high-availability solution, then we will cover the products. note This section only provides an overview of the Serviceguard suite. More details, and some usage examples, will be provided in Chapter 16, "Serviceguard." High Availability and Disaster ToleranceIn this section we will provide a brief overview of what high availability is, how it is different than disaster tolerance, and why these solutions are as important as they are. Why Are High Availability and Disaster Tolerance Important?There are many sources for information about how costly a major service outage can be. The cost of a failure varies based on the industry and the nature of the application. A large number of factors impact the true cost of a failure. These include:
It is likely that you already know which applications in your environment are the ones that are likely to cause the company to have serious financial problems. The type of high availability or disaster tolerance you might want to implement will often depend on the cost of implementing the failover technology compared to the likely cost of a failure. We will describe the various options you have so you can make an educated decision. High Availability vs. Disaster ToleranceHigh availability typically requires providing redundancy within a datacenter to maintain service when there are hardware or software failures. This can also help minimize the damage done by human errors, which account for about 40% of all application failures. Service can normally be restored in only a few minutes. Disaster tolerance involves providing redundancy between separate datacenters so that the service can be restored quickly in the event of a major disaster, which might include a fire, a flood, an earthquake, or terrorism. Service can typically be restored in from tens of minutes to a few hours. There is a third solution, sometimes called disaster recovery. This typically involves sending staff to separate facility with backup tapes. The disaster recovery facility might have spare equipment or may have systems that are used for lower-priority work, so they can be repurposed in the event of a disaster. Service recovery using this method can take days to weeks, depending on how similar the spare systems are to the original production systems. Components of High-Availability TechnologyMany components are necessary to provide a highly available infrastructure. Some are hardware, some are software, some are architecture, and some are processes. Some examples of the components include:
Now let's take a look at the anatomy of a Serviceguard cluster. HP High-Availability SolutionsInstalling high-availability software on a system is not sufficient to get high availability. It is also important that the hardware be configured to allow for a failure. This involves setting up redundant paths to all your mission-critical data and applications. Serviceguard ConceptsA number of critical concepts will help you understand the rest of this section. These include:
Hardware ArchitectureHigh availability generally involves redundancy. The more redundant components you have, the higher the availability you will be able to achieve. Clearly, there is a point of diminishing returns. Figure 13-16 shows the architecture of a reasonable middle ground. Figure 13-16. An Example of the Architecture of High-Availability HardwareIn this example, each system has dual network connections for data and a separate heartbeat LAN that also provides connectivity to the quorum server. In addition, each system has dual fiber-channel cards, each of which is connected to a separate fiber-channel switch, each of which is then connected to one or more storage devices. The key to this is that no one failure will cause a failover. It would take two components, both of which are matched and on the same system, before a failover would occur. Even then, since all the nodes in the cluster are connected to the same storage and networks, any of the other nodes can take over for the failed system. Software ArchitectureNow lets talk about the Serviceguard product. The key features of the Serviceguard cluster product include:
We will discuss some of these features in more detail later when we describe the hardware and software architectures of a Serviceguard cluster. One other thing to consider when designing your cluster is how you want the applications to behave after they have failed over, particularly whether you want to have idle hardware that is able to accept the workload without any performance impact. There are three primary cluster types:
Protecting Against Split BrainOne serious concern in a clustered environment is ensuring that the applications that can run on multiple systems don't start up on more than one system at a time. This is a phenomenon called "split brain" in which a cluster breaks up into multiple smaller clusters, each of which thinks the rest of the cluster has failed. Figure 13-17 shows how this can occur. Figure 13-17. How a Split-Brain Cluster Could Corrupt DataThis picture shows that the network connectivity to two of the nodes in the cluster is lost. However, there is no loss of connectivity to the storage. Therefore, node C thinks that node A has failed and starts the failover process. This includes connecting to the disks for A and starting the package. If this were to occur, then both of these nodes would be running the application and both would be connected to the same storage. This could cause data corruption. The Serviceguard product goes to great lengths to ensure that this can never happen. The first thing Serviceguard does is attempt to reform the cluster whenever it detects a failure. This must happen before any packages are failed over. This reformation will fail unless the cluster has at least half of the systems that were in the original cluster. Since no two portions of the cluster will both be able to get more than half the nodes in the original cluster, there is no way that Serviceguard can reform more than one cluster. This resolves the split-brain problem for all but one special case where a cluster splits up into two equal halves. Serviceguard uses several mechanisms as tie-breakers to avoid this situation. The first is a cluster lock disk on HP-UX and a cluster lock LUN on Linux. These are used for smaller clusters. For larger clusters this is managed using a quorum server. In both cases the disk, LUN, or quorum server is connected to all of the nodes in the cluster. If the cluster reforms and has exactly half of the nodes from the original cluster, it will attempt to acquire the cluster lock. If it succeeds, it will reform the cluster and start any packages that were on the nodes that are no longer in the cluster. If it is unable to get the lock, it shuts down any packages and disconnects them from the shared storage. This ensures that no two nodes are ever connected to the shared storage at the same time. Serviceguard ManagerServiceguard also has a management utility called Serviceguard Manager. This is a Java graphical user interface for managing your Serviceguard clusters. Figure 13-18 shows some screenshots of Serviceguard Manager. Figure 13-18. Screenshots of Serviceguard ManagerIn this figure you can see that the left-hand pane shows a list of clusters and the right-hand pane provides status details of whichever cluster you have selected on the left. The details pane shows the cluster, the nodes in the cluster, and the packages running on each node. You can also see the status of each node and each package. There are mini-icons next to packages whenever something of interest is happening with that package, such as package shutdown or startup or the lack of an available active failover node for the package. From the GUI you can perform virtually any operation you might want on a cluster. Some examples include:
Serviceguard Manager works by connecting to a daemon called the object manager that is configured to monitor one or more clusters. You connect to the object manager by logging in from Serviceguard Manager. You can control who is allowed to log in to the object manager by editing the /etc/cmclnodelist file on the node running the object manager. As you saw earlier in this chapter, this GUI has been integrated with the new VSE management suite. In the first release of the Virtualization Manager, the integration provides a context-sensitive launch of the current Serviceguard Manager product. This integration will become tighter in future releases. HP Disaster-Tolerant SolutionsHigh-availability clusters are intended to provide nearly immediate recovery from a single point of failure. These are achieved through redundant hardware and Serviceguard software to recover from component or node failures and are typically implemented in a single datacenter. For truly mission-critical applications where any sustained outage poses a significant risk to the business, it is important to guard against multiple points of failure as well. Disaster-tolerant clusters are capable of restoring service even after multiple failures or massive single failures in the primary datacenter. These solutions replicate the application data and provide the ability to move the application to an entirely different datacenter in a different part of the building, a different part of the city, or another city. The distance between the datacenters is dependent on the types of disasters you are trying to guard against and the technology used to replicate the data between the datacenters. We will discuss three types of disaster-tolerant clusters in this section:
Extended Distance ClusterAn Extended Distance Cluster, sometimes called an Extended Campus Cluster, runs a cluster across multiple datacenters with high-speed networking between them. The distance between the datacenters is dependent on the technology used for data replication. Table 13-2 lists the technologies and the distances they support.
It is possible to set up a two-datacenter solution or a three-datacenter solution. The key difference between these is that the two-datacenter solution is implemented with dual cluster lock disks and the three-datacenter solution uses a quorum server in the third datacenter. Two-Datacenter Extended Distance ClusterBecause the two-datacenter solution requires the use of dual cluster lock disks, the size of the cluster is limited to a maximum of four nodes. In addition, the cluster must be split evenly; you can have a two-node cluster or a four-node cluster. Figure 13-19 shows the layout of a two-site cluster. Figure 13-19. A Two-Site Extended Distance Serviceguard ClusterYou must have at least two network paths and it is recommended that you have three different network paths between the datacenters to ensure continuous access to the two cluster lock disks should there be a network failure. Application data is mirrored between the two primary datacenters and you must ensure that the mirrored copies are in different datacenters. Three-Datacenter Extended Distance ClusterThe three-datacenter solution is architecturally very similar to the two-datacenter solution. The third site serves as a tie-breaker in case connectivity is lost between the other two sites. Figure 13-20 shows the layout of a three-site cluster. Figure 13-20. A Three-Site Extended Distance Serviceguard ClusterThe third site can either run two nodes that are part of the cluster, called arbitrator nodes, or a quorum server. The two arbitrator nodes are part of the cluster but can't be sharing the disks in either of the primary datacenters. They are cluster members that are not running any packages or they can be running packages that failover locally but not to either of the other datacenters. Or the third site could be running a quorum server. This brings several advantagesmore nodes and lower overhead. Since the arbitrator nodes are cluster members that can't be running any packages, the maximum size of the cluster would be 14, or seven nodes in each datacenter. In addition, the overhead of the cluster quorum server is quite low and could be run on a small Linux server or a server that is running other workloads. MetroclusterThe main difference between an Extended Distance Cluster and a Metrocluster is that the data replication between the two primary sites is handled by an EMC Symmetrix, HP XP, or EVA disk array. Figure 13-21 provides an example architecture for a Metrocluster. Figure 13-21. An Example Three-Site Serviceguard MetroclusterNotice in this figure the existence of the CA/SRDF Link is the primary difference between this architecture and the Extended Distance Cluster. There are two Metrocluster products, Metrocluster/CA for HP disk arrays and Metrocluster/SRDF for EMC. The "CA" in "Metrocluster/CA" stands for "continuous access," which is a data replication product available with the XP and EVA disk arrays from HP. The "SRDF" in "Metrocluster/SRDF" stands for "Symmetrix remote data facility," which is the data-replication product available from EMC for the EMC Symmetrix disk arrays. The Metrocluster/CA product is a set of scripts and utilities that simplifies the integration of the CA functionality with the Extended Distance Cluster. The Metrocluster/SRDF product is a similar set of utilities designed to assist in integrating the EMC SRDF product. For more information on how to set up either of these products, see the "Designing Disaster Tolerant High Availability Clusters" document available on http://docs.hp.com. ContinentalclustersDatacenters that are more than 100 kilometers away require a very different architecture. The primary difference between an Extended Distance Cluster and a Continental Cluster is that with the Extended Distance Cluster you have a single cluster with nodes in multiple datacenters. With the Continentalcluster there are two separate clusters. The solution is to allow one cluster to take over operation of the critical packages in the other cluster in the event of a disaster that takes down an entire cluster. The Continentalclusters product is a set of utilities to monitor geographically remote clusters and a command to start critical packages on the recovery cluster if the primary cluster is lost. An example of the architecture of Continentalclusters is shown in Figure 13-22. Figure 13-22. An Example Serviceguard Continental ClusterAs you can see in Figure 13-22, there are two distinct clusters running in separate data-centers connected by a wide-area network (WAN). Although the figure shows an active-standby configuration, it is not necessary for the recovery cluster to be idle. A supported configuration is where both clusters are running packages under normal circumstances and each cluster is monitoring the health of the other cluster. Continentalclusters use physical or logical data replication using disk arrays, just like Metroclusters. Because of the higher likelihood that a spurious network error will cause a false alarm, Continentalclusters does not automate the failover. However, the product does provide a single command to initiate the failover manually. The process of failover would include:
One thing to consider with Continentalclusters is that because the two clusters are physically distinct and managed separately, you will need to have administration processes in place to ensure that the versions of the applications on the primary and recovery clusters are kept in sync so that when a recovery is required, there aren't any surprises. This is true of all high-availability clusters, but it may be more of a challenge in this case because of the distance between the datacenters supported by Continentalclusters. |