2.6 How clusters work | Oracle Real Application Clusters

< Day Day Up >

Clusters work primarily through a mechanism called polling. Polling is an activity similar to pinging, where a message is sent to a target device and the results are examined. If a successful result is received then the polling was successful; otherwise, it is determined to have a problem. All servers participating in the clustered configuration keep polling each other to determine whether each one is working. If something is not right, failover mechanisms leap into action. Polling relies on the cluster interconnect system, the LAN and the device controllers.

Figure 2.16 illustrates a two-node cluster configuration with the cluster interconnect and the SCSI BUS for the LAN. The cluster interconnect is used for polling to determine whether the other node is available while the SCSI BUS is used by the system to determine, based on the initial check using the cluster interconnect, whether the nodes participating in the cluster are available. Once an initial signal of failure is detected, another validation is performed using the LAN connection before attempting the failover operation.

click to expand
Figure 2.16: Cluster config uration with cluster intercon nect and SCSI bus.

The interconnect device polls the other node in the cluster to see if the node is available. If the node, or server, does not respond within a time specified internally (time determined by the value of a heartbeat timeout parameter) the polling node tries to find the node through the LAN. The heartbeat timeout parameter predefines a timeout period after which if there is no response from the target node, it assumes that the node is not available. Failure or success indicates whether the problem lies with the target node or whether it is with the interconnect device.

If the polling node determines through interconnect and LAN polling mechanism that the other node in the cluster is offline, it will take over the node's disk resources in order to keep them available to the network. When the disabled node goes back online, the substitute relinquishes the appropriated disk resources to their rightful owner as soon as the no-longer ''missing'' node tries to access them. This is an automatic process and all part of the failover feature of a clustered configuration.

For a failover to occur, the following conditions must be met:

The interconnect poll must fail.
The LAN poll must fail.
The polling node must succeed in taking control of the missing cluster nodes' disk resources.

2.6.1 Cluster failover

The nodes in a clustered configuration communicate via the cluster interconnect, through a heartbeat mechanism. For example, in a two- node clustered configuration using the heartbeat mechanism, node 1 will check the availability of node 2. Similarly, node 2 will check the viability of node 1. The implementations of the various clustered configurations have a different heartbeat value that is used to verify if the other nodes in the cluster are present. For example, Sun and Linux clusters have a default configuration for the heartbeat set at 2 seconds. This means that every 2 seconds the heartbeat mechanism will verify the existence of other nodes in the cluster to ensure that they are up and running.

When one of the nodes does not respond to another member in the cluster after several attempts (based on the heartbeat timeout value, which varies between operating systems), the cluster manager will timeout the heartbeat check against the node that first detected the failure and will declare that the node is not available. The heartbeat timeout value also varies among the various clustered configurations. For example, the heartbeat timeout value on a Sun cluster is 12 seconds, whereas the value on a Linux cluster is 10 seconds. Consequently, if the heart beat mechanisms do not receive a favorable response within the stipulated period, the node that did not respond is declared as having a failed status.

Once a failure has been ascertained, the next step for the cluster manager is to reconfigure the remaining nodes to form a new confi guration. During this process, the remaining nodes drop the failed node from the node/member list and the activities of checking, load balancing, processing, etc., occur among the reconfigured cluster group.

After the reconfiguration process has completed, the cluster manager starts the recovery process. During the recovery process, all user and system activities are brought to a new state that includes informa tion about the failed instance. Also included are removals and/or com pletion activities that are incomplete by the processes running on the failed node.

At this stage, any database, if present, will start the recovery process. A more detailed look at how the cluster manager and the database (such as Oracle) coordinate this type of activity will be discussed in a later chapter.

< Day Day Up >