17.4 Cluster Partition


17.4 Cluster Partition

When communication failures occur, it is the connection manager's job to make sure that only one cluster is formed (or maintained). If a cluster were to "partition" into more than one active cluster, nasty things like data corruption could occur. This, of course, would be a very bad thing!

To prevent this situation from occurring, the connection manager will detect the problem and reconfigure the cluster to remove any member without complete communications connectivity to the other cluster members. Any member that cannot fully communicate with the other cluster members will be prevented from joining the cluster, and, in many cases, it will panic.

Below are the possible panic strings that could be seen as a result of a cluster partition:

  • "CNX MGR: partition action"

  • "CNX MGR: this node removed from cluster"

  • "CNX MGR: phase1 form: cluster already formed"

  • "CNX MGR: restart requested to resynchronize with cluster with quorum"

  • "CNX MGR: rcnx_status: restart requested to resynchronize with cluster with quorum"

  • "CNX QDISK: configuration error. Qdisk in use by cluster of different name."

  • "CNX QDISK: configuration error. Qdisk written by cluster of different name."

  • "CNX QDISK: Yielding to foreign owner without quorum."

  • "CNX QDISK: Yielding to foreign owner with provisional quorum."

  • "CNX QDISK: Yielding to foreign owner with quorum."

An example of a situation where a cluster partition might occur would be in the case where there was a break in the cable for the CI of a two-member cluster with a quorum disk configured as shown in Figure 17-4.

click to expand
Figure 17-4: Cluster Partition

Since each member would be unable to communicate with the other via the CI, and since they both have enough votes to attain quorum, a partition is possible. In other words, since member1 has a CEV value of 3, that would mean 2 votes are needed to attain quorum. Well, member1 has 2 votes (1 for itself and 1 because it can claim the quorum disk). The same situation holds true for member2. So if both members could come up with enough votes to reach quorum, we have a problem. Fortunately, the CNX is ready for this type of scenario, and as a result one member will panic, probably with one of the last two panic strings listed above. The member that has claimed the quorum disk will continue to function.




TruCluster Server Handbook
TruCluster Server Handbook (HP Technologies)
ISBN: 1555582591
EAN: 2147483647
Year: 2005
Pages: 273

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net