Chapter 22: Cluster Maintenance and Recovery


Overview

Clusters are great when things are working well. Clusters are great even when things aren't working so well since some of the benefits of a cluster are its high availability and robustness, but we need to be prepared for some possible bad situations. For example, what do you do if you lose an entire member boot disk (which only affects a single member) or cluster_root, cluster_usr, and cluster_var? Or what if something happens to your data file system(s)? These types of problems should be extremely rare if you've followed our advice and the advice given in the TruCluster Server documentation and built your cluster with no single point of failure. But sometimes bad things still happen. For example, an errant "rm *" at the wrong place will cause extensive damage that may only be repaired by a restore of the affected file system(s). We'll tackle some of these types of problems and show you how to work your way out of a few tight situations. In addition we'll show how to change some of the characteristics of your cluster such as the IP address and the cluster interconnect.

We will cover the following:

Section

  • Backup and Restore of Critical Cluster File Systems

22.1

  • Replacing HBA and/or HSx Controllers

22.2

  • Installing Customer Specific Patches

22.3

  • Multi-Path Storage

22.4

  • I/O Barriers and the cleanPR Command

22.5

  • How to Replace a Failed Quorum Disk

22.6

  • Migrating from MC to LAN Cluster Interconnect (and vice versa)

22.7

  • Name and Address Changes

22.8

  • References

22.9




TruCluster Server Handbook
TruCluster Server Handbook (HP Technologies)
ISBN: 1555582591
EAN: 2147483647
Year: 2005
Pages: 273

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net