27.8 Discussing the Process of Rolling Upgrades within a Cluster

     

When we talk about rolling upgrades, we are talking about the process of upgrading hardware components, software components , applications, and loading patches, as well as upgrading the version of Serviceguard itself. The rolling part of the r olling upgrades statement relates to the fact that we are trying to minimize downtime for our applications and in order to do this, we roll the applications onto nodes around the cluster while the original node is being upgraded. There are no Serviceguard specific commands to perform upgrades. The process of upgrading software is normally controlled via swinstall . What we need to ensure is that we have adequate capacity within our cluster to run all current packages while the upgrade is in progress; some nodes may have to run multiple packages during this time. A simple checklist of processes to perform, in order to instigate an upgrade, would look something like this:

  • Move any package(s) off the node to be upgraded ( cmhaltpkg , cmrunpkg ).

  • Halt cluster services on the node to be upgraded ( cmhaltnode ).

  • Ensure that the node does not rejoin the cluster after a reboot ( AUTOSTART_CMCLD=0 ).

  • Upgrade the node.

  • Apply any relevant patches.

  • Rejoin the cluster ( cmrunnode ).

  • Ensure that the node joins a cluster after a reboot ( AUTOSTART_CMCLD=1 ) if applicable .

  • Repeat for all nodes in the cluster.

This last point is crucial because running a cluster with different versions of Serviceguard is not advisable. We must try to upgrade all nodes in the cluster as quickly as possible. While the cluster can run quite happily with different versions of Serviceguard, changes to the binary cluster configuration file are not allowed. If we understand this major limitation, we will realize the necessity to perform upgrades on all nodes in a timely fashion. Here is a list of limitations that apply to the cluster during an upgrade:

  • Cluster configuration files cannot be updated until all nodes are at the same version of the operating system.

    This is an absolute. If you try to make any modifications, you will receive errors from commands such as cmcheckconf and cmapplyconf .

  • All Serviceguard commands must be issued from the node with latest version of Serviceguard.

    You may be in a cluster where the binary file was created on an older version of Serviceguard. A new node brought into the cluster may be on a newer version. While commands from a newer version can understand an old version of the binary cluster configuration file, the reverse is not necessarily the case.

  • Only two versions of Serviceguard can be implemented during a rolling upgrade.

    It is too much to ask Serviceguard to understand and interpolate between too many versions of the software.

  • Binary configuration files may be incompatible.

    Software that uses the binary cluster configuration file should execute the command /usr/sbin/convert after being installed. This converts an older binary configuration file to the new format. You should check with the software's installation instructions whether this is performed automatically or whether you have to perform the step manually.

  • Rolling upgrades can only be carried out on configurations that have not been modified since the last time cluster was started.

  • Serviceguard cannot be removed from a node while a cluster is being upgraded.

  • Any new features of Serviceguard cannot be utilized until all nodes are running that version of software.

  • Keep kernels consistent.

    In some cases, it may be necessary to modify some kernel parameters in order for a node to run a particular application. This should be done only under the guidance of the operating system/application supplier.

  • Hardware configurations cannot be modified.

    Most hardware changes would require a node to be rebooted, which is something we try to avoid during an upgrade. If we were to think of a specific example where this would be a real problem, it would be if we were to change one of our LAN cards. The MAC Address for a LAN card is compiled into the binary cluster configuration file. This is necessary because Serviceguard polls LAN cards at every NETWORK_POLLING_INTERVAL . If we were to change a LAN card and hence the MAC Address, we would need to recompile and distribute the binary cluster configuration file, i.e., cmgetconf , cmcheckconf , and cmapplyconf . As stated earlier, changes to the binary cluster configuration file are not allowed during a rolling upgrade .

Considering these limitations, we need to plan a r olling upgrade with great care. Most customers I know will upgrade one node first and perform significant testing on the cluster and applications on that one node. The plan you construct for performing the upgrade will be extensively tested as well. It is crucial in your planning to work out a drop-dead time for the upgrade. This is a time where you know that if the upgrade has not reached an important milestone, you must back out the upgrade and return the system to its original state before the upgrade started. At least if we can return the node to its original state, we can return the cluster to its original state and work out what went wrong during the upgrade process itself. If the important milestone has been reached by the drop-dead time, it is likely that you will have applications running on their original node in a timely fashion. If all goes well, you can schedule all the other nodes to be upgraded as soon as possible.



HP-UX CSE(c) Official Study Guide and Desk Reference
HP-UX CSE(c) Official Study Guide and Desk Reference
ISBN: N/A
EAN: N/A
Year: 2006
Pages: 434

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net