Without further ado, let's show you how to perform a rolling upgrade, the stages of which are described in the following flow diagram (see Figure 25-1).
Figure 25-1: The Rolling Upgrade Flow
Preparing the cluster for the rolling upgrade includes several parts. Each of these parts must be completed before you can continue to the next stage of the rolling upgrade.
The "lead member" is chosen as the first member of the cluster to be rolled. Normally, it is good practice to select the cluster member that is serving the root file system, the /usr file system, and the /var file system. The following provides an example of how to determine which cluster member is serving these file systems.
Let's find out which cluster member or system is serving /, /usr, and /var:
# cfsmgr / /usr /var Domain or filesystem name = / Server Name = molari Server Status : OK Domain or filesystem name = /usr Server Name = molari Server Status : OK Domain or filesystem name = /var Server Name = molari Server Status : OK
Or, you can use our cfs script.
# cfs | grep -E "cluster_root|cluster_usr|cluster_var" molari / cluster_root#root AdvFS molari /usr cluster_usr#usr AdvFS molari /var cluster_var#var AdvFS
If one or more of these file systems (/, /usr, or /var) is being served by another cluster member, you can easily relocate this file system using the cfsmgr command.
# cfsmgr /var Domain or filesystem name = /var Server Name = sheridan Server Status : OK
# cfsmgr –h sheridan –r –a SERVER=molari /var
# cfsmgr /var Domain or filesystem name = /var Server Name = molari Server Status : OK
Now that we know which system is the server for these file systems, let's determine which memberid it is in the cluster.
# hostname molari
# sysconfig -q generic memberid generic: memberid = 1
Or you can use the hwmgr(8) command:
# hwmgr –view cluster Member ID State Member HostName --------- ----- --------------- 1 UP molari (localhost) 2 UP Sheridan
In our example, the system known as molari, which is also cluster memberid 1, will be the lead member for the purposes of the rolling upgrade.
In general, it is a good Systems Administration practice never to perform any type of major systems maintenance without first having a good and verifiable backup of your system.
Note | We have noticed that while most Systems Administrators remember to back up /, /usr, and /var, the majority of folks forget about each cluster member's boot disk. Please, always remember to backup each cluster member's boot disk when you back up your cluster. |
New TruCluster Server software often takes advantage of the latest features that can only be found in the latest systems firmware updates from HP. It is essential to have the latest and greatest systems firmware in place before actually operating your cluster on the newly upgraded TruCluster Server software.
When does the systems firmware need to be upgraded? At any time before actually running the new TruCluster Servers software. In order to minimize downtime, in the case of the lead cluster member, the systems firmware for that node should be upgraded early during the Install Stage. In the case of the other cluster members, the systems firmware for each node should be upgraded before completion of the Roll Stage – see section 25.5.2.5 for a recommendation as to when to upgrade the
systems firmware for the other cluster members. For more information on installing systems firmware, please see Chapter 5 on Installing Tru64 UNIX.
Warning | Firmware upgrade of the KGPSA host bus adapter may alter Fibre Channel (FC) node name. |
During the testing required to produce this chapter, we came across an issue and a warning from HP Support that has to do with upgrading firmware on the KGPSA.
Upgrading the firmware on the KGPSA may change the FC node name. "This upgrade cannot be taken lightly as the change in the FC node name changes connection status on the HSG80s on the Tru64 UNIX platforms." If the storage system for the cluster "is using Selective Storage Presentation (setting entries into the Access_Enable table), then all these new connections will have to be renamed and the old connections deleted."[3]
The following is an example of these changes in the connections and how to correct them:
SHOW CONNECTION before the KGPSA firmware upgrade:
BL5-HSG1> SHOW CONNECTION Name Operating system Controller Port Address Status Offset MOLARI1B01 TRU64_UNIX THIS 2 000008 OL this 00 HOST_ID=1000-0000-C921-92C9 ADAPTER_ID=1000-0000-C921-92C9 MOLARI1A01 TRU64_UNIX THIS 2 000002 OL this 00 HOST_ID=1000-0000-C921-93B0 ADAPTER_ID=1000-0000-C921-93B0
SHOW CONNECTION after the firmware upgrade:
BL5-HSG1> SHOW CONNECTION Name Operating system Controller Port Address Status Offset !NEWCON02 WINNT THIS 2 000008 OL this 00 HOST_ID=2000-0000-C921-92C9 ADAPTER_ID=1000-0000-C921-92C9 !NEWCON03 WINNT THIS 2 000002 OL this 00 HOST_ID=2000-0000-C921-93B0 ADAPTER_ID=1000-0000-C921-93B0 MOLARI1B01 TRU64_UNIX THIS 2 000008 Offline 00 HOST_ID=1000-0000-C921-92C9 ADAPTER_ID=1000-0000-C921-92C9 MOLARI1A01 TRU64_UNIX THIS 2 000002 Offline 00 HOST_ID=1000-0000-C921-93B0 ADAPTER_ID=1000-0000-C921-93B0
Notice that the two new connections,!NEWCON02 and !NEWCON03, are the same KGPSA adapters that appear in MOLARI1B01 and MOLARI1A01, but the firmware upgrade has changed the Fibre Channel Node Name or the HOST_ID on the connections. The system's old connections are now OFFLINE and the system cannot access any of the units that are presented with Selective Storage Presentation[4]. In a cluster and especially during a rolling upgrade, this would not be a very desirable situation but it can easily be resolved.
SHOW CONNECTION after re-editing the connection entries:
BL5-HSG1> SHOW CONNECTION Name Operating system Controller Port Address Status Offset MOLARI1B01 TRU64_UNIX THIS 2 000008 OL this 00 HOST_ID=2000-0000-C921-92C9 ADAPTER_ID=1000-0000-C921-92C9 MOLARI1A01 TRU64_UNIX THIS 2 000002 OL this 00 HOST_ID=2000-0000-C921-93B0 ADAPTER_ID=1000-0000-C921-93B0
Notice that the old connections MOLARI1B01 and MOLARI1A01 were deleted. The new connection !NEWCON02 was renamed to MOLARI1B01, and !NEWCON03 was renamed to MOLARI1A01.
One final word about firmware before we proceed. If any of your systems in the cluster has an EISA bus, and most EV5 and EV56 Alpha based system had this, you must run the EISA Configuration Utility (ECU). The ECU is third-party software that was usually shipped on diskette with your system hardware. If you do not have a current ECU diskette, please contact your HP support representative for a replacement copy before starting the rolling upgrade.
# clu_upgrade –v check setup 1 Retrieving cluster upgrade status
This verifies the following:
There is no rolling upgrade in progress.
All cluster members are running the same version of the operating system and cluster software.
There are no cluster members running on tagged files[5].
There is enough free disk space in which to perform a rolling upgrade of the cluster.
During the Setup Stage, the following subtasks are performed:
The rolling upgrade log is created and initialized. This file can be found in /cluster/admin/clu_upgrade.log.
The "clu_upgrade –v check setup" command is reissued.
A set of tagged files is created in preparation for the rolling upgrade. Tagged files are used so that the cluster can operate on two different versions of the operating system and cluster software at the same time. This is why there must be enough free disk space to perform a rolling upgrade.
For all cluster members except the lead member, the /etc/sysconfigtab attribute generic:rolls_ver_lookup is set to 1. This allows the cluster members that have not yet been "rolled" to use the tagged files to operate.
Up to this point, the Setup Stage does the same tasks whether you plan to perform an Update Installation, an NHD Installation, or a Patch Kit Installation.
What's a tagged file and why should you care? As we stated in the previous subsection, tagged files are created in preparation for the rolling upgrade. Their primary purpose is to enable the cluster to operate on two different versions of the operating system and cluster software at once. Is it magic or just a clever trick? You be the judge!
First, a tagged file is usually created in the same directory as its original file. Each tagged file has an AdvFS property set on it. This is called a DEC_VERSION_TAG.
Next, if a cluster member's /etc/sysconfigtab generic:rolls_ver_lookup attribute is set to 1, then pathname resolution includes the determination on whether or not a specific file has an .Old.. prepended to the file's name and whether the copy has a DEC_VERSION_TAG property set on it. If both of these conditions are met, then when an attempt is made to use the file, the request is redirected so that the .Old.. prepended file is used. See, it's magic!
Well, now that we have you thoroughly confused, let's try to simplify matters with an example. If you execute the command /usr/sbin/dump on a member that has not been rolled, then what actually gets executed is /usr/sbin/.Old..dump. Executing the same command on a member that has been rolled will execute the newly updated /usr/sbin/dump command.
You may find that this feature of allowing two different versions of the operating system and cluster software to co-exist will come in handy particularly when it comes to testing and verifying that your user applications work on the new version of the software.
The Setup Stage for an Update Installation also copies the cluster kit from the mounted CD (containing the TruCluster software installation CD) to /var/adm/update/TruClusterKit. This is done so that the cluster kit will be accessible during the Install Stage and the Roll Stage.
Please note that if your existing cluster is at TruCluster Server version 5.0A or at TruCluster Server version 5.1, you will see a slightly different version of output from the "clu_upgrade setup 1" command than if you were using TruCluster Server version 5.1A.
The following is the sample output of the "clu_upgrade setup 1" command, taken from a TruCluster Server version 5.1 system:
# clu_upgrade setup 1 This is the cluster upgrade program. You have indicated that you want to perform the 'setup' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]:yes
What type of upgrade will be performed? 1) Rolling upgrade using the installupdate command 2) Rolling patch using the dupatch command 3) Both a rolling upgrade and a rolling patch 4) Exit cluster software upgrade Enter your choice:1 Enter the full pathname of the cluster kit mount point ['???']:/cdrom1/TruCluster
A cluster kit has been found in the following location: /cdrom1/TruCluster/kit/ This kit has the following version information: 'Tru64 UNIX TruCluster(TM) Server Software V5.1A (Rev 1312)' Is this the correct cluster kit for the update being performed? [yes]:yes
Checking inventory and available disk space. Copying cluster kit '/cdrom1/TruCluster/kit/' to '/var/adm/update/TruClusterKit/'.
The next sample output is what you would see at the Setup Stage if you were performing an Update Installation on TruCluster Server version 5.1A:
# clu_upgrade setup 1 This is the cluster upgrade program. You have indicated that you want to perform the 'setup' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]:yes
What type of rolling upgrade will be performed? Selection Type of Upgrade ---------------------------------------------------------------------- 1 An upgrade using the installupdate command 2 A patch using the dupatch command 3 A new hardware delivery using the nhd_install command 4 All of the above 5 None of the above 6 Help 7 Display all options again ---------------------------------------------------------------------- Enter your Choices (for example, 1 2 2-3):1
You selected the following rolling upgrade options: 1 Is that correct? (y/n) [y]: y
Please note that as TruCluster Server version 5.1A is the latest version, we have yet to test the Update Installation to a later version of TruCluster Server.
As of this writing, not much has been documented on what is required for performing a rolling upgrade of a cluster for a New Hardware Delivery. Based on available information, when a rolling upgrade for an NHD Installation is performed, at the Setup Stage, the NHD installation kit is copied from its source media to /var/adm/update/NHDKit for accessibility during the Install Stage. For more information on the New Hardware Delivery kit, please see Compaq's Tru64 UNIX New Hardware Delivery Release Notes and Installation Instructions when this document becomes available.
Unlike the Setup Stage for either an Update Installation or an NHD Installation, the Setup Stage for a Patch Kit does not do anything additional in terms of copying files for greater accessibility.
Again, if your existing cluster is at TruCluster Server version 5.0A or at TruCluster Server version 5.1, you will see a slightly different version of output from the "clu_upgrade setup 1" command than if you were using TruCluster Server version 5.1A.
The following sample output of "clu_upgrade setup 1" command comes from a TruCluster Server version 5.1 system:
# clu_upgrade setup 1 This is the cluster upgrade program. You have indicated that you want to perform the 'setup' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]: yes
What type of upgrade will be performed? 1) Rolling upgrade using the installupdate command 2) Rolling patch using the dupatch command 3) Both a rolling upgrade and a rolling patch 4) Exit cluster software upgrade Enter your choice: 2
The following is what you would see at the Setup Stage if you were performing a Patch Installation on TruCluster Server version 5.1A:
# clu_upgrade setup 1 This is the cluster upgrade program. You have indicated that you want to perform the 'setup' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]:yes What type of rolling upgrade will be performed? Selection Type of Upgrade ---------------------------------------------------------------------- 1 An upgrade using the installupdate command 2 A patch using the dupatch command 3 A new hardware delivery using the nhd_install command 4 All of the above 5 None of the above 6 Help 7 Display all options again ---------------------------------------------------------------------- Enter your Choices (for example, 1 2 2-3): 2
You selected the following rolling upgrade options: 2 Is that correct? (y/n) [y]: y
The Setup Stage has been known to take a very long time – over two hours – on clusters greater than two cluster nodes or on older, more vintage AlphaServer architectures like AlphaServer 2100s. The reason for this is that system files are being copied into tag file sets. The greater the number of cluster nodes, the greater the number of system files that need to be copied, which takes longer to complete. Please have patience during this stage of the process.
The following is sample output from the Setup Stage from a two-member cluster. You should receive output similar to this on completion of this stage for your cluster:
Backing up member-specific data for member: 1 ....... Creating tagged files. ............. The cluster upgrade 'setup' stage has completed successfully. Reboot all cluster members except member: '1' The 'setup' stage of the upgrade has completed successfully.
At this point, all cluster members except the lead member must be rebooted. As soon as the other cluster members come up after the reboot, they will be running on the tagged files.
If the firmware on each of these other cluster members has not been upgraded, we highly recommend that you take this opportunity to upgrade each system's firmware.
The Preinstall Stage is executed on the lead member of the cluster only after the other cluster members have been rebooted at the end of the Setup Stage. The following subtasks are performed during this stage:
Confirm that the cluster is ready to proceed with the upgrade by verifying that all members are running on the tagged files and that the lead member is not.
An on-disk backup of the lead member's member-specific system files is made.
The tagged files are verified and matched against their inventory files.
The following is sample output from the Preinstall Stage for an Update Installation:
# clu_upgrade preinstall This is the cluster upgrade program. You have indicated that you want to perform the 'preinstall' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]:yes Checking tagged files. ...................................................... The cluster upgrade 'preinstall' stage has completed successfully. On the lead member, perform the following steps before running the installupdate command: # shutdown -h now >>> boot -fl s When the system reaches single-user mode run the following commands: # init s # bcheckrc # update # kloadsrv # lmf reset See the Tru64 UNIX Installation Guide for detailed information on using the installupdate command. The 'preinstall' stage of the upgrade has completed successfully.
This next sample output is from the Preinstall Stage for a Patch Kit installation:
# clu_upgrade preinstall This is the cluster upgrade program. You have indicated that you want to perform the 'preinstall' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]: yes Checking tagged files. ...................................................... The cluster upgrade 'preinstall' stage has completed successfully. You can now run the dupatch command on the lead member.
Now that we have all the preliminaries out of the way, we are finally ready to get this show on the road. The Install Stage of the rolling upgrade is where we actually get to start upgrading the software. All stages previous to this have been in preparation of this stage.
All the tasks of the Install Stage are executed on the lead member. The commands for each of the tasks that can be executed are: installupdate(8) and/or dupatch(8) and/or nhd_install(8)7. See Table 25-3 for the combination of tasks that can be performed.
TruCluster Server Rolling Upgrade Tasks | ||||
---|---|---|---|---|
Tasks | Supported Version | Command(s) | ||
V5.0A | V5.1 | V5.1A | ||
Update Installation | √ | √ | √ | installupdate |
Patch Kit Installation | √ | √ | √ | dupatch |
Update Installation and Patch Kit Installation | √ | √ | √ | 1. installupdate |
New Hardware Delivery (NHD) Kit Installation |
|
| √ | nhd_install |
New Hardware Delivery (NHD) Kit Installation and Patch Kit Installation |
|
| √ | 1. nhd_install |
Update Installation, New Hardware Delivery (NHD) Kit Installation and Patch Kit Installation[*] |
|
| √ | 1. installupdate |
[*]- This is only supported if you have previously installed the NHD kit on a TruCluster Server version 5.1A cluster. |
As of this writing, NHD installation kits have not been made available. Therefore, while the nhd_install command is mentioned here, no example of nhd_install output is provided.
For more information on this topic, we refer you to Compaq's Tru64 UNIX New Hardware Delivery Release Notes and Installation Instructions when they become available.
Let's now follow the individual steps required to perform the Install Stage for an Update Installation:
Shutdown the lead member.
# shutdown –hs now System going down IMMEDIATELY ...
Update the system firmware on the lead cluster member. Please review the Warning in section 25.5.1.3.
Boot to single user mode.
P00>>> boot -fl s ... Loading vmunix ... ... INIT: SINGLE-USER MODE ...
Manually run the bcheckrc(8) command to check and mount all file systems.
# bcheckrc Checking device naming: Passed. Checking local filesystems Mounting / (root) user_cfg_pt: reconfigured root_mounted_rw: reconfigured Mounting /cluster/members/member1/boot_partition (boot filesystem) user_cfg_pt: reconfigured root_mounted_rw: reconfigured user_cfg_pt: reconfigured dsfmgr: NOTE: updating kernel basenames for system at / scp kevm tty00 tty01 lp0 dmapi scp0 dsk0 dsk1 dsk2 dsk3 dsk4 dsk5 dsk6 dsk7 dsk8 dsk9 dsk10 floppy0 cdrom0 dsk13 Mounting local filesystems exec: /sbin/mount_advfs -F 0x14000 cluster_root#root / cluster_root#root on / type advfs (rw) exec: /sbin/mount_advfs -F 0x4000 cluster_usr#usr /usr cluster_usr#usr on /usr: Device busy exec: /sbin/mount_advfs -F 0x4000 cluster_var#var /var cluster_var#var on /var: Device busy ...
Execute the kloadsrv(8) command to start the kernel load server daemon, the update(8) command to flush data from memory and update the file system, and finally use the "swapon" command with the "-a" option to make all swap space available.
# kloadsrv # update # swapon -a
Make sure that all Software License Product Authorization Keys (PAKs) are active by resetting the License Management Facility (LMF).
# lmf reset Combine OSF-USR ALS-NQ-2000NOV03-90 with OSF-USR UNIX-SERVER-IMPLICIT-USER
Now let's start the Update Installation by executing the installupdate(8) command. We recommend using the "–nogui" flag because it takes less time to complete and time is especially important when it involves anything that may impact users. For example, in one of our four-member ES40 clusters, an installupdate from V5.0A to V5.1 took 30 minutes longer with the GUI option than without.
# /sbin/installupdate -nogui /dev/disk/cdrom0c Searching for distribution media... Checking for installed supplemental hardware support... Completed check for installed supplemental hardware support *** START UPDATE INSTALLATION (Thu Nov 8 11:02:38 PST 2001) *** FLAGS: -nogui Checking for retired hardware...done. Initializing new version information (OSF)...done Initializing new version information (TCR)...done Update Installation has detected the following update installable products on your system: Tru64 UNIX T5.1A-4 Operating System (Rev 1278) Tru64 UNIX TruCluster(TM) Server Software X5.1A-4 (Rev 619) These products will be updated to the following versions: Tru64 UNIX V5.1A Operating System (Rev 1885) Tru64 UNIX TruCluster(TM) Server Software V5.1A (Rev 1312) It is recommended that you update your system firmware and perform a complete system backup before proceeding. A log of this update installation can be found at /var/adm/smlogs/update.log. Do you want to continue the Update Installation? (y/n) []: y
As the system firmware has already been updated on the lead member, we can continue with the installupdate.
For our installation, while we want to select the kernel components, we do not have an interest in archiving obsolete files as we find that they are not very useful and take up valuable space. If you have accounting running, you may want to run a report on them first to see if anyone is using them and if not, then delete them.
Do you want to select optional kernel components? (y/n) [n]: y Do you want to archive obsolete files? (y/n) [n]: n
The check for conflicting software has found four software subsets that are not compatible or will not be upgraded with this Update Installation. These software subsets are identified and will need to be reinstalled after the Rolling Upgrade has been completed.
*** Checking for conflicting software *** -------------------------------------------------------------------------------- The following software may require reinstallation after the Update Installation is completed: COMPAQ C++ Version 6.3 for COMPAQ UNIX Systems DEC C++ Class Libraries Version 4.0 for Tru64 UNIX DECevent Development Enhancement Tools for Tru64 UNIX Do you want to continue the Update Installation? (y/n) [y]: y
This section of the installupdate command will allow us to select which kernel options we would like for the new kernel that will eventually be built from the Update Installation. The selections that we have made support the environment in which our cluster is operating. Each Systems Administrator should determine the kernel options that best support his system's unique environment.
... *** KERNEL OPTION SELECTION *** Selection Kernel Option -------------------------------------------------------------- 1 System V Devices 2 NTP V3 Kernel Phase Lock Loop (NTP_TIME) 3 Kernel Breakpoint Debugger (KDEBUG) 4 Packetfilter driver (PACKETFILTER) 5 IP-in-IP Tunneling (IPTUNNEL) 6 IP Version 6 (IPV6) 7 Point-to-Point Protocol (PPP) 8 STREAMS pckt module (PCKT) 9 Data Link Bridge (DLPI V2.0 Service Class 1) 10 X/Open Transport Interface (XTISO, TIMOD, TIRDWR) 11 Digital Versatile Disk File System (DVDFS) 12 ISO 9660 Compact Disc File System (CDFS) 13 Audit Subsystem 14 All of the above 15 None of the above 16 Help 17 Display all options again -------------------------------------------------------------- Enter your choices, choose an overriding action or press <Return> to confirm previous selections. Choices (for example, 1 2 4-6): 1 2 3 4 8 11 12 13
You selected the following kernel options: System V Devices NTP V3 Kernel Phase Lock Loop (NTP_TIME) Kernel Breakpoint Debugger (KDEBUG) Packetfilter driver (PACKETFILTER) STREAMS pckt module (PCKT) Digital Versatile Disk File System (DVDFS) ISO 9660 Compact Disc File System (CDFS) Audit Subsystem Is that correct? (y/n) [y]: y
A check is then made for file type conflicts. If obsolete files are detected, you are given the option to archive them, view them, or continue with the Update Installation. In our case, we were not really interested in archiving or viewing obsolete files. You may choose to do otherwise.
*** Checking for file type conflicts *** Working.... Obsolete files are files that were shipped with the previous version of the operating system that the current version does not require. Obsolete files are removed during the Update Installation. To save any of these files, archive them now. File Administration Menu ------------------------ a) Archive Files v) View List of Files x) Return to Previous Menu Enter your choice: x Continuing update install...
The Update Installation again checks to make sure that we have enough space in our file systems. As you can see, the designers of Tru64 UNIX software are being very careful to ensure that there is indeed enough space to perform an Update Installation.
*** Checking file system space *** Update Installation is now ready to begin software load. Please check the /var/adm/smlogs/update.log file for errors after the installation is complete. Do you want to continue the Update Installation? (y/n) [n]: y
The new version of the Operating System files is copied to a predetermined area for faster and easier access during the roll of the other cluster members. The upgraded Tru64 UNIX Operating System is then loaded.
Copying the new version of the operating system files to /var/adm/update/OSKit. This information will be used by the clu_upgrade command to roll the remaining cluster members and should not be modified in any way. This operation may take a while. Working.... - *** Load Tru64 UNIX V5.1A Operating System (Rev 1885) Software Subsets *** - *** Starting protofile merges for Tru64 UNIX V5.1A Operating System (Rev 1885) - *** Finished protofile merges for Tru64 UNIX V5.1A Operating System (Rev 1885)
Finally, the newly upgraded TruCluster Server software subsets are installed and loaded.
*** Load Tru64 UNIX TruCluster(TM) Server Software V5.1A (Rev 1312) Software Subsets *** 3 subsets will be installed. Loading subset 1 of 3 ... TruCluster Migration Components Copying from /var/adm/update/TruClusterKit (disk) Verifying Loading subset 2 of 3 ... TruCluster Reference Pages Copying from /var/adm/update/TruClusterKit (disk) Verifying Loading subset 3 of 3 ... TruCluster Base Components Copying from /var/adm/update/TruClusterKit (disk) Working....Thu Nov 8 13:09:18 PST 2001 Verifying 3 of 3 subsets installed successfully. *** Starting protofile merges for Tru64 UNIX TruCluster(TM) Server Software V5.1 A (Rev 1312) *** *** Finished protofile merges for Tru64 UNIX TruCluster(TM) Server Software V5.1 A (Rev 1312) *** *** Starting configuration merges for Update Install *** ... Update Installation complete with loading of subsets. Rebooting system with Compaq Computer Corporation Tru64 UNIX V5.1A generic kernel for configuration phase... Removing temporary update installation files...done. ...
The lead cluster member will reboot. When the system comes back up, the software subsets are configured on member0 and then on member1. In this instance, member1 is the designated lead cluster member.
Note | You may notice that some of the configuration messages are stating that software subsets are being configured on member0. This is only the directory, /cluster/members/member0, on the cluster_root file system and not for cluster memberid 0. |
After the configuration of all the software subsets, a new kernel will be built for the lead cluster member. The new kernel will be copied in place and the system is again rebooted.
... rebooting.... (transferring to monitor) ... The system is ready.
Compaq Tru64 UNIX V5.1A (Rev. 1885) (molari.gene.com) console login: root Password: **************************************************************************** The cluster is currently in a rolled state and the software versions that are available are different depending on which cluster member you are on. Additional information about the exact state of the system can be obtained using the /usr/sbin/clu_upgrade command. ****************************************************************************
At this point in the Upgrade Installation, we are just about done with this stage. The next step will be to verify that the Install Stage is complete and successful. See section 25.5.5.1.
We've seen the individual steps required to perform the Install Stage for an Update Installation. Now let's see what it takes to perform the Install Stage of a Patch Kit Installation:
Update the system firmware on the lead cluster member. Please review the Warning in section 25.5.1.3.
We are now ready to install the Patch Kit. Again, the following commands must be executed on the lead member.
| ← or where your Patch Kit is located |
| ← or where your Patch Kit is located |
Warning | While you have a choice of installing Patch Kits in either multi-user or single-user mode, we agree with HP in recommending that the Patch Kit installations should be performed in single-user mode. Doing so will reduce the risk of another Systems Administrator causing unintentional issues. |
As the contents of every Patch Kit vary from release to release, we won't bore you with the details here but instead refer you to the Patch Kit's Summary and Release Notes for expanded information. It should also be noted that Patch Kits have been known to take anywhere from thirty minutes to two hours to install. This depends on the number of patches in the Patch Kit and the type of server you are attempting to patch.
Once the dupatch command is complete, reboot the lead cluster member.
The Postinstall Stage verifies that the Install Stage has completed and the Update Installation and/or the Patch Kit Installation and/or the NHD Kit Installation has completed successfully. This stage must be done on the "lead" member.
# clu_upgrade postinstall This is the cluster upgrade program. You have indicated that you want to perform the 'postinstall' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]: yes
The 'postinstall' stage of the upgrade has completed successfully.
We would now recommend that you test the newly upgraded software before rolling the other cluster members to this new version.
First, let's test the Cluster File System by relocating a cluster file system between individual cluster members. In this example, we have just completed the installupdate on the server molari. The server sheridan has not been rolled to the new version of the TruCluster Server software yet.
# cfsmgr -v -a server / Domain or filesystem name = / Server Name = sheridan Server Status : OK
# cfsmgr -h sheridan -r -a SERVER=molari / Recovering filesystem mounted at / to this node (member id 1) Recovery to this node (member id 1) complete for filesystem mounted at /
# cfsmgr -v -a server / Domain or filesystem name = / Server Name = molari Server Status : OK
Let's test Cluster Application Availability management. In this example, we will be relocating the CAA service for cluster_lockd from sheridan to molari.
# caa_relocate cluster_lockd -c molari Attempting to stop 'cluster_lockd' on member 'sheridan' Stop of 'cluster_lockd' on member 'sheridan' succeeded. Attempting to start 'cluster_lockd' on member 'molari' cluster NFS Locking: cluster rpc.statd started cluster rpc.lockd started Start of 'cluster_lockd' on member 'molari' succeeded.
# caa_stat cluster_lockd NAME=cluster_lockd TYPE=application TARGET=ONLINE STATE=ONLINE on molari
So what have we really tested here? Well, we have tested that the new TruCluster Server software that we have upgraded to work the same as the old software.
The next step would be to test the application software on the newly upgraded TruCluster Server software. Let's face it, this is probably the most important part of this chapter – testing and verifying that everything is okay from the standpoint of the user application software. You do not have TruCluster Server installed because it's really cool… or maybe you do… but because of the advantages it provides to you and your user community. These advantages do not mean very much if your users' application software does not work properly.
We strongly recommend that you test your individual applications on the new TruCluster Server software before continuing any further. Please make sure that it runs the same on the new TruCluster Server software as it did on the old TruCluster Server software. Your next question is probably, "How can you compare if you are now running on the new software?" While the cluster is running the new TruCluster Server software on the lead cluster member, it is still operating on the old TruCluster Server software on all the other non-rolled cluster members.
While the lead cluster member was upgraded during the Install Stage, upgrades to the remaining cluster members are performed during the Roll Stage. The Roll Stage is performed individually on each of the remaining cluster members – one at a time and in single-user mode.
The "clu_upgrade roll" command performs the following:
Verifies that the member to be rolled is in single-user mode, is not the lead cluster member, and has not been rolled yet.
Backs up all the member-specific files for the member to be rolled.
Sets up it(8) scripts that will be executed on reboot. These it scripts actually perform the installation and update of the new software.
The resulting output of the "clu_upgrade roll" command very much mirrors what was done during the Install Stage. If a version update was performed, then the output from the "clu_upgrade roll" will look a great deal like the output from the installupdate command. The same will be true for the installation of a Patch Kit.
Now let's show you what really happens during the Roll Stage:
First we need to shut down this cluster member to upgrade the system firmware. Again, please review the Warning in section 25.5.1.3.
# shutdown –hs now ... Halting processes ... ...
Next we need to boot the system into single-user mode. As soon as this is done, we use bcheckrc to check and mount all the file systems and the "lmf reset" command to reset all the License PAKs to make them active.
P0>>> boot –fl 0 INIT: SINGLE-USER MODE #
# /sbin/bcheckrc
# lmf reset Combine OSF-USR ALS-NQ-2000NOV03-99 with OSF-USR UNIX-SERVER-IMPLICIT-USER
Now let's start the roll or upgrade of this cluster member. Please note that one of the first tasks performed is backing up the member-specific files.
The output from rolling a cluster member after a version update installation will probably look very familiar. It should, as this is basically what occurred during the installupdate with a few differences at the end. For the sake of not being too redundant, we will note only the differences in this example.
# clu_upgrade roll This is the cluster upgrade program. You have indicated that you want to perform the 'roll' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]: yes
Backing up member-specific data for member: 2 ... *** START UPDATE INSTALLATION (Thu Nov 8 13:39:22 PST 2001) *** Checking for installed supplemental hardware support... Completed check for installed supplemental hardware support Checking for retired hardware...done. Initializing new version information (OSF)...done Initializing new version information (TCR)...done Initializing the list of member specific files for member2...done Update Installation has detected the following update installable products on your system: Tru64 UNIX T5.1A-4 Operating System (Rev 1278) Tru64 UNIX TruCluster(TM) Server Software X5.1A-4 (Rev 619) These products will be updated to the following versions: Tru64 UNIX V5.1A Operating System (Rev 1885) Tru64 UNIX TruCluster(TM) Server Software V5.1A (Rev 1312) It is recommended that you update your system firmware and perform a complete system backup before proceeding. A log of this update installation can be found at /var/adm/smlogs/update.log. Do you want to continue the Update Installation? (y/n) []: y
Do you want to select optional kernel components? (y/n) [n]: y
Do you want to archive obsolete files? (y/n) [n]: n FLAGS: *** Checking for conflicting software *** The following software may require reinstallation after the Update Installation is completed: COMPAQ C++ Version 6.3 for COMPAQ UNIX Systems DEC C++ Class Libraries Version 4.0 for Tru64 UNIX DECevent Development Enhancement Tools for Tru64 UNIX Do you want to continue the Update Installation? (y/n) [y]: y
*** Determining installed Operating System software *** *** Determining installed Tru64 UNIX TruCluster(TM) Server Software X5.1A-4 (Rev 619) software *** Working.... *** Determining kernel components *** *** KERNEL OPTION SELECTION ***
... *** KERNEL OPTION SELECTION *** Selection Kernel Option -------------------------------------------------------------- 1 System V Devices 2 NTP V3 Kernel Phase Lock Loop (NTP_TIME) 3 Kernel Breakpoint Debugger (KDEBUG) 4 Packetfilter driver (PACKETFILTER) 5 IP-in-IP Tunneling (IPTUNNEL) 6 IP Version 6 (IPV6) 7 Point-to-Point Protocol (PPP) 8 STREAMS pckt module (PCKT) 9 Data Link Bridge (DLPI V2.0 Service Class 1) 10 X/Open Transport Interface (XTISO, TIMOD, TIRDWR) 11 Digital Versatile Disk File System (DVDFS) 12 ISO 9660 Compact Disc File System (CDFS) 13 Audit Subsystem 14 All of the above 15 None of the above 16 Help 17 Display all options again -------------------------------------------------------------- Choices (for example, 1 2 4-6): 1 2 3 4 8 11 12 13
You selected the following kernel options: System V Devices NTP V3 Kernel Phase Lock Loop (NTP_TIME) Kernel Breakpoint Debugger (KDEBUG) Packetfilter driver (PACKETFILTER) STREAMS pckt module (PCKT) Digital Versatile Disk File System (DVDFS) ISO 9660 Compact Disc File System (CDFS) Audit Subsystem Is that correct? (y/n) [y]: y
*** Checking for file type conflicts *** *** Checking for obsolete files *** *** Checking file system space *** Update Installation is now ready to begin modifying the files necessary to reboot the cluster member off of the new OS. Please check the /var/adm/smlogs/update.log and /var/adm/smlogs/it.log files for errors after the installation is complete. Do you want to continue the Update Installation? (y/n) [n]: y *** Starting configuration merges for Update Install ***
Up to this point, it would be rather hard to differentiate this output from the output from installupdate. This next section of output is unique for the Roll Stage.
The critical files needed for reboot have been moved into place. The system will now reboot with the generic kernel for Compaq Computer Corporation Tru64 UNIX V5.1A and complete the rolling upgrade for this member (member2).
The 'roll' stage has completed successfully. This member must be rebooted in order to run with the newly installed software. Do you want to reboot this member at this time? []: yes You indicated that you want to reboot this member at this time. Is that correct? [yes]: yes The 'roll' stage of the upgrade has completed successfully.
As soon as the cluster member finishes rebooting, it configures the individual software subset for member2.
After the configuration of all the software subsets is complete, a new kernel will be built for this cluster member. The new kernel will be copied in place, and the system is rebooted.
Saving /sys/conf/SHERIDAN as /sys/conf/SHERIDAN.bck The system will now automatically build a kernel with the selected options and then reboot. This can take up to 15 minutes, depending on the processor type. *** PERFORMING KERNEL BUILD *** ... System rebooting
This cluster member is now running on the new TruCluster Server software.
The output of a roll of a cluster member after a Patch Kit Installation is much simpler than a roll after an Update Installation.
# clu_upgrade roll This is the cluster upgrade program. You have indicated that you want to perform the 'roll' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]: yes
Backing up member-specific data for member: 2 .... ... The 'roll' stage has completed successfully. This member must be rebooted in order to run with the newly installed software. Do you want to reboot this member at this time? []: yes
You indicated that you want to reboot this member at this time. Is that correct? [yes]: yes
After the cluster node is rebooted, the newly patched software subsets are installed and configured. Finally a new kernel is built and copied into place. The cluster member is again rebooted, but this time, when the system comes back up it will do so on the newly patched TruCluster Server software.
The Roll Stage is not complete until each and every cluster member, except the lead cluster member, is rolled. If a cluster member goes down and cannot be rebooted before all cluster members are rolled, it is recommended that this cluster member be deleted from the cluster. You can always add this cluster member back after the Rolling Upgrade is complete and this system is repaired.
The Switch Stage is where we actually turn on any new software features installed during the Install Stage. Until this point, after the Install Stage and prior to the completion of the Roll Stage, the cluster was actually operating on two different versions of the operation system and TruCluster Server software. One of the ways it does this is by making sure that active features between the different versions of the software are as compatible as possible. This is handled by effectively "turning off" or disabling any and all new features installed during the Roll Stage until the entire cluster is at the same version of the software.
In detail, let's see what happens when the "clu_upgrade switch" command is executed:
First, it verifies that all cluster members have been rolled and that they are all operating off the same version of the operating system and the TruCluster Server software.
The new version ID of the operating system and cluster software is then set in each cluster member's /etc/sysconfigtab file. This version ID corresponds to the running kernel.
This "clu_upgrade switch" command is executed in multi-user mode and on any cluster member. This command is only executed once and only on one node of the cluster.
# clu_upgrade switch This is the cluster upgrade program. You have indicated that you want to perform the 'switch' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]: yes
Initiating version switch on cluster members ... The cluster upgrade 'switch' stage has completed successfully. All cluster members must be rebooted before running the 'clean' command.
After the "clu_upgrade switch" command completes, every member in the cluster must be rebooted one at a time. For the convenience of the System Administrator and the users, the reboot of each cluster member can be done over time.
Caution | It should be noted that as soon as the Switch Stage is completed or the "switch thrown," you cannot issue any "clu_upgrade undo" commands. |
The Clean Stage is the final stage of the Rolling Upgrade. The "clu_upgrade clean" command performs the following:
Verifies that the Switch Stage has been completed.
Removes all the tagged (.Old..) files.
Removes all the on-disk backups that were created by the clu_upgrade command.
Removes the Kit installation directories: /var/adm/update/TruClusterKit, /var/adm/update/OSKit, and/or /var/adm/update/NHDKit.
Creates a directory for the upgrade just completed in /cluster/admin/clu_upgrade/history/release_version. This directory contains the log files for each stage of the upgrade.
The following is an example of output from the "clu_upgrade clean" command:
# clu_upgrade clean This is the cluster upgrade program. You have indicated that you want to perform the 'clean' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]: yes
.Deleting tagged files. .................................................................... Removing back-up and kit files The Update Administration Utility is typically run after an update installation to manage the files that are saved during an update installation. Do you want to run the Update Administration Utility at this time? [yes]: yes
The Update Installation Cleanup utility is used to clean up backup files created by Update Installation. Update Installation can create two types of files: .PreUPD and .PreMRG. The .PreUPD files are copies of unprotected customized system files as they existed prior to running Update Installation. The .PreMRG files are copies of protected system files as they existed prior to running Update Installation.
At this point, the cluster is now operating on the newly upgraded software.
[3]Compaq Support Blitz TD 2807-C.
[4]For more information on Selective Storage Presentation, we will refer you to Compaq StorageWorks manual for the HSG80 Controller (ACS Manual).
[5]For more information on tagged files, please see section 25.5.2.1.
[6]This is a feature of TruCluster Server version 5.1A or later.