25.5 The Rolling Upgrade


25.5 The Rolling Upgrade

Without further ado, let's show you how to perform a rolling upgrade, the stages of which are described in the following flow diagram (see Figure 25-1).

click to expand
Figure 25-1: The Rolling Upgrade Flow

25.5.1 Preparation Stage

Preparing the cluster for the rolling upgrade includes several parts. Each of these parts must be completed before you can continue to the next stage of the rolling upgrade.

25.5.1.1 Determine which Cluster Member will be the Lead Member

The "lead member" is chosen as the first member of the cluster to be rolled. Normally, it is good practice to select the cluster member that is serving the root file system, the /usr file system, and the /var file system. The following provides an example of how to determine which cluster member is serving these file systems.

Let's find out which cluster member or system is serving /, /usr, and /var:

 # cfsmgr / /usr /var   Domain or filesystem name = /   Server Name = molari   Server Status : OK   Domain or filesystem name = /usr   Server Name = molari   Server Status : OK   Domain or filesystem name = /var   Server Name = molari   Server Status : OK 

Or, you can use our cfs script.

 # cfs | grep -E "cluster_root|cluster_usr|cluster_var" molari    /       cluster_root#root    AdvFS molari    /usr    cluster_usr#usr      AdvFS molari    /var    cluster_var#var      AdvFS 

If one or more of these file systems (/, /usr, or /var) is being served by another cluster member, you can easily relocate this file system using the cfsmgr command.

 # cfsmgr /var   Domain or filesystem name = /var   Server Name = sheridan   Server Status : OK 
 # cfsmgr –h sheridan –r –a SERVER=molari /var 
 # cfsmgr /var   Domain or filesystem name = /var   Server Name = molari   Server Status : OK 

Now that we know which system is the server for these file systems, let's determine which memberid it is in the cluster.

 # hostname molari 
 # sysconfig -q generic memberid generic: memberid = 1 

Or you can use the hwmgr(8) command:

 # hwmgr –view cluster Member ID      State   Member HostName ---------      -----   ---------------    1            UP     molari (localhost)    2            UP     Sheridan 

In our example, the system known as molari, which is also cluster memberid 1, will be the lead member for the purposes of the rolling upgrade.

25.5.1.2 Back up the Cluster

In general, it is a good Systems Administration practice never to perform any type of major systems maintenance without first having a good and verifiable backup of your system.

Note

We have noticed that while most Systems Administrators remember to back up /, /usr, and /var, the majority of folks forget about each cluster member's boot disk. Please, always remember to backup each cluster member's boot disk when you back up your cluster.

25.5.1.3 Update the Systems Firmware

New TruCluster Server software often takes advantage of the latest features that can only be found in the latest systems firmware updates from HP. It is essential to have the latest and greatest systems firmware in place before actually operating your cluster on the newly upgraded TruCluster Server software.

When does the systems firmware need to be upgraded? At any time before actually running the new TruCluster Servers software. In order to minimize downtime, in the case of the lead cluster member, the systems firmware for that node should be upgraded early during the Install Stage. In the case of the other cluster members, the systems firmware for each node should be upgraded before completion of the Roll Stage – see section 25.5.2.5 for a recommendation as to when to upgrade the

systems firmware for the other cluster members. For more information on installing systems firmware, please see Chapter 5 on Installing Tru64 UNIX.

Warning

Firmware upgrade of the KGPSA host bus adapter may alter Fibre Channel (FC) node name.

During the testing required to produce this chapter, we came across an issue and a warning from HP Support that has to do with upgrading firmware on the KGPSA.

Upgrading the firmware on the KGPSA may change the FC node name. "This upgrade cannot be taken lightly as the change in the FC node name changes connection status on the HSG80s on the Tru64 UNIX platforms." If the storage system for the cluster "is using Selective Storage Presentation (setting entries into the Access_Enable table), then all these new connections will have to be renamed and the old connections deleted."[3]

The following is an example of these changes in the connections and how to correct them:

  • SHOW CONNECTION before the KGPSA firmware upgrade:

     BL5-HSG1> SHOW CONNECTION Name       Operating system    Controller   Port   Address   Status   Offset MOLARI1B01   TRU64_UNIX          THIS        2     000008    OL this    00              HOST_ID=1000-0000-C921-92C9        ADAPTER_ID=1000-0000-C921-92C9 MOLARI1A01   TRU64_UNIX          THIS        2     000002    OL this    00              HOST_ID=1000-0000-C921-93B0        ADAPTER_ID=1000-0000-C921-93B0 

  • SHOW CONNECTION after the firmware upgrade:

     BL5-HSG1> SHOW CONNECTION Name      Operating system     Controller   Port   Address   Status   Offset !NEWCON02    WINNT               THIS        2     000008    OL this    00           HOST_ID=2000-0000-C921-92C9      ADAPTER_ID=1000-0000-C921-92C9 !NEWCON03    WINNT              THIS         2     000002    OL this    00           HOST_ID=2000-0000-C921-93B0      ADAPTER_ID=1000-0000-C921-93B0 MOLARI1B01   TRU64_UNIX        THIS          2     000008    Offline    00           HOST_ID=1000-0000-C921-92C9      ADAPTER_ID=1000-0000-C921-92C9 MOLARI1A01   TRU64_UNIX        THIS          2     000002    Offline    00           HOST_ID=1000-0000-C921-93B0      ADAPTER_ID=1000-0000-C921-93B0 

    Notice that the two new connections,!NEWCON02 and !NEWCON03, are the same KGPSA adapters that appear in MOLARI1B01 and MOLARI1A01, but the firmware upgrade has changed the Fibre Channel Node Name or the HOST_ID on the connections. The system's old connections are now OFFLINE and the system cannot access any of the units that are presented with Selective Storage Presentation[4]. In a cluster and especially during a rolling upgrade, this would not be a very desirable situation but it can easily be resolved.

  • SHOW CONNECTION after re-editing the connection entries:

     BL5-HSG1> SHOW CONNECTION Name         Operating system      Controller     Port     Address     Status    Offset MOLARI1B01     TRU64_UNIX            THIS          2        000008     OL this     00              HOST_ID=2000-0000-C921-92C9               ADAPTER_ID=1000-0000-C921-92C9 MOLARI1A01     TRU64_UNIX            THIS          2        000002     OL this     00              HOST_ID=2000-0000-C921-93B0               ADAPTER_ID=1000-0000-C921-93B0 

    Notice that the old connections MOLARI1B01 and MOLARI1A01 were deleted. The new connection !NEWCON02 was renamed to MOLARI1B01, and !NEWCON03 was renamed to MOLARI1A01.

One final word about firmware before we proceed. If any of your systems in the cluster has an EISA bus, and most EV5 and EV56 Alpha based system had this, you must run the EISA Configuration Utility (ECU). The ECU is third-party software that was usually shipped on diskette with your system hardware. If you do not have a current ECU diskette, please contact your HP support representative for a replacement copy before starting the rolling upgrade.

25.5.1.4 Determine if the Cluster is Ready to be Upgraded

 # clu_upgrade –v check setup 1 Retrieving cluster upgrade status 

This verifies the following:

  • There is no rolling upgrade in progress.

  • All cluster members are running the same version of the operating system and cluster software.

  • There are no cluster members running on tagged files[5].

  • There is enough free disk space in which to perform a rolling upgrade of the cluster.

25.5.2 Setup Stage

During the Setup Stage, the following subtasks are performed:

  • The rolling upgrade log is created and initialized. This file can be found in /cluster/admin/clu_upgrade.log.

  • The "clu_upgrade –v check setup" command is reissued.

  • A set of tagged files is created in preparation for the rolling upgrade. Tagged files are used so that the cluster can operate on two different versions of the operating system and cluster software at the same time. This is why there must be enough free disk space to perform a rolling upgrade.

  • For all cluster members except the lead member, the /etc/sysconfigtab attribute generic:rolls_ver_lookup is set to 1. This allows the cluster members that have not yet been "rolled" to use the tagged files to operate.

Up to this point, the Setup Stage does the same tasks whether you plan to perform an Update Installation, an NHD Installation, or a Patch Kit Installation.

25.5.2.1 A Note about Tagged Files and How They Work

What's a tagged file and why should you care? As we stated in the previous subsection, tagged files are created in preparation for the rolling upgrade. Their primary purpose is to enable the cluster to operate on two different versions of the operating system and cluster software at once. Is it magic or just a clever trick? You be the judge!

First, a tagged file is usually created in the same directory as its original file. Each tagged file has an AdvFS property set on it. This is called a DEC_VERSION_TAG.

Next, if a cluster member's /etc/sysconfigtab generic:rolls_ver_lookup attribute is set to 1, then pathname resolution includes the determination on whether or not a specific file has an .Old.. prepended to the file's name and whether the copy has a DEC_VERSION_TAG property set on it. If both of these conditions are met, then when an attempt is made to use the file, the request is redirected so that the .Old.. prepended file is used. See, it's magic!

Well, now that we have you thoroughly confused, let's try to simplify matters with an example. If you execute the command /usr/sbin/dump on a member that has not been rolled, then what actually gets executed is /usr/sbin/.Old..dump. Executing the same command on a member that has been rolled will execute the newly updated /usr/sbin/dump command.

You may find that this feature of allowing two different versions of the operating system and cluster software to co-exist will come in handy particularly when it comes to testing and verifying that your user applications work on the new version of the software.

25.5.2.2 Setup Stage for an Update Installation

The Setup Stage for an Update Installation also copies the cluster kit from the mounted CD (containing the TruCluster software installation CD) to /var/adm/update/TruClusterKit. This is done so that the cluster kit will be accessible during the Install Stage and the Roll Stage.

Please note that if your existing cluster is at TruCluster Server version 5.0A or at TruCluster Server version 5.1, you will see a slightly different version of output from the "clu_upgrade setup 1" command than if you were using TruCluster Server version 5.1A.

The following is the sample output of the "clu_upgrade setup 1" command, taken from a TruCluster Server version 5.1 system:

 # clu_upgrade setup 1 This is the cluster upgrade program. You have indicated that you want to perform the 'setup' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]:yes 

 What type of upgrade will be performed? 1) Rolling upgrade using the installupdate command 2) Rolling patch using the dupatch command 3) Both a rolling upgrade and a rolling patch 4) Exit cluster software upgrade Enter your choice:1 Enter the full pathname of the cluster kit mount point ['???']:/cdrom1/TruCluster 

 A cluster kit has been found in the following location: /cdrom1/TruCluster/kit/ This kit has the following version information: 'Tru64 UNIX TruCluster(TM) Server Software V5.1A (Rev 1312)' Is this the correct cluster kit for the update being performed? [yes]:yes 

 Checking inventory and available disk space. Copying cluster kit '/cdrom1/TruCluster/kit/' to '/var/adm/update/TruClusterKit/'. 

The next sample output is what you would see at the Setup Stage if you were performing an Update Installation on TruCluster Server version 5.1A:

 # clu_upgrade setup 1 This is the cluster upgrade program. You have indicated that you want to perform the 'setup' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]:yes 

 What type of rolling upgrade will be performed?     Selection    Type of Upgrade ----------------------------------------------------------------------       1          An upgrade using the installupdate command       2          A patch using the dupatch command       3          A new hardware delivery using the nhd_install command       4          All of the above       5          None of the above       6          Help       7          Display all options again ---------------------------------------------------------------------- Enter your Choices (for example, 1 2 2-3):1 

 You selected the following rolling upgrade options: 1 Is that correct? (y/n) [y]: y 

Please note that as TruCluster Server version 5.1A is the latest version, we have yet to test the Update Installation to a later version of TruCluster Server.

25.5.2.3 Setup Stage for a New Hardware Delivery (NHD)[6]

As of this writing, not much has been documented on what is required for performing a rolling upgrade of a cluster for a New Hardware Delivery. Based on available information, when a rolling upgrade for an NHD Installation is performed, at the Setup Stage, the NHD installation kit is copied from its source media to /var/adm/update/NHDKit for accessibility during the Install Stage. For more information on the New Hardware Delivery kit, please see Compaq's Tru64 UNIX New Hardware Delivery Release Notes and Installation Instructions when this document becomes available.

25.5.2.4 Setup Stage for Installation of a Patch Kit

Unlike the Setup Stage for either an Update Installation or an NHD Installation, the Setup Stage for a Patch Kit does not do anything additional in terms of copying files for greater accessibility.

Again, if your existing cluster is at TruCluster Server version 5.0A or at TruCluster Server version 5.1, you will see a slightly different version of output from the "clu_upgrade setup 1" command than if you were using TruCluster Server version 5.1A.

The following sample output of "clu_upgrade setup 1" command comes from a TruCluster Server version 5.1 system:

 # clu_upgrade setup 1 This is the cluster upgrade program. You have indicated that you want to perform the 'setup' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]: yes 

 What type of upgrade will be performed? 1) Rolling upgrade using the installupdate command 2) Rolling patch using the dupatch command 3) Both a rolling upgrade and a rolling patch 4) Exit cluster software upgrade Enter your choice: 2 

The following is what you would see at the Setup Stage if you were performing a Patch Installation on TruCluster Server version 5.1A:

 # clu_upgrade setup 1 This is the cluster upgrade program. You have indicated that you want to perform the 'setup' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]:yes What type of rolling upgrade will be performed?    Selection           Type of Upgrade ----------------------------------------------------------------------       1                An upgrade using the installupdate command       2                A patch using the dupatch command       3                A new hardware delivery using the nhd_install command       4                All of the above       5                None of the above       6                Help       7                Display all options again  ---------------------------------------------------------------------- Enter your Choices (for example, 1 2 2-3): 2 

 You selected the following rolling upgrade options: 2 Is that correct? (y/n) [y]: y 

25.5.2.5 Verification of Setup Stage

The Setup Stage has been known to take a very long time – over two hours – on clusters greater than two cluster nodes or on older, more vintage AlphaServer architectures like AlphaServer 2100s. The reason for this is that system files are being copied into tag file sets. The greater the number of cluster nodes, the greater the number of system files that need to be copied, which takes longer to complete. Please have patience during this stage of the process.

The following is sample output from the Setup Stage from a two-member cluster. You should receive output similar to this on completion of this stage for your cluster:

 Backing up member-specific data for member: 1 ....... Creating tagged files. ............. The cluster upgrade 'setup' stage has completed successfully. Reboot all cluster members except member: '1' The 'setup' stage of the upgrade has completed successfully. 

At this point, all cluster members except the lead member must be rebooted. As soon as the other cluster members come up after the reboot, they will be running on the tagged files.

25.5.2.6 One More Thing on the Setup Stage Before You Reboot

If the firmware on each of these other cluster members has not been upgraded, we highly recommend that you take this opportunity to upgrade each system's firmware.

25.5.3 Preinstall Stage

The Preinstall Stage is executed on the lead member of the cluster only after the other cluster members have been rebooted at the end of the Setup Stage. The following subtasks are performed during this stage:

  • Confirm that the cluster is ready to proceed with the upgrade by verifying that all members are running on the tagged files and that the lead member is not.

  • An on-disk backup of the lead member's member-specific system files is made.

  • The tagged files are verified and matched against their inventory files.

The following is sample output from the Preinstall Stage for an Update Installation:

 # clu_upgrade preinstall This is the cluster upgrade program. You have indicated that you want to perform the 'preinstall' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]:yes Checking tagged files. ...................................................... The cluster upgrade 'preinstall' stage has completed successfully. On the lead member, perform the following steps before running the installupdate command: # shutdown -h now >>> boot -fl s When the system reaches single-user mode run the following commands: # init s # bcheckrc # update # kloadsrv # lmf reset See the Tru64 UNIX Installation Guide for detailed information on using the installupdate command. The 'preinstall' stage of the upgrade has completed successfully. 

This next sample output is from the Preinstall Stage for a Patch Kit installation:

 # clu_upgrade preinstall This is the cluster upgrade program. You have indicated that you want to perform the 'preinstall' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]: yes Checking tagged files. ...................................................... The cluster upgrade 'preinstall' stage has completed successfully. You can now run the dupatch command on the lead member. 

25.5.4 Install Stage

Now that we have all the preliminaries out of the way, we are finally ready to get this show on the road. The Install Stage of the rolling upgrade is where we actually get to start upgrading the software. All stages previous to this have been in preparation of this stage.

All the tasks of the Install Stage are executed on the lead member. The commands for each of the tasks that can be executed are: installupdate(8) and/or dupatch(8) and/or nhd_install(8)7. See Table 25-3 for the combination of tasks that can be performed.

Table 25-3: Rolling Upgrade Tasks

TruCluster Server Rolling Upgrade Tasks

Tasks

Supported Version

Command(s)

V5.0A

V5.1

V5.1A

Update Installation

installupdate

Patch Kit Installation

dupatch

Update Installation and Patch Kit Installation

1. installupdate
2. dupatch

New Hardware Delivery (NHD) Kit Installation

nhd_install

New Hardware Delivery (NHD) Kit Installation and Patch Kit Installation

1. nhd_install
2. dupatch

Update Installation, New Hardware Delivery (NHD) Kit Installation and Patch Kit Installation[*]

1. installupdate
2. nhd_install
3. dupatch

[*]- This is only supported if you have previously installed the NHD kit on a TruCluster Server version 5.1A cluster.

As of this writing, NHD installation kits have not been made available. Therefore, while the nhd_install command is mentioned here, no example of nhd_install output is provided.

For more information on this topic, we refer you to Compaq's Tru64 UNIX New Hardware Delivery Release Notes and Installation Instructions when they become available.

25.5.4.1 Install Stage for an Update Install

Let's now follow the individual steps required to perform the Install Stage for an Update Installation:

  1. Shutdown the lead member.

     # shutdown –hs now System going down IMMEDIATELY ... 
  2. Update the system firmware on the lead cluster member. Please review the Warning in section 25.5.1.3.

  3. Boot to single user mode.

     P00>>> boot -fl s ... Loading vmunix ... ... INIT: SINGLE-USER MODE ... 
  4. Manually run the bcheckrc(8) command to check and mount all file systems.

     # bcheckrc Checking device naming:      Passed. Checking local filesystems Mounting / (root) user_cfg_pt: reconfigured root_mounted_rw: reconfigured Mounting /cluster/members/member1/boot_partition (boot filesystem) user_cfg_pt: reconfigured root_mounted_rw: reconfigured user_cfg_pt: reconfigured dsfmgr: NOTE: updating kernel basenames for system at /      scp kevm tty00 tty01 lp0 dmapi scp0 dsk0 dsk1 dsk2 dsk3 dsk4 dsk5 dsk6 dsk7 dsk8 dsk9 dsk10 floppy0 cdrom0 dsk13 Mounting local filesystems exec: /sbin/mount_advfs -F 0x14000 cluster_root#root / cluster_root#root on / type advfs (rw) exec: /sbin/mount_advfs -F 0x4000 cluster_usr#usr /usr cluster_usr#usr on /usr: Device busy exec: /sbin/mount_advfs -F 0x4000 cluster_var#var /var cluster_var#var on /var: Device busy ... 

  5. Execute the kloadsrv(8) command to start the kernel load server daemon, the update(8) command to flush data from memory and update the file system, and finally use the "swapon" command with the "-a" option to make all swap space available.

      # kloadsrv # update # swapon -a 
  6. Make sure that all Software License Product Authorization Keys (PAKs) are active by resetting the License Management Facility (LMF).

     # lmf reset Combine OSF-USR ALS-NQ-2000NOV03-90 with OSF-USR UNIX-SERVER-IMPLICIT-USER 

  7. Now let's start the Update Installation by executing the installupdate(8) command. We recommend using the "–nogui" flag because it takes less time to complete and time is especially important when it involves anything that may impact users. For example, in one of our four-member ES40 clusters, an installupdate from V5.0A to V5.1 took 30 minutes longer with the GUI option than without.

     # /sbin/installupdate -nogui /dev/disk/cdrom0c Searching for distribution media... Checking for installed supplemental hardware support... Completed check for installed supplemental hardware support *** START UPDATE INSTALLATION (Thu Nov 8 11:02:38 PST 2001) ***     FLAGS: -nogui Checking for retired hardware...done. Initializing new version information (OSF)...done Initializing new version information (TCR)...done Update Installation has detected the following update installable products on your system:            Tru64 UNIX T5.1A-4 Operating System (Rev 1278)            Tru64 UNIX TruCluster(TM) Server Software X5.1A-4 (Rev 619) These products will be updated to the following versions:            Tru64 UNIX V5.1A Operating System (Rev 1885)            Tru64 UNIX TruCluster(TM) Server Software V5.1A (Rev 1312) It is recommended that you update your system firmware and perform a complete system backup before proceeding. A log of this update installation can be found at /var/adm/smlogs/update.log. Do you want to continue the Update Installation? (y/n) []: y 

    As the system firmware has already been updated on the lead member, we can continue with the installupdate.

  8. For our installation, while we want to select the kernel components, we do not have an interest in archiving obsolete files as we find that they are not very useful and take up valuable space. If you have accounting running, you may want to run a report on them first to see if anyone is using them and if not, then delete them.

     Do you want to select optional kernel components? (y/n) [n]: y Do you want to archive obsolete files? (y/n) [n]: n 

  9. The check for conflicting software has found four software subsets that are not compatible or will not be upgraded with this Update Installation. These software subsets are identified and will need to be reinstalled after the Rolling Upgrade has been completed.

     *** Checking for conflicting software *** -------------------------------------------------------------------------------- The following software may require reinstallation after the Update Installation is completed:        COMPAQ C++ Version 6.3 for COMPAQ UNIX Systems        DEC C++ Class Libraries Version 4.0 for Tru64 UNIX        DECevent        Development Enhancement Tools for Tru64 UNIX Do you want to continue the Update Installation? (y/n) [y]: y 

  10. This section of the installupdate command will allow us to select which kernel options we would like for the new kernel that will eventually be built from the Update Installation. The selections that we have made support the environment in which our cluster is operating. Each Systems Administrator should determine the kernel options that best support his system's unique environment.

     ... *** KERNEL OPTION SELECTION ***     Selection   Kernel Option --------------------------------------------------------------        1        System V Devices        2        NTP V3 Kernel Phase Lock Loop (NTP_TIME)        3        Kernel Breakpoint Debugger (KDEBUG)        4        Packetfilter driver (PACKETFILTER)        5        IP-in-IP Tunneling (IPTUNNEL)        6        IP Version 6 (IPV6)        7        Point-to-Point Protocol (PPP)        8        STREAMS pckt module (PCKT)        9        Data Link Bridge (DLPI V2.0 Service Class 1)        10        X/Open Transport Interface (XTISO, TIMOD, TIRDWR)        11        Digital Versatile Disk File System (DVDFS)        12        ISO 9660 Compact Disc File System (CDFS)        13        Audit Subsystem        14        All of the above        15        None of the above        16        Help        17        Display all options again -------------------------------------------------------------- Enter your choices, choose an overriding action or press <Return> to confirm previous selections. Choices (for example, 1 2 4-6): 1 2 3 4 8 11 12 13 

     You selected the following kernel options:        System V Devices        NTP V3 Kernel Phase Lock Loop (NTP_TIME)        Kernel Breakpoint Debugger (KDEBUG)        Packetfilter driver (PACKETFILTER)        STREAMS pckt module (PCKT)        Digital Versatile Disk File System (DVDFS)        ISO 9660 Compact Disc File System (CDFS)        Audit Subsystem Is that correct? (y/n) [y]: y 
  11. A check is then made for file type conflicts. If obsolete files are detected, you are given the option to archive them, view them, or continue with the Update Installation. In our case, we were not really interested in archiving or viewing obsolete files. You may choose to do otherwise.

     *** Checking for file type conflicts ***         Working.... Obsolete files are files that were shipped with the previous version of the operating system that the current version does not require. Obsolete files are removed during the Update Installation. To save any of these files, archive them now.         File Administration Menu        ------------------------        a) Archive Files        v) View List of Files        x) Return to Previous Menu        Enter your choice: x Continuing update install... 

  12. The Update Installation again checks to make sure that we have enough space in our file systems. As you can see, the designers of Tru64 UNIX software are being very careful to ensure that there is indeed enough space to perform an Update Installation.

     *** Checking file system space *** Update Installation is now ready to begin software load. Please check the /var/adm/smlogs/update.log file for errors after the installation is complete. Do you want to continue the Update Installation? (y/n) [n]: y 

  13. The new version of the Operating System files is copied to a predetermined area for faster and easier access during the roll of the other cluster members. The upgraded Tru64 UNIX Operating System is then loaded.

     Copying the new version of the operating system files to /var/adm/update/OSKit. This information will be used by the clu_upgrade command to roll the remaining cluster members and should not be modified in any way. This operation may take a while.       Working.... - *** Load Tru64 UNIX V5.1A Operating System (Rev 1885) Software Subsets *** - *** Starting protofile merges for Tru64 UNIX V5.1A Operating System (Rev 1885) - *** Finished protofile merges for Tru64 UNIX V5.1A Operating System (Rev 1885) 

  14. Finally, the newly upgraded TruCluster Server software subsets are installed and loaded.

     *** Load Tru64 UNIX TruCluster(TM) Server Software V5.1A (Rev 1312) Software Subsets *** 3 subsets will be installed. Loading subset 1 of 3 ... TruCluster Migration Components    Copying from /var/adm/update/TruClusterKit (disk)    Verifying Loading subset 2 of 3 ... TruCluster Reference Pages    Copying from /var/adm/update/TruClusterKit (disk)    Verifying Loading subset 3 of 3 ... TruCluster Base Components     Copying from /var/adm/update/TruClusterKit (disk)         Working....Thu Nov 8 13:09:18 PST 2001    Verifying 3 of 3 subsets installed successfully. *** Starting protofile merges for Tru64 UNIX TruCluster(TM) Server Software V5.1 A (Rev 1312) *** *** Finished protofile merges for Tru64 UNIX TruCluster(TM) Server Software V5.1 A (Rev 1312) *** *** Starting configuration merges for Update Install *** ... Update Installation complete with loading of subsets. Rebooting system with Compaq Computer Corporation Tru64 UNIX V5.1A generic kernel for configuration phase... Removing temporary update installation files...done. ... 

  15. The lead cluster member will reboot. When the system comes back up, the software subsets are configured on member0 and then on member1. In this instance, member1 is the designated lead cluster member.

    Note

    You may notice that some of the configuration messages are stating that software subsets are being configured on member0. This is only the directory, /cluster/members/member0, on the cluster_root file system and not for cluster memberid 0.

  16. After the configuration of all the software subsets, a new kernel will be built for the lead cluster member. The new kernel will be copied in place and the system is again rebooted.

     ... rebooting.... (transferring to monitor) ... The system is ready. 

     Compaq Tru64 UNIX V5.1A (Rev. 1885) (molari.gene.com) console login: root Password: **************************************************************************** The cluster is currently in a rolled state and the software versions that are available are different depending on which cluster member you are on. Additional information about the exact state of the system can be obtained using the /usr/sbin/clu_upgrade command. **************************************************************************** 

  17. At this point in the Upgrade Installation, we are just about done with this stage. The next step will be to verify that the Install Stage is complete and successful. See section 25.5.5.1.

25.5.4.2 Install Stage for a Patch Kit Installation

We've seen the individual steps required to perform the Install Stage for an Update Installation. Now let's see what it takes to perform the Install Stage of a Patch Kit Installation:

  1. Update the system firmware on the lead cluster member. Please review the Warning in section 25.5.1.3.

  2. We are now ready to install the Patch Kit. Again, the following commands must be executed on the lead member.

    • If installing the Patch Kit from multi-user mode:

       # cd /usr/patch_kit # ./dupatch 
    or where your Patch Kit is located
    • If installing the Patch Kit from single-user mode:

       # shutdown –sh now # bcheckrc # rcinet start # cd /usr/patch_kit # ./dupatch 
    or where your Patch Kit is located

    Warning

    While you have a choice of installing Patch Kits in either multi-user or single-user mode, we agree with HP in recommending that the Patch Kit installations should be performed in single-user mode. Doing so will reduce the risk of another Systems Administrator causing unintentional issues.

    As the contents of every Patch Kit vary from release to release, we won't bore you with the details here but instead refer you to the Patch Kit's Summary and Release Notes for expanded information. It should also be noted that Patch Kits have been known to take anywhere from thirty minutes to two hours to install. This depends on the number of patches in the Patch Kit and the type of server you are attempting to patch.

  3. Once the dupatch command is complete, reboot the lead cluster member.

25.5.5 Postinstall Stage

The Postinstall Stage verifies that the Install Stage has completed and the Update Installation and/or the Patch Kit Installation and/or the NHD Kit Installation has completed successfully. This stage must be done on the "lead" member.

 # clu_upgrade postinstall This is the cluster upgrade program. You have indicated that you want to perform the 'postinstall' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]: yes 

 The 'postinstall' stage of the upgrade has completed successfully. 

We would now recommend that you test the newly upgraded software before rolling the other cluster members to this new version.

25.5.5.1 Testing the Newly Upgraded TruCluster Server Software

First, let's test the Cluster File System by relocating a cluster file system between individual cluster members. In this example, we have just completed the installupdate on the server molari. The server sheridan has not been rolled to the new version of the TruCluster Server software yet.

 # cfsmgr -v -a server /    Domain or filesystem name = /    Server Name = sheridan    Server Status : OK 

 # cfsmgr -h sheridan -r -a SERVER=molari / Recovering filesystem mounted at / to this node (member id 1) Recovery to this node (member id 1) complete for filesystem mounted at / 

 # cfsmgr -v -a server /   Domain or filesystem name = /   Server Name = molari   Server Status : OK 

Let's test Cluster Application Availability management. In this example, we will be relocating the CAA service for cluster_lockd from sheridan to molari.

 # caa_relocate cluster_lockd -c molari Attempting to stop 'cluster_lockd' on member 'sheridan' Stop of 'cluster_lockd' on member 'sheridan' succeeded. Attempting to start 'cluster_lockd' on member 'molari' cluster NFS Locking:          cluster rpc.statd started          cluster rpc.lockd started Start of 'cluster_lockd' on member 'molari' succeeded. 
 # caa_stat cluster_lockd NAME=cluster_lockd TYPE=application TARGET=ONLINE STATE=ONLINE on molari 

So what have we really tested here? Well, we have tested that the new TruCluster Server software that we have upgraded to work the same as the old software.

25.5.5.2 Testing Application Software on the Newly Upgraded TruCluster Server Software

The next step would be to test the application software on the newly upgraded TruCluster Server software. Let's face it, this is probably the most important part of this chapter – testing and verifying that everything is okay from the standpoint of the user application software. You do not have TruCluster Server installed because it's really cool or maybe you do but because of the advantages it provides to you and your user community. These advantages do not mean very much if your users' application software does not work properly.

We strongly recommend that you test your individual applications on the new TruCluster Server software before continuing any further. Please make sure that it runs the same on the new TruCluster Server software as it did on the old TruCluster Server software. Your next question is probably, "How can you compare if you are now running on the new software?" While the cluster is running the new TruCluster Server software on the lead cluster member, it is still operating on the old TruCluster Server software on all the other non-rolled cluster members.

25.5.6 Roll Stage

While the lead cluster member was upgraded during the Install Stage, upgrades to the remaining cluster members are performed during the Roll Stage. The Roll Stage is performed individually on each of the remaining cluster members – one at a time and in single-user mode.

The "clu_upgrade roll" command performs the following:

  • Verifies that the member to be rolled is in single-user mode, is not the lead cluster member, and has not been rolled yet.

  • Backs up all the member-specific files for the member to be rolled.

  • Sets up it(8) scripts that will be executed on reboot. These it scripts actually perform the installation and update of the new software.

The resulting output of the "clu_upgrade roll" command very much mirrors what was done during the Install Stage. If a version update was performed, then the output from the "clu_upgrade roll" will look a great deal like the output from the installupdate command. The same will be true for the installation of a Patch Kit.

Now let's show you what really happens during the Roll Stage:

First we need to shut down this cluster member to upgrade the system firmware. Again, please review the Warning in section 25.5.1.3.

 # shutdown –hs now ... Halting processes ... ... 

Next we need to boot the system into single-user mode. As soon as this is done, we use bcheckrc to check and mount all the file systems and the "lmf reset" command to reset all the License PAKs to make them active.

 P0>>> boot –fl 0 INIT: SINGLE-USER MODE # 
 # /sbin/bcheckrc 

 # lmf reset Combine OSF-USR ALS-NQ-2000NOV03-99 with OSF-USR UNIX-SERVER-IMPLICIT-USER 

Now let's start the roll or upgrade of this cluster member. Please note that one of the first tasks performed is backing up the member-specific files.

25.5.6.1 Rolling a Cluster Member after the installupdate Program

The output from rolling a cluster member after a version update installation will probably look very familiar. It should, as this is basically what occurred during the installupdate with a few differences at the end. For the sake of not being too redundant, we will note only the differences in this example.

 # clu_upgrade roll This is the cluster upgrade program. You have indicated that you want to perform the 'roll' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]: yes 

 Backing up member-specific data for member: 2 ... *** START UPDATE INSTALLATION (Thu Nov 8 13:39:22 PST 2001) ***     Checking for installed supplemental hardware support... Completed check for installed supplemental hardware support Checking for retired hardware...done. Initializing new version information (OSF)...done Initializing new version information (TCR)...done Initializing the list of member specific files for member2...done Update Installation has detected the following update installable products on your system:        Tru64 UNIX T5.1A-4 Operating System (Rev 1278)        Tru64 UNIX TruCluster(TM) Server Software X5.1A-4 (Rev 619) These products will be updated to the following versions:        Tru64 UNIX V5.1A Operating System (Rev 1885)        Tru64 UNIX TruCluster(TM) Server Software V5.1A (Rev 1312) It is recommended that you update your system firmware and perform a complete system backup before proceeding. A log of this update installation can be found at /var/adm/smlogs/update.log. Do you want to continue the Update Installation? (y/n) []: y 

 Do you want to select optional kernel components? (y/n) [n]: y 

 Do you want to archive obsolete files? (y/n) [n]: n FLAGS: *** Checking for conflicting software *** The following software may require reinstallation after the Update Installation is completed:            COMPAQ C++ Version 6.3 for COMPAQ UNIX Systems            DEC C++ Class Libraries Version 4.0 for Tru64 UNIX            DECevent            Development Enhancement Tools for Tru64 UNIX Do you want to continue the Update Installation? (y/n) [y]: y 

 *** Determining installed Operating System software *** *** Determining installed Tru64 UNIX TruCluster(TM) Server Software X5.1A-4 (Rev 619) software ***      Working.... *** Determining kernel components *** *** KERNEL OPTION SELECTION *** 

 ... *** KERNEL OPTION SELECTION ***     Selection   Kernel Option --------------------------------------------------------------        1        System V Devices        2        NTP V3 Kernel Phase Lock Loop (NTP_TIME)        3        Kernel Breakpoint Debugger (KDEBUG)        4        Packetfilter driver (PACKETFILTER)        5        IP-in-IP Tunneling (IPTUNNEL)        6        IP Version 6 (IPV6)        7        Point-to-Point Protocol (PPP)        8        STREAMS pckt module (PCKT)        9        Data Link Bridge (DLPI V2.0 Service Class 1)        10        X/Open Transport Interface (XTISO, TIMOD, TIRDWR)        11        Digital Versatile Disk File System (DVDFS)        12        ISO 9660 Compact Disc File System (CDFS)        13        Audit Subsystem        14        All of the above        15        None of the above        16        Help        17        Display all options again -------------------------------------------------------------- Choices (for example, 1 2 4-6): 1 2 3 4 8 11 12 13 

 You selected the following kernel options:        System V Devices        NTP V3 Kernel Phase Lock Loop (NTP_TIME)        Kernel Breakpoint Debugger (KDEBUG)        Packetfilter driver (PACKETFILTER)        STREAMS pckt module (PCKT)        Digital Versatile Disk File System (DVDFS)        ISO 9660 Compact Disc File System (CDFS)        Audit Subsystem Is that correct? (y/n) [y]: y 

 *** Checking for file type conflicts *** *** Checking for obsolete files *** *** Checking file system space *** Update Installation is now ready to begin modifying the files necessary to reboot the cluster member off of the new OS. Please check the /var/adm/smlogs/update.log and /var/adm/smlogs/it.log files for errors after the installation is complete. Do you want to continue the Update Installation? (y/n) [n]: y *** Starting configuration merges for Update Install *** 

Up to this point, it would be rather hard to differentiate this output from the output from installupdate. This next section of output is unique for the Roll Stage.

 The critical files needed for reboot have been moved into place. The   system will now reboot with the generic kernel for Compaq Computer   Corporation Tru64 UNIX V5.1A and complete the rolling upgrade for this   member (member2). 

 The 'roll' stage has completed successfully. This member must be rebooted in order to run with the newly installed software. Do you want to reboot this member at this time? []: yes You indicated that you want to reboot this member at this time. Is that correct? [yes]: yes The 'roll' stage of the upgrade has completed successfully. 

As soon as the cluster member finishes rebooting, it configures the individual software subset for member2.

After the configuration of all the software subsets is complete, a new kernel will be built for this cluster member. The new kernel will be copied in place, and the system is rebooted.

 Saving /sys/conf/SHERIDAN as /sys/conf/SHERIDAN.bck The system will now automatically build a kernel      with the selected options and then reboot. This can take      up to 15 minutes, depending on the processor type. *** PERFORMING KERNEL BUILD *** ... System rebooting 

This cluster member is now running on the new TruCluster Server software.

25.5.6.2 Rolling a Cluster Member after the dupatch Program

The output of a roll of a cluster member after a Patch Kit Installation is much simpler than a roll after an Update Installation.

 # clu_upgrade roll This is the cluster upgrade program. You have indicated that you want to perform the 'roll' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]: yes 

 Backing up member-specific data for member: 2 .... ... The 'roll' stage has completed successfully. This member must be rebooted in order to run with the newly installed software. Do you want to reboot this member at this time? []: yes 

 You indicated that you want to reboot this member at this time. Is that correct? [yes]: yes 

After the cluster node is rebooted, the newly patched software subsets are installed and configured. Finally a new kernel is built and copied into place. The cluster member is again rebooted, but this time, when the system comes back up it will do so on the newly patched TruCluster Server software.

25.5.6.3 A Final Word on Rolling Cluster Members

The Roll Stage is not complete until each and every cluster member, except the lead cluster member, is rolled. If a cluster member goes down and cannot be rebooted before all cluster members are rolled, it is recommended that this cluster member be deleted from the cluster. You can always add this cluster member back after the Rolling Upgrade is complete and this system is repaired.

25.5.7 Switch Stage

The Switch Stage is where we actually turn on any new software features installed during the Install Stage. Until this point, after the Install Stage and prior to the completion of the Roll Stage, the cluster was actually operating on two different versions of the operation system and TruCluster Server software. One of the ways it does this is by making sure that active features between the different versions of the software are as compatible as possible. This is handled by effectively "turning off" or disabling any and all new features installed during the Roll Stage until the entire cluster is at the same version of the software.

In detail, let's see what happens when the "clu_upgrade switch" command is executed:

  • First, it verifies that all cluster members have been rolled and that they are all operating off the same version of the operating system and the TruCluster Server software.

  • The new version ID of the operating system and cluster software is then set in each cluster member's /etc/sysconfigtab file. This version ID corresponds to the running kernel.

This "clu_upgrade switch" command is executed in multi-user mode and on any cluster member. This command is only executed once and only on one node of the cluster.

 # clu_upgrade switch This is the cluster upgrade program. You have indicated that you want to perform the 'switch' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]: yes 

 Initiating version switch on cluster members ... The cluster upgrade 'switch' stage has completed successfully. All cluster members must be rebooted before running the 'clean' command. 

After the "clu_upgrade switch" command completes, every member in the cluster must be rebooted one at a time. For the convenience of the System Administrator and the users, the reboot of each cluster member can be done over time.

Caution

It should be noted that as soon as the Switch Stage is completed or the "switch thrown," you cannot issue any "clu_upgrade undo" commands.

25.5.8 Clean Stage

The Clean Stage is the final stage of the Rolling Upgrade. The "clu_upgrade clean" command performs the following:

  • Verifies that the Switch Stage has been completed.

  • Removes all the tagged (.Old..) files.

  • Removes all the on-disk backups that were created by the clu_upgrade command.

  • Removes the Kit installation directories: /var/adm/update/TruClusterKit, /var/adm/update/OSKit, and/or /var/adm/update/NHDKit.

  • Creates a directory for the upgrade just completed in /cluster/admin/clu_upgrade/history/release_version. This directory contains the log files for each stage of the upgrade.

The following is an example of output from the "clu_upgrade clean" command:

 # clu_upgrade clean This is the cluster upgrade program. You have indicated that you want to perform the 'clean' stage of the upgrade. Do you want to continue to upgrade the cluster? [yes]: yes 

 .Deleting tagged files. .................................................................... Removing back-up and kit files The Update Administration Utility is typically run after an update installation to manage the files that are saved during an update installation. Do you want to run the Update Administration Utility at this time? [yes]: yes 

 The Update Installation Cleanup utility is used to clean up backup files created by Update Installation. Update Installation can create two types of files: .PreUPD and .PreMRG. The .PreUPD files are copies of unprotected customized system files as they existed prior to running Update Installation. The .PreMRG files are copies of protected system files as they existed prior to running Update Installation. 

At this point, the cluster is now operating on the newly upgraded software.

[3]Compaq Support Blitz TD 2807-C.

[4]For more information on Selective Storage Presentation, we will refer you to Compaq StorageWorks manual for the HSG80 Controller (ACS Manual).

[5]For more information on tagged files, please see section 25.5.2.1.

[6]This is a feature of TruCluster Server version 5.1A or later.




TruCluster Server Handbook
TruCluster Server Handbook (HP Technologies)
ISBN: 1555582591
EAN: 2147483647
Year: 2005
Pages: 273

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net