A Simple Example | HP-UX 11i Systems Administration Handbook and Toolkit (2nd Edition)

Let's go through a simple ServiceGuard example. Although this example is simple it demonstrates all of the fundamental steps to getting a ServiceGuard cluster set up and an application package running. This example is derived from some labs that are part of the HP ServiceGuard training. If you think ServiceGuard applies to your environment after reviewing this example I would recommend taking the ServiceGuard I and II courses to gain a thorough understanding of ServiceGuard. The example consists of the following three high-level steps:

Create a shared volume group vg01 .
Create a ServiceGuard Cluster
Create an application package.

Let's now proceed to the first of the three steps and create a shared volume group.

Setup a Shared Volume Group

Our goal in this section is to have a shared volume group that both of the systems in our ServiceGuard cluster can access. Each of the two systems starts with HP-UX installed and a volume group vg00 that is exclusive to each system; that is, it is not shared. We'll name our shared volume group vg01 .

vg01 will consist of two disks with mirrored data to provide data protection. The disks are also on two different buses to provide protection against a bus failure. Background on the logical volume manager commands we issue to create the shared volume group are covered in Chapter 3, so I won't provide a lot of comments along with the following steps to create and share vg01 . The two systems used throughout the upcoming examples have long system names , so I just use a prompt with ny3c15 for the first system and ny3c16 for the second system in the cluster.

The first step performed is to create vg01 . Next we run ioscan to view all of the disks on the system. We then run pvdisplay to see which disk is used by vg00 so that we don't use it for vg01 . After selecting the two disks that we want to be in vg00 , we run pvcreate on them and include them in vg01 with vgcreate as shown in the following steps:

 ny3c15#  mkdir /dev/vg01  ny3c15#  mknod /dev/vg01/group c 64 0x010000  ny3c15#  ioscan -funC disk  Class     I  H/W Path    Driver      S/W State H/W Type  Description ===================================================================== disk     0  8/0.5.0    sdisk      CLAIMED   DEVICE    SEAGATE ST34572WC                         /dev/dsk/c0t5d0   /dev/rdsk/c0t5d0 disk     1  8/0.8.0    sdisk      CLAIMED   DEVICE    SEAGATE ST34371W                         /dev/dsk/c0t8d0   /dev/rdsk/c0t8d0 disk     2  8/4.13.0   sdisk      CLAIMED   DEVICE    SEAGATE ST32171W                         /dev/dsk/c1t13d0   /dev/rdsk/c1t13d0 disk     3  8/4.14.0   sdisk      CLAIMED   DEVICE    SEAGATE ST32171W                         /dev/dsk/c1t14d0   /dev/rdsk/c1t14d0 disk     4  8/4.15.0   sdisk      CLAIMED   DEVICE    SEAGATE ST32171W                         /dev/dsk/c1t15d0   /dev/rdsk/c1t15d0 disk     5  8/8.1.0    sdisk      CLAIMED   DEVICE    SEAGATE ST32171W                         /dev/dsk/c2t1d0   /dev/rdsk/c2t1d0 disk     6  8/8.10.0   sdisk      CLAIMED   DEVICE    SEAGATE ST32171W                         /dev/dsk/c2t10d0   /dev/rdsk/c2t10d0 disk     7  8/8.11.0   sdisk      CLAIMED   DEVICE    SEAGATE ST32171W                         /dev/dsk/c2t11d0   /dev/rdsk/c2t11d0 disk     8  8/16/5.2.0 sdisk      CLAIMED   DEVICE   CD-ROM XM-5701TA                         /dev/dsk/c3t2d0   /dev/rdsk/c3t2d0 ny3c15#  vgdisplay -v /dev/vg00  tail -9  --- Physical volumes ---    PV Name                     /dev/dsk/c0t5d0    PV Status                   available    Total PE                    1023    Free PE                     38    Autoswitch                  On ny3c15#  pvcreate -f /dev/rdsk/c1t13d0  Physical volume "/dev/rdsk/c1t13d0" has been successfully created. ny3c15#  pvcreate -f /dev/rdsk/c2t1d0  Physical volume "/dev/rdsk/c2t1d0" has been successfully created. ny3c15#  vgcreate /dev/vg01 /dev/dsk/c1t13d0 /dev/dsk/c2t1d0  Volume group "/dev/vg01" has been successfully created. Volume Group configuration for /dev/vg01 has been saved in /etc/lvmconf/vg01.conf ny3c15#  strings /etc/lvmtab  /dev/vg00 -=8%# /dev/dsk/c0t5d0 /dev/vg01 -==sW /dev/dsk/c1t13d0 /dev/dsk/c2t1d0

Note that all of the steps performed in this procedure so far were run on system ny3c15 . Now we make vg01 inactive with vgchange , run vgexport to create the file /tmp/lvmtab , and copy it to the second system. We create /dev/vg01 and run vgimport on the second system as shown in the following procedure:

 ny3c15#  vgchange -a n vg01  Volume group "vg01" has been successfully changed. ny3c15#  vgexport -p -s -m /tmp/lvm /dev/vg01  ny3c15#  rcp /tmp/lvm_map ny3c16:/tmp/lvm_map  --------------------------------- ny3c16#  mkdir /dev/vg01  ny3c16#  mknod /dev/vg01/group c 64 0x010000  ny3c16#  vgimport -s -m /tmp/lvm_map /dev/vg01

We now have vg01 set up on both of the systems we'll use for our ServiceGuard cluster. The next section covers creating the ServiceGuard cluster.

Creating an MC/ServiceGuard Cluster

Now that we have our shared volume group vg01 set up we can use it as a shared volume group in our ServiceGuard cluster. The following procedure includes several ServiceGuard commands. I'll include comments for some of the cluster- related commands in the upcoming procedure since they have not yet been covered. Assuming that the ServiceGuard application is loaded we can proceed to create the file /etc/cmcluster/cmclnodelist to include the two nodes in our cluster and then query these two nodes with cmquerycl to create a template cluster configuration file:

 ny3c15#  cat   /etc/cmcluster/cmnodelist  ;also copy (rcp) this to ny3c16 ny3c15            root ny3c16            root ny3c15#  cd   /etc/cmcluster  ny3c15#  cmquerycl -n ny3c15 -n ny3c16 -C conf.ascii  Warning: The disk at /dev/dsk/c3t2d0 on node ny3c15 does not have an ID. Warning: The disk at /dev/dsk/c3t2d0 on node ny3c16 does not have an ID. Warning: Disks which do not have IDs cannot be included in the topology description. Use pvcreate(1m) to give a disk an ID. Node Names:    ny3c15                ny3c16 Bridged networks: 1       lan0           (ny3c15)         lan1           (ny3c15) 2       lan0           (ny3c16)         lan1           (ny3c16) IP subnets: 156.153.202.0          lan0  (ny3c15)                        lan0  (ny3c16) Possible Heartbeat IPs: 156.153.202.0                     156.153.202.105     (ny3c15)                                   156.153.202.106     (ny3c16) Possible Cluster Lock Devices: /dev/dsk/c1t13d0   /dev/vg01            30 seconds /dev/dsk/c2t1d0    /dev/vg01            30 seconds LVM volume groups: /dev/vg00               ny3c15 /dev/vg01               ny3c15                         ny3c16 /dev/vg00               ny3c16

You can see that cmquerycl provides a lot of information about the initial cluster components . The file /etc/cmcluster/conf.ascii is a template on which the cluster is based. You can edit this file to make any changes you like to the template, such as to change the name of the cluster, which we'll call web_server in our example. The maximum number of application packages that can be configured, and many other changes. The following listing shows the conf.ascii file for our example:

# cat conf.ascii # ********************************************************************** # ********* HIGH AVAILABILITY CLUSTER CONFIGURATION FILE *************** # ***** For complete details about cluster parameters and how to **** # ***** set them, consult the cmquerycl(1m) manpage or your manual. **** # ********************************************************************** # Enter a name for this cluster. This name will be used to identify the # cluster when viewing or manipulating it. CLUSTER_NAME web_server # Cluster Lock Device Parameters. This is the volume group that # holds the cluster lock which is used to break a cluster formation # tie. This volume group should not be used by any other cluster # as cluster lock device. FIRST_CLUSTER_LOCK_VG/dev/vg01 # Definition of nodes in the cluster. # Repeat node definitions as necessary for additional nodes. NODE_NAME ny3c15 NETWORK_INTERFACElan0 HEARTBEAT_IP 156.153.202.105 NETWORK_INTERFACElan1 FIRST_CLUSTER_LOCK_PV/dev/dsk/c1t13d0 # List of serial device file names # For example: # SERIAL_DEVICE_FILE/dev/tty0p0 # Possible standby Network Interfaces for lan0: lan1. NODE_NAME ny3c16 NETWORK_INTERFACElan0 HEARTBEAT_IP 156.153.202.106 NETWORK_INTERFACElan1 FIRST_CLUSTER_LOCK_PV/dev/dsk/c1t13d0 # List of serial device file names # For example: # SERIAL_DEVICE_FILE/dev/tty0p0 # Possible standby Network Interfaces for lan0: lan1. # Cluster Timing Parmeters (microseconds). HEARTBEAT_INTERVAL 1000000 NODE_TIMEOUT 2000000 # Configuration/Reconfiguration Timing Parameters (microseconds). AUTO_START_TIMEOUT 600000000 NETWORK_POLLING_INTERVAL2000000 # Package Configuration Parameters. # Enter the maximum number of packages which will be configured in the cluster. # You can not add packages beyond this limit. # This parameter is required. MAX_CONFIGURED_PACKAGES7 # List of cluster aware Volume Groups. These volume groups will # be used by package applications via the vgchange -a e command. # For example: # VOLUME_GROUP /dev/vgdatabase. # VOLUME_GROUP /dev/vg02. VOLUME_GROUP /dev/vg01 root@ny3c15 [/etc/cmcluster] #

Now that we've made the changes to conf.ascii , we'll run cmcheckconf , which will perform a verification of our changes. With no errors found in our verification, we can apply our configuration with cmapplyconf after we enable vg01 . We can then bring up the cluster with cmruncl as shown in the following listing:

 ny3c15#  cmcheckconf -C /etc/cmcluster/conf.ascii  Begin cluster verification... Adding node ny3c15 to cluster web_server. Adding node ny3c16 to cluster web_server. Verification completed with no errors found. Use the cmapplyconf command to apply the configuration. ny3c15#  vgchange -a y vg01  Activated volume group Volume group "vg01" has been successfully changed. ny3c15#  cmapplyconf -C /etc/cmcluster/conf.ascii  # Begin cluster verification... Adding configuration to node ny3c15 Adding configuration to node ny3c16 Adding node ny3c15 to cluster web_server. Adding node ny3c16 to cluster web_server. Completed the cluster creation. root@ny3c15   [/etc/cmcluster] root@ny3c15   [/etc/cmcluster] ny3c15#  cmruncl

We received no errors and everything ran smoothly, so we can run cmviewcl to view our cluster and see if indeed it has formed :

 ny3c15#  cmviewcl -v  CLUSTER      STATUS web_server   up   NODE         STATUS       STATE   ny3c15       up           running     Network_Parameters:     INTERFACE    STATUS       PATH         NAME     PRIMARY      up           8/16/6       lan0     STANDBY      up           8/20/5/1     lan1   NODE         STATUS       STATE   ny3c16       up           running     Network_Parameters:     INTERFACE    STATUS       PATH         NAME     STANDBY      up           8/20/5/1     lan1     PRIMARY      up           8/16/6       lan0

The cmviewcl output indicates that our two nodes are up and running smoothly. At this point we have a fully operational ServiceGuard cluster; however, there are no applications that are protected by the cluster. Now let's proceed to the third step, where we'll cover creating an application package.

Create a Highly Available Package

The procedure to set up a package is lengthy, so I'll provide some of the information for the Web server used in this example.

The first step for our particular example is to add an IP address for the Web server to the /etc/ hosts file. After this is complete we need to perform some LVM-related work to create vg02 , which will be used as a shared volume for our highly available package. I'll show only commands issued to create vg02 since it is similar to the procedure used for vg01 earlier. The main difference is that after vg02 is created, a mirrored logical volume is put on it that will be used for our highly available application.

 ny3c15#  pvcreate -f /dev/rdsk/c1t14d0  Physical volume "/dev/rdsk/c1t14d0" has been successfully created. root@ny3c15   [/root] ny3c15#  pvcreate -f /dev/rdsk/c2t10d0  Physical volume "/dev/rdsk/c2t10d0" has been successfully created. ny3c15#  mkdir /dev/vg02  ny3c15#  mknod /dev/vg02/group c 64 0x020000  ny3c15#  vgcreate /dev/vg02 /dev/dsk/c1t14d0 /dev/dsk/c2t10d0  Volume group "/dev/vg02" has been successfully created. Volume Group configuration for /dev/vg02 has been saved in /etc/lvmconf/vg02.conf ny3c15#  lvcreate -L 200 -n stk /dev/vg02  Logical volume "/dev/vg02/stk" has been successfully created with character device "/dev/vg02/rstk". Logical volume "/dev/vg02/stk" has been successfully extended. Volume Group configuration for /dev/vg02 has been saved in /etc/lvmconf/vg02.conf ny3c15#  lvextend -m 1 /dev/vg02/stk  The newly allocated mirrors are now being synchronized.This operation will take some time. Please wait .... Logical volume "/dev/vg02/stk" has been successfully extended. Volume Group configuration for /dev/vg02 has been saved in /etc/lvmconf/vg02.conf ny3c15#  newfs -F vxfs /dev/vg02/rstk  version 3 layout     204800 sectors, 204800 blocks of size 1024, log size 1024 blocks     unlimited inodes, 204800 data blocks, 203656 free data blocks     7 allocation units of 32768 blocks, 32768 data blocks     last allocation unit has 8192 data blocks     first allocation unit starts at block 0     overhead per allocation unit is 0 blocks ny3c15#  mkdir /opt/STK  ny3c15#  mount /dev/vg02/stk /opt/STK  ny3c15#  bdf  Filesystem          kbytes    used   avail %used Mounted on /dev/vg00/lvol3      86016   42282   41006   51% / /dev/vg00/lvol1      67733   17057   43902   28% /stand /dev/vg00/lvol10    200704  189924   10780   95% /var /dev/vg00/lvol9     471040  434239   34544   93% /usr /dev/vg00/lvol8    1134592  924276  197172   82% /u01 /dev/vg00/lvol7      49152    1522   44657    3% /tmp /dev/vg00/lvol6     577536  449719  119887   79% /opt /dev/vg00/lvol5     172032  139466   30538   82% /home /dev/vg00/lvol4     659456  595106   60350   91% /CD_OPS816 /dev/vg02/stk       204800    1157  190923    1% /opt/STK

The /opt/STK will be used for our highly available application. Both systems in the cluster will have access to vg02 , since it is part of our high availability application.

Next, we install our application in /opt/STK and customize it. The Web server configuration file is customized to include some information peculiar to the application as it will run in our example. Then the httpd.conf file is updated to include the name of the package, which we'll call web_stk .

After our application is installed and customized on /dev/vg02 , we can proceed with some additional LVM work to make vg02 accessible on the second system in our cluster:

 ny3c16#  mkdir /dev/vg02  ny3c16#  mknod /dev/vg02/group c 64 0x020000  ny3c15#  umount /opt/STK  ny3c15#  vgchange -a n vg02  ny3c15#  vgexport -p -s -m /tmp/lvm /dev/vg02  # Beginning the export process on Volume Group "/dev/vg02". /dev/dsk/c1t14d0 /dev/dsk/c2t10d0 ny3c15#  rcp /tmp/lvm_map ny3c16:/tmp/lvm_map

On the second system, we'll import the volume group and access it to ensure that it works:

[View full width]

  [View full width] 
 ny3c16#  vgimport -s -m /tmp/lvm /dev/vg02  vgimport: Warning:  Volume Group belongs to different CPU ID. Can not determine if Volume Group is in use on another system. Continuing. Warning: A backup of this volume group may not exist on this machine. Please remember to take a backup using the vgcfgbackup command after activating the  volume group. root@ny3c16   [/root] ny3c16#  vgchange -a y vg02  Activated volume group Volume group "vg02" has been successfully changed. root@ny3c16   [/root] ny3c16# mkdir /opt/STK ny3c16#  mount -o ro /dev/vg02/stk /opt/STK  ny3c16#  bdf  Filesystem          kbytes    used   avail %used Mounted on /dev/vg00/lvol3      86016   41978   41293   50% / /dev/vg00/lvol1      67733   17057   43902   28% /stand /dev/vg00/lvol10    200704  189896   10808   95% /var /dev/vg00/lvol9     471040  434286   34500   93% /usr /dev/vg00/lvol8    1134592  924276  197172   82% /u01 /dev/vg00/lvol7      49152    1522   44657    3% /tmp /dev/vg00/lvol6     577536  449719  119887   79% /opt /dev/vg00/lvol5     172032  139466   30538   82% /home /dev/vg00/lvol4     659456  595106   60350   91% /CD_OPS816 /dev/vg02/stk       204800   93528  104318   47% /opt/STK ny3c16#  umount /opt/STK  root@ny3c16   [/root] #  vgchange -a n vg02  Volume group "vg02" has been successfully changed.

Note that some of the previous commands were performed on ny3c15 and others on ny3c16 . The mount at the end is a test to ensure that vg03 is accessible on system ny3c16 .

Now we're ready to perform some package-related work. First we'll create and edit the Web server package, which we'll call STK:

 ny3c15#  mkdir /etc/cmcluster/web_stk  ny3c15#  cd /etc/cmcluster/web_stk  ny3c15#  cmmakepkg -p web_stk.conf  Package control script is created. This file must be edited before it can be used.

The package we've created is a template that we can edit to include all of the characteristics of our package. The -p specifies that a package template is to be created. Shorty, we'll use the cmmakepkg command with the -s option to create a script .

The web_stk.conf file needs to include all of the information related to our package. We provide a lot of information related to the package, such as the names of the two nodes, the run and halt scripts for the package, the service name, and so on. The following is the full listing of the file, with arrows appearing next to the lines that were modified to include information specific to our cluster:

 # ********************************************************************** # ****** HIGH AVAILABILITY PACKAGE CONFIGURATION FILE (template) ******* # ********************************************************************** # ******* Note: This file MUST be edited before it can be used. ******** # * For complete details about package parameters and how to set them, * # * consult the MC/ServiceGuard or MC/LockManager manpages or manuals. * # ********************************************************************** # Enter a name for this package.  This name will be used to identify the # package when viewing or manipulating it.  It must be different from # the other configured package names. PACKAGE_NAME       web_stk  <--  # Enter the failover policy for this package. This policy will be used # to select an adoptive node whenever the package needs to be started. # The default policy unless otherwise specified is CONFIGURED_NODE. # This policy will select nodes in priority order from the list of # NODE_NAME entries specified below. # # The alternative policy is MIN_PACKAGE_NODE. This policy will select # the node, from the list of NODE_NAME entries below, which is # running the least number of packages at the time this package needs # to start. FAILOVER_POLICY    CONFIGURED_NODE # Enter the failback policy for this package. This policy will be used # to determine what action to take when a package is not running on # its primary node and its primary node is capable of running the # package. The default policy unless otherwise specified is MANUAL. # The MANUAL policy means no attempt will be made to move the package # back to its primary node when it is running on an adoptive node. # # The alternative policy is AUTOMATIC. This policy will attempt to # move the package back to its primary node whenever the primary node # is capable of running the package. FAILBACK_POLICY    MANUAL # Enter the names of the nodes configured for this package.  Repeat # this line as necessary for additional adoptive nodes. # Order IS relevant.  Put the second Adoptive Node AFTER the first # one. # Example : NODE_NAME  original_node #           NODE_NAME  adoptive_node NODE_NAME          ny3c15  <--  NODE_NAME          ny3c16  <--  # Enter the complete path for the run and halt scripts.  In most cases # the run script and halt script specified here will be the same script, # the package control script generated by the cmmakepkg command.  This # control script handles the run(ning) and halt(ing) of the package. # If the script has not completed by the specified timeout value, # it will be terminated.  The default for each script timeout is # NO_TIMEOUT.  Adjust the timeouts as necessary to permit full # execution of each script. # Note: The HALT_SCRIPT_TIMEOUT should be greater than the sum of # all SERVICE_HALT_TIMEOUT specified for all services. RUN_SCRIPT         /etc/cmcluster/web_stk/web_stk.cntl  <--  RUN_SCRIPT_TIMEOUT NO_TIMEOUT HALT_SCRIPT        /etc/cmcluster/web_stk/web_stk.cntl  <--  HALT_SCRIPT_TIMEOUTNO_TIMEOUT # Enter the SERVICE_NAME, the SERVICE_FAIL_FAST_ENABLED and the # SERVICE_HALT_TIMEOUT values for this package.  Repeat these # three lines as necessary for additional service names.  All # service names MUST correspond to the service names used by # cmrunserv and cmhaltserv commands in the run and halt scripts. # # The value for SERVICE_FAIL_FAST_ENABLED can be either YES or # NO.  If set to YES, in the event of a service failure, the # cluster software will halt the node on which the service is # running.  If SERVICE_FAIL_FAST_ENABLED is not specified, the # default will be NO. # # SERVICE_HALT_TIMEOUT is represented in the number of seconds. # This timeout is used to determine the length of time (in # seconds) the cluster software will wait for the service to # halt before a SIGKILL signal is sent to force the termination # of the service.  In the event of a service halt, the cluster # software will first send a SIGTERM signal to terminate the # service.  If the service does not halt, after waiting for the # specified SERVICE_HALT_TIMEOUT, the cluster software will send # out the SIGKILL signal to the service to force its termination. # This timeout value should be large enough to allow all cleanup # processes associated with the service to complete.  If the # SERVICE_HALT_TIMEOUT is not specified, a zero timeout will be # assumed, meaning the cluster software will not wait at all # before sending the SIGKILL signal to halt the service. # # Example: SERVICE_NAME                   DB_SERVICE #          SERVICE_FAIL_FAST_ENABLED      NO #          SERVICE_HALT_TIMEOUT           300 # # To configure a service, uncomment the following lines and # fill in the values for all of the keywords. # SERVICE_NAME                   web_stk.mon  <--  SERVICE_FAIL_FAST_ENABLED       NO SERVICE_HALT_TIMEOUT            60 # Enter the network subnet name that is to be monitored for this package. # Repeat this line as necessary for additional subnet names.  If any of # the subnets defined goes down, the package will be switched to another # node that is configured for this package and has all the defined subnets # available. SUBNET             156.153.202.0  <--  # The following keywords (RESOURCE_NAME, RESOURCE_POLLING_INTERVAL, and # RESOURCE_UP_VALUE) are used to specify Package Resource Dependencies.  To # define a Package Resource Dependency, a RESOURCE_NAME line with a fully # qualified resource path name, and one or more RESOURCE_UP_VALUE lines are # required.  A RESOURCE_POLLING_INTERVAL line (how often in seconds the resource # is to be monitored) is optional and defaults to 60 seconds.  An operator and # a value are used with RESOURCE_UP_VALUE to define when the resource is to be # considered up.  The operators are =, !=, >, <, >=, and <=, depending on the # type of value.  Values can be string or numeric.  If the type is string, then # only = and != are valid operators.  If the string contains whitespace, it # must be enclosed in quotes.  String values are case sensitive.  For example, # #                  Resource is up when its value is #                  -------------------------------- #                  RESOURCE_UP_VALUE= UP"UP" #                  RESOURCE_UP_VALUE!= DOWNAny value except "DOWN" #                  RESOURCE_UP_VALUE= "On Course""On Course" # # If the type is numeric, then it can specify a threshold, or a range to # define a resource up condition.  If it is a threshold, then any operator # may be used.  If a range is to be specified, then only > or >= may be used # for the first operator, and only < or <= may be used for the second operator. # For example, #                  Resource is up when its value is #                  -------------------------------- #                  RESOURCE_UP_VALUE     = 55    (threshold) #                  RESOURCE_UP_VALUE     > 5.1greater than 5.1    (threshold) #                  RESOURCE_UP_VALUE     > -5 and < 10between -5 and 10   (range) # # Note that "and" is required between the lower limit and upper limit # when specifying a range.  The upper limit must be greater than the lower # limit.  If RESOURCE_UP_VALUE is repeated within a RESOURCE_NAME block, then # they are inclusively OR'd together.  Package Resource Dependencies may be # defined by repeating the entire RESOURCE_NAME block. # # Example : RESOURCE_NAME               /net/interfaces/lan/status/lan0 #                      RESOURCE_POLLING_INTERVAL120 #                      RESOURCE_UP_VALUE= RUNNING #                      RESOURCE_UP_VALUE= ONLINE # #           Means that the value of resource /net/interfaces/lan/status/lan0 #           will be checked every 120 seconds, and is considered to #           be 'up' when its value is "RUNNING" or "ONLINE". # # Uncomment the following lines to specify Package Resource Dependencies. # #RESOURCE_NAME         <Full_path_name> #RESOURCE_POLLING_INTERVAL  <numeric_seconds> #RESOURCE_UP_VALUE     <op> <string_or_numeric> [and <op> <numeric>] # The default for PKG_SWITCHING_ENABLED is YES. In the event of a # failure, this permits the cluster software to transfer the package # to an adoptive node.  Adjust as necessary. PKG_SWITCHING_ENABLEDYES # The default for NET_SWITCHING_ENABLED is YES.  In the event of a # failure, this permits the cluster software to switch LANs locally # (transfer to a standby LAN card).  Adjust as necessary. NET_SWITCHING_ENABLEDYES # The default for NODE_FAIL_FAST_ENABLED is NO.  If set to YES, # in the event of a failure, the cluster software will halt the node # on which the package is running.  Adjust as necessary. NODE_FAIL_FAST_ENABLEDNO

The web_stk.conf file calls a control file, which we'll name web_stk.cntl , that contains the functions for the run and halt. This file is produced with the following command:

 ny3c15#  cmmakepkg -s web_stk.cntl

The -s option creates a template package control script. In this file we'll add all of the information important to the package, such as volume group, service name, and package start and stop routines. Again, since this is only a template, we'll have to include important information in it. Because this is a long file, I've included only the customer-defined portion of the file. This does not imply that the other lines are not important, because the script won't work without them:

 #"(#) A.11.05    $Revision: 82.3 $ $Date: 98/11/03 10:56:49 $" # ********************************************************************** # *                                                                    * # *        HIGH AVAILABILITY PACKAGE CONTROL SCRIPT (template)         * # *                                                                    * # *       Note: This file MUST be edited before it can be used.        * # *                                                                    * # ********************************************************************** # UNCOMMENT the variables as you set them. # Set PATH to reference the appropriate directories. PATH=/sbin:/usr/bin:/usr/sbin:/etc:/bin # VOLUME GROUP ACTIVATION: # Specify the method of activation for volume groups. # Leave the default ("VGCHANGE="vgchange -a e") if you want volume # groups activated in exclusive mode. This assumes the volume groups have # been initialized with 'vgchange -c y' at the time of creation. # # Uncomment the first line (VGCHANGE="vgchange -a e -q n"), and comment # out the default, if your disks are mirrored on separate physical paths, # # Uncomment the second line (VGCHANGE="vgchange -a e -q n -s"), and comment # out the default, if your disks are mirrored on separate physical paths, # and you want the mirror resynchronization to ocurr in parallel with # the package startup. # # Uncomment the third line (VGCHANGE="vgchange -a y") if you wish to # use non-exclusive activation mode. Single node cluster configurations # must use non-exclusive activation. # # VGCHANGE="vgchange -a e -q n" # VGCHANGE="vgchange -a e -q n -s" # VGCHANGE="vgchange -a y" VGCHANGE="vgchange -a e"# Default # VOLUME GROUPS # Specify which volume groups are used by this package. Uncomment VG[0]="" # and fill in the name of your first volume group. You must begin with # VG[0], and increment the list in sequence. # # For example, if this package uses your volume groups vg01 and vg02, enter: #         VG[0]=vg01 #         VG[1]=vg02 # # The volume group activation method is defined above. The filesystems # associated with these volume groups are specified below. # VG[0]=vg02 # FILESYSTEMS # Specify the filesystems which are used by this package. Uncomment # LV[0]=""; FS[0]=""; FS_MOUNT_OPT[0]="" and fill in the name of your first # logical volume, filesystem and mount option for the file system. You must # begin with LV[0], FS[0] and FS_MOUNT_OPT[0] and increment the list in # sequence. # # For example, if this package uses the file systems pkg1a and pkg1b, # which are mounted on the logical volumes lvol1 and lvol2 with read and # write options enter: #          LV[0]=/dev/vg01/lvol1; FS[0]=/pkg1a; FS_MOUNT_OPT[0]="-o rw" #          LV[1]=/dev/vg01/lvol2; FS[1]=/pkg1b; FS_MOUNT_OPT[1]="-o rw" # # The filesystems are defined as triplets of entries specifying the logical # volume, the mount point and the mount options for the file system. Each # filesystem will be fsck'd prior to being mounted. The filesystems will be # mounted in the order specified during package startup and will be unmounted # in reverse order during package shutdown. Ensure that volume groups # referenced by the logical volume definitions below are included in # volume group definitions above. # LV[0]=/dev/vg02/stk; FS[0]=/opt/STK; FS_MOUNT_OPT[0]="" # FILESYSTEM UNMOUNT COUNT # Specify the number of unmount attempts for each filesystem during package # shutdown.  The default is set to 1. LV_UMOUNT_COUNT=1 # IP ADDRESSES # Specify the IP and Subnet address pairs which are used by this package. # Uncomment IP[0]="" and SUBNET[0]="" and fill in the name of your first # IP and subnet address. You must begin with IP[0] and SUBNET[0] and # increment the list in sequence. # # For example, if this package uses an IP of 192.10.25.12 and a subnet of # 192.10.25.0 enter: #          IP[0]=192.10.25.12 #          SUBNET[0]=192.10.25.0 # (netmask=255.255.255.0) # # Hint: Run "netstat -i" to see the available subnets in the Network field. # # IP/Subnet address pairs for each IP address you want to add to a subnet # interface card.  Must be set in pairs, even for IP addresses on the same # subnet. # IP[0]=156.153.202.93                             # package IP address SUBNET[0]=156.153.202.0 # SERVICE NAMES AND COMMANDS. # Specify the service name, command, and restart parameters which are # used by this package. Uncomment SERVICE_NAME[0]="", SERVICE_CMD[0]="", # SERVICE_RESTART[0]="" and fill in the name of the first service, command, # and restart parameters. You must begin with SERVICE_NAME[0], SERVICE_CMD[0], # and SERVICE_RESTART[0] and increment the list in sequence. # # For example: #          SERVICE_NAME[0]=pkg1a #          SERVICE_CMD[0]="/usr/bin/X11/xclock -display 192.10.25.54:0" #          SERVICE_RESTART[0]=""  # Will not restart the service. # #          SERVICE_NAME[1]=pkg1b #          SERVICE_CMD[1]="/usr/bin/X11/xload -display 192.10.25.54:0" #          SERVICE_RESTART[1]="-r 2"   # Will restart the service twice. # #          SERVICE_NAME[2]=pkg1c #          SERVICE_CMD[2]="/usr/sbin/ping" #          SERVICE_RESTART[2]="-R" # Will restart the service an infinite #                                    number of times. # # Note: No environmental variables will be passed to the command, this # includes the PATH variable. Absolute path names are required for the # service command definition.  Default shell is /usr/bin/sh. # SERVICE_NAME[0]=web_stk.mon SERVICE_CMD[0]=/etc/cmcluster/web_stk/web_stk.mon SERVICE_RESTART[0]="" # DTC manager information for each DTC. # Example: DTC[0]=dtc_20 #DTC_NAME[0]= # START OF CUSTOMER DEFINED FUNCTIONS # This function is a place holder for customer define functions. # You should define all actions you want to happen here, before the service is # started.  You can create as many functions as you need. function customer_defined_run_cmds { # ADD customer defined run commands. #: # do nothing instruction, because a function must contain some command.                    /usr/local/bin/httpd                    test_return 51 } # This function is a place holder for customer define functions. # You should define all actions you want to happen here, before the service is # halted. function customer_defined_halt_cmds { # ADD customer defined halt commands. #: # do nothing instruction, because a function must contain some command.                    if                    [ -f /usr/local/etc/httpd/logs/httpd.pid ]                    then                    pid=$(cat /usr/local/etc/httpd/logs/httpd.pid)                    if ps -p $pid > /dev/null 2>&1                    then                    kill $pid                    fi                    fi                    test_return 52 } # END OF CUSTOMER DEFINED FUNCTIONS # START OF RUN FUNCTIONS

The web_stk , srm.conf , and httpd.conf files all have to be copied to the second system in the cluster and have to be made executable.

There is also a monitoring script that we produced to print messages regarding the system on which the package is running.

After all of these scripts have been produced, they have to be copied to the second system. All of the files have to be executable.

Next we mark the volume group vg02 as a cluster volume group using the -c y option as shown below:

 ny3c15#  vgchange -a n vg02  vgchange: Volume group "vg02" has been successfully changed. # ny3c15#  vgchange -c y vg02  Performed Configuration change. Volume group "vg02" has been successfully changed.

The -c specifies that a cluster-related option will be specified, and the y indicates that vg02 is a member of the cluster.

Now that we've made all of the necessary modifications to files, we can run the package commands shown below:

 ny3c15#  cmcheckconf -P web_stk.conf  Begin package verification... Verification completed with no errors found. Use the cmapplyconf command to apply the configuration. ny3c15#  cmapplyconf -P web_stk.conf  Begin package verification... Modify the package configuration ([y]/n)? y Completed the cluster update. ny3c15#  cmmodpkg -e web_stk  root@ny3c15   [/etc/cmcluster/web_stk]

In the previous three commands we checked our configuration with cmcheckconf , applied our configuration with cmapplyconf , and then modified our package to enable it with the -e option. You can also specify specific systems on which to enable a package with:

  cmmodpkg   -e -n nodename -n nodename package_name

You may have to issue this form of the command after your package has automatically switched to the backup node and then the backup node fails. At this point the package is running on the first system and we can issue cmviewcl -v to see the details concerning our cluster and the web server package:

 ny3c15#  cmviewcl -v  CLUSTER      STATUS web_server   up   NODE         STATUS       STATE   ny3c15       up           running     Network_Parameters:     INTERFACE    STATUS       PATH         NAME     PRIMARY      up           8/16/6       lan0     STANDBY      up           8/20/5/1     lan1     PACKAGE      STATUS       STATE        PKG_SWITCH   NODE     web_stk      up           running      enabled      ny3c15       Policy_Parameters:       POLICY_NAME     CONFIGURED_VALUE       Failover        configured_node       Failback        manual       Script_Parameters:       ITEM       STATUS   MAX_RESTARTS  RESTARTS   NAME       Service    up                  0         0   web_stk.mon       Subnet     up                                156.153.202.0       Node_Switching_Parameters:       NODE_TYPE    STATUS       SWITCHING    NAME       Primary      up           enabled      ny3c15       (current)       Alternate    up           enabled      ny3c16   NODE         STATUS       STATE   ny3c16       up           running     Network_Parameters:     INTERFACE    STATUS       PATH         NAME     STANDBY      up           8/20/5/1     lan1     PRIMARY      up           8/16/6       lan0 root@ny3c15   [/etc/cmcluster/web_stk]

web_stk is running on ny3c15 at this time. Next we'll halt the package on ny3c15 with cmhaltpkg , and manually run it on ny3c16 with cmrunpkg and run cmviewcl to see the package moved to ny3c16 :

 ny3c15#  cmhaltpkg web_stk  Halting package web_stk. cmhaltpkg  : Successfully halted package web_stk. cmhaltpkg  : Completed successfully on all packages specified. root@ny3c15   [/etc/cmcluster/web_stk] ny3c15#  cmviewcl -v  CLUSTER      STATUS web_server   up   NODE         STATUS       STATE   ny3c15       up           running     Network_Parameters:     INTERFACE    STATUS       PATH         NAME     PRIMARY      up           8/16/6       lan0     STANDBY      up           8/20/5/1     lan1   NODE         STATUS       STATE   ny3c16       up           running     Network_Parameters:     INTERFACE    STATUS       PATH         NAME     STANDBY      up           8/20/5/1     lan1     PRIMARY      up           8/16/6       lan0 UNOWNED_PACKAGES     PACKAGE      STATUS       STATE        PKG_SWITCH   NODE     web_stk      down         halted       disabled     unowned       Policy_Parameters:       POLICY_NAME     CONFIGURED_VALUE       Failover        configured_node       Failback        manual       Script_Parameters:       ITEM       STATUS   NODE_NAME    NAME       Subnet     up       ny3c15       156.153.202.0       Subnet     up       ny3c16       156.153.202.0       Node_Switching_Parameters:       NODE_TYPE    STATUS       SWITCHING    NAME       Primary      up           enabled      ny3c15       Alternate    up           enabled      ny3c16 root@ny3c15   [/etc/cmcluster/web_stk] # ny3c15#  cmrunpkg -n ny3c16 -v web_stk  ny3c15#  cmviewcl -v  # CLUSTER      STATUS web_server   up   NODE         STATUS       STATE   ny3c15       up           running     Network_Parameters:     INTERFACE    STATUS       PATH         NAME     PRIMARY      up           8/16/6       lan0     STANDBY      up           8/20/5/1     lan1   NODE         STATUS       STATE   ny3c16       up           running     Network_Parameters:     INTERFACE    STATUS       PATH         NAME     STANDBY      up           8/20/5/1     lan1     PRIMARY      up           8/16/6       lan0     PACKAGE      STATUS       STATE        PKG_SWITCH   NODE     web_stk      up           running      disabled     ny3c16       Policy_Parameters:       POLICY_NAME     CONFIGURED_VALUE       Failover        configured_node       Failback        manual       Script_Parameters:       ITEM       STATUS   MAX_RESTARTS  RESTARTS   NAME       Service    up                  0         0   web_stk.mon       Subnet     up                                156.153.202.0       Node_Switching_Parameters:       NODE_TYPE    STATUS       SWITCHING    NAME       Primary      up           enabled      ny3c15       Alternate    up           enabled      ny3c16       (current) root@ny3c15   [/etc/cmcluster/web_stk]

In this example we were able to manually halt the package on ny3c15 , run it on ny3c16 , and view it in the halted state in the first cmviewcl output and then view it in the running state with the second cmviewcl output.

The automatic failover from ny3c15 to ny3c16 also worked with this package. When ny3c15 failed, the package automatically moved to ny3c16 as we had designed it.

Should you encounter problems, there are log files in the package directory that provide useful troubleshooting information. One of the log files will have the path /etc/cmcluster/ appname/ appname .log. Depending on the error encountered , you may get a line number in your script where the error was found. The following messages were in the log file for our specific application and clearly pointed to a problem with our IP address that we were able to fix:

[View full width]

  [View full width] 
 ########### Node "ny3c15": Starting package at Tue Jul 23 14:18:51 EDT 2002 ########### Jul 23 14:18:51 - "ny3c15": Activating volume group vg02 with exclusive option. Activated volume group in Exclusive Mode. Volume group "vg02" has been successfully changed. Jul 23 14:18:52 - Node "ny3c15": Checking filesystems:    /dev/vg02/stk file system is clean - log replay is not required Jul 23 14:18:54 - Node "ny3c15": Mounting /dev/vg02/stk at /opt/STK Jul 23 14:18:54 - Node "ny3c15": Adding IP address 156.153.202.105 to subnet 156.153.202.0 cmmodnet  : Unable to verify IP address 156.153.202.105. cmmodnet  : 156.153.202.105 might already be configured as a heartbeat IP address.                   ERROR:  Function add_ip_address                   ERROR:  Failed to add IP address to subnet Jul  23  14:18:54 - Node "ny3c15": Remove IP address 156.153.202.105 from subnet 156.153.  202.0 cmmodnet  : Unable to complete command : Device busy                   ERROR:  Function remove_ip_address                   ERROR:  Failed to remove 156.153.202.105 Jul 23 14:18:55 - Node "ny3c15": Unmounting filesystem on /dev/vg02/stk                    WARNING:   Running fuser to remove anyone using the file system  directly. /dev/vg02/stk: Jul 23 14:18:56 - Node "ny3c15": Deactivating volume group vg02 Deactivated volume group in Exclusive Mode. Volume group "vg02" has been successfully changed.

This example covered all of the fundamentals for working with ServiceGuard. To fully understand the intricacies of ServiceGuard, I recommend taking the HP Education Center courses on this and related topics. When working with high availability solutions, you have to be sure of your implementation and not take any chances .