23.6 Resource Attributes Revisited


23.6 Resource Attributes Revisited

We were giving some thought on the topic of resource attributes, considering the tables presented earlier in the chapter (see section 23.4.4), and several interesting questions popped into our collective mind (scary huh?). Now we're sure that our publishers are just shaking their heads as the page count of this chapter continues to grow unchecked by the mad scientist-like minds of your humble authors, but we're compelled to think in weird and random ways – sorry dear publishers, sorry. What questions exactly? Read on.

In the following examples, the dependent resource is memberUP, the supporting cast of resources is nicUP (a network resource) and powerUP (an application resource).

23.6.1 Dependencies – Optional or Required?

  • If a resource is dependent on another resource, what happens if the resource it is depending on goes away or relocates?

    To answer this question, let's consider two scenarios:

    1. powerUP is an OPTIONAL_RESOURCE of memberUP.

       # caa_stat -t -v memberUP powerUP Name         Type         R/RA       F/FT      Target    State   Host --------------------------------------------------------------------------- memberUP     application  0/1        0/0       OFFLINE   OFFLINE powerUP      application  0/1        0/0       ONLINE    ONLINE  molari 

      Update memberUP's profile so that powerUP is an OPTIONAL_RESOURCE.

       # caa_profile -update memberUP -l powerUP 

      Update the CAA registry with the updated memberUP profile and start memberUP.

       # caa_register -u memberUP && caa_start memberUP Attempting to start 'memberUP' on member 'molari' Start of 'memberUP' on member 'molari' succeeded. 

      Okay, both resources are running on molari. Let's stop powerUP.

       # caa_stop powerUP Attempting to stop 'powerUP' on member 'molari' Stop of 'powerUP' on member 'molari' succeeded. 

      Did this adversely affect memberUP?

       # caa_stat -t -v memberUP powerUP Name       Type         R/RA     F/FT      Target       State   Host ------------------------------------------------------------------------- memberUP   application  0/1      0/0       ONLINE       ONLINE  molari powerUP    application  0/1      0/0       OFFLINE      OFFLINE 

      Should it have? No, powerUP is "optional"; therefore, memberUP will not stop (or even relocate). If powerUP were "required" however

    2. powerUP is a REQUIRED_RESOURCE of memberUP.

      The first thing we need to do here is modify memberUP's profile, stop memberUP, and update the registry.

       # caa_profile -update memberUP -l "" -r powerUP ; caa_stop memberUP Attempting to stop 'memberUP' on member 'molari' Stop of 'memberUP' on member 'molari' succeeded. 

       # caa_register -u memberUP ; caa_start memberUP Attempting to start 'powerUP' on member 'molari' Start of 'powerUP' on member 'molari' succeeded. Attempting to start 'memberUP' on member 'molari' Start of 'memberUP' on member 'molari' succeeded. 

      Note that powerUP started as well. This is due to the fact that memberUP now requires powerUP to be running on the same member. Let's see what happens if we attempt to relocate or stop powerUP.

       # caa_relocate powerUP sheridan : Resource memberUP (application) is running on molari Resource powerUP has placement error. 

       # caa_stop powerUP Resources depending on powerUP are running Resource powerUP has placement error. 

      No joy. Can we force powerUP to relocate or stop? The "-f" option to the caa_relocate (or caa_start) command allows CAA to stop and/or start those resources, upon which the application resource depends, to enable the task to succeed. For example, since we want to relocate memberUP to molari, but memberUP requires powerUP (which is currently on sheridan), stop powerUP as well as memberUP and start them both on molari.

       # caa_relocate powerUP -f Attempting to stop 'memberUP' on member 'molari' Stop of 'memberUP' on member 'molari' succeeded. Attempting to stop 'powerUP' on member 'molari' Stop of 'powerUP' on member 'molari' succeeded. Attempting to start 'powerUP' on member 'sheridan' Start of 'powerUP' on member 'sheridan' succeeded. Attempting to start 'memberUP' on member 'sheridan' Start of 'memberUP' on member 'sheridan' succeeded. 

       # caa_stat -t -v memberUP powerUP Name      Type         R/RA     F/FT       Target      State   Host ------------------------------------------------------------------------ memberUP application   0/1      0/0        ONLINE      ONLINE  sheridan powerUP  application   0/1      0/0        ONLINE      ONLINE  sheridan 

       # caa_stop powerUP -f Attempting to stop 'memberUP' on member 'sheridan' Stop of 'memberUP' on member 'sheridan' succeeded. Attempting to stop 'powerUP' on member 'sheridan' Stop of 'powerUP' on member 'sheridan' succeeded. 

       # caa_stat -t -v memberUP powerUP Name      Type         R/RA     F/FT       Target      State    Host ------------------------------------------------------------------------ memberUP application   0/1      0/0        OFFLINE     OFFLINE powerUP  application   0/1      0/0        OFFLINE     OFFLINE 

      So we were able to successfully relocate and stop powerUP forcefully; however, memberUP was also relocated and stopped.

      Finally, to complete our answer to the first question, let's add nicUP as a REQUIRED_RESOURCE, update the CAA registration database, start memberUP, and then unplug the network associated with nicUP.

       # caa_profile -update memberUP -l "" -r "powerUP nicUP" 
       # caa_register -u memberUP ; caa_start memberUP Attempting to start 'powerUP' on member 'sheridan' Start of 'powerUP' on member 'sheridan' succeeded. Attempting to start 'nicUP' on member 'sheridan' Start of 'nicUP' on member 'sheridan' succeeded. Attempting to start 'memberUP' on member 'sheridan' Start of 'memberUP' on member 'sheridan' succeeded. 

      Prior to unplugging the network, you may want to start evmwatch (1) to monitor the failover activity.

       # evmwatch -A -t "@name [@priority]\n@@\n" 

      Note that use of the "\n" in the "-t" option was introduced in V5.1A.

      As there will be many events generated, the output is not shown. NIFF will mark the network interface as down and this will generate an event (sys.unix.hw.net.niff.down). In Table 23-2 we listed the EVM events that CAA subscribes to – note that sys.unix.hw.net.niff.down is one of the events that network resource monitor is interested in. As a result, CAA initiates a relocation of memberUP and powerUP.

       # caa_stat -t -v memberUP powerUP nicUP Name      Type         R/RA      F/FT    Target     State    Host ---------------------------------------------------------------------- memberUP  application  0/1       0/0     ONLINE     ONLINE   molari nicUP     network       -        0/2     ONLINE     ONLINE   molari nicUP     network       -        0/2     ONLINE     OFFLINE  sheridan powerUP   application  0/1       0/0     ONLINE     ONLINE   molari 

  • If you have a resource that is depending on multiple resources, what happens if these resources are not all running on the same members? Where does the dependent resource start?

    This is merely an extension of our last example. Let's stop memberUP, relocate powerUP to sheridan, and then start memberUP and see what happens.

     # caa_stop memberUP ; caa_relocate powerUP -c sheridan Attempting to stop 'memberUP' on member 'molari' Stop of 'memberUP' on member 'molari' succeeded. Attempting to stop 'powerUP' on member 'molari' Stop of 'powerUP' on member 'molari' succeeded. Attempting to start 'powerUP' on member 'sheridan' Start of 'powerUP' on member 'sheridan' succeeded. 

     # caa_start memberUP molari : Resource powerUP (application) is already running on sheridan sheridan : Resource nicUP (network) is not available on sheridan Resource memberUP has placement error. 

    We get an error. Okay, let's try to force it.

     # caa_start memberUP -f Attempting to stop 'powerUP' on member 'sheridan' Stop of 'powerUP' on member 'sheridan' succeeded. Attempting to start 'powerUP' on member 'molari' Start of 'powerUP' on member 'molari' succeeded. Attempting to start 'nicUP' on member 'molari' Start of 'nicUP' on member 'molari' succeeded. Attempting to start 'memberUP' on member 'molari' Start of 'memberUP' on member 'molari' succeeded. 

    That works. What if the resources were not required but optional? Then memberUP would start on the member with the most optional resources available.

23.6.2 Dependencies Versus Placement Policy

  • If a resource is favored to a member, but the resource it depends on is running on a non-favored member, where does dependent resource start?

    If the resource has its placement policy set to "favored", but the optional resources are on a non-favored member, the resource will start on the favored member. Below is a list of commands you can use to verify this assertion. No output is shown.

     # caa_stop memberUP # caa_profile -update memberUP -p favored -h sheridan -r "" -l "powerUP nicUP" # caa_register -u memberUP # caa_start memberUP # caa_stat -t -v memberUP powerUP nicUP 

    If the resource has its placement policy set to "favored", but the required resources are on a non-favored member, the resource will start on the non-favored member. You can use the same list of commands from the previous example — just swap the "-l" and "-r" options in the caa_profile command (although you will not need the "-p" or "-h" since these do not need to be modified a second time).

     # caa_profile -update memberUP -l "" -r "powerUP nicUP" 
  • If a resource is restricted to a member, but the resource it depends on is running on a member that is not in the HOSTING_MEMBERS list, where does the dependent resource start?

    This is a slight modification to our previous example. If the resource has its placement policy set to "restricted", but the optional resources are on a member not in the HOSTING_MEMBERS list, the resource will start on a restricted member.

     # caa_stop memberUP # caa_profile -update memberUP -p restricted -h sheridan -r "" -l "powerUP nicUP" # caa_register -u memberUP # caa_start memberUP # caa_stat -t -v memberUP powerUP nicUP 

    If the resource has its placement policy set to "restricted", but the required resources are on a member not in the HOSTING_MEMBERS list, let's see what happens.

Stop memberUP. Modify memberUP's profile as follows:

  • PLACEMENT=restricted

  • HOSTING_MEMBERS=sheridan

  • REQUIRED_RESOURCES=powerUP nicUP

 # caa_stop memberUP # caa_profile -update memberUP -p restricted -h sheridan -l "" -r "powerUP nicUP" 

Update the CAA registry with memberUP's modified profile and start memberUP. Note the required resources are currently running on molari, but memberUP is now restricted to sheridan.

 # caa_register -u memberUP ; caa_start memberUP molari : Resource memberUP (application) cannot run on molari sheridan : Resource powerUP (application) is already running on molari Resource memberUP has placement error.

Can it be forcefully started? No.

 # caa_start memberUP -f molari : Resource memberUP (application) cannot run on molari sheridan : Resource nicUP (network) is not available on sheridan Resource memberUP has placement error.

Can it be forcefully started if we plug our network cable back in so that nicUP will go ONLINE? Yes.

 # caa_start memberUP -f Attempting to stop 'powerUP' on member 'molari' Stop of 'powerUP' on member 'molari' succeeded. Attempting to start 'powerUP' on member 'sheridan' Start of 'powerUP' on member 'sheridan' succeeded. Attempting to start 'nicUP' on member 'sheridan' Start of 'nicUP' on member 'sheridan' succeeded. Attempting to start 'memberUP' on member 'sheridan' Start of 'memberUP' on member 'sheridan' succeeded.

 # caa_stat -t -v memberUP powerUP nicUP Name      Type          R/RA     F/FT    Target     State    Host ---------------------------------------------------------------------- memberUP  application   0/1      0/0     ONLINE     ONLINE   sheridan nicUP     network        -       0/2     ONLINE     ONLINE   molari nicUP     network        -       0/2     ONLINE     ONLINE   sheridan powerUP   application   0/1      0/0     ONLINE     ONLINE   sheridan 

Note, though, that if we unplug the cable again, nicUP will be set to OFFLINE and memberUP will be stopped.

 caa_stat -t -v memberUP nicUP powerUP Name      Type          R/RA     F/FT    Target     State    Host ---------------------------------------------------------------------- memberUP  application   0/1      0/0     ONLINE     OFFLINE nicUP     network        -       0/2     ONLINE     ONLINE  molari nicUP     network        -       1/2     ONLINE     OFFLINE  sheridan powerUP   application   0/1      0/0     ONLINE     ONLINE  sheridan 

Will memberUP automatically start if the cable is plugged in again? Since the target state remains ONLINE, yes. Plug in the cable.

 # caa_stat -t -v memberUP powerUP nicUP Name      Type          R/RA     F/FT    Target     State    Host ---------------------------------------------------------------------- memberUP  application   0/1      0/0     ONLINE     ONLINE   sheridan nicUP     network        -       0/2     ONLINE     ONLINE   molari nicUP     network        -       0/2     ONLINE     ONLINE   sheridan powerUP   application   0/1      0/0     ONLINE     ONLINE   sheridan 

23.6.3 Active Placement

Use the ACTIVE_PLACEMENT attribute only if the application resource should be relocated immediately when a more favored member becomes available. This attribute should be used with caution as it can cause your users to be disrupted more often than necessary.

For example, consider that you have a two-member cluster where an application resource is favored to member1. If member1 were to become unavailable for whatever reason, the application resource would relocate to member2. Restarting the application resource on member2 will take at least a few seconds (but could take up to several minutes), thus disrupting your user's ability to get work accomplished.

The good news is that the application resource will be restarted automatically. The bad news is that it is not necessarily transparent.

If ACTIVE_PLACEMENT is set to the value of 1, when member1 reboots, CAA will automatically relocate the application resource back to member1 (the favored member). This means that the users would once again be disrupted.

What would happen if member1 experienced a problem that caused it to go down again and again? Your users would be continuously disrupted as the machine crashed and rebooted.

An alternative to setting the ACTIVE_PLACEMENT attribute to 1 is to have the cluster administrator relocate the application resource to the favored member at a time when the relocation will cause the least disruption. The REBALANCE attribute exists specifically to address this scenario.

23.6.4 Polling an Application Resource

One of the optional functions of the Resource Manager is its ability to check the status of an application resource every CHECK_INTERVAL seconds. In order to take advantage of this feature, the action script must have a "check" entry point written to check the application's status. If the check is unsuccessful, the application is not running; therefore, CAA will run the action script with the "start" entry point to restart the application.

This "check" entry point can be as simple as performing a ps (1) and searching for the command name or PID or as complex as contacting the application and performing some function to determine its status. The exact implementation is left to the cluster administrator or developer in charge of writing the action script.

Note the template.scr script in the /var/cluster/caa/template directory performs a simple check using the ps and grep commands. This is the default script used by the caa_profile and "sysman caa" commands.

Note

Although the "check" that is performed in the template.scr is simple, it is inefficient and does not handle the scenario where the application is hung. Therefore, we strongly recommend that you rewrite your action script's "check" entry point to probe the application to verify that it is actually running.




TruCluster Server Handbook
TruCluster Server Handbook (HP Technologies)
ISBN: 1555582591
EAN: 2147483647
Year: 2005
Pages: 273

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net