Chapter Review on PRM | HP-UX CSE(c) Official Study Guide and Desk Reference

Before anyone points out the fact that we haven't look at disk IO management with PRM, I am going to leave that up to you to investigate. Currently, PRM supports IO bandwidth management only with LVM volume groups. There is also the consideration that disk IO bandwidth management is performed at the Volume Group level. This means that in reality, we can impose IO shares when more than one application shares a Volume Group. Most of the time, applications exist in their own Volume Group , making IO shares meaningless.

PRM is a powerful tool for managing workloads. It operates at the CPU, memory, and IO level. I have used it extensively in environments where multiple applications can be running on a single server. One of my favorite implementations of PRM is in a Serviceguard environment where we have all the applications configured in PRM on all the nodes in the cluster. When the applications are running on their own node, they don't even notice PRM, because the default behavior is to not cap CPU. It's only when multiple applications are executed on a single node that the PRM scheduler starts making its presence known when it starts restricting shares based on the installed configuration. The tricky part of that configuration is making your application users aware of the fact that when Serviceguard dictates that multiple applications are run on a single node, overall performance of applications will be affected. It may be that we need to draw up Service Level Agreements that include performance metrics for situation when nodes are running single applications as well as when nodes are running multiple applications. Deciding how individual applications are prioritized can also be trickier, but interesting to try to manage. I have been involved in some installations where part of the process of moving a package to another node is to reconfigure PRM, using configuration files that have previously been set up on individual nodes. The options are seemingly limitless. The beauty of PRM is its simplicity in design and the simplicity in its configuration.

Now we have a look at WorkLoad Manager. WLM is an alternative to PRM although it uses at its heart the PRM scheduler.

11.10.3 WorkLoad Manager (WLM)

Work Load Manager is a tool that is at home managing a large number of complex workloads. A fundamental difference between PRM and WLM is the idea of prioritizing workloads. Whereas with PRM we are usually managing a number of online, intensive applications, WLM is quite happy taking a varied workload and applying a set of rules to it, including simple priority-based resource allocation, share entitlements based on application generated metrics, and tuning resources allocated to applications based on the application measured performance instead of simple CPU, memory, and disk performance. WLM can even add additional processing power dynamically in a partition configuration (currently only with unbound CPU in adjacent vPar with a single server/nPar). WLM is an extensive product with many options. We don't have the luxury of looking at all the permutations . What we can do is look at some examples using our current PRM configuration and looking at how WLM would manage a similar workload. We can also look at a cornerstone of WLM: goal-based Service Level Objectives. This is where we can integrate into WLM our own scripts and programs that interface with our applications in order to measure some level of application performance. These metrics can be fed into WLM's arbitration system whereby more system resources can be allocated to applications based not only on simple share entitlements but also on the feedback from applications themselves . Before we can achieve any of this, we need to create a configuration file that WLM can understand.

11.10.3.1 THE WLM CONFIGURATION FILE

The initial problem we have with WLM is that it doesn't come with a default configuration file. That's because there are so many potential starting points for a WLM configuration. There are a number of excellent example configuration files in the /opt/wlm/eamples/wlmconf directory:

 root@hpeos003[]  cd /opt/wlm/examples/wlmconf  root@hpeos003[wlmconf]  ll  total 106 -r--r--r--   1 root       sys        2178 Mar 21  2002 .install_test.wlm -r--r--r--   1 root       sys        3309 Mar 21  2002 README -r--r--r--   1 root       sys        4600 Mar 21  2002 distribute_excess.wlm -r--r--r--   1 root       sys        2306 Mar 21  2002 enabling_event.wlm -r--r--r--   1 root       sys        3191 Mar 21  2002 entitlement_per_procem -r--r--r--   1 root       sys        2282 Mar 21  2002 fixed_entitlement.wlm -r--r--r--   1 root       sys        3998 Mar 21  2002 manual_entitlement.wlm -r--r--r--   1 root       sys        2680 Mar 21  2002 metric_condition.wlm -r--r--r--   1 root       sys        5011 Mar 21  2002 performance_goal.tempe -r--r--r--   1 root       sys        3530 Mar 21  2002 stretch_goal.wlm -r--r--r--   1 root       sys        2579 Mar 21  2002 time_activated.wlm -r--r--r--   1 root       sys        5645 Mar 21  2002 twice_weekly_boost.wlm -r--r--r--   1 root       sys        1487 Mar 21  2002 usage_goal.wlm -r--r--r--   1 root       sys        3940 Mar 21  2002 user_application_recom root@hpeos003[wlmconf]

With WLM A.02.00, there is a configuration wizard /opt/wlm/bin/wlmcw . I'll let you play with that on your time. Personally, the way I have approached WLM is first to understand how PRM operates in at least a general context. I think this is important because it is the PRM scheduler that WLM utilizes to allocates resources; it's just that WLM can be seen to be more intelligent than a simple PRM configuration. The number and extent of the available configuration files can sometimes be a bit daunting. The online manual available at http://docs.hp.com is an excellent resource when you are first configuring WLM (there are PDF and Postscript versions of this along with further documentation supplied with the product under the /opt/wlm/share/doc directory). If you have an existing PRM configuration file, this can be used as an initial starting point to build an equivalent WLM configuration file. The only issue WLM has with the PRM configuration file is the inclusion of the PRM_SYS group (the group that root is assigned to by default). If we take that group out of the PRM file, WLM will quite happily produce a configuration file that we can use to get our initial WLM configuration off the ground . This is going to be my starting point for these examples:

 root@hpeos003[]  wlmprmconf /etc/prmconf /etc/wlmconf  root@hpeos003[] root@hpeos003[]  ll /etc/wlmconf  -rw-rw-rw-   1 root       sys           2673 Nov 26 06:03 /etc/wlmconf root@hpeos003[]

I have called my WLM configuration file wlmconf just because it seemed like a convenient name . When we start looking at the WLM configuration file, it is quite different in layout to PRM. There are quite a lot of comments in the file produced. Here are the pertinent entries:

 root@hpeos003[]  more /etc/wlmconf  ... # slo slo_QUARRY {     pri = 1;     entity = PRM group QUARRY;     mincpu = 6;     maxcpu = 6; } slo slo_ADMIN {     pri = 1;     entity = PRM group ADMIN;     mincpu = 4;     maxcpu = 4; } slo slo_IT {     pri = 1;     entity = PRM group IT;     mincpu = 2;     maxcpu = 2; } slo slo_OTHERS {     pri = 1;     entity = PRM group OTHERS;     mincpu = 100;     maxcpu = 100; } # # PRM configuration # prm {     groups = QUARRY : 2,              ADMIN : 3,              IT : 4,              OTHERS : 1;     users = daemon : OTHERS,             bin : OTHERS,             sys : OTHERS,             adm : OTHERS,             uucp : OTHERS,             lp : OTHERS,             nuucp : OTHERS,             hpdb : OTHERS,             oracle : OTHERS,             www : OTHERS,             webadmin : OTHERS,             charlesk : OTHERS,             smbnull : OTHERS,             sshd : OTHERS,             ids : OTHERS,             mysql : OTHERS,             haydes : OTHERS,             betty : OTHERS,             fred : QUARRY,             barney : ADMIN,             wilma : IT;     apps = ADMIN : "/home/haydes/bin/*adm*",            IT : "/home/haydes/bin/comp-it*",            IT : "/home/haydes/bin/itDB" "itDB*",            QUARRY : "/home/haydes/bin/quarry*",            QUARRY : "/home/haydes/bin/rdr_quarry" "rdr_quarry*";     memweight = OTHERS : 1,                 QUARRY : 1,                 ADMIN : 1,                 IT : 1;     gminmem = OTHERS : 5,               QUARRY : 26,               ADMIN : 52,               IT : 15;     gmaxmem = OTHERS : 5,               QUARRY : 26,               ADMIN : 52,               IT : 15; } root@hpeos003[]

I don't think the syntax is difficult to understand, so I won't go through it laboriously. Essentially, the wlmprmconf utility has transformed PRM structures into WLM structures. Here are a couple of things to note:

At its heart, WLM is a goal-based system. Our configuration lists a number of Service Level Objectives (SLO) without a defined goal . As such, these are known as Entitlement-based SLOs where mincpu and maxcpu represent an SLO current request to meet its workload. These are not hard limits . Hard limits can be defined in a group definition using gmincpu and gmaxcpu .
While shares do not need to add up to 100, a single CPU is considered to have 100 shares available. As such, if a group is required to have 80 percent of CPU, then allocate them 80 shares. In an eight-CPU system, if you wanted to ensure that a group was allocated two CPUs' worth of processing, you would allocate 200 shares to a group.
WLM uses an allocation policy known as a rising -tide allocation policy, where a single CPU share is initially allocated to all groups and then WLM will arbitrate further allocation based on defined goals. In this way, groups can rise toward their allocation requests and on the SLO priority. Arbitration is an internal feature of WLM and occurs every WLM interval (60 seconds by default).

Knowing these pieces of information, I propose to start up this configuration with our haydes application and observe any differences in behavior.

If we are going to use WLM, we need to ensure that PRM is not activated explicitly at reboot; we disable PRM in /etc/rc.config.d/prm and enable WLM via /etc/rc.config.d/wlm ( WLM_ENABLE=1 ). WLM will activate the PRM scheduler as part of its startup procedure.

 root@hpeos003[]  prmmonitor  PRM resource manager(s) disabled. (PRM-003) root@hpeos003[]

I can check the syntax of my new configuration file before starting the WLM daemons:

 root@hpeos003[]  wlmd W -c /etc/wlmconf  root@hpeos003[]  wlmd -a /etc/wlmconf  root@hpeos003[]  prmmonitor 1 1  PRM configured from file:  /var/opt/wlm/tmp/wmprmBAAa03695 File last modified:        Wed Nov 26 14:21:29 2003 HP-UX hpeos003 B.11.11 U 9000/800    11/26/03 Wed Nov 26 14:21:47 2003    Sample:  1 second   CPU scheduler state:  Enabled, CPU cap ON   CPU      CPU PRM Group                       PRMID   Entitlement     Used ____________________________________________________________ OTHERS                              1        88.00%    0.00% QUARRY                              2         6.00%    0.00% ADMIN                               3         4.00%    0.00% IT                                  4         2.00%    0.00% Wed Nov 26 14:21:47 2003    Sample:  1 second Memory manager state:  Enabled    Paging:  No                                           Memory  Upper PRM Group                 PRMID  Entitlement  Bound    Usage   Procs  Stops _______________________________________________________________________________ OTHERS                            1        7.00%           0.71%       1 QUARRY                            2       26.00%           0.00%       0 ADMIN                             3       52.00%           0.00%       0 IT                                4       15.00%           0.00%       0 PRM application manager state:  Enabled  (polling interval: 30 seconds) root@hpeos003[]

One of the most crucial things to note is that by default WLM enables CPU capping. This means that our application will be limited to their entitlements and that's all. Here's the haydes application running under this configuration:

 root@hpeos003[]  ps -fu haydes  UID   PID  PPID  C    STIME TTY       TIME COMMAND   haydes  3914     1 25 15:35:27 ?       0:02 itDB   haydes  3909     1 13 15:35:26 ?       0:01 adminDB   haydes  3917     1 42 15:35:28 ?       0:03 rdr_quarry   haydes  3913     1  0 15:35:27 ?       0:00 comp-it3   haydes  3915     1 41 15:35:27 ?       0:03 quarryDB   haydes  3906     1  0 15:35:26 ?       0:00 FLUSH_admin   haydes  3912     1  0 15:35:27 ?       0:00 comp-it2   haydes  3910     1  0 15:35:27 ?       0:00 adminRPT   haydes  3908     1 16 15:35:26 ?       0:01 TRANS_adm.ship   haydes  3907     1 13 15:35:26 ?       0:02 RECV_adm.ship   haydes  3911     1  0 15:35:27 ?       0:00 comp-it1 root@hpeos003[]  prmmonitor 1 1  PRM configured from file:  /var/opt/wlm/tmp/wmprmBAAa03888 File last modified:        Wed Nov 26 15:35:14 2003 HP-UX hpeos003 B.11.11 U 9000/800    11/26/03 Wed Nov 26 15:36:54 2003    Sample:  1 second   CPU scheduler state:  Enabled, CPU cap ON   CPU      CPU PRM Group                       PRMID   Entitlement     Used ____________________________________________________________ OTHERS                              1        88.00%    0.00% QUARRY                              2         6.00%    5.99% ADMIN                               3         4.00%    3.99% IT                                  4         2.00%    2.00% Wed Nov 26 15:36:54 2003    Sample:  1 second Memory manager state:  Enabled    Paging:  No                                           Memory  Upper PRM Group                  PRMID  Entitlement  Bound    Usage   Procs  Stops _______________________________________________________________________________ OTHERS                            1        7.00%           0.48%       1 QUARRY                            2       26.00%           0.03%       2 ADMIN                             3       52.00%          41.15%       5 IT                                4       15.00%          11.92%       4 PRM application manager state:  Enabled  (polling interval: 30 seconds) root@hpeos003[]

We need to note two things about our share allocations . First, the allocation is as specified in our wlmconf file (6, 4, 2, and so on). The other thing is the use of CPU capping. The default behavior of WLM is to not distribute any spare CPU cycles to active groups. Excess CPU is allocated to the OTHERS group by default. If we want to turn off CPU capping , we need to implement a tuning feature known as distribute_excess . This will distribute excess CPU cycles among groups other than the OTHERS group. I have two things to do here:

I will rework my share entitlements to align with the idea that one CPU has 100 shares.

I will utilize the distribute_excess global tuning directive to allow applications use spare CPU cycles when other applications are shut down.

 root@hpeos003[]  vi /etc/wlmconf  ... slo slo_QUARRY {     pri = 1;     entity = PRM group QUARRY;     mincpu = 50;     maxcpu = 50; } slo slo_ADMIN {     pri = 1;     entity = PRM group ADMIN;     mincpu = 30;     maxcpu = 30; } slo slo_IT {     pri = 1;     entity = PRM group IT;     mincpu = 15;     maxcpu = 15; } slo slo_OTHERS {     pri = 1;     entity = PRM group OTHERS;     mincpu = 5;     maxcpu = 5; } ... tune { distribute_excess = 1; } root@hpeos003[]

Unfortunately, I need to kill and restart the daemon for these changes to take effect:

 root@hpeos003[]  wlmd -k  root@hpeos003[]  wlmd W -c /etc/wlmconf  root@hpeos003[]  wlmd -a /etc/wlmconf  root@hpeos003[]  prmmonitor 1 1  PRM configured from file:  /var/opt/wlm/tmp/wmprmBAAa04070 File last modified:        Wed Nov 26 16:01:19 2003 HP-UX hpeos003 B.11.11 U 9000/800    11/26/03 Wed Nov 26 16:01:57 2003    Sample:  1 second CPU scheduler state:  Enabled, CPU cap ON                                                 CPU      CPU PRM Group                       PRMID   Entitlement     Used ____________________________________________________________ OTHERS                              1         5.00%    0.00% QUARRY                              2        50.00%   50.60% ADMIN                               3        30.00%   29.77% IT                                  4        15.00%   14.88% Wed Nov 26 17:35:14 2003    Sample:  1 second Memory manager state:  Enabled    Paging:  No                                           Memory  Upper PRM Group                     PRMID  Entitlement  Bound    Usage   Procs  Stops _______________________________________________________________________________ OTHERS                            1        7.00%           0.62%       1 QUARRY                            2       26.00%          20.61%       3 ADMIN                             3       52.00%          41.26%       6 IT                                4       15.00%          11.94%       5 PRM application manager state:  Enabled  (polling interval: 30 seconds) root@hpeos003[]

This appears to be operating as we would expect.

11.10.3.2 SPECIFYING A GOAL

At the moment, our configuration is behaving in a similar fashion to how PRM operates. WLM does allow us to specify performance goals for applications that WLM will try to achieve. The idea here is that we will write (or use a WLM Toolkit) a script or program that will extract some performance- related metrics from our application. WLM will use this information as a basis for retuning application resource allocation. In this way, WLM can distribute resources in a more intelligent manner; if an application is achieving its performance goals, then why give it more resources?

At the heart of specifying a goal is going to be our interaction with our application. We need to sit down with our application administrators and users and try to determine what constitutes adequate performance for our applications. We will reference this goal in our SLO.

 root@hpeos003[]  vi /etc/wlmconf  ... slo slo_QUARRY {     pri = 1;     entity = PRM group QUARRY;     mincpu = 5;     maxcpu = 60;     goal = metric TONNAGE < 200000; } ... root@hpeos003[]

This metric name TONNAGE will need to be tuned via a tune statement. This will configure how we gather the statistics from the QUARRY application:

 root@hpeos003[]  vi /etc/wlmconf  ... tune TONNAGE {         coll_argv = wlmrcvdc /q01/local/scripts/wlm/quarry_weigbridge.sh; } ... root@hpeos003[]

The reference to the command wlmrcvdc is necessary, because my program/script called quarry_weighbridge.sh has not been written using WLM-API calls. This sets up a WLM rendezvous point called TONNAGE where data can be sent from my application analysis program. In my shell script is a collection of code that interrogates the application and uses a wlmsend command to send information back to the WLM rendezvous point .

 root@hpeos003[]  more /q01/local/scripts/wlm/quarry_weigbridge.sh  ... wlmsend TONNAGE $RESP               echo $RESP > $COUNTER                 ;; esac ... root@hpeos003[]

This whole process can be time-consuming as you come to terms with what is and isn't important as far as WLM priorities, mincpu , gmincpu , wlmsend , distribute_excess , and so on. There is no easy answer except understanding your application workload requirements, studying the documentation carefully , and trying out some test configurations with the help of the example files under /opt/wlm/examples/wlmconf . The examples I have shown you are from a working system. It took several weeks to fine-tune the details while we understood the finer points of the applications.

One last thought before we leave WLM; there is an interesting goal that you can set up known as a usage goal ( goal = usage _CPU; ) whereby WLM will arbitrate between your current utilization and your current requirements. If you are not using your entire requirement, WLM can decrease your requirement while using it elsewhere. It builds a little more intelligence into the arbitration process. Here is an example of using usage goals .

 root@hpeos003[]  cat /etc/wlmconf  ... slo slo_QUARRY {     pri = 1;     entity = PRM group QUARRY;     mincpu = 5;     maxcpu = 70;   goal = usage _CPU   ; } slo slo_ADMIN {     pri = 2;     entity = PRM group ADMIN;     mincpu = 5;     maxcpu = 40;  goal = usage _CPU;  } slo slo_IT {     pri = 2;     entity = PRM group IT;     mincpu = 5;     maxcpu = 20;  goal = usage _CPU;  } tune { distribute_excess = 1; wlm_interval = 30; } root@hpeos003[]

In the grand scheme of things, this is actually a rather simple configuration but one that worked well. Here is a typical day's output from prmmonitor :

 root@hpeos003[]  prmmonitor 1 1  PRM configured from file:  /var/opt/wlm/tmp/wmprmBAAa05836 File last modified:        Wed Nov 26 22:27:36 2003 HP-UX hpeos003 B.11.11 U 9000/800    11/26/03 Wed Nov 26 22:29:16 2003    Sample:  1 second CPU scheduler state:  Enabled, CPU cap ON                                                 CPU      CPU PRM Group                       PRMID   Entitlement     Used ____________________________________________________________ OTHERS                              1         5.00%    0.00% QUARRY                              2        70.00%   70.93% ADMIN                               3        13.00%   12.99% IT                                  4        12.00%   11.99% Wed Nov 26 22:29:16 2003    Sample:  1 second Memory manager state:  Enabled    Paging:  No                                           Memory  Upper PRM Group                     PRMID  Entitlement  Bound    Usage   Procs  Stops _______________________________________________________________________________ OTHERS                            1        7.00%           0.40%       1 QUARRY                            2       26.00%           0.14%       4 ADMIN                             3       52.00%           8.69%       6 IT                                4       15.00%           0.14%       4 PRM application manager state:  Enabled  (polling interval: 30 seconds) root@hpeos003[]

When setting up WLM, I would also suggest that you start the daemon with arguments to log metric and SLO data to the file /var/opt/wlm/wlmdstats . The command line would be this:

 root@hpeos003[]  wlmd -a /etc/wlmconf -l metric,slo  root@hpeos003[]

In this way, you can monitor how WLM is adjusting entitlements based on your various SLO goals and priorities.

11.10.3.3 HELP IS AT HAND: WLM TOOLKITS

While trying to uncover the intricacies of your application and how to implement it within a WLM environment, help is at hand in the shape of WLM Toolkits. These are ready-to-run WLM configuration files. They also include data-collector programs/scripts that can feed data into WLM rendezvous points to help the WLM arbitration process. The current list of Toolkits includes:

WLM Oracle Database Toolkit
WLM Pay Per Use Toolkit
WLM Apache Toolkit
WLM BEA WebLogic Server Toolkit
WLM SNMP Toolkit
WLM Duration Management Toolkit
WLM SAS Toolkit

The most up-to-date toolkits can be downloaded free of charge from http://software.hp.com. The files are installed under the /opt/wlm.toolkits directory:

 root@hpeos003[]  ll /opt/wlm/toolkits/  total 8 -r--r--r--   1 root       sys           1533 Mar 21  2002 README dr-xr-xr-x   7 bin        bin           1024 Nov 26 05:41 apache dr-xr-xr-x   2 bin        bin             96 Nov 26 05:41 doc dr-xr-xr-x   6 bin        bin             96 Nov 26 05:41 duration dr-xr-xr-x   2 bin        bin             96 Nov 26 05:41 man dr-xr-xr-x   7 bin        bin           1024 Nov 26 05:41 oracle dr-xr-xr-x   6 bin        bin             96 Nov 26 05:41 sas dr-xr-xr-x   5 bin        bin             96 Nov 26 05:41 snmp dr-xr-xr-x   5 bin        bin             96 Nov 26 05:41 utility root@hpeos003[]

They are well documented and easy to follow. You have no excuses.