8.3 Monitoring the Domino system

 < Day Day Up > 



8.3 Monitoring the Domino system

Domino has many server monitoring features and tools that work together to inform you about the status, performance, and use of the Domino system. The Domino server provides services and tasks that create and report information about the Domino system in the form of statistics and events.

Domino generates statistics that you can use to monitor the activity of your servers and events. They indicate if the system is running smoothly, or signal you when there is a malfunction. You can monitor the availability and performance of your system using the Domino Administrator, the Domino Console, or the Web Administrator.

8.3.1 System monitoring tools

The Domino Administrator includes system-monitoring tools that you use to configure, view, and track the Domino system:

Monitoring databases - store monitoring documents, information, and results.

  • The Monitoring Configuration database (EVENTS4.NSF) stores the documents you can use to set up monitoring. It also includes information about statistics, statistic thresholds, and event messages.

    EVENTS4.NSF is used to configure Notes server event handling, statistic monitoring, ACL monitoring, and replication monitoring. It contains the names of all statistics monitored by the server, and thresholds for producing event records. It also contains error and status messages from the server. This database can be used to:

    • Configure Notes server event handling

    • Look up information about a specific statistic or event message

    • Set statistic thresholds

    • Assign types and severities to server events

  • The Monitoring Results database (STATREP.NSF) stores the gathered statistics reports and can be configured to store information about logged events. The Statistic Collector task gathers statistics for one or more servers. It loads automatically on a server if it is listed in the task line (ServerTasks=Replica,Router,AMgr,AdminP,...) of the notes.ini file.

There are two ways to set up statistics collection:

  1. Start the Statistic Collector task on each server, to collects its own statistics and create the report in the local Monitoring Results database

  2. Start the Statistic Collector on one server that you set up to collect statistics from one or more servers and create reports in a specified "centralized" Monitoring Results database

Monitoring Configuration documents - define and configure what constitutes an event and how the event is handled.

You can also customize the messages that appear on the console when an event occurs. To configure an event, you need to determine the type of event, the severity level, and how you want it handled. You configure your events using the Event Generator and Event Handler documents.

  • Event generators describe the condition that must be met for an event to be generated. Event generators gather information by monitoring a task or a statistic, or by probing a server for access or connectivity.

    Each event generator has a specified threshold or condition which, when met, causes an event to be created. The event is passed to the Event Monitor task, which checks whether an event handler has been defined. The Domino Administrator includes a set of default event generators, which are listed in the Event Generators view of the Monitoring Configuration database (EVENTS4.NSF).

  • Event handlers describe what happens when the event occurs; for example, you can mail a notification of the event to a file or an administrator, or you can log the event to the log file (LOG.NSF). If you want to know about an event, you must have an Event Handler document; otherwise, the event is not recorded.

The Monitoring Configuration database (EVENTS4.NSF) includes default event handlers for server tasks. However, you can easily disable a default event handler and replace it with a customized one. To manually create an Event Handler document in the Monitoring Configuration database (EVENTS4.NSF), do the following:

  1. From the Domino Administrator, click the Configuration tab, and open the Monitoring Configuration view.

  2. Open the Event Handlers - All view, and click New Event Handler.

  3. On the Basics tab, in the Server to monitor field, choose which server.

  4. Under the Notification trigger, select one (that is, Any event that matches a criteria).

  5. Complete the fields on the Event tab: event type, event severity, and message text

  6. Click the Action tab, then select the notification method and select the enablement option.

Note 

The Event Monitor task (formerly known as the Event task) starts automatically when you start the server, and it must run on all servers that you want to monitor.

Server tasks - collect and record information about the Domino system. The Event Monitor task determines if an Event Handler has been configured for the event, and if so, routes the event to the specified person, database, or server-management program for processing.

The Statistic collector task gathers Domino server statistics and creates statistics reports in the Monitoring Results database (STATREP.NSF) or to another database you can specify. The ISpy task executes TCP server and mail-routing event generators. However, we recommend that you use Ispy sparingly, since it does have a performance impact.

8.3.2 Server availability

You can check and monitor the availability of your Domino servers by using the Domino Console, Domino Administrator, or the Web Administrator. In addition, the Domino server monitor will provide you a way to display real-time statistics, as well as a visual representation of the status of servers and server tasks that you are monitoring.

You can also set up an event generator or probe to check on the availability of your servers on a regular basis. If you are currently using any third-party tools, you can check if they are also available for Domino on Linux.

Domino Console

In the Domino Console, you can filter events and processes to check the status and availability of your server. You can filter server messages according to their severity. You can specify which server events the Domino Console displays in the console window.

By default, the Domino Console shows all server events. To allow filtering of the event types that you want displayed, expand the bottom panel of Domino Console by clicking the up/down arrow that is directly above the Domino Command field; or from the View menu, select Event Filter.

If you want to display only error messages, then in the Command panel, select only the type of server events you want to show in the Domino console window (see Figure 8-9 on page 185). For example:

  • Fatal

  • Failure

  • Warning (High)

click to expand
Figure 8-9: Domino Console filter messages

For more information on server events, refer to the Domino Administrator 6 Help database.

Domino Server Monitor

In the Domino Administrator, with the Domino Server Monitor, you can view all your servers or just a subset of servers. Some of the tasks you can perform with the server monitor is to:

  • View server monitor statistics by timeline or by state

  • Display past error states

  • Add or remove a server to monitor

  • Add or remove server tasks or statistics from a selected server or from all servers

Note 

The Domino server monitor is not available in the Web Administrator.

Start and stop the Domino server monitor manually, as follows:

  1. From the Domino Administrator, click the Server - Monitoring tab.

  2. Click the Green arrow in the upper-right of the task screen.

  3. To stop the server monitor, click Stop.

Each server and server task displays a status indicator that identifies its current state: running, not running, or not responding, as shown in Figure 8-10. You can use the option "Display past states reporting errors exclusively" to only view error states

click to expand
Figure 8-10: Domino Server monitor

Note 

The Domino server monitor does not start by default. Change the monitoring defaults in the Administration Preferences to start it automatically.

Domino Server event generator

Another way to check for the availability of your Domino Server is to create a Domino Server event generator. There are numerous server probes or events you can define; you have to decide which ones are useful in your environment.

You can set up your "probe" or server event as follows:

  1. From the Domino Administrator, click the Configuration tab, and then open the Monitoring Configuration view.

  2. Open the Event Generators - Domino Server Response view, and then click New Domino Server Event Generator.

We defined two kind of probes:

  1. To open a database to check if the server is running and alive, as shown in Figure 8-11 on page 187

    click to expand
    Figure 8-11: Server Availability probe

  2. Sending mail to check if the router task is working, as shown in Figure 8-12 on page 188

    click to expand
    Figure 8-12: Mail probe

Of course, both probes also check the network connection between the servers.

We set up a" probe" server event that checked the connectivity and port status of the two servers, DomServA/ITSO and DomServC/ITSO, by opening the readme.nsf file every six minutes (the default is every three minutes). You will not want to select a large file, since it will take a long time to open.

Set up the mail probe as follows:

  1. Make sure the ISpy task is running.

  2. From the Domino Administrator, click the Configuration tab, and then open the Monitoring Configuration view.

  3. Open the Event Generators -> Mail view, and click New Mail Routing Event Generator.

  4. On the Basics tab, complete the fields listed in Table 8-4.

    Table 8-4: Mail probe - basic settings

    Field

    Action

    All Domino servers in the domain will probe themselves

    Do one:

    • Check this option to have each server to probe only the local mail box.

    • Uncheck this option to probe specified servers.

    Recipient

    Enter the address of the recipient for which you want to check the mail route or use the drop-down box to select a recipient from a Domino Directory or Address Book. Do not enter more than one user and do not enter a group name.

    Probing servers (source)

    Select the name of the server from which to start the probe.

    Show intermediate hop times

    Enable this option to track intermediate hop times.

  5. Click the Probe tab and complete the fields shown in Table 8-5.

    Table 8-5: Mail probe - probe settings

    Field

    Action

    Send interval

    Enter the number of minutes between probes. The default is 15.

    Time-out threshold

    Enter the number of minutes the probing server (source) waits for a response before logging a failure.

  6. Click the Other tab, complete fields shown in Table 8-6, and then click Save and Close.

    Table 8-6: Mail probe - miscellaneous settings

    Field

    Action

    On time-out, generate a Mail event of severity

    Select the severity level.

    Create a new event handler for this event

    Click this button to launch the Event Notification Wizard and create an event handler.

8.3.3 Performance monitoring

Domino generates statistics that you can use to monitor your system performance and utilization. You can use the Domino and platform statistics or Tivoli® Analyzer for Domino to determine the health of your servers.

When you run Linux under VM, remember that all (performance) data depends on the resources dedicated to Linux on any specific point in time. VM keeps changing the amount of CPU and memory. So each time Linux tries to find out what the percentage of allocated memory is, the absolute value may have changed. So any absolute value should be correct, but any relative (percentage) value may not be.

Domino statistics and VM

When you run Domino under VM, be aware that all the system resources that Domino and Linux are measuring are virtual resources. So VM makes Linux think it has nnn MB of memory or xx CPUs. If those values were static, everything would be fine-but they are constantly changing. Typically, you define ranges for memory and CPU that VM can allocate to a guest. VM reallocates them if another guest has a higher demand for resources.

Domino statistics

Domino gathers statistics that show the status of processes currently running on the system. For example, the statistic "Mail.TransferThreads.Active" indicates the number of active mail threads. This tells you something about the status of the router. You use these statistics, along with the predetermined statistics thresholds, to monitor both your Domino system and platform statistics.

In addition to platform statistics, you can display other Domino statistics by issuing Show Stat <statisticname>.You can also use load stats <servername> to create statistics on demand for a remote server, and show statistics server to display the complete list of statistics for the server. Here are some of the statistics that you can use for performance monitoring:

  • Server.Trans.PerMinute.Peak is the peak number of transactions during a one-minute interval since the server started. It is not the peak during this interval. You are given the date and time this occurred.

  • Server.Users is the number of users with connections to the server at the end of the interval when statistics were collected.

  • Server.Users.Active15Min is the number of active users in a 15-minute interval.

There are also counters for the number of calendar and mail requests, and also agent processes. These may be of use if you are investigating the average profile of your mail users.

Just as you create an event generator for a Domino system statistic, you also create an event generator for a health statistic. Then, when the statistic does not meet the defined threshold, an event is generated.

For an event to be created, however, you must enable statistic alarms. Then, the first time a statistic alarm is reported, an event is generated and reported to the in STATREP.NSF. In addition to an alarm, you can create an event handler to notify you of the event.

To enable statistic alarms

  1. From the Domino Administrator, choose File - Preferences - Administration Preferences.

  2. Click Statistics, and then check: Check statistic alarms while monitoring or charting statistics.

  3. For: Check alarms every <n> minutes (greater than monitoring poll interval), enter a value that is greater than the server polling value. The default is 15.

Coordinating Linux, VM, and Domino data collection

To get Domino to collect records at the same time as VM and Linux, you must start the statistics task at the desired time, since it does not run at a specific time, but only after a defined period of time as elapsed.

If you wish to combine Domino statistics with Linux or VM data, for example, you will probably need to extract the Domino statistics into file. You can then use currently available tools to analyze the data.

For example, from the Domino Administrator, you can use the Domino Server monitor and statistics charts to view graphical representations of system status. From the Domino Console, you can view a representation that uses your predefined colors and text attributes to illustrate the status of a process.

Processing Domino statistics

To process Domino statistics, you can:

  • View them in the statrep database.

  • View them using the Administrator Client or by value (Figure 8-13),under the Statistics tab.


    Figure 8-13: Statistics for free and total Disk space

  • View them graphically as shown in Figure 8-14. You find this under the Performance tab of the Server tab of the administration client.

    click to expand
    Figure 8-14: Graphical representation - Domino Server statistics

Platform statistics

Platform statistics are now available for Linux on zSeries. They provide another way to gain insight into the combined behavior of Domino and Linux. Performance information is gathered from Linux and stored as Domino statistics that can be collected and processed just like any other Domino stats.

However, when you are running Domino on Linux under z/VM, data depends on the resources dedicated to Linux on any specific point in time, so the memory and system statistics will not reflect your configuration. If you are running Domino "natively", the statistics will be correct.

Platform statistics are collected continuously by the Statistic Collector. You can view these statistics from the Domino Administrator in the Statistics tab, as shown in Figure 8-15.

click to expand
Figure 8-15: Platform statistics

The following are the statistics that are gathered:

Logical disk

Statistics for individual disks and total percent use of all disks

Memory

Statistics showing memory allocation and use, including available memory

Network

Statistics for individual network adapters and cumulatively for all the network adapters on the system

Paging file

Statistics that show use of paging files

System

Statistics on the information captured-for example, a summary of system CPU use and queue length

Time

The time that platform stats were last collected, and the sampling interval in minutes

No disk space is consumed by enabling platform statistics, since no log files are created. As with Domino statistics, disk space is used only if you log platform statistics to the log file or to the Monitoring Results database (STATREP.NSF). The amount of disk space used depends on the frequency of capture.

You can also use the show stat platform command from the Domino console to view all platform statistics, or just a subset.

  • Show Stat Platform displays a complete list of platform statistics.

  • Show Stat platform.logicaldisk.* displays all the platform statistics in the logical disk group.

Example 8-3 shows an extract of the results obtained when we issued the Show Stat platform.logicaldisk.* command from the Domino Console.

Example 8-3: Logical disk

start example
    Show Stat platform.logicaldisk.*      Platform.LogicalDisk.10.AssignedName = dasdl      Platform.LogicalDisk.10.AvgQueueLen = 0.1      Platform.LogicalDisk.10.AvgQueueLen.Avg = 0.3      Platform.LogicalDisk.10.AvgQueueLen.Peak = 2.49      Platform.LogicalDisk.10.PctUtil = 0.7      Platform.LogicalDisk.10.PctUtil.Avg = 1.92      Platform.LogicalDisk.10.PctUtil.Peak = 11.43      Platform.LogicalDisk.10.ServiceTime = 23.73      Platform.LogicalDisk.10.ServiceTime.Avg = 31.97      Platform.LogicalDisk.10.ServiceTime.Peak = 2,623.08      Platform.LogicalDisk.11.AssignedName = dasdm      Platform.LogicalDisk.11.AvgQueueLen = 0.1      Platform.LogicalDisk.11.AvgQueueLen.Avg = 0.24      Platform.LogicalDisk.11.AvgQueueLen.Peak = 2.95      Platform.LogicalDisk.11.PctUtil = 0.68      Platform.LogicalDisk.11.PctUtil.Avg = 1.79      Platform.LogicalDisk.11.PctUtil.Peak = 10.43      Platform.LogicalDisk.11.ServiceTime = 21.58      Platform.LogicalDisk.11.ServiceTime.Avg = 32.48      Platform.LogicalDisk.11.ServiceTime.Peak = 3,850 
end example

Important: 

If iostats is not installed on the system platform, stats will not display the disk stats. iostats is typically delivered in the sysstat package on Linux. Refer to 2.6.2, "Linux sysstat package" on page 20 for more information on this subject.

DomServA was running on a Linux guest under VM and had two CPUs and 256 memory. Example 8-4 is an extract of the results we received from show stat platform.

Example 8-4: show stat platform display

start example
    Memory    Platform.Memory.PagesPerSec = 0      Platform.Memory.RAM.AvailMBytes = 3      Platform.Memory.RAM.AvailMBytes.Avg = 2      Platform.Memory.RAM.AvailMBytes.Min = 2      Platform.Memory.RAM.AvailMBytes.Peak = 9      Platform.Memory.RAM.PctUtil = 98      Platform.Memory.RAM.TotalMBytes = 186    Paging    Platform.PagingFile.Free.SizeMBytes = 419      Platform.PagingFile.Total.PctUtil = 13      Platform.PagingFile.Total.PctUtil.Avg = 10      Platform.PagingFile.Total.PctUtil.Peak = 13      Platform.PagingFile.Total.SizeMBytes = 484    System    Platform.System.ContextSwitchesPerSec = 1,738      Platform.System.ContextSwitchesPerSec.Avg = 1,712      Platform.System.ContextSwitchesPerSec.Min = 1,479      Platform.System.ContextSwitchesPerSec.Peak = 2,565      Platform.System.PctCombinedCpuUtil = 2      Platform.System.PctCombinedCpuUtil.Avg = 9.86      Platform.System.PctCombinedCpuUtil.Peak = 45      Platform.System.PctTotalPrivilegedCpuUtil = 1      Platform.System.PctTotalPrivilegedCpuUtil.Avg = 4.5      Platform.System.PctTotalPrivilegedCpuUtil.Peak = 21      Platform.System.PctTotalUserCpuUtil = 1      Platform.System.PctTotalUserCpuUtil.Avg = 5.35      Platform.System.PctTotalUserCpuUtil.Peak = 24    Time    Platform.Time.LastSample = 08/26/2003 18:46:29 EDT      Platform.Time.SampleRateInMins = 1    503 statistics found 
end example

Note 

When collecting statistics from a partitioned server, Domino collects platform statistics that pertain to the system as a whole, not to an individual server.

Statistics and reporting database

If you run the reporter task on the server, a set of reports will be created at scheduled intervals. These are put in the statistics and reporting database statrep.nsf. You can specify the interval between records. For information on setting up the reporter task, refer to the Domino 6 Administration Help database.

Domino 6 includes a number of improvements that make it easier for administrators to view server reporting information.

Feature

Description

Benefit

Statistic Charting

Administrators can now see current and historical graphing of Domino and platform statistics from the Administrator client.

Administrators can assess server performance and behavior in a historical context and in real-time.

Database monitoring

Reorganized DB monitor form, single-click creation of new database usage, activity, replication, and ACL monitors.

Simplified access to monitoring.

Event Description

Right-click access to more detailed information about console messages.

Contextual access to information.

Activity logging

Activity logging can be used to collect information about the activity in your enterprise. This information charge users for the amount they use your system, monitor usage, conduct resource planning, and determine if clustering would improve the efficiency of your system.

Domino writes the activity logging information in the Domino log file (LOG.NSF). To create activity logging reports, you write a Notes API program to access the information in the log file. You can also view the activity logging information by using Activity Analysis.

In a hosted environment, enable activity logging on all of your ASP servers. These are the servers used to house and maintain your hosted organizations

You configure activity logging by editing the Configurations Settings document.

  1. From the Domino Administrator, click the Configuration tab.

  2. In the Task pane, expand Server and click Configurations.

  3. In the Results pane, select the Configuration Settings document you want, and click Edit Configuration.

  4. On the Configuration Settings document, click the Activity Logging tab.

  5. Select Activity logging is enabled.

  6. In the Enabled logging types field, select the types of activity you want to log.

  7. (Optional) To increase or decrease the frequency of creating Checkpoint records, change the checkpoint interval.

  8. (Optional) To automatically create Notes session and Notes database Checkpoint records every day at midnight, select: Log checkpoint at midnight.

  9. (Optional) To automatically create Notes session and Notes database Checkpoint records every day at the beginning and end of a specific time period, select: Log checkpoints for prime shift and then specify the times for the Prime shift interval.

  10. Click Save and Close.

  11. (Optional) If you are logging activity for LDAP Add and Modify operations, and want to change the amount of information logged in the Attributes field from the default of 4096 bytes, follow the steps in the topic "Limiting the amount of attribute information logged for LDAP Add and LDAP Modify activity."

Tivoli Analyzer for Lotus Domino

IBM Tivoli Analyzer for Lotus Domino takes advantage of the new Lotus Domino 6 statistics and activity measurements, and runs seamlessly inside the Domino Administrator. It generates comprehensive, detailed statistics and measurements of the server's activity

Tivoli Analyzer for Lotus Domino includes two integrated system-management tools:

  • Activity Trends, which provides data collection, data exploration, and resource balancing

  • The Server Health Monitor, which offers real-time assessment and recommendations for server performance

We list and describe these tools in Table 8-7 on page 195.

Table 8-7: Tivoli Analyzer for Lotus Domino tools

Feature

Description

Benefit

Activity Trends Reporting

This reporting capability lets administrators analyze server workload by user, database, and protocol.

They can also determine growth rate trends.

Administrators can use this reporting to predict life span of existing disk and processing resources. They can also identify the databases responsible for server activity.

In addition, Activity Trends Reporting automates creation and execution of workload redistribution and server decommissioning plans.

Server Health Monitoring

Health Monitoring tracks blends of Domino and OS statistics from within the Domino Administrator client to determine overall server health.

Incorporated drill-down technology allows an administrator to find specific counters (for example, CPU, disk queue) responsible for reduced performance, and to suggest both immediate and long-term corrective actions.

This tool helps administrators determine, at any time, just how well a server is performing.

It isolates trouble spots automatically and suggests corrective actions.

Tivoli Analyzer for Lotus Domino is a separate product offering from Tivoli Systems. You can obtain more information about this tool at:

  • http://www.ibm.com/software/tivoli/r/analyzerfordomino

The Server Health Monitor is part of the IBM Tivoli Analyzer for Lotus Domino. It integrates with the Domino Server monitor, which is part of the Domino Administration client. You can use it for monitoring and troubleshooting the performance of your Domino server. It automatically calculates health statistics and compares those statistics to predefinedthresholds. It also reports on the overall health of the server.

To set up the Server Health Monitor, complete these steps:

  1. Install the IBM Tivoli Analyzer for Lotus Domino.

    1. Make sure you have installed the Domino Administrator.

    2. Run the install program (SETUP.EXE) from the Tivoli Analyzer directory.

  2. Start the Domino server monitor.

To view server health:

  1. Make sure you have enabled the Server Health Monitor in Administration Preferences, started the Domino Server Monitor, and allowed the monitor to run for a few minutes longer than the specified polling interval.

  2. From the Domino Administrator, click the Server - Monitoring tab.

  3. In the Health column (shown as "Hea" in Figure 8-16 on page 196), the Server Health Monitor uses these icons to indicate the server's overall health:

    Green thermometer

    The server's overall health rating is Healthy. All server components are within the appropriate range.

    Yellow thermometer

    The server's overall health rating is Warning. One or more server components being monitored are approaching unacceptably poor levels of performance.

    Red thermometer

    The server's overall health rating is Critical. One or more server components being monitored are failing to perform within acceptable tolerance levels.

    click to expand
    Figure 8-16: View server health

The Server Health Monitor reports a health rating for each server being monitored and for all enabled individual server components such as these; see Figure 8-17.

click to expand
Figure 8-17: Health monitoring

  • CPU, disk, memory, and network utilization

  • NRPC name lookup

  • Mail delivery latency

  • Server response

If the server health rating is Warning or Critical, a health report is generated in the Health Monitoring database (DOMMON.NSF) locally on the Domino Administration client. The health report makes short-term and long-term recommendations for tuning the server and returning the performance status to a Healthy state.

To view the Health reports:

  1. From the Domino Administrator, click the Server - Monitoring tab.

  2. From the menu, choose Monitoring - Switch to Health Reports.

In the following health report (Figure 8-18 on page 197), CPU, disk, and network utilization are displaying a critical rating, since the Disk Utilization component is reporting critically poor performance. The overall health report contains short- and long-term recommendations, as shown in Figure 8-19 on page 198.

click to expand
Figure 8-18: Current Health Report

click to expand
Figure 8-19: Overall Health Report

Server Health Monitor does not include threshold values specific to Linux on zSeries, and it automatically picked up the thresholds for Linux/Intel which do not include CPU, disk, and memory. Although the CPU and disk utilization values were at a low value, this was reported as "critical" and we were not able to change this to a higher value. Therefore, at the time of the writing, you are not able to modify these specific thresholds; but this is expected to be made available. However, you can modify the other thresholds (for example, server response and mail delivery latency).

The Index Thresholds view in the Health Monitoring database (DOMMON.NSF) displays the threshold values for each platform.

To modify a threshold value:

  1. From the Domino Administrator, click the Server - Monitoring tab.

  2. From the menu, choose Monitoring - Display Health Reports.

  3. Under Configuration, choose Index Thresholds.

  4. Choose the operating system whose threshold you want to change, and choose Edit Threshold Document.

  5. Change the value for the Warning Threshold or Critical Threshold.

  6. Click OK.



 < Day Day Up > 



IBM Lotus Domino 6. 5 for Linux on zSeries Implementation
IBM Lotus Domino 6.5 for Linux on Zseries Implementation
ISBN: 0738491748
EAN: 2147483647
Year: 2003
Pages: 162
Authors: IBM Redbooks

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net