Prehistory and Early Electronic Directories

Understanding and Deploying LDAP Directory Services > 18. Monitoring > An Introduction to Monitoring

<  BACK CONTINUE  >
153021169001182127177100019128036004029190136140232051053054012004115230005232218103002

An Introduction to Monitoring

It's likely that your directory service is (or will be) a vital part of your computing infrastructure. Users may depend on it for things such as login and address books, and applications may depend on it for such things as access control and email delivery. Failure or unavailability of the directory can result in downtime for users and applications, which translates into lost time and money. By monitoring your directory service, you can learn of outages as soon as they occur.

With more-sophisticated, proactive monitoring strategies, you can also anticipate problems before they result in an outage or degraded service. Information you gather from this type of monitoring can be used to fine-tune your directory server software. For example, proactive monitoring may alert you to the need to change directory configuration parameters to optimize performance for common queries. Proactive monitoring can also provide data that can help you optimize your management procedures. For example, an increase in the number of updates handled by your directory may signal the need for more frequent backups .

A monitoring system consists of three conceptual modules (see Figure 18.1), described in the following list:

Figure 18.1 A conceptual overview of monitoring.
  • The Device and Application Probing module is responsible for periodically checking the status of the monitored devices, hosts , and applications. When a device fails a test, an event is generated that describes the device and the test that was performed.

  • The Event Correlation module is fed these events and correlates them to determine the root cause, and then it suppresses any events that might have occurred as a consequence of other events. For example, a network router might fail, temporarily making all devices, hosts, and applications beyond it inaccessible. Any alerts for those events would be suppressed because they are probably false alarms. After suppressing any inappropriate events, the module constructs one or more alerts and sends them to the notification module.

  • The Notification module receives alerts and notifies the appropriate persons who can remedy the problem. Alternatively, the Notification module might arrange for an automated system to take some remedial action such as restarting a server process or rebooting a failed server.

The monitoring system shown in Figure 18.1 is a conceptual model that we use to frame our discussion of directory monitoring. Any of the modules' functions could be performed by humans or a software program; however, if you use a commercially available network management system (NMS), you'll probably find that it implements all of these functions for you.

The most basic type of monitoring detects when the directory (or a part of it) is unavailable, perhaps because a server machine has crashed or has become unreachable as a result of a network failure. These directory failures are hard failures ”that is, a part of the directory has failed completely.

Other types of directory problems can result in degraded performance. For example, looping electronic mail can cause the load on the directory to increase dramatically as the messaging servers attempt to deliver the looping mail. A more advanced monitoring tool could conceivably detect the increased load on the directory and alert a system administrator, who could take corrective action. A complete monitoring system should be able to detect hard failures and also detect when performance drops below an acceptable level.

In addition to detecting hard failures and unacceptable performance degradation, a well-designed monitoring solution also provides you with valuable information on performance trends. Such proactive monitoring can help you anticipate problems before they become serious enough for your users to notice.

Methods of Monitoring

There are a number of ways to monitor a directory service. Following are the various types of monitoring you should consider:

  • Monitoring with Simple Network Management Protocol (SNMP).   Although SNMP has found its widest application in the management of networking hardware such as switches, hubs, and routers, it is also possible to use SNMP to monitor and manage application processes running on server computers. SNMP allows a management application to monitor the status of an entity on the network. It's also possible for a management application to be asynchronously notified via the SNMP trap mechanism when some sort of problem occurs (if a server process terminates unexpectedly, for example).

  • Probing the directory via LDAP.   One of the most straightforward and useful ways to monitor your directory service is to probe it by connecting to it as a client and issuing LDAP requests . For example, a simple probing tool might connect to a directory server and issue a search request for a given entry. If the entry is returned within a reasonable span of time, the directory is considered functional. If not, the probing tool can report a failure.

  • Monitoring operating system performance data.   Most modern operating systems (OSs) provide tools to query their operating parameters. This type of information can help you identify when your directory server performance is suffering because of an OS problem.

  • Indirect Monitoring   .Monitoring the applications that utilize the directory provides more of an end user view of the reliability and responsiveness of your system.

  • Log File Analysis.   You can automatically scan the directory service's log files for messages that indicate an error condition, and you can watch for conditions that signal a performance problem. Log file analysis is also a good way to perform proactive monitoring, in which you identify undesirable performance trends and telltale signs of impending problems before they are noticed by your users.

Later in this chapter we discuss each of these five approaches in detail and provide specific examples of each.

General Monitoring Principles

Before we discuss specific monitoring methods, let's take a moment to introduce some general principles that apply to all methods.

Monitoring Unobtrusively

You should always understand the implications of your monitoring strategy. It's possible for a poorly designed monitoring system to adversely affect performance if it places a heavy load on the directory service. In general, you should strive to make the monitoring as unobtrusive as possible while still providing the information you need.

How do you make your monitoring unobtrusive? You should use the available method that is the most lightweight but gives you the needed information. For example, if you probe the directory, retrieving a single entry is probably sufficient; it's unnecessary to retrieve many entries. You should also perform the probe no more often than necessary to implement your desired responsiveness. For example, you can discover problems sooner if you probe the directory every five seconds, but that may be overkill; it's probably reasonable to probe every minute, or even every five minutes, depending on the level of service you are expected to provide to your users.

One Failure Causing Other Failures

It's also possible that if a failure occurs, it may trigger other alerts in your monitoring system. For example, if one of a set of replicated servers becomes unavailable, the load on the remaining servers may increase as clients reapportion themselves among the remaining replicas. If this occurs, you can try to reduce the load on the remaining servers by disabling noncritical applications or by bringing additional replicas online. In any case, such an event signals the need for additional capacity to provide some headroom should it happen again.

Keeping a Problem History

You should strive to design your monitoring system so that it provides you with a reliable history of problems. For example, if you use a commercial network management system that logs alerts in a standard format, you might periodically extract the directory- related alerts and archive them in a central location. These extracted logs can help you identify trends that you can use to plan for expansion ”and demonstrate your ever-improving reliability figures to management (or hide the figures if they don't show improvement!).

Having a Plan

Finally, for every type of failure you can anticipate, you should create a written action plan to share with all operators and support personnel who might be the first to learn of a failure. It's also a good idea to have a default action plan to be followed when an unanticipated error occurs. Action plans are covered in more detail in the "Taking Action" section of this chapter.



Understanding and Deploying LDAP Directory Services,  2002 New Riders Publishing
<  BACK CONTINUE  >

Index terms contained in this section

action plans
          monitoring
degraded perforamnce
          monitoring
Device and Application Probing module
          monitoring
directories
          monitoring 2nd
                    action plans
                    degraded perforamnce
                    Device and Application Probing module
                    Event Correlation module
                    failures causing failures
                    hard failures
                    indirect
                    log file analysis
                    NMSs
                    Notification module
                    operating system performance data
                    problem histories
                    unobtrusively 2nd
                    with LDAP
                    with SNMP
Event Correlation module
          monitoring
failures
         causing failures
                    monitoring
         hard
                    monitoring
hard failures
          monitoring
indirect monitoring
LDAP
          monitoring with
log files
          monitoring
logs
          problem histories
maintenance
          problem histories
modules
         monitoring
                    Device and Application Probing
                    Event Correlation
                    Notification
monitoring 2nd
          action plans
          degraded performance
         failures
                    causing failures
          hard failures
          indirect
          log file analysis
         modules
                    Device and Application Probing
                    Event Correlation
                    Notification
          NMSs
         operating systems
                    performance data
          problem histories
          unobtrusively 2nd
          with LDAP
          with SNMP
NMS (Network Managing System)
          monitoring
Notification module
          monitoring
operating systems
         performance data
                    monitoring
performance
         degraded
                    monitoring
         operating system
                    monitoring
problems histories
          monitoring
protocols
         SNMP
                    monitoring with
SNMP
          monitoring with
SNMP (Simple Network Management Protocol)
          monitoring with
troubleshooting
          problem histories

2002, O'Reilly & Associates, Inc.



Understanding and Deploying LDAP Directory Services
Understanding and Deploying LDAP Directory Services (2nd Edition)
ISBN: 0672323168
EAN: 2147483647
Year: 1997
Pages: 245

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net