Sophisticated resource monitoring is easy with SNMP. By simply adding the correct line to your snmpd.conf file, you can add useful information to the MIB that can then be accessed by the Mon daemon from the cluster node manager.
On the CD-ROM in the chapter17 directory, you'll find a file called snmpd.conf.example that contains the following configuration to monitor both disk space and processes:
# GENERAL SNMP INFORMATION syslocation "Room 133" syscontact alert@domain.com rwcommunity private rocommunity public authtrapenable 1 trapcommunity trapServ trapsink localhost trap2sink localhost # DISK SPACE MONITORING # Disk partitions should have at least 100 MB of # freespace. disk / 100000 disk /var 100000 disk /home 10000 disk /usr 10000 # PROCESS MONITORING # System init proc init # Job Scheduling and system time. proc atd proc ntpd # System logging proc syslogd # Start services based on in-bound connections proc xinetd # Remote machine login proc sshd # Printing proc lpd # Web services (may be used in cluster environment # for monitoring servers inside the cluster). proc httpd # NFS Client software proc portmap proc lockd proc rpc.statd # Optional Automounter daemon # proc automount # NIS Client software proc ypbind
Note | The crond daemon is not included here because we will only run the cron daemon on one node. (See Chapter 19.) |
Copy this file into place by naming it /etc/snmp/snmpd.conf, and then restart the snmpd daemon and check the /var/log/messages file to see if any errors are reported when the snmpd daemon first starts.
Mon only needs to be run on the cluster node manager. The netsnmpproc.monitor script is also included on the CD-ROM (modified to support establishing the SNMP session using version 2 of the protocol).[3] Copy it to the /usr/lib/mon/mon.d directory and make sure to set the execute permissions on the file:
#cp /mnt/CDROM/chapter17/netsnmp-proc.monitor /usr/lib/mon/mon.d/ #chmod 755 /usr/lib/mon/mon.d/netsnmp-proc.monitor
The CD-ROM also includes another mon.cf configuration file named mon.cf.2:
alertdir = /usr/lib/mon/alert.d mondir = /usr/lib/mon/mon.d logdir = /usr/lib/mon/logs histlength = 500 dtlogging = yes dtlogfile = /usr/lib/mon/logs/dtlog hostgroup clusternodes localhost clnode1 clnode2 clnode3 watch clusternodes service cluster-ping-check interval 30s monitor fping.monitor depend clusternodes:cluster-ping-check period wd {Su-Sa} alert mail.alert alert@domain.com upalert mail.alert alert@domain.com alertevery 1h service disk-space-check interval 15m monitor netsnmp-freespace.monitor -t 10000000 depend clusternodes:cluster-ping-check period wd {Su-Sa} alert mail.alert alert@domain.com upalert mail.alert alert@domain.com service process-check interval 30s monitor netsnmp-proc.monitor -t 10000000 depend clusternodes:cluster-ping-check period wd {Su-Sa} alert mail.alert alert@domain.com upalert mail.alert alert@domain.com service telnet-check description telnet to servers in cluster interval 2m monitor telnet.monitor depend clusternodes:cluster-ping-check period wd {Sun-Sat} alertevery 1h
You may want to add additional (hostgroup) entries to this file in order to monitor the LDAP server,[4] the DNS server, and the mail server using additional monitor scripts available from the download site.
Note | The SNMP monitors are being passed the -t 10000000 argument, which means that theSNMP response can take up to ten seconds. |
This configuration file introduces the syntax necessary to create a hierarchical relationship between the things being monitored. The line:
depend clusternodes:cluster-ping-check
when inserted in each service paragraph causes Mon to suppress alerts if a cluster node does not respond to a ping.
Note | See the Mon man page for the complete description of the depend syntax. (You can use any Perl expression that evaluates as true or false to create complex dependencies.) |
The chapter17 directory on the CD-ROM contains a script called mon that can be placed in the /etc/init.d directory and then added to the normal boot process using the chkconfig command (on Red Hat systems).
#cp /mnt/CDROM/chapter17/mon /etc/init.d/mon #chmod 755 /etc/init.d/mon #chkconfig --add mon
With the new configuration files in place, restart the SNMP service on each cluster node and the Mon service on the cluster node manager.[5] On each cluster node, enter:
#service snmpd restart
On the cluster node manager, start Mon in debug mode to make sure your configuration file is working properly (use /usr/lib/mon/mon -d). Once you are sure that Mon works, start it using the init script with the command:
#service mon start
or
#/etc/init.d/mon start
Note | If you receive an error message that the fping monitor could not be located, see the instructions earlier in this chapter for specifying the path to the fping binary in the /usr/lib/mon/mon.d/fping.monitor script. |
[3]You can also download the script from http://ftp.kernel.org/pub/software/admin/mon/contrib/monitors/netsnmp/.
[4]Requires the Net::LDAP package (which requires the Convert::ASN1 package) from CPAN. Here is a sample mon entry to use the ldap.monitor to look up the account admin in the LDAP database for basedn yourdomain.com: monitor ldap.monitor -basedn "dc=yourdomain,dc=com" -filter="uid=admin" -attribute=uid -value=admin
[5]In Chapter 19, we will discuss how to run a command on all cluster nodes.