Responding to Problems Automatically


Traditionally systems are monitored so that problems can be seen immediately and the appropriate administrator can be notified of the issue. The administrator would then go to the site where the system is located to perform the necessary maintenance task to return the system to usable state. Modern monitoring systems are able to not only alert administrators to problems but they are able to react to system events and process commands to attempt to fix the problem on their own. In this way problems can be responded to automatically. Often simple fixes can be attempted by the system itself. If these fixes fail the problem can be escalated to an administrator who can handle the problem in person.

Reactive Monitoring Systems Are No Replacement for Qualified Technical Resources

Although many monitoring systems are quite sophisticated, they are only as good as the administrator that configured their responses. Even then, reactive monitoring systems are no replacement for qualified technical resources.


Triggering External Scripts

One of the simplest ways to allow a system to repair itself in the event of simple problems is by triggering an external script. External scripts also enable the clever administrator to extend the abilities of the monitoring system. Rather than using static reactions to an event, any external script can call external programs to do more advanced tasks to determine who is the administrator on call and page them rather than statically paging the same person every time. External scripts enable an administrator to stack events such as triggering a pager, sending e- mails to multiple recipients, or simply trying to restart a series of services.

External Scripts Might Be Prevented

Some external scripts might be prevented by OS level security settings. The MAPISEND, for example, will not be allowed by default on Outlook 2000 SP-1 and higher. This is because the default security settings don't allow an external script to use a MAPI profile without user intervention. This behavior can be excluded for a system in the Exchange configuration. Always be aware of these types of limitations with scripts and test them before they get used in production.


External scripts often require additional programs to fully execute all the items an administrator would typically want to occur. One of the most useful things an administrator can do is to initiate a command line e-mail message. Programs like Mapisend.exe can be leveraged to send messages to different resources that indicate the situation that has occurred.

 
[View full width]
 mapisend -u Outlook -p password -r recepient@companyxyz.com -s %hostname% is down -m "at graphics/ccc.gif %time% on %date%" 

Services Recovery and Notification

Windows 2003 has a built-in function that allows services to not only attempt to restart themselves , but also to alert the system in some way that a restart has occurred. By going into the Properties of a service and going to the Recovery tab there are options for what the system should do on first, second, and subsequent failures. There is also an option to determine when to reset the failure counter. The options are to Take No Action, Restart the Service, Run a Program, or Restart the System. Although restarting the service might seem very tempting, it is preferred to run an external program instead. This external program should restart the service but also it should alert you that the service has been restarted. If the monitoring system in use doesn't natively detect service failure, the program could send an SNMP trap, send a message to the your system, or trigger an e-mail to you to alert you that the event has occurred. This built-in functionality gives you great flexibility in integrating this built-in function to almost any type of monitoring system.

Append an Entry to a Log File

Have the program that restarts the service append an entry to a log file by passing an error message along with the %time% and %date% parameters to enable you to check a single file to determine when and how often a service is failing and restarting. This information, covering a long stretch of time, can be a very useful troubleshooting tool.




Microsoft Windows Server 2003 Insider Solutions
Microsoft Windows Server 2003 Insider Solutions
ISBN: 0672326094
EAN: 2147483647
Year: 2003
Pages: 325

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net