Recovering from System Problems

I l @ ve RuBoard

When a server fails and can't be immediately repaired, high availability cluster software (such as MC/ServiceGuard) can be used to reduce the downtime associated with the situation and to keep services available. MC/ServiceGuard detects the failure of an application and automatically restarts the application on another system. This automatic detection and recovery can save you downtime. MC/ServiceGuard can detect a failure and restart an application on another system in under one minute.

However, even with the kernel's capability to mask certain failures and high availability software's capability to move applications to redundant servers, ultimately you still need to repair the failed components . For hardware problems, Support Tool Manager can provide fast diagnosis on HP-UX systems. SyMON can be used for Solaris environments.

You may need to find a software or firmware patch to fix your problem. For HP-UX, you can obtain patch information from the HP Web site by following links to support information. Customized patch bundles are available for customers with HP software support agreements. For Sun Solaris, you can access patch information over the Web through a service called SunSolve Online.

Recovering from a security violation by a malicious intruder may be more difficult. The system administrator may need to revert to system backup tapes from a known good system state.

I l @ ve RuBoard


UNIX Fault Management. A Guide for System Administrators
UNIX Fault Management: A Guide for System Administrators
ISBN: 013026525X
EAN: 2147483647
Year: 1999
Pages: 90

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net