Summary


Real-time operations comprise a set of key functions that must operate within tight time constraints. Information flows into the real-time operations system from the instrumentation manager (the source of alert data) and from the SLA statistics modules, which provide time-sliced measurements of performance. The real-time operations system then processes the inputs in an attempt to improve MTBFpossibly by using proactive techniques to predict possible failures. At the same time, it tries to assist the operations staff in decreasing the MTTR when a failure actually occurs.

Reactive management, used to decrease MTTR, is based on the use of triage and root-cause analysis. Triage tries to identify the responsible organization very quickly, in the hope that they will be able to use their specialized tools and knowledge to fix the situation. Root-cause analysis is a more detailed, technically intense process that tries to assist in the detailed diagnosis of the situation.

Root-cause analysis uses sophisticated methods of filtering and correlating input data, possibly combined with a model of the system being managed, to make reasonable suggestions about the cause of a performance problem.

Active responses can then be used to handle routine problems or even predicted problems so that system operators can concentrate on more complex issues.




Practical Service Level Management. Delivering High-Quality Web-Based Services
Practical Service Level Management: Delivering High-Quality Web-Based Services
ISBN: 158705079X
EAN: 2147483647
Year: 2003
Pages: 128

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net