SELF-HEALING

Prev don't be afraid of buying books Next

A self-healing capability can detect the abnormal operation of the system and business processes (either in predictive or reactive mode), then initiate corrective and appropriate action. It needs to achieve this without disrupting the users. Actions such as balancing work load and volume or transactions are examples. The Tivoli software management availability portfolio of software products provides tools to assist in the autonomic monitoring of system health and performance. They also provide metrics from numerous resources to allow the filtering, correlation and analysis of system data. Based on this analysis, autonomic actions can be taken to halt problems—sometimes before they occur. Autonomic features are provided at multiple levels to proactively manage the IT infrastructure.

The software tools available in Tivoli for self-healing include:

  • Tivoli Enterprise Console

    The Enterprise Console collates error reports, derives the root problem cause, and initiates corrective action. The event server and correlation engine help allow cross-resource correlation of events observed from the hardware, the running applications, and the network devices. Events from the multiple resources can be analyzed in real time to highlight the problems that merit immediate attention versus lower priority events. After a problem is determined, the system takes self-healing actions by responding automatically. As an option, support staff can be included to verify the response.

  • Tivoli Switch Analyzer

    The Switch Analyzer correlates network device errors to the root cause without user intervention. It is a level 2 (out of the five autonomic levels) network management solution that provides automated level 2 discovery. It identifies the relationship between devices, including level 2 and level 3 devices, and identifies the root cause of a problem without human intervention. During a network event storm, it can filter out extraneous events to correlate the true cause of the problem.

  • Tivoli NetView

    NetView® helps enable self-healing by discovering TCP/IP networks, displaying network topologies, correlating and managing events and SNMP traps, monitoring network health, and gathering performance data. Router fault isolation technology quickly identifies and focuses on the root cause of a network error and initiates corrective actions.

  • Tivoli Business Systems Manager

    The Business Systems Manager collects real-time operating data from distributed application components and resources across the enterprise and provides a comprehensive view of the IT infrastructure components that make up different business solutions. It contains technologies that analyze how an outage would affect a line of business, critical business process, or service-level agreement (SLA).

  • Tivoli Systems Automation S/390

    The Systems Automation for the IBM mainframe model S/390 manages real-time problems in the context of an enterprise's business priorities. It provides monitoring and management of critical system resources, such as processors, subsystems, the Sysplex Timer, and coupling facilities. It supports self-healing by providing mechanisms to reconfigure a processor's partitions, perform power-on reset on IML processors and IPL operating systems (even automatically), investigate and respond to I/O configuration errors, and restart and stop applications if failures occur.

  • Tivoli Risk Manager

    The Risk Manager enables self-healing by assessing potential security threats and automating responses, such as server reconfiguration, security patch deployment, and account revocation. This helps enable system administrators who are not security experts to monitor and assess security risks in real time with a high degree of integrity and confidence across an organization's multiple security checkpoints. This product contains technology developed by IBM Research.

  • Tivoli Monitoring for Applications, for Databases and Tivoli Monitoring for Middleware

    This family of products minimizes vulnerability by discovering, diagnosing, and reacting to disruptions automatically. It provides monitoring solutions and a local automation capability through a set of Proactive Analysis Components. A sophisticated resource model engine allows for local filtering of monitored data, raising events when specific conditions are met. Local rules can be encoded to take immediate corrective action, providing automatic recovery for server failures.

  • Tivoli Storage Resource Manager

    The Storage Resource Manager automatically identifies potential problems and executes policy-based actions to help prevent or resolve storage issues, minimize storage costs, and provide application availability. It can scan and discover storage resources in the IT environment. It supports autonomic policy-based automation for the allocation of storage quotas and storage space; monitors file systems; and provides reports on capacity and storage asset utilization.

Figure 15.3 summarizes the self-healing products with the Tivoli product suite.

Figure 15.3. A summary of the self-healing products in the Tivoli suite.

graphics/15fig03.jpg




Software that can repair itself must be a valuable contribution to future IT operations, datacenters, and corporate applications.

Amazon


Autonomic Computing
Autonomic Computing
ISBN: 013144025X
EAN: 2147483647
Year: 2004
Pages: 254
Authors: Richard Murch

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net