I l @ ve RuBoard |
Various support tools monitor errors and faults, configuration information, and troubleshooting for hardware components , including the CPU, system memory, and tape devices. Some of these support tools also monitor software configurations, to track changes. Support Tool ManagerHP's Support Tool Manager (STM) provides access to a set of tools for verifying and troubleshooting HP-UX system hardware. These online diagnostic tools provide the ability to determine device status, get configuration information, and diagnose hardware problems. These tools are available by using a GUI or through commands, and have the flexibility to be invoked automatically at periodic intervals. STM discovers the hardware devices on a system and can diagnose memory errors, Low-Priority Machine Check (LPMC) errors, I/O driver errors, Logical Volume Manager (LVM) errors, and over-temperature events. Memory errors include single-bit errors and page deallocation events. STM includes the Automatic Configuration Mapper, shown in Figure 4-8, which gives a graphical view of your hardware configuration using color -coded icons, showing device status as well as logical relationships, such as the peripherals connected to an I/O card. Each icon on the map represents a hardware device. These icons display the device type, device identifier, device path , last active tool, and test status (from last active tool). You can launch other STM tools from this view as well. Figure 4-8. STM Configuration Mapper showing the latest status of the CPU and memory.
The Information tool provides product identifier information, product description, hardware path, vendor name , firmware revision, and error log statistics, including read errors, which can be used to trend and anticipate problems. This tool also checks onboard log information, and can be used to track configuration changes. Several other tools under STM perform varying levels of testing to stress a device or determine and diagnose problems:
STM enables an operator to run a module on several devices simultaneously . In addition, the operator can start diagnostic tests running on more than one system from within the user interface. STM provides both configuration and fault monitoring capabilities for the system. STM tools detect the same errors as the EMS Hardware Monitors, but the EMS Hardware Monitors report them in real time. After getting an EMS event, you can run STM to further diagnose a problem. STM is used to diagnose local or remote systems. It is available on HP-UX releases 10.01 and later. STM replaces the Sherlock diagnostics. The software (product number is B4708AA) is being distributed on the HP-UX Diagnostic/IPR Media. HP Predictive SupportHP Predictive Support detects and predicts system- related faults. When problem conditions are detected, notification is sent to the HP Response Center. This level of care is meant for customers with special support contracts with Hewlett-Packard. The Predictive Support software proactively monitors the system and automatically reports information back to the HP Response Center via modem access. Because the HP Response Center is available 24 hours a day, 7 days a week, this procedure can lead to a quick response to problems. The Predictive Support software focuses on system event information for memory and I/O devices. Error logs are analyzed daily, with potential problems diagnosed. By proactively warning of potential problems, scheduled maintenance can replace the unplanned downtime associated with a failed component. Predictive Support uses a set of rules on a managed node to determine when events should be sent to the HP Response Center. These conditions can be updated periodically by downloading new rules from HP. Event correlation ensures that duplicate messages are suppressed and that the Response Center is not repeatedly warned of the same root problem. Predictive Support analyzes on-board logs, system logs, and memory logs. The software can automatically dial the HP Response Center to transmit error data and logs, or the system administrator can initiate modem transmission. Similarly, Predictive Support software updates, to include new rules for generating predictive events, can be triggered automatically or controlled by the administrator. Configuration and administration is controlled through a menu-driven interface. System logs are scanned for I/O errors and LPMCs. Logged data is analyzed for trends associated with specific disk or tape devices, such as correctable errors. LPMC records are analyzed for internal cache errors. Memory logs are also scanned to look for error rates exceeding specified thresholds. The Response Center determines where a failed device is located, its model number, its manufacturer, and its serial number, so that repairs can be made. This information is sent in the failure notification messages. HP Predictive Support does not help with other areas of system monitoring, such as resource and performance management. Also, the software runs only on HP-UX systems. HA ObservatoryHA Observatory is a suite of tools used to detect and quickly diagnose system problems. The products include the Configuration Tracker, which keeps track of the server's software configuration, Network Node Manager, and HP Predictive Support. A support system and network router are also maintained at the customer site. HA Observatory relies on HP Predictive Support to report hardware failures. In addition, configuration information collected by the Configuration Tracker is available. The Configuration Tracker generates and maintains a snapshot of the configuration so that it can detect software configuration changes. HA Observatory uses a secure network link to HP's High Availability Support Center from a special system at the customer site. This support system, an HP 9000 Series 700 workstation, collects system configuration information from key servers and can be used to view network status and topology information. Hardware failure notifications and configuration information can be sent to HP. When permitted by the customer, HP support engineers can access the customer servers over the secure link to gather additional information. HA Observatory is supported only on HP-UX systems and is available only to customers with BCS and CCS support contracts. |
I l @ ve RuBoard |