ErrorFault Data for Switch Environmental Characteristics


Error/Fault Data for Switch Environmental Characteristics

The following section looks at the switch environmental characteristics such as power supply statistics, temperature status, and fan status. We'll identify some MIBs as well as some show commands to identify the data points for these variables.

MIB Variables for Voltages (Power Supply) and Fan

From CISCO-STACK MIB, the following variables provide voltage and fan data for switches:

  • chassisPs1Type

  • chassisPs1Status

  • chassisPs1TestResult

  • chassisPs2Type

  • chassisPs2Status

  • chassisPs2TestResult

These MIBs indicate the type of power supplies installed in the chassis as well as their statuses. There are unique MIBs for each power supply installed, either power supply 1 or 2.

The status MIBs report either an ok status, a minorFault status, or a majorFault status. If the status is not ok, the value of TestResult MIBs give more detailed information about the power supply's failure condition(s). Polling these variables is a lot more flexible than actively polling the router's environmental variables. Specifically, if an alarm is triggered, then you can actively poll the TestResult MIBs to get a reason why a failure occurred. If the statuses are flagged as a minor or major fault, it triggers the appropriate chassisMinorAlarm or chassisMajorAlarm, thus triggering the chassisAlarmOn SNMP trap.

Table 10-19 summarizes the recommended baseline thresholds for voltages:

Table 10-19. Power-Supply-Monitored Voltage Thresholds
Parameter Alarm Normal Alarm
+5V < 4.74V 4.74 5.26V > 5.26V
+12V < 11.40V 11.40 12.60V > 12.60V
+24V < 20.00V 20.00 30.00V > 30.00V

Related MIB objects from MIB CISCO-STACK are the following:

  • chassisMinorAlarm

  • chassisMajorAlarm

  • chassisFanStatus

  • chassisFanTestResult

Voltage and Fan Information via show system

The show system command can be used to zoom in on the power supply status (PS1-Status and PS2-Status) and fan status. The normal status of the power supplies is ok or none if no redundant power supply is installed. The failed values are either fan failed or faulty, which triggers either a major or minor alarm. The normal status for the fan is ok anything else is a fault with the fan. Example 10-19 shows output from a show system command.

Example 10-19 Obtaining power supply and fan status information with show system.
 Switch>show system PS1-Status PS2-Status Fan-Status Temp-Alarm Sys-Status Uptime d,h:m:s Logout ---------- ---------- ---------- ---------- ---------- -------------- --------- okA           none A       ok B          off        ok          4,23:06:16     20 min PS1-Type   PS2-Type   Modem   Baud  Traffic Peak Peak-Time ---------- ---------- ------- ----- ------- ---- ------------------------- WS-C5508 C   none C       disable  9600   0%      0% Wed Apr 21 1999, 15:57:24 System Name              System Location          System Contact ------------------------ ------------------------ ------------------------ 

Following are annotated highlights of Example 10-19:

A The "PS-Status" columns display the current state of the power supplies installed in the switch. The possible values are ok, none, fan failed, or faulty.

B The "Fan-Status" column displays the current state of the fan installed in the switch. The possible values are ok, faulty, or other.

C The "PS-Type" columns display kind of power supplies installed in the switch chassis. If no redundancy is used, one of the types will be "none".

SNMP Traps for Voltage and Fan Information

From CISCO-STACK-MIB TRAPS, two SNMP traps are relevant to voltage and fan information:

  • chassisAlarmOn

  • chassisAlarmOff

A chassisAlarmOn trap signifies that the agent entity has detected the chassisTempAlarm, chassisMinorAlarm, or chassisMajorAlarm object, and this MIB has transitioned to the on(2) state. The generation of this trap can be controlled by the sysEnableChassisTraps object in this MIB or by using the CLI command set snmp trap enable chassis.

A chassisAlarmOff trap signifies that the agent entity has detected the chassisTempAlarm, chassisMinorAlarm, or chassisMajorAlarm object, and this MIB has transitioned to the off(1) state. The generation of this trap can be controlled by the sysEnableChassisTraps object in this MIB or by using the CLI command set snmp trap enable chassis.

Syslog Messages for Voltage and Fan Information[1]

[1] Message and Recovery Procedures; http://www.cisco.com/univercd/cc/td/doc/product/lan/cat5000/rel_4_5/sys_msg/emsg.htm

Table 10-20 summarizes the syslog messages from the switch that relate to the voltage and fan statistics.

Table 10-20. Syslog Messages for Switch Voltage and Fan Information
Message Explanation
SYS-2-PS_OK: Power supply [dec] okay This message indicates that the power supply has been turned on or has returned to a proper state; [dec] is the power supply number.
SYS-2-PS_FAIL: Power supply [dec] Failed This message indicates that the power supply [dec] failed; [dec] is the power supply number. Replace the indicated power supply.
SYS-2-PS_FANFAIL: Power supply [dec] fan failed This message indicates that the power supply [dec] fan failed; [dec] is the power supply number. Replace the indicated power supply fan.
SYS-2-PS_NFANFAIL: Power supply [dec] and power supply fan failed This message indicates that the power supply [dec] and power supply fan failed; [dec] is the power supply number. Replace the indicated power supply and fan.
SYS-2-FAN_OK: Fan okay This message indicates that the chassis fan tray was plugged back in or returned to a proper state.
SYS-2-FAN_FAIL: Fan failed This message indicates that the chassis fan failed. Replace the fan.

MIB Variables for Temperature

From CISCO-STACK MIB, the chassisTempAlarm MIB indicates the temperature alarm status as off, on, or critical. Refer to Table 10-16 for the temperature thresholds.

The temperature alarm status is not an object typically actively polled because the SNMP trap chassisAlarmOn uses this variable as a varbind, unlike the power supply and Fan MIBs. Treat this MIB just like the router environmental MIBs. Use the SNMP trap as the way to determine the temperature status; "only notify me when it is an issue."

Table 10-21 shows the recommended baseline threshold for temperature:

Table 10-21. Processor-Monitored Temperature Thresholds [2]
Parameter Normal Alarm
Airflow 10 55° C > 55° C

[2] .Catalyst 5000 Series Power Supply Configuration Notes; http://www.cisco.com/univercd/cc/td/doc/product/lan/cat5000/cnfg_nts/hw_cns/2236_01.htm

Temperature Information via show system

Example 10-20 emphasizes the temperature alarm status (Temp-Alarm) as displayed in show system output.

Example 10-20 Obtaining temperature information with show system.
 Switch>show system PS1-Status PS2-Status Fan-Status Temp-Alarm Sys-Status Uptime d,h:m:s Logout ---------- ---------- ---------- ---------- ---------- -------------- --------- ok           none       ok         off A      ok          4,23:06:16     20 min PS1-Type   PS2-Type   Modem   Baud  Traffic Peak Peak-Time ---------- ---------- ------- ----- ------- ---- ------------------------- WS-C5508    none       disable  9600   0%      0% Wed Apr 21 1999, 15:57:24 System Name              System Location          System Contact ------------------------ ------------------------ ------------------------ 

The "Temp-Alarm" column (A) is either "on" or "off". The normal state is off. If it is on, look for an SNMP trap chassisAlarmOn with a value of contained from chassisTempAlarm, either on or critical.

SNMP Traps for Temperature Information

The chassisAlarmOn and chassisAlarmOff traps can be useful in obtaining temperature information. See the earlier section, "SNMP Traps for Voltage and Fan Information," for a description of these traps.

Syslog Messages for Temperature Information

Table 10-22 summarizes syslog messages that provide temperature information for switches.

Table 10-22. Syslog Messages for Switch Temperature Information
Message Explanation
SYS-0-TEMP_CRITOK: Temp critical okay This message indicates the temperature is under 50° C (122° F); this message applies only to the redundant supervisor engine in the switch.
SYS-0-TEMP_CRITFAIL: Temp critical Failure This message indicates the temperature is above 70° C (158° F). The system automatically powers down after five minutes; this message applies only to the redundant supervisor engine in the switch. The recommended actions are to power down the system and contact your technical support representative.
SYS-0-TEMP_CRITRECOVER: Temp critical Recovered This message indicates the temperature dropped below 70° C (158° F), and the automatic powerdown was canceled; this message applies only to the redundant supervisor engine in the switch.
SYS-2-TEMP_HIGHOK: Temp high okay This message indicates that the temperature returned to a normal state 20° C-40° C (68° F 104° F).
SYS-2-TEMP_HIGHFAIL: VTemp high failure This message indicates that the temperature is between 40° C 50° C (104° F 122° F); this message applies only to the redundant supervisor engine in the switch.



Performance and Fault Management
Performance and Fault Management: A Practical Guide to Effectively Managing Cisco Network Devices (Cisco Press Core Series)
ISBN: 1578701805
EAN: 2147483647
Year: 2005
Pages: 200

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net