15.4 Troubleshooting the Chassis


It is important to be able to check on the health of the overall router and of individual components at will. This section will provide you with the commands needed to perform this task and will explain the various qualifiers for each command. This will be a section to earmark as you work with the Juniper Networks routers until you become proficient enough to know the commands by heart.

15.4.1 Environmental Monitoring

This section addresses how to monitor the temperature, alarms, and general environment of the router. It is important to be able to do these both locally and remotely because there is often no one present at the router. With more and more networks becoming centrally managed, this is a critical component of overall network health control.

15.4.1.1 Remote Craft Interface Monitoring on the M40, M40e, and M160 Routers

When it is impossible to be in front of the router you wish to troubleshoot, it is handy to be able to view virtually the information that is currently being displayed on that router's craft interface. Note that this is only true for data that is displayed on an M40, M40e, or M160 router equipped with an LCD display on the craft interface. The following command can be used in JUNOS view privilege mode from a terminal session to retrieve this data:

 lab@Chicago> show chassis craft-interface 

This command would offer output similar to that displayed below. Notice that there are no alarms and no known problems on this M20 router. Because there is no LCD screen on the craft interface of the M20 router, this command is the only way to discover this information. There are FPCs present in every available slot, again, with no problems.

 Red alarm:     LED off, relay off  Yellow alarm:  LED off, relay off Host OK LED:   On Host fail LED: Off FPCs     0  1  2  3 ------------------- Green    *  *  *  * Red      .  .  .  . LCD screen:\LCD Screen:  New York  Up: 10+7:05:32      Temperature OK 

Each router type has a different output for this command. Please refer to your JUNOS software configuration manual or go online to the Juniper Networks Web site at www.juniper.net/techpubs/software.html.

15.4.1.2 Chassis Alarms

We mentioned chassis alarms earlier in Section 15.3.1. You can also gather this information remotely. A chassis alarm is defined as an alarm that originates in a router component, such as the power supply. If a power supply fails, a chassis alarm will be generated. Of course, on the M40, M40e, and M160 models, which are equipped with an LCD display on the craft interface, you can view these alarms if you are standing at the router. By using the CLI from a local or remote terminal session, you can gather information on active chassis alarms by using the command:

 lab@Chicago> show chassis alarms 

Tables 15-1 and 15-2 show the output you will see for each type of alarm for models M5, M10, M20, and M40 and for model M160.

Table 15-1. Chassis Alarm CLI Output for Models M5, M10, M20, and M40
CLI Output Alarm Type
Fan- name stopped spinning Fan failure
Fan-name removed Fan removed
Too few fans installed or working Too many fans inoperable
Temperature-sensor temperature sensor failed Failure of temperature sensor
A temperature sensor exceeds 54 °C Temperature too high
Power supply power-supply -name not providing power Power-supply failure

Power supply power-supply-name 3.3V failed

Power supply power-supply-name 5V failed

 
Craft interface not responding Craft-interface failure

Too many unrecoverable errors

Too many recoverable errors

FPC or PIC [*] errors (per slot information)

[*] PIC errors are shown in the case of an M5 or M10 router.

Table 15-2. Chassis Alarm CLI Output for Model M160
CLI Output Alarm Type
RED ALARM “ fan-name Failure Fan failure
YELLOW ALARM “ fan-name Removed Fan removed
RED ALARM “ Too many fans missing or failing Fan(s) missing
YELLOW ALARM “ Temperature Warm Internal chassis temperature greater than 65 °C
RED ALARM “ Temperature Hot greater than 75 °C Internal chassis temperature
RED ALARM “ Temperature sensor failure Internal chassis temperature-sensor failure
YELLOW ALARM “ PEM pem-number Removed Power supply removed
RED ALARM “ PEM pem-number High Temperature Power supply too hot
RED ALARM “ PEM pem-number Output Failure Power-supply output failed
RED ALARM “ PEM pem-number Input Failure Power-supply input failed
RED ALARM “ SFM sfm-number Failure Switch fabric module failed
RED ALARM “ SFM sfm-number Removed Switch fabric module removed
RED ALARM “ Host host-number Failure Host module failed
RED ALARM “ Host host-number Removed Host module removed
YELLOW ALARM “ Craft Failure Craft-interface failure

15.4.2 Power-Supply Monitoring

It is important to check the power supplies on the routers regularly. Although all Juniper Networks M-Series routers have redundant power supplies, it is critical that you know the status of all power supplies at all times. Doing so will help to head off any future total failure of the router if one power supply is nonfunctional and the standby (which has become the master) fails.

There are three ways to monitor the power supply for operational status:

  1. Using the show chassis craft-interface command, as described in Section 15.4.1.1

  2. Using information gained from SNMP traps on the NMS, as described earlier in Section 15.3.2

  3. Using visual inspection, which is our primary focus in this section

Visual inspection can provide the status of the system LEDs, as was discussed earlier in Section 15.3.1, and of the output on the LCD display of the craft interface on M40, M40e, and M160 models only. If the power supply is functioning normally, you should see a solid green OK LED illuminated on the power-supply faceplate, no system alarms, and no trouble indicated in the LCD display of the craft interface. Table 15-3 offers some tips on troubleshooting a power supply that is not functioning normally.

15.4.3 Control Board Monitoring

It is at least as important to monitor the health of the system control boards as it is to monitor any other system component. These boards are integral to the router's functioning, as you learned in Chapter 3. When troubleshooting the router, take careful note of the condition of the control boards . Here are a few tips:

  • If the LEDs are all faintly illuminated, the control board may not be properly seated. Try reseating and securing the board.

  • If the LED does not indicate normal operation on the control board (more information is provided for each model in Sections 15.4.3.1 to 15.4.3.4), the board may not be making solid contact with the backplane. Try tightening the top and bottom screws to see if the problem resolves. If it doesn't, try reseating and securing the control board.

  • If the control board doesn't appear to be functioning normally, the craft interface may be malfunctioning as well. You may receive alarms about fans, impellers, or other system components when in fact these components are working properly. If you encounter these issues, troubleshoot the control board.

The following sections address how to determine the status of the system control boards for each type of Juniper Networks router by visually inspecting the LEDs.

Table 15-3. Power-Supply Troubleshooting
Symptom Possible Cause Possible Solution Applicable Router Models

Status Fail

LED is red

Power supply has failed Try replacing it. If a new power supply works, contact Juniper Networks to return the faulty power supply. If the new power supply does not work, contact the JTAC. M20, M40

Status OK

LED is green and blinking

Power supply has not yet initialized Router may not be fully initialized. Wait a few minutes. M20, M40
Power supply has no power; no LEDs are lit Power supply may have failed because temperature threshold exceeded or loss of power from source Check for alarms. Check condition and status of power cables, UPS, and power source. All models

OUTPUT OK

LED is blue and blinking

Power supply has failed Try replacing it. If a new power supply works, contact Juniper Networks to return the faulty power supply. If the new power supply does not work, contact the JTAC. M5, M10, M160
15.4.3.1 M5 and M10 Control Board

The M5 and M10 routers do not have a control board.

15.4.3.2 M20 Control Board

The M20 model has five LED indicators on the front faceplate.

  1. The amber OFFLINE LED is illuminated and solid when the SSB is offline.

  2. The green ONLINE LED is illuminated and solid when the SSB is online.

  3. If the SSB is acting as master, the blue MASTER LED is illuminated and solid.

  4. The left-most green STATUS LED blinks faintly when the SSB is operating and grows brighter when many exception packets are being processed .

  5. The right-most green STATUS LED flashes rapidly when I/O activity is occurring.

15.4.3.3 M40 Control Board

The M40 model has an SCB on which there are four LED indicators:

  1. The green ACTIVE LED flashes rapidly when there is I/O activity on the SCB.

  2. The green RUN LED blinks slowly when the SCB processor is handling exception packets. Normally rather faint, this LED becomes brighter when many exception packets are being processed.

  3. Two amber STAT1 and STAT2 LEDs flash when internal diagnostics are running.

15.4.3.4 M160 Control Board

The M160 model has two types of control module: the SFM and PCG. The health and operational status of both components can be discovered through a visual inspection of the LEDs on the components. The SFM has two LEDs on the front faceplate: green for OK and amber for FAIL. Similar to LEDs on other component LEDs, solid green indicates an operational status, blinking green indicates that the component is still initializing, and solid amber indicates a component failure. If an SFM fails, try swapping it out with a spare SFM. If the new SFM works, contact Juniper Networks for replacement of the faulty module.

The PCG is located in the rear of the chassis beside the routing engine and has three LED indicators:

  1. The blue MASTER LED is illuminated and solid when the PCG is acting as master.

  2. The green OK LED is illuminated and solid when the PCG is in a normal operational state and blinks when the PCG is initializing.

  3. The amber FAIL LED is illuminated and solid when the PCG has failed to operate normally.



Juniper Networks Reference Guide. JUNOS Routing, Configuration, and Architecture
Juniper Networks Reference Guide: JUNOS Routing, Configuration, and Architecture: JUNOS Routing, Configuration, and Architecture
ISBN: 0201775921
EAN: 2147483647
Year: 2002
Pages: 176

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net