When the likely location of the problem has been identified, the first step is to physically examine and test the suspect device. This will identify the problem's cause in a surprising number of troubleshooting situations. Sometimes, the problem is caused by something as simple as a loose component; other times, something is damaged.
Once a suspect device is identified, it should be physically inspected. This is routine procedure. (Even when a suspect or troubled device is in a remote location, a contact person is sent to make an inspection.) Earlier, we stated that most troubleshooting tasks are done from the administrator's desk, and that's true. Most tasks are done from the administrator's PC or an NMS console. However, there's no substitute for actually looking at a device to see what's going on. The two parts of inspecting a device are reading its LEDs and inspecting the device's components.
If the device is still online, the first thing to do is to read the LEDs (light-emitting diodes). You probably recognize LEDs as those blinking lights on the front of many electronic devices. Virtually all network devices have LEDs to assist in troubleshooting. The LED bank arrangement follows the device layout:
Access devices with a bank of ports on the front, with a twisted-pair cable plugged into each port using an RJ-45 phone-style jack. Products from the Cisco Catalyst 2800 to the Catalyst 4500 Switch fit this description. There is usually one LED per port.
Motherboard-based routers with LAN segments plugged into the back, usually through twisted-pair cable, but also fiber-optic cables for uplinks. The Cisco 7200 Router fits this description. LEDs on these devices appear behind smoked-plastic panels on the front of these boxes.
High-end routers and switches of the bus-and-blade configuration, again with networks plugged into the back, both fiber-optic and twisted-pair cable. The Cisco 7500 Router and Catalyst 6500 Switch fit this description. LEDs on these devices appear both behind smoked-plastic panels on the front and on the blades (card modules) themselves on the back (remember, a blade is basically an entire router or switch on a board).
LEDs are also called activity lights. Each LED on an access device represents a host. Router and LAN switch LEDs represent entire LAN segments.
LEDs blink and change colors according to the port's status. Green means okay, and orange means the port is coming up. If the port is down, its LED goes dark. The port's LED blinks when packets are passing through it. A common practice is to press Reset to see what happens. LEDs temporarily go orange or even red if they encounter trouble during the power cycle. They will eventually go green, but the temporary error condition may indicate a nonfatal configuration error.
The rule is that if an activity light is green, the line is good and the problem must stem from some type of configuration problem. If the light is orange, the line is operating but malfunctioning. If the activity light is off, the line is down.
The next step is to physically inspect the device itself. Start by making sure the device is offline, and then remove the cover from the top of the device chassis and inspect the interior, looking for the following:
Loose connections Look for any loosely attached card (module) or cable. Reseat any that are found.
New cards If you know any card to be new, reseat it into its connection several times. New cards are more prone to oxidation or carbon film buildup on their backplane connections.
Burned or damaged parts Look for any burned wires, ribbon cables, or cards. Also look at the backplane to see if it's okay. Closely inspect the wires leading to the device's power supply. Also look for any crimped wires.
Dirty device interior If the device has dust and lint in the interior, turn off the device and clean it. Devices can accumulate a lot of foreign substances from the air in dusty or dirty environments, which sometimes can affect performance.
After completing the inspection, try rebooting the device to see if power-cycling it will fix the problem. One important caution: Don't change anything in the configuration. Doing so before rebooting can make it difficult to determine the problem's source afterward; it only adds more variables to the mix.
If no severe problem was found inspecting the device, the next step is to try a powercycle test to see how it responds. Power-cycle means to turn a device off and then turn it on again, which you probably know as the cure-all for Microsoft Windows. As we saw in Chapter 3, rebooting devices can tell a lot about the status of a device, and, in some cases, it even makes the problem go away.
When you reboot, if the configuration in memory is mismatched with the hardware, a variety of problems can ensue. Ports might hang, bus timeout errors may occur, and so on. If the device reboots and prompts for a password, the circuitry and memory are working properly. Some major symptoms and probable causes are outlined in Table 15-8.
Bad power supply; blown fuse; bad breaker; bad power switch; bad backplane
Bad or miswired power supply; bad (or poorly seated) processor card; bad memory board; bad IOS image in NVRAM; shorted wires
Partial or constant reboot
Bad processor, controller, or interface card; bad backplane; bad power supply; bad microcode
No cards show up in boot display
Bad processor, controller, or interface card; bad backplane; cards not seated in backplane; bad power supply
When hardware problems this extreme are encountered, it's time to call in support from Cisco or a third-party maintenance organization with which your enterprise has contracted. Typically, devices are shipped into the maintenance center for bench repair. Only end-user enterprises with spare parts, Cisco-trained personnel, and proper instruments attempt to repair networking hardware devices in-house.