Section 41.3. Hardware Problems


41.3. Hardware Problems

One of the truisms of networking is that the majority of problems occur at the physical layer. Problems can include a failed NIC, a hub that has lost power, or bad cabling. A discussion of the relevant applications for troubleshooting physical networking issues follows.

41.3.1. Physical Connection Issues

Although the LPI Exam focuses mostly on local configuration issues that can be discovered and resolved by using applications such as ifconfig and hostname, it is important to understand that other devices may be causing problems.

41.3.1.1. Cabling

While relatively unusual, cables can become weakened and wires can sever. Some offices may have wiring that is routed beneath carpets that receive substantial traffic. Users may also be able to roll over the cables with their chairs. When it comes to troubleshooting such cabling problems, consider the following steps:

  1. Obtain a working system and attach it to the cable in question. If the cable works, you know you have a problem with the Linux system.

  2. Check for loose or broken cable connectors. If a connector is broken, the cable may be only partially inserted, causing failed or intermittent connections.

41.3.1.2. Failed networking devices

Hubs, switches, and routers are generally quite reliable, but it is possible that an intervening device, rather than the Linux system, has failed. One way to confirm your suspicion that a hub or switch has failed is to use a crossover cable and connect the affected system to another working system. A crossover cable is essentially the same thing as a standard LAN cable, but with four of the pins reversed. As a result, two systems can communicate directly with each other. If the two systems can communicate, it is likely that a hub or switch servicing the system you once suspected is experiencing a problem.

When inspecting a hub or switch, look for the following:


Whether the device has power

Most hubs and all switches in a larger network are active devices. If they do not have power, you won't have a network connection.


Disconnected cables

The system's cable may have been simply disconnected and needs to be reconnected.


Steadily blinking lights

If lights are blinking steadily and continuously or remain on steadily, the device is experiencing a problem. Sometimes power surges cause devices to fail. Try powering it down and back on or simply replacing it.


Activated warning lights

Some hubs and switches have warning lights that will indicate a problem. Look for them, then take steps to solve the problem.


Misconfigured hardware settings

In one case, a Linux administrator noticed that a hub had a button pressed that caused one of its ports to effectively act as a straight through connection, rather than as a proper hub. Simply deselecting the button solved the problem. Look on the hub to see if there are any other improperly selected switches.

41.3.2. Problems with the Interface Card

If an NIC completely fails, it will fail to initialize at boot or respond to the ifup command. The lspci and usbview commands can help you determine whether the NIC has been recognized as valid hardware. If these commands do not show that the system recognizes the NIC, consider installing a new one. But problems can also exist elsewhere than the NIC. Make sure that you know what your system is telling you by consulting log files and reviewing screen output .

Finally, you can inspect the lights that ship with most NICs. These lights may seem to be useless, but they can help you determine whether the NIC is receiving power. If the lights are flashing randomly, it is likely that the device is receiving traffic. If you find that the lights are blinking steadily, it is likely the device has a configuration problem. If you find that one or more lights are simply staying on constantly, you likely have a hardware configuration problem. Of course, if the lights are not turned on at all, the NIC has not been recognized by your system's bus or has completely failed.

To solve such problems, make sure that the NIC is recognized by the Linux server. Consult the distribution's Hardware Compatibility List (HCL). You may find that you will have to get a new NIC.

41.3.3. Reviewing Screen Output

Do not simply focus on the NIC's lights or on the system's log files. Some systems are configured to report critical problems directly to the screen. In other cases, problems experienced by the NIC can cause warning messages to be printed on the screen, even though the system is not specially configured for this.

In most cases, the messages you see printed to the screen will be seen when the system boots. Messages can indicate that the system is delaying the interface's intialization or can report errors in transmission and reception.

41.3.4. Changes to the Kernel and /etc/modules

When a Linux system scans for PCI devices at boot time, it recognizes the devices it finds in the order that it finds them. The first card recognized becomes eth0, the second card recognized becomes eth1, and so forth. Recognition involves the act of detecting the hardware and assigning any drivers and modules. Sometimes, a seemingly innocuous change to the system can cause problems with PCI-based network devices.

In some cases, changes to the /etc/modules.conf file can cause devices to go undetected or to be detected in a different order. In one case, an application completely unrelated to networking rewrote the /etc/modules.conf file and inadvertently changed the order that modules were installed for a dual-NIC Linux router. The update to /etc/modules.conf changed the order in which the NICs were recognized. With the change, the NIC that used to be recognized as eth0 for two years was suddenly recognized as eth1. Because this system was a router, the eth0 device was configured to masquerade connections, whereas the eth1 device was not.

Normally, this would not be a problem, except that the company's ISP required all Internet-facing network interfaces to register their MAC addresses. Now the system was recognizing an unregistered card as eth0, and the ISP would not recognize the new eth1 card as an Internet-addressable device. So a seemingly simple change to the /etc/modules.conf file caused serious networking problems for the company.

Updating the kernel can also sometimes affect the order in which PCI devices are scanned, similarly to the previous example.

Finally, if for some reason a NIC's driver is loaded at a different time from previous boots, this NIC may be recognized earlier or later than before. As a result, the NIC may be assigned a different name.

Solving the problems discussed in this section is relatively trivialonce you know what caused the problems in the first place. In the first instance, simply editing the /etc/modules.conf file and specifying the previous module installation order solved the problem. For the second problem, physically swapping the NICs would work. For the third problem, you can either swap the NICs or change the time when the drivers are assigned during the boot process. You would have to either reconfigure some boot scripts or possibly use an application such as YaST or netconfig.

Whatever solution you choose, it is important to understand that a seemingly unrelated change to the system can cause a ripple effect.

41.3.5. Checking Log Files

If you have an interface experiencing problems, you likely will be able to read about it in the /var/log/messages or /var/log/syslog files. You can also review dmesg output to view the contents of the kernel message buffer. Understand, however, that this buffer can be overwritten, resulting in incomplete information from the kernel.

This is because this buffer is a "round-robin buffer," meaning that if the kernel needs more space, it will delete log entries, starting at the beginning. It is important to understand that even though the log file can be overwritten starting at the beginning of the log file, the latest messages are stored at the end of the buffer. So, you will at least be able to read the most current messages.




LPI Linux Certification in a Nutshell
LPI Linux Certification in a Nutshell (In a Nutshell (OReilly))
ISBN: 0596005288
EAN: 2147483647
Year: 2004
Pages: 257

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net