ErrorFault Detection


Error/Fault Detection

The most fundamental fault detection for Ethernet interfaces is the same as that for any system interface whether the interface is operational or not. Changes in interface state trigger linkUp or linkDown traps. You may also monitor ifOperStatus. For further details, on this type of fault monitoring, please refer to the section on system interfaces in Chapter 12, "Monitoring System Interfaces." The rest of this section discusses Ethernet-specific fault management.

Ethernet errors come in a variety of flavors, but all represent a framing error of some sort. There is either an FCS error or the frame is too short or too long. Following are the basic Ethernet errors and a brief explanation of each:

  • CRC or FCS errors The Frame Check Sequence (FCS) is a 32-bit cyclic redundancy check (CRC) appended to the end of each Ethernet frame transmitted. If the receiving station detects an error in the FCS, one or more of the bits in the Ethernet frame are in error and the frame is discarded. CRC errors should occur very rarely. Ethernets (according to the IEEE 802.3 specification) should have an error rate no greater that 10 8. On an Ethernet transmitting 1500-byte packets at 100 percent load, that translates to one erred frame out of approximately 82 million frames. Cable problems, poor connections, or faulty interfaces can cause CRC errors. If you detect such errors, it is time to utilize some Layer 1/Layer 2 test equipment, such as a cable tester, to determine whether the cables and connections are up to specification.

  • Alignment Errors The frame does not have an integer number of octets and does have a bad frame check sequence. These errors indicate a faulty transmitter or a cable problem.

  • Runts or Fragments Runts or fragments are frames on the wire that are shorter than 64 bytes and that usually have an invalid FCS. The observance of fragments is normal because they can be the result of collisions.

  • Jabbers A jabber is a frame longer than 1518 bytes and with a bad FCS error. Jabbers are usually due to a malfunctioning interface.

  • Collisions These are not errors; they are normal occurrences on shared Ethernet segments. You do want to monitor them, however, because a higher-than-normal collision count can indicate a congested segment or other problems on the segment.

Collisions Caused by Duplex Configuration Errors

Although collisions are not technically errors, a very common cause of collisions on twisted pair links is a duplex-configuration error. Section 28 of the IEEE 802.3u standard defines Auto-Negotiation, which is the process for two devices to negotiate a common link speed and duplex configuration. However, interoperability between vendors on this process is notoriously poor. For two devices from different vendors to negotiate to different duplex configurations for one side to be half-duplex and the other full-duplex is a common occurrence. Such a configuration error will cause abnormally high collision rates and CRC errors. The solution is to manually configure both ends of the link to the proper link speed and duplex.

And just as in monitoring traffic, there can be different sources for very similar data. The next few sections discuss various sources of fault and error data.

MIB Variables for Ethernet Errors

The following list shows the MIB objects you can poll to collect the statistics on Ethernet interface errors:

  • ifInErrors from RFC 2233 For an Ethernet interface, this counter is the sum of three error conditions: alignment, giants, and FCS errors.

  • dot3StatsAlighnmentErrors, dot3StatsFrameTooLongs, dot3StatsFCSErrors from RFC 2358 These three variables summed together equal ifInErrors.

  • etherStatsCRCAlignErrors, etherStatsJabbers from RFC 1757 These variables are only available on the 2500 routers and Catalyst switches. Summed together, they equal ifInErrors.

For Ethernet, any framing error or data corruption is bad because it causes the MAC layer to discard the frame. Any request for retransmission of the lost frame must come after timeouts from the upper-layer protocols. Even very small numbers of framing errors can cause major degradation in performance.

Most framing errors and data corruptions are due to a physical layer problem such as a faulty interface or bad cable. In general, it is best to monitor ifInErrors because it is the sum of the main types of framing errors.

MIB Variables for Ethernet Collisions

The following list shows the MIB objects you can poll to collect the statistics on Ethernet collisions:

  • dot3StatsSingleCollisionFrames, dot3StatsMultipleCollisionFrames from RFC 2358 These counters are available on either switches or routers. As their names indicate, they are the number of frames that encountered either a single collision or multiple collisions before transmission was possible.

  • etherStatsCollisions from RFC 1757 The total number of collisions (single or multiple) on a given interface.

The RMON collision counter is easier to monitor because it is one object with the complete count of all collisions, but it is available only on 2500 routers and Catalyst switches. However, for the 2500 routers, the lance Ethernet chip used will detect collisions only when transmitting. Do not use this counter on the 2500 router to get collision counts for the whole segment. For other routers, the dot3 MIB objects are the best choice. Remember that collisions are natural on shared Ethernet or half-duplex Ethernet. The presence of collisions does not mean there is a problem. It is important to baseline the collision rate on a given segment and then watch for sudden inexplicable increases.

However, on a full-duplex segment, you should see no collisions. The presence of collision on a full-duplex segment often means that one interface on the link is configured for half-duplex transmission and the other is for full-duplex transmissions.

CLI Commands for Ethernet Errors

For routers, the best show command for examining Ethernet interface errors is the show interface command. It gives in details the current state of the interface. Example 13-4 provides sample output for show interface.

Example 13-4 Using show interface to get error information on a router's Ethernet interfaces.
 nms-7010a#sh int fa0/0 FastEthernet0/0 is up, line protocol is up  Hardware is cyBus FastEthernet Interface, address is 0060.5490.f800 (bia 0060.5490.f800)  MTU 1500 bytes, BW 100000 Kbit, DLY 100 usec, rely 255/255, load 1/255  Encapsulation ARPA, loopback not set, keepalive set (10 sec), fdx, 100BaseTX/FX  ARP type: ARPA, ARP Timeout 04:00:00  Last input 00:00:01, output 00:00:01, output hang never Last clearing of "show interface" counters 00:43:40  Queueing strategy: fifo  Output queue 0/40, 0 drops; input queue 0/75, 0 drops  5 minute input rate 2376 bits/sec, 27 packets/sec  5 minute output rate 1000 bits/sec, 7 packets/sec   3653980 packets input, 269895525 bytes, 0 no buffer   Received 3499 broadcasts, 0 runts,A 0 giantsB 1119 input errors,D 1119 CRC,C 540 frameG,0 overrun,0 ignored,0 abort 0 watchdog, 744 multicast 0 input packets with dribble condition detectedH 1507771 packets output, 161101301 bytes, 0 underruns 0 output errors, 0 collisionsE, 2 interface resetsF 0 babbles, 0 late collisionI, 0 deferredJ 0 lost carrier, 0 no carrier 0 output buffer failures, 0 output buffers swapped out 

The annotated information in Example 13-4 is as follows:

A runts: The number of input packets discarded because they were less than 64 bytes long.

B giants: Equivalent to jabbers, the number of packets discarded because they were greater that 1518 bytes long.

C CRC: The number of input frames where the checksum calculated by the router does not match the checksum at the end of the frame.

D input error: The total number of errors.

E collisions: The number of frames that had to be retransmitted because of a collision.

F interface resets: The number of times the interface has been reset either by an internal error condition or through an administrative shutdown.

G frame: The number of frames received with a CRC error and a non-integral number of octets. Could be the result of a collision or a faulty interface.

H dribble condition: The device received a frame that was slightly too long, but the frame is accepted and forwarded. The counter is for information only.

I late collision: An error indicating that something in the Ethernet is out of specification. Either the cable is too long or perhaps there are too many repeaters.

J deferred: A packet has not been transmitted due to excessive number of collisions.

For Catalyst switches, the best command to examine Ethernet interface errors is the show port counters command, as illustrated in Example 13-5.

Example 13-5 Using show port counters to get error information for a switch's Ethernet interfaces.
 nms-5505a (enable) show port counters 1/1 Port   Align-ErrA  FCS-ErrB     Xmit-ErrC     Rcv-ErrD     UnderSizeE ----- ----------   ----------   ----------    ----------   --------- 1/1    0           0            0             0            0 Port  Single-ColF Multi-CollG Late-CollH Excess-ColI Carri-Sen RuntsJ     GiantsK ----- ----------  ----------  ---------- ---------- ---------  --------- ---------  1/1   0          0           0          0           0         0         - Last-Time-Cleared -------------------------- Thu Jan 21 1999, 16:55:02 

A Align-Err: The number of frames that do not have an integer number of octets and have an incorrect frame check sequence.

B FCS-Err: The number of frames with an incorrect frame check sequence.

C Xmit-Err: The internal transmit buffer is full.

D Rcv-Err: The internal receive buffer is full.

E UnderSize: Frames smaller than 64 bytes with a good FCS.

F Single-Col: The number of times the port had a single collision before transmitting the frame.

G Multi-Coll: The number of times the port had more than one collision before transmitting the frame. Note that this counter does not count how many actual collisions occurred trying to transmit the frame only that it was more than once.

H Late-Coll: An error indicating that the Ethernet is out of specification. A cable is too long or there are too many repeaters.

I Excess-Col: The number of frames that were dropped because the port saw 16 sequential collisions attempting to transmit that one frame.

J Runts: Frames less than 64 bytes long and with a bad FCS.

K Giants: Frames greater than 1518 bytes long and with a bad FCS the same as a jabber.

Syslog Messages Relating to Ethernet Errors

Table 13-2 outlines several common System Error messages and their general causes. Each message is specific to the chipset used on the Ethernet interface. Please refer to the "Cisco IOS Software System Error Messages" guide for details on each individual message.

Table 13-2. Ethernet System Error Messages
Message Explanation

AMDP2-1-MEMERR

AMDP2_FE-3-SPURIDON

AMDP2_FE-1-DISCOVER

AMDP2_FE-1-INITFAIL

AMDP2_FE-3-UNDERFLO

DEC21140-1-DISCOVER

DEC21140-3-ERRINT

DEC21140-3-ERRINT

LANCE-4-BABBLE

LANCE-3-BADCABLE

LANCE-1-MEMERR

This list of messages usually indicates a problem on the device such as faulty interface hardware, software problems, or memory problems.

AMDP2_FE-5-COLL

AMDP2_FE-5-LATECOLL

DEC21140-5-COLL

ETHERNET-1-TXERR

LANCE-5-COLL

These types of messages are most likely the result of a duplex mismatch or just general congestion on the line. A sudden flurry of these messages may also indicate cabling problems.



Performance and Fault Management
Performance and Fault Management: A Practical Guide to Effectively Managing Cisco Network Devices (Cisco Press Core Series)
ISBN: 1578701805
EAN: 2147483647
Year: 2005
Pages: 200

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net