Troubleshooting EGP

 

The earlier section "Shortcomings of EGP" discussed several reasons why EGP cannot be used in complex inter-AS topologies. An unexpected benefit is that by forcing a simple topology, EGP is easy to troubleshoot.

As with any routing protocol, the first step in troubleshooting EGP is examining the routing tables. If a required route is missing or an unwanted route is present, the routing tables should lead you to the source of the problem. Because the EGP metrics have very little meaning, using the routing tables for troubleshooting is greatly simplified in comparison with other routing protocols.

When examining EGP configurations, remember that the gateway must have some sort of neighbor statement ”either explicit or neighbor any ”for every neighbor. Understanding the use of the network statement, and how it differs from the network statement used with IGPs, is also important.

The debug ip egp transactions command, used several times in the "Operation of EGP" section, is a very useful troubleshooting tool. The output of this command reveals all the important information in all the EGP messages being exchanged between neighbors.

Interpreting the Neighbor Table

An examination of the EGP neighbor table using show ip egp will tell you about the state and configuration of a gateway's neighbors. Example 1-18 displayed the output of this command; Example 1-22 shows some additional output from the show ip egp command that examines Stan's neighbor table.

Example 1-22 show ip egp Command Output Displays Information Useful for Troubleshooting EGP Peers
 Stan#  show ip egp  Local autonomous system is 65501  EGP Neighbor     FAS/LAS    State    SndSeq RcvSeq Hello  Poll j/k Flags *192.168.18.2    65501/65501 UP  2:08   3227     43    60   180   4 Temp, Act *192.168.16.2    65502/65501 UP  6d17   3233   3233    60   180   4 Temp, Act Stan# 

You can see in Stan's neighbor table that neighbor 192.168.18.2 is an interior neighbor, because the FAS and LAS are the same (65501). The state of the neighbor is shown, as is its uptime. Whereas 192.168.18.2 has been up for just over 2 hours, 192.168.16.2 has been up for 6 days and 17 hours. The present sequence number being used by the gateway for each neighbor is shown, as is the present sequence number being used by the neighbor.

After the Hello and Poll intervals, the number of neighbor reachability messages that have been received in the past four Hello intervals is recorded. This number is used to determine whether a neighbor should be declared Up or Down, based on two values known as the j and k thresholds. The j threshold specifies the number of neighbor reachability messages that must be received during four Hello intervals before a Down neighbor is declared Up. The k threshold specifies the minimum number of neighbor reachability messages that must be received within four Hello intervals to prevent an Up neighbor from being declared Down. The thresholds, shown in Table 1-9, differ for active and passive neighbors.

Table 1-9. EGP j and k Thresholds
Threshold Active Passive Description
j 3 1 Neighbor Up threshold
k 1 4 Neighbor Down threshold

The next field (Flags) in Example 1-22 specifies whether the neighbor is permanent or temporary. Permanent neighbors are neighbors that have been explicitly configured with a neighbor statement, whereas temporary neighbors have been implicitly peered under the neighbor any statement. In Example 1-22, you can see that both of Stan's neighbors are temporary; this fits with the configuration of Stan discussed earlier, in which there is a single neighbor any statement. Comparing Example 1-22 with Example 1-18, you might find it interesting that although Stan sees Ollie (192.168.18.2) as a temporary neighbor, Ollie sees Stan (192.168.18.1) as a permanent neighbor. An examination of Ollie's configuration in Example 1-23 shows why.

Example 1-23 Neighbor Configuration of Router Ollie
  autonomous-system 65501   !   router egp 0   network 192.168.19.0   network 192.168.22.0   network 192.168.18.0   neighbor 192.168.19.3   neighbor 192.168.19.3 third-party 192.168.19.2   neighbor 192.168.19.2   neighbor 192.168.19.2 third-party 192.168.19.3   neighbor 192.168.18.1   neighbor any  

The explicit neighbor 192.168.18.1 causes Ollie to classify Stan as a permanent neighbor.

The last field indicates whether the local router is the active or the passive neighbor. Example 1-22 shows that Stan is the active neighbor for both of its peer relationships, so you would expect Ollie to show that it is the passive neighbor. Example 1-18 bears out this assumption and also indicates that Ollie is the active neighbor for all of its other peer relationships. This is also to be expected, because AS 65501 is lower than the other AS numbers .

Case Study: Converging at the Speed of Syrup

A distinct characteristic of EGP is that nothing happens quickly. The neighbor acquisition process is slow, and the advertisement of network changes is almost glacial. As a result, you might sometimes mistakenly assume that there is a problem where none exists (except for the problematic nature of EGP itself). For example, suppose users in AS 65503 of Figure 1-13 complain that they cannot reach network 172.17.0.0 in AS 65502. When you examine Groucho's routing table, there is a route to 172.17.0.0 (see Example 1-24), but a ping to a known address on that network fails. You might be led to believe that traffic to the network is being misrouted, or black holed.

A clue to the problem is shown in Ollie's routing table (see Example 1-25). Notice that a new update for network 172.17.0.0 has not been received in more than 16 minutes, but the route entry for the network is still valid and is still being advertised to Ollie's neighbors.

Example 1-24 Groucho in Figure 1-13 Has a Route to 172.17.0.0, but the Network Is Unreachable
 Groucho#  show ip route  Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP        D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area        E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP        i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, * - candidate default Gateway of last resort is 192.168.19.1 to network 0.0.0.0 E    10.0.0.0 [140/4] via 192.168.19.1, 00:01:23, Ethernet0 E    192.168.16.0 [140/4] via 192.168.19.1, 00:01:23, Ethernet0 E    192.168.17.0 [140/4] via 192.168.19.1, 00:01:23, Ethernet0 C    192.168.19.0 is directly connected, Ethernet0 C    192.168.20.0 is directly connected, Loopback0 E    192.168.21.0 [140/4] via 192.168.19.1, 00:01:24, Ethernet0 E    192.168.22.0 [140/1] via 192.168.19.1, 00:01:24, Ethernet0 E    172.16.0.0 [140/4] via 192.168.19.1, 00:01:24, Ethernet0  E    172.17.0.0 [140/4] via 192.168.19.1, 00:01:24, Ethernet0  E    172.18.0.0 [140/4] via 192.168.19.1, 00:01:24, Ethernet0 E*   0.0.0.0 0.0.0.0 [140/4] via 192.168.19.1, 00:01:24, Ethernet0 Groucho#  ping 172.17.3.1  Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 172.17.3.1, timeout is 2 seconds: ..... Success rate is 0 percent (0/5) Groucho# 
Example 1-25 New Network Updates Are Not Being Advertised
 Ollie#  show ip route  Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP        D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area        N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2        E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP        i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, * - candidate default        U - per-user static route, o - ODR Gateway of last resort is not set E    10.0.0.0/8 [140/1] via 192.168.22.2, 00:01:20, Serial1 E    192.168.16.0/24 [140/1] via 192.168.18.1, 00:01:13, Serial0 E    192.168.17.0/24 [140/4] via 192.168.18.1, 00:16:14, Serial0 C    192.168.18.0/24 is directly connected, Serial0 C    192.168.19.0/24 is directly connected, Ethernet0 E    192.168.20.0/24 [140/1] via 192.168.19.2, 00:02:06, Ethernet0 E    192.168.21.0/24 [140/1] via 192.168.22.2, 00:01:21, Serial1 C    192.168.22.0/24 is directly connected, Serial1 E    172.16.0.0/16 [140/4] via 192.168.18.1, 00:01:13, Serial0  E    172.17.0.0/16 [140/4] via 192.168.18.1, 00:16:14, Serial0  E    172.18.0.0/16 [140/1] via 192.168.19.3, 00:01:59, Ethernet0 Ollie# 

Stan has not included network 172.17.0.0 in the past five update messages to Ollie. There is no black hole problem here; network 172.17.0.0 has just become unreachable due to a disconnected Ethernet interface on a router in AS 65502. EGP will not declare a route down until it has failed to receive six consecutive updates for the route. Couple this with an update interval of 180 seconds, and you will see that EGP will take 18 minutes to declare a route down. Only then will it stop including the network in its own updates. In the internetwork of Figure 1-13, 54 minutes will pass between the time the exterior gateway of AS 65502 declares network 172.17.0.0 down and the time Groucho declares the network down!



Routing TCP[s]IP (Vol. 22001)
Routing TCP[s]IP (Vol. 22001)
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 182

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net