Troubleshooting OSPF

 

Troubleshooting OSPF can sometimes be daunting, especially in a large internetwork. However, a routing problem with OSPF is no different than a routing problem with any other routing protocol; the cause will be one of the following:

  • Missing route information

  • Inaccurate route information

An examination of the route table is still the primary source of troubleshooting information. Using the show ip ospf database command to examine the various LSAs will also yield important information. For example, if a link is unstable, the LSA advertising it will change frequently. This condition is reflected in a sequence number that is conspicuously higher than that of the other LSAs. Another sign of instability is an LSA whose age never gets very high.

Keep in mind that the LS database of every router within an area is the same. So unless you suspect that the database itself is being corrupted on some routers, you can examine the LS database for the entire area by examining a single router's LS database. Another good practice is to keep a copy (hard or soft) of the link state database for each area.

Note

Troubleshooting router configuration


When examining an individual router's configuration, consider the following:

  • Do all interfaces have the correct addresses and masks?

  • Do the network area statements have the correct inverse masks to match the correct interfaces?

  • Do the network area statements put all interfaces into the correct areas?

  • Are the network area statements in the correct order?

Note

Troubleshooting adjacencies


When examining adjacencies (or the lack thereof), consider these questions:

  • Are Hellos being sent from both neighbors?

  • Are the timers set the same between neighbors?

  • Are the optional capabilities set the same between neighbors?

  • Are the interfaces configured on the same subnet (that is, do the address/mask pairs belong to the same subnet)?

  • Are the neighboring interfaces of the same network type?

  • Is a router attempting to form an adjacency with a neighbor's secondary address?

  • If authentication is being used, is the authentication type the same between neighbors? Are the passwords and (in the case of MD5) the keys the same? Is authentication enabled on all routers within the area?

  • Are any access lists blocking OSPF?

  • If the adjacency is across a virtual link, is the link configured within a stub area?

If a neighbor or adjacency is suspected of being unstable, adjacencies can be monitored with the command debug ip ospf adj . However, this command can often present more information than you want, as Figure 9.94 shows. Not only are the state changes of a neighbor recorded in great detail, but regular Hello processing is recorded. If monitoring is to be performed over an extended period, this wealth of information can overflow a router's internal logging buffers. Beginning with IOS 11.2, adjacencies can be monitored by adding the command ospf log-adja cency-changes under a router's OSPF configuration. This command will keep a simpler log of adjacency changes, as shown in Figure 9.95.

Figure 9.94. This debug output from debug ip ospf adj shows the result of temporarily disconnecting and then reconnecting a neighbor's ethernet interface.

graphics/09fig94.gif

Figure 9.95. These logging messages, resulting from the command ospf log-adjacency-changes, show the same neighbor failure as depicted in Figure 9.94, but with much less detail.

graphics/09fig95.gif

Note

Troubleshooting the link state database


If you suspect that a l ink state database is corrupted or that two databases are not synchronized, you can use the show ip ospf database database-summary command to observe the number of LSAs in each router's database. For a given area, the number of each LSA type should be the same in all routers. Next the command show ip ospf database will show the checksums for every LSA in a router's database. Within a given area, each LSA's checksum should be the same in every router's database. Verifying this status can be excruciatingly tedious for all but the smallest databases. Luckily, there are MIBs, [31] which can report the sum of a database's checksums to an SNMP management platform. If all databases in an area are synchronized , this sum should be the same for each database.

[31] Namely, ospfExternLsaCksum and ospfAreaLsaCksumSum.

Note

Troubleshooting area-wide problems


When examining an area-wide problem , keep in mind the following issues:

  • Is the ABR configured correctly?

  • Are all routers configured for the same area type? For example, if the area is a stub area, all routers must have the area stub command.

  • If address summarization is configured, is it correct?

Note

Troubleshooting performance


If performance is a problem, check the memory and CPU utilization on the routers. If memory utilization is above 70%, the link state database may be too large; if CPU utilization is consistently above 60%, instabilities may exist in the topology. If memory and/or CPU surpasses the 50% mark, the network administrator should begin an analysis of the cause of the performance stress and, based on the results of the analysis, should begin planning corrective upgrades.

Stub areas and address summarization can help to both reduce the size of the link state database and to contain instabilities. The processing of LSAs, not the SPF algorithm, puts the most burden on an OSPF router. Taken individually, type 1 and type 2 LSAs would be more processor intensive than summary LSAs. However, type 1 and 2 LSAs tend to be grouped, whereas summary LSAs are sent in individual packets. As a result, in reality summary LSAs are more processor intensive .

The following case studies demonstrate the most frequently used techniques and tools for troubleshooting OSPF.

Case Study: An Isolated Area

Intra-area packets can be routed within area 1 of Figure 9.96, but all attempts at inter-area communications fail. Suspicion should immediately fall on area 1's ABR. This suspicion is reinforced by the fact that the Internal Routers have no router entry for an ABR (Figure 9.97).

Figure 9.96. The end systems and routers within area 1 can communicate, but no traffic is being passed to or from area 0.

graphics/09fig96.gif

Figure 9.97. The command show ip ospf border-routers checks the internal route table of the internal routers. No router entry for an ABR is shown.

graphics/09fig97.gif

The next step is to verify that the physical link to the ABR is operational and that OSPF is working properly. The same Internal Router's neighbor table (Figure 9.98) shows that the neighbor state of the ABR is full, indicating that an adjacency exists. In fact, the ABR is the DR for the Token Ring network. The existence of an adjacency confirms that the link is good and that OSPF Hellos are being exchanged with the proper parameters.

Figure 9.98. The neighbor table of router National indicates that the ABR (1.1.1.1) is fully adjacent.

graphics/09fig98.gif

Other evidence relevant to the problem can be found in National's database and its route table. The database (Figure 9.99) contains only Router (type 1) and Network (type 2) LSAs. No Network Summary (type 3) LSAs, which advertise destinations outside of the area, are recorded. At the same time, there are LSAs originated by Whitney (1.1.1.1). This information again indicates that Whitney is adjacent but is not passing information from area 0 into area 1.

Figure 9.99. National's link state database also shows that Whitney is adjacent, but is not advertising inter-area destinations.

graphics/09fig99.gif

The only destinations outside of area 1 in National's route table (Figure 9.100) are the serial links attached to Whitney. Yet another clue is revealed here: The route entries are tagged as intra-area routes (O); if they were in area 0, as Figure 9.96 shows they should be, they would be tagged as inter-area routes (O IA). The problem is apparently on the area 0 side of the ABR.

Figure 9.100. Whitney is advertising the subnets of its serial interfaces, but they are being advertised as intra-area destinations.

graphics/09fig100.gif

An examination of Whitney's serial interfaces (Figure 9.101) reveals the problem, if not the cause of the problem. Both interfaces, which should be in area 0, are instead in area 1. Both interfaces are connected to topological neighbors (Louvre and Hermitage), but no OSPF neighbors are recorded. Error messages are being displayed regularly, indicating that Whitney is receiving Hellos from Louvre and Hermitage; those Hellos have their Area fields set to zero, causing a mismatch.

Figure 9.101. Whitney's serial interfaces are configured in area 1 instead of area 0; this configuration is causing error messages when area 0 Hellos are received.

graphics/09fig101.gif

Whitney's OSPF configuration is:

 
routerospf8
network172.16.0.00.0.255.255area1
network172.16.113.00.0.0.255area0

At first glance, this configuration may appear to be fine. However, recall from the first configuration case study that the network area commands are executed consecutively. The second network area command affects only interfaces that do not match the first command. With this configuration, all interfaces match the first network area command and are placed into area 1. The second command is never applied.

A correct configuration is:

 
routerospf8
network172.16.192.00.0.0.255area1
network172.16.113.00.0.0.255area0

There are, of course, several valid configurations. The important point is that the first network area command must be specific enough to match only the address of the area 1 interface, and not the addresses of the area 0 interfaces.

Case Study: Misconfigured Summarization

Figure 9.102 shows a backbone area and three attached areas. To reduce the size of the link state database and to increase the stability of the internetwork, summarization will be used between areas.

Figure 9.102. The summary addresses shown for each area will be advertised into area 0. Area 0 will also be summarized into the other areas.

graphics/09fig102.gif

The individual subnets of the three nonbackbone areas are summarized with the addresses shown in the figure. For example, a few of the subnets of area 1 may be:

172.16.192.0/29

172.16.192.160/29

172.16.192.248/30

172.16.217.0/24

172.16.199.160/29

172.16.210.248/30

Figure 9.103 shows that these subnet addresses can all be summarized with 172.16.192.0/19.

Figure 9.103. A few of the subnet addresses that are summarized with 172.16.192.0/19. The bold type indicates the network bits of each address.

graphics/09fig103.gif

Whitney's configuration is:

 
routerospf8
network172.16.192.00.0.0.255area1
network172.16.113.00.0.0.255area0
area1range172.16.192.0255.255.224.0
area0range172.16.113.0255.255.224.0

The other three ABRs are configured similarly. Each ABR will advertise the summary address of its attached non-backbone area into area 0 and will also summarize area 0 into the non-backbone area.

Figure 9.104 shows that there is a problem. When the route table of one of area 1's Internal Routers is examined, area 0 is not being summarized properly (area 1's internal subnets are not shown, for clarity). Although the summary addresses for areas 2 and 3 are present, the individual subnets of area 0 are in the table instead of its summary address.

Figure 9.104. The individual subnets of area 0, instead of the expected summary address, are recorded in the route table of one of area 1's internal routers.

graphics/09fig104.gif

You can see that the area range command for area 0 is the problem when you examine the three subnets of area 0 in binary (Figure 9.105).

Figure 9.105. The subnets of area 0, the configured summary mask, and the correct summary address.

graphics/09fig105.gif

The problem is that the summary address specified in the area range command (172.16.113.0) is more specific than the accompanying mask (255.255.224.0). The correct address to use with the 19-bit mask is 172.16.96.0:

 
routerospf8
network172.16.192.00.0.0.255area1
network172.16.113.00.0.0.255area0
area1range172.16.192.0255.255.224.0
area0range172.16.96.0255.255.224.0

Figure 9.106 shows the resulting route table. There are other options for area 0's summary address. For example, 172.16.113.0/24 and 172.16.113.0/27 are both legitimate . The most appropriate summary address depends on the priorities of the internetwork design. In the case of the internetwork of Figure 9.99, 172.16.96.0/19 might be selected for consistency ”all summary addresses have a 19-bit mask. On the other hand, 172.16.113.0/27 might be selected for better scalability; five more subnets can be added to the backbone under this summary address, leaving a wider range of addresses to be used elsewhere in the internetwork.

Figure 9.106. Area 0 is now being summarized correctly.

graphics/09fig106.gif



Routing TCP[s]IP (Vol. 11998)
Routing TCP[s]IP (Vol. 11998)
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 224

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net