Troubleshooting OSPF can sometimes be daunting, especially in a large internetwork. However, a routing problem with OSPF is no different than a routing problem with any other routing protocol; the cause will be one of the following:
An examination of the route table is still the primary source of troubleshooting information. Using the show ip ospf database command to examine the various LSAs will also yield important information. For example, if a link is unstable, the LSA advertising it will change frequently. This condition is reflected in a sequence number that is conspicuously higher than that of the other LSAs. Another sign of instability is an LSA whose age never gets very high. Keep in mind that the LS database of every router within an area is the same. So unless you suspect that the database itself is being corrupted on some routers, you can examine the LS database for the entire area by examining a single router's LS database. Another good practice is to keep a copy (hard or soft) of the link state database for each area. Note Troubleshooting router configuration When examining an individual router's configuration, consider the following:
Note Troubleshooting adjacencies When examining adjacencies (or the lack thereof), consider these questions:
If a neighbor or adjacency is suspected of being unstable, adjacencies can be monitored with the command debug ip ospf adj . However, this command can often present more information than you want, as Figure 9.94 shows. Not only are the state changes of a neighbor recorded in great detail, but regular Hello processing is recorded. If monitoring is to be performed over an extended period, this wealth of information can overflow a router's internal logging buffers. Beginning with IOS 11.2, adjacencies can be monitored by adding the command ospf log-adja cency-changes under a router's OSPF configuration. This command will keep a simpler log of adjacency changes, as shown in Figure 9.95. Figure 9.94. This debug output from debug ip ospf adj shows the result of temporarily disconnecting and then reconnecting a neighbor's ethernet interface.
Figure 9.95. These logging messages, resulting from the command ospf log-adjacency-changes, show the same neighbor failure as depicted in Figure 9.94, but with much less detail.
Note Troubleshooting the link state database If you suspect that a l ink state database is corrupted or that two databases are not synchronized, you can use the show ip ospf database database-summary command to observe the number of LSAs in each router's database. For a given area, the number of each LSA type should be the same in all routers. Next the command show ip ospf database will show the checksums for every LSA in a router's database. Within a given area, each LSA's checksum should be the same in every router's database. Verifying this status can be excruciatingly tedious for all but the smallest databases. Luckily, there are MIBs, [31] which can report the sum of a database's checksums to an SNMP management platform. If all databases in an area are synchronized , this sum should be the same for each database.
Note Troubleshooting area-wide problems When examining an area-wide problem , keep in mind the following issues:
Note Troubleshooting performance If performance is a problem, check the memory and CPU utilization on the routers. If memory utilization is above 70%, the link state database may be too large; if CPU utilization is consistently above 60%, instabilities may exist in the topology. If memory and/or CPU surpasses the 50% mark, the network administrator should begin an analysis of the cause of the performance stress and, based on the results of the analysis, should begin planning corrective upgrades. Stub areas and address summarization can help to both reduce the size of the link state database and to contain instabilities. The processing of LSAs, not the SPF algorithm, puts the most burden on an OSPF router. Taken individually, type 1 and type 2 LSAs would be more processor intensive than summary LSAs. However, type 1 and 2 LSAs tend to be grouped, whereas summary LSAs are sent in individual packets. As a result, in reality summary LSAs are more processor intensive . The following case studies demonstrate the most frequently used techniques and tools for troubleshooting OSPF. Case Study: An Isolated AreaIntra-area packets can be routed within area 1 of Figure 9.96, but all attempts at inter-area communications fail. Suspicion should immediately fall on area 1's ABR. This suspicion is reinforced by the fact that the Internal Routers have no router entry for an ABR (Figure 9.97). Figure 9.96. The end systems and routers within area 1 can communicate, but no traffic is being passed to or from area 0.
Figure 9.97. The command show ip ospf border-routers checks the internal route table of the internal routers. No router entry for an ABR is shown.
The next step is to verify that the physical link to the ABR is operational and that OSPF is working properly. The same Internal Router's neighbor table (Figure 9.98) shows that the neighbor state of the ABR is full, indicating that an adjacency exists. In fact, the ABR is the DR for the Token Ring network. The existence of an adjacency confirms that the link is good and that OSPF Hellos are being exchanged with the proper parameters. Figure 9.98. The neighbor table of router National indicates that the ABR (1.1.1.1) is fully adjacent.
Other evidence relevant to the problem can be found in National's database and its route table. The database (Figure 9.99) contains only Router (type 1) and Network (type 2) LSAs. No Network Summary (type 3) LSAs, which advertise destinations outside of the area, are recorded. At the same time, there are LSAs originated by Whitney (1.1.1.1). This information again indicates that Whitney is adjacent but is not passing information from area 0 into area 1. Figure 9.99. National's link state database also shows that Whitney is adjacent, but is not advertising inter-area destinations.
The only destinations outside of area 1 in National's route table (Figure 9.100) are the serial links attached to Whitney. Yet another clue is revealed here: The route entries are tagged as intra-area routes (O); if they were in area 0, as Figure 9.96 shows they should be, they would be tagged as inter-area routes (O IA). The problem is apparently on the area 0 side of the ABR. Figure 9.100. Whitney is advertising the subnets of its serial interfaces, but they are being advertised as intra-area destinations.
An examination of Whitney's serial interfaces (Figure 9.101) reveals the problem, if not the cause of the problem. Both interfaces, which should be in area 0, are instead in area 1. Both interfaces are connected to topological neighbors (Louvre and Hermitage), but no OSPF neighbors are recorded. Error messages are being displayed regularly, indicating that Whitney is receiving Hellos from Louvre and Hermitage; those Hellos have their Area fields set to zero, causing a mismatch. Figure 9.101. Whitney's serial interfaces are configured in area 1 instead of area 0; this configuration is causing error messages when area 0 Hellos are received.
Whitney's OSPF configuration is:
At first glance, this configuration may appear to be fine. However, recall from the first configuration case study that the network area commands are executed consecutively. The second network area command affects only interfaces that do not match the first command. With this configuration, all interfaces match the first network area command and are placed into area 1. The second command is never applied. A correct configuration is:
There are, of course, several valid configurations. The important point is that the first network area command must be specific enough to match only the address of the area 1 interface, and not the addresses of the area 0 interfaces. Case Study: Misconfigured SummarizationFigure 9.102 shows a backbone area and three attached areas. To reduce the size of the link state database and to increase the stability of the internetwork, summarization will be used between areas. Figure 9.102. The summary addresses shown for each area will be advertised into area 0. Area 0 will also be summarized into the other areas.
The individual subnets of the three nonbackbone areas are summarized with the addresses shown in the figure. For example, a few of the subnets of area 1 may be: 172.16.192.0/29 172.16.192.160/29 172.16.192.248/30 172.16.217.0/24 172.16.199.160/29 172.16.210.248/30 Figure 9.103 shows that these subnet addresses can all be summarized with 172.16.192.0/19. Figure 9.103. A few of the subnet addresses that are summarized with 172.16.192.0/19. The bold type indicates the network bits of each address.
Whitney's configuration is:
The other three ABRs are configured similarly. Each ABR will advertise the summary address of its attached non-backbone area into area 0 and will also summarize area 0 into the non-backbone area. Figure 9.104 shows that there is a problem. When the route table of one of area 1's Internal Routers is examined, area 0 is not being summarized properly (area 1's internal subnets are not shown, for clarity). Although the summary addresses for areas 2 and 3 are present, the individual subnets of area 0 are in the table instead of its summary address. Figure 9.104. The individual subnets of area 0, instead of the expected summary address, are recorded in the route table of one of area 1's internal routers.
You can see that the area range command for area 0 is the problem when you examine the three subnets of area 0 in binary (Figure 9.105). Figure 9.105. The subnets of area 0, the configured summary mask, and the correct summary address.
The problem is that the summary address specified in the area range command (172.16.113.0) is more specific than the accompanying mask (255.255.224.0). The correct address to use with the 19-bit mask is 172.16.96.0:
Figure 9.106 shows the resulting route table. There are other options for area 0's summary address. For example, 172.16.113.0/24 and 172.16.113.0/27 are both legitimate . The most appropriate summary address depends on the priorities of the internetwork design. In the case of the internetwork of Figure 9.99, 172.16.96.0/19 might be selected for consistency ”all summary addresses have a 19-bit mask. On the other hand, 172.16.113.0/27 might be selected for better scalability; five more subnets can be added to the backbone under this summary address, leaving a wider range of addresses to be used elsewhere in the internetwork. Figure 9.106. Area 0 is now being summarized correctly.
|