Classless Interdomain Routing | Routing TCP[s]IP (Vol. 22001)

The invention of autonomous systems and exterior routing protocols solved the early scalability problems on the Internet in the 1980s. However, by the early 1990s the Internet was beginning to present a different set of scalability problems, including the following:

Explosion of the Internet routing tables. The exponentially growing routing tables were becoming increasingly unmanageable both by the routers of the time and the people who managed them. The mere size of the tables was burden enough on Internet resources, but day-to-day topological changes and instabilities added heavily to the load.
Depletion of the Class B address space. In January 1993, 7133 of the 16,382 available Class B addresses had been assigned; at 1993 growth rates, the entire Class B address space would be depleted in less than 2 years (as cited in RFC 1519).
The eventual exhaustion of the entire 32-bit IP address space.

Classless interdomain routing provides a short-term solution to the first two problems. Another short- term solution is network address translation (NAT), discussed in Chapter 4, "Network Address Translation." These solutions were intended to buy the Internet architects enough time to create a new version of IP with enough address space for the foreseeable future. That initiative, known as IP Next Generation (IPng), resulted in the creation of IPv6, with a 128-bit address format. IPv6, discussed in Chapter 8, "IP Version 6," is the long-term solution to the third problem. Interestingly, CIDR and NAT have been so successful that few people place as much urgency on the migration to IPv6 as they once did.

CIDR is merely a politically sanctioned address summarization scheme that takes advantage of the hierarchical structure of the Internet. So before discussing CIDR further, a review of summarization and classless routing, and a look at the modern Internet, are in order.

A Summarization Summary

Summarization or route aggregation (discussed extensively in Routing TCP/IP, Volume I ) is the practice of advertising a contiguous set of addresses with a single, less-specific address. Basically, summarization/route aggregation is accomplished by reducing the length of the subnet mask until it masks only the bits common to all the addresses being summarized. In Figure 2-1, for example, the four subnets (172.16.100.192/28, 172.16.100.208/28, 172.16.100.224/28, and 172.16.100.240/28) are summarized with the single aggregate address 172.16.100.192/26.

Figure 2-1. Route Aggregation

graphics/02fig01.gif

Many networkers who view summarization as a difficult topic are surprised to learn that they use summarization daily. What is a subnet address, after all, other than a summarization of a contiguous group of host addresses? For example, the subnet address 192.168.5.224/27 is the aggregate of host addresses 192.168.5.224/32 through 192.168.5.255/32. (The "host address" 192.168.5.224/32 is, of course, the address of the data link itself.) The key characteristic of a summary address is that its mask is shorter than the masks of the addresses it is summarizing. The ultimate summary address is the default address, 0.0.0.0/0, commonly written as just 0/0. As the /0 indicates, the mask has shrunk until no network bits remain ”the address is the aggregate of all IP addresses.

Summarization can also cross class boundaries. For example, the four Class C networks (192.168.0.0, 192.168.1.0, 192.168.2.0, and 192.168.3.0) can all be summarized with the aggregate address 192.168.0.0/22. Notice that the aggregate, with its 22-bit mask, is no longer a legal Class C address. Therefore, to support the aggregation of major class network addresses, the routing environment must be classless.

Classless Routing

Classless routing features two aspects:

Classlessness can be a characteristic of a routing protocol.
Classlessness can be a characteristic of a router.

Classless routing protocols carry, as part of the routing information, a description of the network portion of each advertised address. The network portion of a network address is commonly referred to as the address prefix. An address prefix can be described by including an address mask, a length field that indicates how many bits of the address are prefix bits, or by including only the prefix bits in the update (see Figure 2-2). The classless IP routing protocols are RIP-2, EIGRP, OSPF, Integrated IS-IS, and BGP-4.

Figure 2-2. Advertising an Address Prefix with a Classless Routing Protocol

graphics/02fig02.gif

A classful router records destination addresses in its routing table as major class networks and subnets of those networks. When it performs a route lookup, it first looks up the major class network address and then tries to find a match in its list of subnets under that major address. A classless router ignores address classes and merely attempts a "longest match." That is, for any given destination address, it chooses the route that matches the most bits of the address. Take the routing table of Example 2-1, for instance, which shows several variably subnetted IP networks. If the router is classless, it attempts to find the longest match for each destination address.

Example 2-1 A Routing Table Containing Several Variably Subnetted IP Networks

 Cleveland#  show ip route  Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP        D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area        E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP        i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, * - candidate default Gateway of last resort is 192.168.2.130 to network 0.0.0.0 O E2 192.168.125.0 [110/20] via 192.168.2.2, 00:11:19, Ethernet0 O    192.168.75.0 [110/74] via 192.168.2.130, 00:11:19, Serial0 O E2 192.168.8.0 [110/40] via 192.168.2.18, 00:11:19, Ethernet1      192.168.1.0 is variably subnetted, 3 subnets, 3 masks O E1    192.168.1.64 255.255.255.192            [110/139] via 192.168.2.134, 00:11:20, Serial1 O E1    192.168.1.0 255.255.255.128            [110/139] via 192.168.2.134, 00:00:34, Serial1 O E2    192.168.1.0 255.255.255.0            [110/20] via 192.168.2.2, 00:11:20, Ethernet0      192.168.2.0 is variably subnetted, 4 subnets, 2 masks C       192.168.2.0 255.255.255.240 is directly connected, Ethernet0 C       192.168.2.16 255.255.255.240 is directly connected, Ethernet1 C       192.168.2.128 255.255.255.252 is directly connected, Serial0 C       192.168.2.132 255.255.255.252 is directly connected, Serial1 O E2 192.168.225.0 [110/20] via 192.168.2.2, 00:11:20, Ethernet0 O E2 192.168.230.0 [110/20] via 192.168.2.2, 00:11:21, Ethernet0 O E2 192.168.198.0 [110/20] via 192.168.2.2, 00:11:21, Ethernet0 O E2 192.168.215.0 [110/20] via 192.168.2.2, 00:11:21, Ethernet0 O E2 192.168.129.0 [110/20] via 192.168.2.2, 00:11:21, Ethernet0 O E2 192.168.131.0 [110/20] via 192.168.2.2, 00:11:21, Ethernet0 O E2 192.168.135.0 [110/20] via 192.168.2.2, 00:11:21, Ethernet0 O*E2 0.0.0.0 0.0.0.0 [110/1] via 192.168.2.130, 00:11:21, Serial0 O E2 192.168.0.0 255.255.0.0 [110/40] via 192.168.2.18, 00:11:22, Ethernet1 Cleveland#

If the router receives a packet with a destination address of 192.168.1.75, several entries in the routing table match the address: 192.168.0.0/16, 192.168.1.0/24, 192.168.1.0/25, and 192.168.1.64/26. The entry 192.168.1.64/26 is chosen (see Example 2-2) because it matches 26 bits of the destination address ”the longest match.

Example 2-2 A Packet with a Destination Address of 192.168.1.75 Is Forwarded Out Interface S1

 Cleveland#  show ip route 192.168.1.75   Routing entry for 192.168.1.64 255.255.255.192  Known via "ospf 1", distance 110, metric 139, type extern 1   Redistributing via ospf 1  Last update from 192.168.2.134 on Serial1, 06:46:52 ago  Routing Descriptor Blocks:   * 192.168.2.134, from 192.168.7.1, 06:46:52 ago, via Serial1       Route metric is 139, traffic share count is 1

A packet with a destination address of 192.168.1.217 will not match 192.168.1.64/26, nor will it match 192.168.1.0/25. The longest match for this address is 192.168.1.0/24, as demonstrated in Example 2-3.

Example 2-3 The Router Cannot Match 192.168.1.217 to a More-Specific Subnet, So It Matches the Network Address 192.168.1.0/24

 Cleveland#  show ip route 192.168.1.217   Routing entry for 192.168.1.0 255.255.255.0  Known via "ospf 1", distance 110, metric 20, type extern 2, forward metric 10   Redistributing via ospf 1  Last update from 192.168.2.2 on Ethernet0, 06:48:18 ago  Routing Descriptor Blocks:   * 192.168.2.2, from 10.2.1.1, 06:48:18 ago, via Ethernet0       Route metric is 20, traffic share count is 1

The longest match that can be made for destination address 192.168.5.3 is the aggregate address 192.168.0.0/16, as demonstrated in Example 2-4.

Example 2-4 Packets Destined for 192.168.5.3 Do Not Match a More-Specific Subnet or Network, and Therefore Match the Supernet 192.168.0.0/16

 Cleveland#  show ip route 192.168.5.3   Routing entry for 192.168.0.0 255.255.0.0, supernet  Known via "ospf 1", distance 110, metric 139, type extern 1   Redistributing via ospf 1  Last update from 192.168.2.18 on Ethernet1, 06:49:26 ago  Routing Descriptor Blocks:   * 192.168.2.18, from 192.168.7.1, 06:49:26 ago, via Ethernet1       Route metric is 139, traffic share count is 1

Finally, a destination address of 192.169.1.1 will not match any of the network entries in the routing table, as demonstrated in Example 2-5. However, packets with this destination address are not dropped, because the routing table of Example 2-1 contains a default route. The packets are forwarded to next-hop router 192.168.2.130.

Example 2-5 No Match Is Found in the Routing Table for 192.169.1.1; Packets Destined for This Address Are Forwarded to the Default Address, Out Interface S0

 Cleveland#  show ip route 192.169.1.1  % Network not in table

Beginning with IOS 11.3, Cisco routers are classless by default. Prior to this release, the IOS defaults were classful. You can change the default with the ip classless command.

The routing table in Example 2-1 and the associated examples demonstrates another characteristic of longest-match routing. Namely, a route to an aggregate address does not necessarily point to every member of the aggregate. Figure 2-3 shows the vectors of the routes in Examples 2-2 through 2-5.

Figure 2-3. The Vectors of Routes in the Routing Table of Example 2-1

graphics/02fig03.gif

You can consider network 192.168.1.0/24 an aggregate of all its subnets; Figure 2-3 shows that the route to this network address directs packets out interface E0. Yet routes to two of its subnets, 192.168.1.0/25 and 192.168.1.64/26, point out a different interface, S1.

NOTE

In fact, 192.168.1.64/26 is itself a member of 192.168.1.0/25. The fact that there are distinct routes for these two addresses, both pointing out S1, hints that they are advertised by separate routers somewhere upstream.

Likewise, 192.168.1.0/24 is a member of the aggregate 192.168.0.0/16, but the route to that less-specific address is out E1. The least-specific route, 0.0.0.0/0, which is an aggregate of all other addresses, is out S0. Because of longest-match routing, packets to subnets 192.168.1.64/26 and 192.168.1.0/25 are forwarded out S1, whereas packets to other subnets of network 192.168.1.0/24 are forwarded out E0. Packets with destination addresses beginning with 192.168, other than 192.168.1, are forwarded out E1, and packets whose destination addresses do not begin with 192.168 are forwarded out S0.

Summarization: The Good, the Bad, and the Asymmetric

Summarization is a great tool for conserving network resources, from the amount of memory required to store the routing table to the amount of network bandwidth and router horsepower necessary to transmit and process routing information. Summarization also conserves network resources by "hiding" network instabilities.

For example, the network in Figure 2-4 has a flapping route ”a route that, due to a bad physical connection or router interface, keeps transitioning down and up and down again.

Figure 2-4. A Flapping Route Can Destabilize the Entire Network

graphics/02fig04.gif

Without summarization, every time subnet 192.168.1.176/28 goes up or down, the information must be conveyed to every router in the corporate internetwork. Each of those routers, in turn , must process the information and adjust its routing table accordingly . If router Nashville advertises all the upstream routes with the aggregate address 192.168.1.128/25, however, changes to any of the more-specific subnets are not advertised past that router. Nashville is the aggregation point; the aggregate continues to be stable even if some of its members are not.

The price to be paid for summarization is a reduction in routing precision. In Example 2-6, interface S1 of the router in Figure 2-3 has failed, causing the routes learned from the neighbor on that interface to become invalid. Instead of dropping packets that would normally be forwarded out S1, however, such as a packet with a destination address of 192.168.1.75, the packet now matches the next-best route, 192.168.1.0/24, and is forwarded out interface E0. (Compare this to Example 2-2.)

Example 2-6 A Failed Route Can Lead to Inaccurate Packet Forwarding

 Cleveland# %LINEPROTO-5-UPDOWN: Line protocol on Interface Serial1, changed state to down %LINK-3-UPDOWN: Interface Serial1, changed state to down Cleveland#show ip route 192.168.1.75  Routing entry for 192.168.1.0 255.255.255.0  Known via "ospf 1", distance 110, metric 20, type extern 2, forward metric 10   Redistributing via ospf 1  Last update from 192.168.2.2 on Ethernet0, 00:00:20 ago  Routing Descriptor Blocks:  * 192.168.2.2, from 10.2.1.1, 00:00:20 ago, via Ethernet0  Route metric is 20, traffic share count is 1 Cleveland#

This imprecision may or may not be a problem, depending on what the rest of the internetwork looks like. Continuing with the example, suppose the next-hop router 192.168.2.2 still has a route entry to 192.168.1.64/26 via the router Cleveland, either because the internetwork has not yet converged or because the route was statically entered. In this case, a routing loop occurs. On the other hand, some router reachable via Cleveland's E0 interface may have a "back door" route to subnet 192.168.1.64/26 that should be used only if the primary route, via Cleveland's S1, becomes invalid. In this second case, the route to 192.168.1.0/24 has been designed as a backup route, and the behavior shown in Example 2-6 is intentional.

Figure 2-5 shows an internetwork in which a loss of routing precision can cause a different sort of problem. Here, routing domain 1 is connected to routing domain 2 by routers in San Francisco and Atlanta. What defines these domains is unimportant for the example. What is important is that all the networks in domain 1 can be summarized with the address 172.16.192.0/18, and all the networks in domain 2 can be summarized with the address 172.16.128.0/18.

Figure 2-5. When Multiple Routers Are Advertising the Same Aggregate Addresses, Loss of Routing Precision Can Become a Problem

graphics/02fig05.gif

Rather than advertise individual subnets, Atlanta and San Francisco advertise the summary addresses into the two domains. If a host on Dallas' subnet 172.16.227.128/26 sends a packet to a host on Seattle's subnet 172.16.172.32/28, the packet most likely is routed to Atlanta, because that is the closest router advertising domain 2's summary route. Atlanta forwards the packet into domain 2, and it arrives at Seattle. When the host on subnet 172.16.172.32/28 sends a reply, Seattle forwards that packet to San Francisco ”the closest router advertising the summary route 172.16.192.0/18.

The problem here is that the traffic between the two subnets has become asymmetric: Packets from 172.16.227.128/26 to 172.16.172.32/28 take one path, whereas packets from 172.16.172.32/28 to 172.16.227.128/26 take a different path . Asymmetry occurs because the Dallas and Seattle routers do not have complete routes to each other's subnets. They have only routes to the routers advertising the summaries and must forward packets based on those routes. In other words, the summarization at San Francisco and Atlanta has hidden the details of the internetworks behind those routers.

Asymmetric traffic can be undesirable for several reasons. First, internetwork traffic patterns become unpredictable, making baselining, capacity planning, and troubleshooting more problematic . Second, link usage can become unbalanced. The bandwidth of some links can become saturated , while other links are underutilized . Third, a distinct variation can occur in the delay times of outgoing traffic and incoming traffic. This delay variation can be detrimental to some delay-sensitive applications such as voice and live video.

The Internet: Still Hierarchical After All These Years

Although the Internet has grown away from the single-backbone architecture of the ARPANET described in Chapter 1, it retains a certain hierarchical structure. At the lowest level, Internet subscribers connect to an Internet service provider (ISP). In many cases, that ISP is one of many small providers in the local geographic area (called local ISPs). For example, there are presently almost 200 ISPs in Colorado's 303 area code. These local ISPs in turn are the customers of larger ISPs that cover an entire geographic region such as a state or a group of adjacent states. These larger ISPs are called regional service providers. Examples in Colorado are CSD Internet and Colorado Supernet. The regional service providers, in turn, connect to large ISPs with high-speed (DS-3 or OS-3 or better) backbones spanning a national or global area. These largest providers are the network service providers and include companies such as MCI/WorldCom (UUNET), SprintNet, Cable & Wireless, Concentric Network, and PSINet. More commonly, these various providers are referred to as Tier III, Tier II, and Tier I providers, respectively.

Figure 2-6 shows how these different types of ISPs are interrelated. In each case, a subscriber ”whether an end user or a lower-level service provider ”connects to a higher-level service provider at that ISP's Point of Presence (POP). A POP is just a nearby router to which the subscriber can connect via dialup or a dedicated local loop. At the highest level, the network service providers interconnect via network access points (NAPs). A NAP is a LAN or switch ”typically Ethernet, FDDI, or ATM ”across which different providers can exchange routes and data traffic.

Figure 2-6. ISP/NAP Hierarchy

graphics/02fig06.gif

As Table 2-1 shows, some NAPs are known by names such as Commercial Internet Exchange (CIX), Federal Internet Exchange (FIX), and Metropolitan Area Exchange (MAE ”originally called Metropolitan Area Ethernets, a creation of Metropolitan Fiber Systems, Inc.). CIX, FIX, and MAE-East were early experiments to connect backbones; based on the experience gained from these connection points, the National Science Foundation implemented the first four NAPs in 1994 as part of the decommissioning of the NSFnet.

Table 2-1. Well-Known Network Access Points in the United States

NAP	Location	Maintained By
New York NAP ^[*]	Pennsauken, New Jersey	Sprint
Chicago NAP ^[*]	Chicago, Illinois	Ameritech and Bellcore
San Francisco NAP ^[*]	San Francisco, California	Pacific Bell
Big East NAP	Bohemia, New York	ICS Network Systems
MAE-West	San Jose, California	MCI/WorldCom
MAE-East ^[*]	Washington, DC	MCI/WorldCom
MAE-LA	Los Angeles, California	MCI/WorldCom
MAE-Houston	Houston, Texas	MCI/WorldCom
MAE-Dallas	Dallas, Texas	MCI/WorldCom
MAE-New York	New York City, New York	MCI/WorldCom
MAE-Chicago	Chicago, Illinois	MCI/WorldCom
FIX-East	College Park, Maryland	University of Maryland
FIX-West	Moffett Field, California	NASA Ames Research Center
CIX	Santa Clara, California	Wiltel
Digital PAIX	Palo Alto, California	Digital Equipment Corporation

^[*] One of the original four NSF NAPs

In addition to the major NAPs shown in Table 2-1, where the NSPs come together, there are many smaller NAPs. These usually interconnect smaller regional providers. Examples of regional NAPs are Seattle Internet eXchange (SIX) and the New Mexico network access point.

In conjunction with the formation of the NAPs, the NSF funded the Routing Arbiter (RA) project. One of the duties of the RA is to promote Internet stability and manageability. To this end, the RA proposed a database (the RADB, or Routing Arbiter Database) of routes (topology) and policies (preferred paths) from the service providers. The database is maintained at NAPs on a route server, a UNIX workstation or server running BGP. Rather than peering with every other router at the NAP, each provider's router peers with only the route server. Routes and policies are communicated to the server, which uses a sophisticated database language called RIPE-181 to process and maintain the information. The appropriate routes are then passed to the other routers.

Although the route server speaks BGP and processes routes, it does not perform packet forwarding. Instead, its updates inform routers of the best next-hop router that is directly reachable across the NAP. You are already familiar with this concept from the discussion in Chapter 1 of EGP third-party neighbors. By making one-to-many peering feasible rather than many-to-many peering, route servers increase the stability, manageability, and throughput of traffic through the NAPs.

The NAPs and the RA project proved that the competing network service providers could cooperate to provide manageable connectivity and stability to the Internet. As a result, the NSF ceased funding of the route servers and NAPs on January 1, 1997, and turned the operations over to the commercial interests. Although publicly funded Internet research continues with such projects as Internet2, GigaPOPs, and the very high-speed Backbone Network Service (vBNS), the present Internet can be considered a commercial operation.

A result of the transition to commercial control of the Internet is that the topology of the modern Internet is far from the tidy picture drawn by the preceding paragraphs. The largest service providers, driven by financial, competitive, and policy interests, generally choose to peer directly rather than peer through route servers. The peering also takes place at many levels, rather than just at the top level shown in Figure 2-6.

When two or more service providers agree to share routes across a NAP, either directly or through a route server, they enter into a peering agreement. A peering agreement may be established directly between two providers (a bilateral peering agreement) or between a group of similar- sized providers (a multilateral peering agreement, or MLPA). Traffic patterns play a major role in determining the financial nature of the agreement. If the traffic between the peering partners is reasonably balanced in both directions, money usually does not exchange hands. The peering is equitable for the two partners . However, if the traffic is heavier in one direction than in the other across the peering point, as is the case when a small provider peers with a larger provider, the small provider usually must pay for the peering privilege. The rationale here is that the small provider benefits more from the peering than the larger provider.

Another factor muddling the Internet picture is the location of peering points. NAPs in which many providers come together, such as the ones listed in Table 2-1, are public peering sites. In addition to these public sites, service providers have created hundreds of smaller NAPs at sites where they find themselves co-located with other service providers. The peering agreements at such sites are usually private agreements between two or a few providers. Private peering is encouraged because it helps relieve congestion at the national NAPs, adds to route diversity, and can decrease delay for some traffic.

Another fact hinting that real life is not as tidy as Figure 2-6 suggests is that many national and regional service providers also sell local Internet access, in direct competition with the local ISPs. The "starting point" of the route traces in Example 2-7, for example, is a dial-in POP belonging to Concentric Network ”a backbone provider. Regional service providers also frequently have a presence at the backbone NAPs. They might connect to one or more network service providers across the NAP, or they might connect to other regional service providers across the NAP, bypassing any network service provider.

The route traces in Example 2-7 show a little of the Internet backbone structure. Both traces originated from a Concentric Network POP in Denver. In the first trace, the packets traverse Concentric Network's backbone to MAE-East, where they connect to the BBN Planet backbone (lines 3 and 4). The packets traverse BBN Planet's backbone to a Tier II NAP shared by BBN and US West in Minneapolis (lines 10 and 11) and then are passed to the US West destination.

Example 2-7 Route Traces from a Concentric Network POP in Denver

 --- traceroute to www.uswest.com (205.215.207.54),     30 hops max, 18 byte packets   1  (  207.155.168.5)  ts003e01.den-co.concentric.net  174 ms   2 (  207.155.168.1)  rt001e0102.den-co.concentric.net.168.155.207.IN-ADDR.ARPA 162 ms  3  (   207.88.24.29)  us-dc-wash-core1-a1-0d12.rtr.concentric.net  385 ms   4  (   192.41.177.2)  maeeast2.bbnplanet.net  225 ms  5  (       4.0.1.93)  p2-2.vienna1-nbr2.bbnplanet.net  232 ms   6  (      4.0.3.130)  p3-1.nyc4-nbr2.bbnplanet.net  222 ms   7  (       4.0.5.26)  p1-0.nyc4-nbr3.bbnplanet.net  223 ms   8  (      4.0.3.121)  p2-1.chicago1-nbr1.bbnplanet.net  235 ms   9  (       4.0.5.89)  p10-0-0.chicago1-br1.bbnplanet.net  239 ms  10  (       4.0.2.18)  h1-0.minneapol1-cr1.bbnplanet.net  258 ms   11  (    4.0.246.254)  h1-0.uswest-mn.bbnplanet.net  260 ms  12  (207.225.159.221)  207.225.159.221  249 ms  13  ( 205.215.207.54)  www.uswc.uswest.net  258 ms ____________________________________________________________________________________________________________ --- traceroute to www.rmi.net (166.93.8.30),     30 hops max, 18 byte packets   1  (  207.155.168.5)  ts003e01.den-co.concentric.net  152 ms   2  (  207.155.168.1)  rt001e0102.den-co.concentric.net.168.155.207.IN-ADDR.ARPA 161 ms   3  (   207.88.24.21)  207.88.24.21  190 ms  4  (   207.88.0.253)  us-ca-scl-core1-f9-0.rtr.concentric.net  189 ms  5  (   207.88.0.178)  207.88.0.178  206 ms  6  ( 144.228.207.73)  sl-gw18-chi-5-1-0-T3.sprintlink.net  210 ms  7  (  144.232.0.217)  sl-bb11-chi-3-3.sprintlink.net  216 ms   8  (  144.232.0.174)  sl-bb5-chi-4-0-0.sprintlink.net  211 ms   9  (   144.232.8.85)  sl-bb7-pen-5-1-0.sprintlink.net  225 ms  10  (   144.232.5.53)  sl-bb10-pen-1-3.sprintlink.net  236 ms  11  (   144.232.5.62)  sl-nap1-pen-4-0-0.sprintlink.net  228 ms   12  (  192.157.69.13)  p219.t3.ans.net  263 ms  13  ( 140.223.60.209)  f1-1.t60-6.Reston.t3.ans.net  264 ms  14  (  140.223.65.17)  h12-1.t64-0.Houston.t3.ans.net  286 ms  15  (  140.223.25.14)  h13-1.t80-1.St-Louis.t3.ans.net  283 ms  16  (  140.223.25.29)  h14-1.t24-0.Chicago.t3.ans.net  292 ms  17  (   140.223.9.18)  h14-1.t96-0.Denver.t3.ans.net  309 ms  18  ( 140.222.96.122)  f1-0.c96-10.Denver.t3.ans.net  313 ms  19  (  207.25.224.14)  h1-0.enss3191.t3.ans.net  306 ms  20  (  166.93.46.246)  166.93.46.246  305 ms  21  (    166.93.8.30)  www.rmi.net  285 ms

The packets in the second trace take a pretty thorough tour of the United States before arriving at their destination, a few miles from their origination. First, they follow Concentric's backbone through a router in California (line 4) and then to the Chicago NAP, where they connect to the Sprint backbone (line 6). The packets are routed to the New York NAP in Pennsauken, New Jersey, where they are passed to the ANS backbone (lines 11 and 12). They then visit routers in Reston, Houston, St. Louis, and Chicago (again), and finally arrive back in Denver.

Like the packets in the last trace, we have taken a rather lengthy and circuitous route to get back to the topic at hand, CIDR.

CIDR: Reducing Routing Table Explosion

Given the somewhat hierarchical structure of the Internet, you can see how the structure lends itself to an address summarization scheme. At the top layers , large blocks of contiguous Class C addresses are assigned by the Internet Assigned Numbers Authority (IANA) to the various addressing authorities around the globe, known as the regional IP registries. Currently, there are three regional registries. The regional registry for North and South America, the Caribbean, and sub-Saharan Africa is the American Registry for Internet Numbers (ARIN). ARIN also is responsible for assigning addresses to the global network service providers. The regional registry for Europe, the Middle East, northern Africa, and parts of Asia (the area of the former Soviet Union) is the Res aux IP Europ ens (RIPE). The regional registry for the rest of Asia and the Pacific nations is the Asia Pacific Network Information Center (APNIC).

NOTE

ARIN was spun off of the InterNIC (run by Network Solutions, Inc.) in 1997 to separate the management of IP addresses and domain names.

Table 2-2 shows the original scheme for assigning Class C addresses to the regions these registries serve, although some of the allocations are now outdated . As Example 2-8 demonstrates, the blocks labeled "Others" are now being assigned. The regional registries, in turn, assign portions of these blocks to the large service providers or to local IP registries. Generally, the blocks assigned at this level are no smaller than 32 contiguous Class C addresses (and are usually larger). Concentric Network has been assigned the block 207.155.128.0/17, for example, which includes the equivalent of 128 contiguous Class C addresses (see Example 2-8).

Table 2-2. CIDR Address Allocation by Geographic Region

Region	Address Range
Multiregional	192.0.0.0 “193.255.255.255
Europe	194.0.0.0 “195.255.255.255
Others	196.0.0.0 “197.255.255.255
North America	198.0.0.0 “199.255.255.255
Central/South America	200.0.0.0 “201.255.255.255
Pacific Rim	202.0.0.0 “203.255.255.255
Others	204.0.0.0 “205.255.255.255
Others	206.0.0.0 “207.255.255.255

Example 2-8 When a WHOIS Is Performed on the Address 207.155.128.5 from Example 2-7, the Address Is Shown as Part of a /17 CIDR Block Assigned to Concentric Network

 --- looking up 207.155.128.5 --- performing WHOIS on "207.155.128.5", please wait... --- contacting host whois.arin.net --- smart query on "207.155.128" Concentric Research Corp. (NETBLK-CONCENTRIC-CIDR)    10590 N. Tantau Ave.    Cupertino, CA  95014    Netname: CONCENTRIC-CIDR  Netblock: 207.155.128.0 - 207.155.255.255  Maintainer: CRC    Coordinator:       DNS and IP ADMIN  (DIA-ORG-ARIN)  hostmaster@CONCENTRIC.NET       (408) 342-2800 Fax- (408) 342-2810    Domain System inverse mapping provided by:    NAMESERVER3.CONCENTRIC.NET    206.173.119.72    NAMESERVER2.CONCENTRIC.NET    207.155.184.72    NAMESERVER1.CONCENTRIC.NET    207.155.183.73    NAMESERVER.CONCENTRIC.NET    207.155.183.72    Record last updated on 13-Feb-97.    Database last updated on 29-Jan-99 16:12:40 EDT.

The service providers receiving these blocks assign them in smaller blocks to their subscribers. If those subscribers are themselves ISPs, they can again break their blocks into smaller blocks. The obvious advantage of assigning these blocks of Class C addresses, called CIDR blocks, comes when the blocks are summarized back up the hierarchy. For more information on how addresses are assigned throughout the Internet, see RFC 2050 (www.isi.edu/in-notes/rfc2050.txt).

To illustrate , suppose Concentric Network assigns to one of its subscribers a portion of its 207.155.128.0/17 block, consisting of 207.155.144.0/20. If that subscriber is an ISP, it may assign a portion of that block, say 207.155.148.0/22, to one of its own subscribers. That subscriber advertises its /22 (read "slash twenty-two") block back to its ISP. That ISP in turn summarizes all of its subscribers to Concentric Network with the single aggregate 207.155.144.0/20, and Concentric Network summarizes its subscribers into the NAPs to which it is attached with the single aggregate 207.155.128.0/17.

The advertisement of a single aggregate to the higher-level domain is obviously preferable to advertising possibly hundreds of individual addresses. But an equally important benefit is the stability such a scheme adds to the Internet. If the state of a network in a low-level domain changes, that change is felt only up to the first aggregation point and no further.

Table 2-3 shows the different sizes of CIDR blocks, their equivalent size in Class C networks, and the number of hosts each block can represent.

Table 2-3. CIDR Block Sizes

CIDR Block Prefix Size	Number of Equivalent Class C Addresses	Number of Possible Host Addresses
/24	1	254
/23	2	510
/22	4	1022
/21	8	2046
/20	16	4094
/19	32	8190
/18	64	16,382
/17	128	32,766
/16	256	65,534
/15	512	131,070
/14	1024	262,142
/13	2048	524,286

CIDR: Reducing Class B Address Space Depletion

The depletion of Class B addresses was due to an inherent flaw in the design of the IP address classes. A Class C address provides 254 host addresses, whereas a Class B address provides 65,534 host addresses. That's a wide gap. Before CIDR, if your company needed 500 host addresses, a Class C address would not have served your needs. You probably would have requested a Class B address, even though you would be wasting 65,000 host addresses. With CIDR, your needs can be met with a /23 block. The host addresses that would have otherwise been wasted have been conserved.

Difficulties with CIDR

Although CIDR has proven successful in slowing both the growth of Internet routing tables and the depletion of Class B addresses, it also has presented some problems for the users of CIDR blocks.

The first problem is one of portability. If you have been given a CIDR block, the addresses are most likely part of a larger block assigned to your ISP. Suppose, however, that your ISP is not living up to your expectations or contractual agreements, or you have just gotten a more attractive offer from another ISP. A change of ISPs most likely means you must re-address. It's unlikely that an ISP will allow a subscriber to keep its assigned block when the subscriber moves to a new provider. Aside from an ISP's being unwilling to give away a portion of its own address space, regional registries strongly encourage the return of address space when a subscriber changes ISPs.

For an end user, re-addressing carries varying degrees of difficulty. The process is probably the easiest for those who use private address space within their routing domain and network address translation (see Chapter 4) at the edges of the domain. In this case, only the "public- facing " addresses have to be changed, with minimal impact on the internal users. At the other extreme are end users who have statically assigned public addresses to all their internal network devices. These users have no choice but to visit every device in the network to re-address.

Even if the end user is using the CIDR block throughout the domain, the pain of re-addressing can be somewhat reduced by the use of DHCP (or BOOTP). In this case, the DHCP scopes must be changed and users must reboot, but only some statically addressed network devices, such as servers and routers, must be individually re-addressed.

The problem is much amplified if you are an ISP rather than an end user and you want to change your upstream service provider. Not only must your own internetwork be renumbered, but so must any of your subscribers to whom you have assigned a portion of your CIDR block.

CIDR also presents a problem to anyone who wants to connect to multiple service providers. Multihoming (discussed in more depth later in this chapter) is used for redundancy so that an end user or ISP is not vulnerable to the failure of a single upstream service provider. The trouble is that if your addresses are taken from one ISP's block, you must advertise those addresses to the second provider.

Figure 2-7 shows what can happen. Here, the subscriber has a /23 CIDR block that is part of ISP1's larger /20 block. When the subscriber attaches to ISP2, he wants to ensure that traffic from the Internet can reach him through either ISP1 or ISP2. To make this happen, he must advertise his /23 block through ISP2. The trouble arises when ISP2 advertises the /23 block to the rest of the world. Now all the routers "out there" have a route to 205.113.48.0/20 advertised by ISP1 and a route to 205.113.50.0/23 advertised by ISP2. Any packets destined for the subscriber are forwarded on the more-specific route, and as a result, almost all traffic from the Internet to the subscriber is routed through ISP2 ”including traffic from sources that are geographically much closer to the subscriber through ISP1.

Figure 2-7. Incoming Internet Traffic Matches the Most-Specific Route

graphics/02fig07.gif

In Figure 2-7, it is even possible for the 205.113.50.0/23 route to be advertised into ISP1 from the Internet. This shouldn't happen, because most ISPs set route filters to prevent their own routes from reentering their domain. However, there are no guarantees that ISP1 is filtering properly. If the more-specific route should leak in from the Internet, traffic from ISP1's other subscribers could traverse the Internet and ISP2 to 205.113.50.0/23 rather than take the more-direct path.

For the subscriber to be multihomed , ISP1 must advertise the more-specific route in addition to its own CIDR block (see Figure 2-8). Most service providers will not agree to this arrangement, because it means " punching a hole" in their own CIDR block (sometimes called address leaking ). In addition to reducing the overall effectiveness of CIDR, advertising a more-specific route of its own CIDR block carries an administrative burden for the ISP.

Figure 2-8. ISP1 "Punches a Hole" in Its CIDR Block

graphics/02fig08.gif

Although Figures 2-7 and 2-8 show ISP1 as having only a single connection to the Internet, in most cases an ISP has many connections to higher-level providers and at NAPs. At each of these connections, the provider must reconfigure its router to advertise the more-specific route in addition to the CIDR block, and possibly must modify all its incoming route filters. Administration is also complicated by the fact that ISP1 and ISP2 have to closely coordinate their efforts to ensure that the subscriber's /23 block is advertised correctly. Because ISP1 and ISP2 are competitors , either or both might be resistant to working so closely together.

Even if the subscriber in Figure 2-8 can get ISP1 and ISP2 to agree to advertise its own /23 block, there is another obstacle . Some Tier I providers accept only prefixes of /19 or smaller, to control the backbone-level routing tables. If ISP1 or ISP2 or both get their Internet connectivity from one of these network service providers, they cannot advertise the subscriber's /23. The practice of filtering any CIDR addresses with a prefix larger than /19 has become so well-known that a /19 prefix is commonly referred to as a globally routable address. The implication here is that if you advertise a longer CIDR prefix, say a /21 or /22, your prefix might not be advertised to all parts of the Internet. Remember that any parts of the Internet that do not know how to reach you are essentially unreachable by you.

NOTE

Many Tier I providers have relaxed their /19 rules recently in response to increased subscriber complaints.

A possible solution for the multihomed subscriber in Figure 2-8 is to obtain a provider-independent address space (also known as a portable address space). That is, the subscriber can apply for a block that is not a part of either ISP1's or ISP2's CIDR block; both ISPs can advertise the subscriber's block without interference with their own address space. Since the formation of ARIN, obtaining a provider-independent block is somewhat easier than it was under the InterNIC. Although ARIN strongly encourages you to seek an address space first from your provider and second from your provider's provider, obtaining a provider-independent address space from ARIN is a last resort. However, you still face difficulties.

First, if you want to multihome, it is likely that your present address space was obtained from your original ISP. Changing to a provider-independent address space means renumbering , with all the difficulties already discussed. (Of course, if you obtained your IP address space in the pre-CIDR days, you are already provider-independent, making the question moot.)

Second, the registries assign address space based on justified need, not on long-term predicted need. This policy means that you probably will be allocated "just enough" space to fit your present needs and a three-month predicted need. From there, you have to justify a further allocation by proving that you are efficiently using the original space. For example, ARIN requires proof of address utilization by one of two means: the use of the Shared WHOIS Project (SWIP) or the use of a Referral WHOIS Server (RWHOIS). SWIP, most commonly used, is the practice of adding WHOIS information to a SWIP template and e-mailing it to ARIN. To use RWHOIS, you establish an RWHOIS server on your premises that ARIN can access for WHOIS information. In both cases, the WHOIS information establishes proof that you have efficiently used, and are approaching exhaustion of, your present address space.

Of course, you still have a problem if you cannot justify obtaining a globally routable (/19) address space. The bottom line is that CIDR allocation rules make multihoming a difficult problem for small subscribers and ISPs. The following section discusses multihoming in more detail, along with some alternative topologies.