IPv6 Functionality

 

A number of functions designed as a part of IPv6 must be implemented by any node said to support IPv6, including the following:

  • ICMPv6

  • Neighbor discovery

  • Stateless autoconfiguration

  • Anycast

  • Multicast

  • MTU path discovery (recommended)

These functions are the basis of IPv6, in most cases enhancing the capabilities of IPv4.

Another feature of IPv6 is the ability to assign multiple addresses to any interface, easing the problem of prefix renumbering . Not only can any IPv6 interface have multiple addresses in multiple prefixes, but two nodes on a link also can communicate together directly, regardless of the prefix to which they belong.

This functionality is discussed in detail in this section. Cisco routers are configured and command output is examined to help you understand the IPv6 functionality.

Enabling IPv6 Capability on a Cisco Router

IPv6 (disabled by default) is enabled on the Cisco router by issuing the following global command:

  ipv6 unicast-routing [ table-count   num  ] 

Cisco's support enables multiple routing tables. One routing table is enabled by default. Multiple tables enable the network administrator to have more control over routing entry lookups. Longest match routing is no longer the only rule. If multiple tables are enabled, the forwarding algorithm searches the routing tables in increasing order until a usable route is found.

The next step in configuring IPv6 is to enable an IPv6 interface and enable autoconfiguration, or to configure an address. The following section discusses autoconfiguration.

The interface subcommand to enable the interface for IPv6 and configure the interface with an address is as follows :

  ipv6 address   ipv6address/prefix-length  [  link-local  ] 

The interface subcommand to enable an interface without a specific address configured is as follows:

  ipv6 enable  

The router autoconfigures a link-local unicast address as part of enabling the interface.

Two routers, Falcon and Eagle, both reside on a single Ethernet link and have the configurations shown in Example 8-1.

Example 8-1 Enabling IPv6 on Two Routers That Reside on a Single Ethernet Link
  Falcon   ipv6 unicast-routing   !   interface Ethernet0   ipv6 enable  ______________________________________________________________________  Eagle   ipv6 unicast-routing   !   interface Ethernet0   ipv6 enable  

Note that the configurations in Example 8-1 are identical.

The command to display the state of IPv6 on the interface, as well as relevant interface information, is as follows:

  show ipv6 interface   interface-type number  

Example 8-2 shows partial output from the show ipv6 interface command, which displays the Ethernet interfaces' MAC addresses and the IPv6 state and link-local addresses automatically configured on the interfaces.

Example 8-2 show ipv6 interface ethernet 0 Is Used to View IPv6 Interface Information
 Falcon#  sh int e 0  Ethernet0 is up, line protocol is up   Hardware is Lance, address is 0000.0c0a.2c51 (bia 0000.0c0a.2c51) Falcon#  show ipv6 interface ethernet 0  Ethernet0 is up, line protocol is up   IPv6 is enabled, link-local address is FE80::200:CFF:FE0A:2C51 _______________________________________________________________________ Eagle#  sh int e 0  Ethernet0 is up, line protocol is up   Hardware is Lance, address is 0000.0c76.5b7c (bia 0000.0c76.5b7c) Eagle#  show ipv6 interface ethernet 0  Ethernet0 is up, line protocol is up   IPv6 is enabled, link-local address is FE80::200:CFF:FE76:5B7C 

Notice that Falcon's MAC address 0000.0C0A.2C51 creates the link-local address FE80::200:CFF:FE0A:2C51, and Eagle's MAC address 0000.0C76.5B7C creates the link-local address FE80::200:CFF:FE76:5B7C. Also note that IPv6 is enabled.

ICMPv6

ICMPv6 is integral to IPv6. Every node that implements IPv6 must fully implement ICMPv6. ICMPv6 is a modified version of ICMP for IPv4. Error reporting and many IPv6 functions, such as MTU path discovery and neighbor discovery, utilize ICMPv6. Error messages are discussed here.

The ICMPv6 packet follows the IPv6 header or one of the extension headers and is identified by the IPv6 Next-Header value of 58 in the immediately preceding header. (This is not the same value used by IP to identify ICMP for IPv4.) Informational and error messages are identified by the high-order bit in the ICMP Type field. An error message has a zero in the high-order bit of the ICMP Type field. An ICMP error message includes as much of the offending packet as possible without making the ICMP message larger than the minimum IPv6 MTU, 1280 bytes.

The following error messages are discussed:

  • Destination Unreachable

  • Packet Too Big

  • Time Exceeded

  • Parameter Problem

Destination Unreachable errors are sent when a node cannot forward the packet for some reason other than congestion. The node sends an error message to the source of the packet, with a code indicating the following:

  • No route to the destination (0)

  • Access is administratively prohibited (1)

  • Address unreachable (3)

  • Port unreachable (4)

A node sends a Packet Too Big message when the size of the packet exceeds the MTU on the link. In IPv6, fragmentation is not performed by routers, as it is in IPv4. Only the source node performs fragmentation. The MTU of the link that caused the error is included in the packet. The Packet Too Big message is sent regardless of whether the IPv6 destination is unicast or multicast. The message is used by the MTU path discovery process.

When the IPv6 hop limit reaches zero, an ICMP Time Exceeded message is sent. A zero hop limit usually indicates a routing loop.

An ICMP Parameter Problem message is sent if a node finds a problem with part of an IP header or an extension header. A pointer to the location in the offending header is included in the error message. An error code identifies the type of problem encountered :

  • Erroneous header field encountered (0)

  • Unrecognized next-header type encountered (1)

  • Unrecognized IPv6 option encountered (2)

Neighbor Discovery

The Neighbor Discovery (ND) protocol addresses many problems related to nodes on a single link. It provides the functionality for serverless automatic configuration, router discovery, prefix discovery, address resolution, neighbor unreachability detection, link MTU discovery, next-hop determination, and duplicate address detection. With IPv4, a combination of many protocols, including DHCP, ICMP router discovery, a routing protocol, and ARP, are required to provide only some of this functionality. ND uses ICMPv6 to perform these tasks . ND intended to improve on the IPv4 processes by integrating them all into ICMPv6, a required component of IPv6.

When a node is initialized , it must know a few things before it begins communicating:

  • It must know its own address.

  • It must know its own prefix information so that it can figure out how to send packets to nodes located in other prefixes.

  • It must know about any routers on the link.

  • It needs to know how to determine the next hop in the path to a destination.

  • It needs to know how to obtain the link-level address associated with a known network layer address.

  • It needs to know how large of a packet it can send.

To make communication run a lot smoother, a node should know some other things:

  • It should be able to detect when a neighbor is no longer reachable so that it does not send packets to that neighbor.

  • It should know about neighbors on its link.

  • It should know whether the address it is trying to use is in use already by another node on the link.

  • It needs to know what other prefixes are assigned to nodes on the same link.

  • It should be able to redirect traffic to a better next-hop node, if one exists, for any destination.

ND defines five ICMPv6 packets to provide IPv6 nodes with the information they must and should know before communicating:

  • Router Solicitation (RS) ” Multicasted by a node when it wants routers to send a Router Advertisement immediately instead of waiting for the next scheduled advertisement. An initializing node may send the Router Solicitation so that it can immediately learn about configuration parameters and about the existence of routers on the link.

  • Router Advertisement (RA) ” Sent periodically or in response to a solicitation. Routers advertise their presence, as well as provide information necessary for a node to configure itself.

  • Neighbor Solicitation (NS) ” Enables a node to determine the link layer address of a neighbor or to determine whether the neighbor is still reachable via a cached link layer address. Also enables a node to determine whether a duplicate IP address exists on the link.

  • Neighbor Advertisement (NA) ” Sent in response to Neighbor Solicitations, or unsolicited if a node's link layer address changes.

  • Redirect ” Sent by routers to redirect traffic to a better first hop on the link.

Each message is an ICMP packet with a defining type. The ICMP packet contains type-specific information. Each type of message may also contain one or more TLV options.

ND provides the basis for stateless autoconfiguration ”automatic configuration without a configuration server. Router Advertisements provide the information necessary for node configuration. Autoconfiguration is discussed fully in the section "Autoconfiguration."

Router Solicitation

Hosts send Router Solicitations when they want to receive a Router Advertisement right away ”they do not want to wait for the periodic advertisement. An initializing host sends an RS so that it can quickly learn the information it needs for configuration.

An RS is an ICMP packet of type 133. Its source address is an address assigned to the sending host's interface. If no address has yet been assigned, it is the unspecified address, 0:0:0:0:0:0:0:0. The destination is typically the all-routers multicast address. The RS also may contain an option with the sender's link layer address. The link layer address must not be included if the source address is the unspecified address.

Router Advertisements

Routers advertise their presence on a link and provide the information necessary for a node to configure itself. The RA is multicast to the link-scope all-nodes multicast group .

An RA is an ICMP packet of type 134. Its source IP address is the link-local address of the sending router, and the destination address is either the address of a node that sent a Router Solicitation or the link-scope all-nodes multicast address. The hop limit must be set to 255. The hop limit is not used, in this case, to stop routers from forwarding the packet. A value of 1 ensures that the packet does not get forwarded, because a router that receives the packet decrements the hop limit and drops the packet when the hop limit reaches 0. The value of 255 ensures that no off-link device sends RAs in an attempt to disrupt traffic flow. If an off-link device does send an RA, the RA traverses a router, which automatically decrements the hop-limit value, rendering the packet invalid. One of the ways that the receiving node validates the packet is by verifying that the hop limit is 255. IPv4 does not use this method of ensuring that the packet could not possibly have traversed a router.

An RA contains a Router Lifetime. The Router Lifetime informs nodes how long they should consider the router as a default. The time is in units of seconds, with a maximum value of 18.2 hours. A value of 0 means that the router is not a default candidate and should not appear on any host's default router list.

A host receiving RAs builds a default router list. All routers that advertised RAs with non-zero valued Router Lifetimes appear in the default router list. The entry for a router's Router Lifetime value in the default list is updated with each subsequent RA received. If an RA contains a zero-valued Router Lifetime for an already listed router, the host immediately removes the router from the default list (an improvement over IPv4). IPv4 hosts have to be manually configured with default router lists. Some IPv4 hosts run a routing protocol, such as RIP, to dynamically learn this information, and some run the ICMP Router Discovery Protocol (IRDP). Neither RIP nor IRDP are implemented on all IPv4 hosts, however.

An RA also contains a Reachable Time and a Retransmit Timer. The Reachable Time informs hosts how long to assume a neighbor is alive after receiving a reachability confirmation from that neighbor. This information is used in the Neighbor Unreachability detection process. The Retransmit Timer is the time, in milliseconds , between subsequent Neighbor Solicitation messages. It is used in the address resolution and the Neighbor Unreachability detection processes.

Two bits found in the RA packet, the Managed Address (M) bit and the Other Stateful Configuration (O) bit, inform a host how it should configure itself. If the M bit is set, the host configures its address using the stateful autoconfiguration protocol, such as DHCP, in addition to any addresses configured with stateless autoconfiguration. If the O bit is set, hosts use the stateful autoconfiguration protocol to configure other information besides the address. IPv4 hosts are manually configured to indicate whether they should learn their IP configuration information via DHCP. Automatically providing this information to hosts on a link via router advertisements minimizes the amount of static configuration information contained in hosts, easing future reconfiguration efforts. The autoconfiguration methods are discussed in the section "Autoconfiguration."

The options that may be present in the RA are the source link layer address, the MTU, and prefix information. Including the source link layer address of the router in the RA eliminates the need for hosts to perform the address resolution protocol on default routers. A router may elect not to include the link layer address. The MTU option enables centralized control of the MTU that hosts on a link use. This option is used mainly for links with a variable MTU but may be used on other links. The value is set in the router, which then enables the configuration of all the hosts on the link. The prefix information is used to inform other nodes of on-link prefixes and for address autoconfiguration. A host that knows of all the prefixes that are configured on a link forwards traffic more knowledgeably. A multihomed host can choose the closest interface to any known on-link destination prefix. A nonmultihomed host uses the prefix list to assist in next-hop detection.

The prefix information option contains data that is used for both on-link determination and stateless autoconfiguration. It contains the actual prefix and the length of the prefix, which is always from 1 to 128 bits. It also contains bits that indicate whether the prefix is to be used for on-link determination or for address configuration. When the L bit is set, you can use the prefix for on-link determination. When it is not set, you can determine no information about on-link or off-link. The A bit, when set, indicates that you can use the prefix for stateless address configuration.

The prefix option also contains a Valid Lifetime value and a Preferred Lifetime value. The Valid Lifetime indicates, in seconds, how long a prefix is valid for purposes of on-link determination. The lifetime is relative to the time the packet was sent. An advertised Valid Lifetime value of zero indicates that the prefix is no longer valid. The Preferred Lifetime is the number of seconds that the address automatically configured from the prefix can remain "preferred." A preferred address on an interface is one that any node can actively use for communication. A Preferred Lifetime of zero means the addresses configured with the prefix must be deprecated. A deprecated address is one that is used to maintain existing connections, but it should not be used to initiate new connections if a preferred address exists. A lifetime of all ones indicates infinity. You can use a prefix for both on-link determination and configuration. The two types of addresses are discussed further in the section "Autoconfiguration."

Neighbor Solicitation

Neighbor Solicitation messages are used to obtain the link layer address of a neighbor, as well as to provide link layer addresses and to verify the reachability of a neighbor. It is an ICMP packet of type 135. The source address of the IP packet is the link-local address of the soliciting node. The destination is the solicited-node multicast address associated with the target IP address in the case of link layer determination, and the unicast address of the target in the case of reachability verification. The hop limit is 255. As in the RA, a hop limit of 255 in the received NS ensures that the packet has not traversed a router. If the packet had traversed a router, the hop limit would be some value less than 255. A field indicating the target address is also included in the NS.

The source link layer address option may be included in the NS. If the NS is attempting to find a target link layer address, and the NS is therefore multicast on the link, the source link layer address must be included in the packet. This inclusion minimizes the occurrence of address resolution packets on the link.

Neighbor Advertisement

A Neighbor Advertisement is sent in response to an NS or is unsolicited to immediately propagate new information, such as a change in a node's link layer address. NA is an ICMP packet of type 136. The source address is any valid unicast address assigned to the sending interface. For solicited advertisements, the destination is the source address of the solicitation, or, if the solicitation's address is the unspecified address, it is the all-nodes multicast address. Unsolicited advertisements are typically sent to the all-nodes multicast address. The NA contains a Solicited flag (S) bit. It is set when the NA is in response to an NS. The hop limit is 255. The target address is the same target address from the solicitation. This is the address for which a link layer address is sought. For an unsolicited advertisement, this is the IP address whose link layer address has changed. The NA may include the target link layer address option. Unsolicited advertisements sent to inform nodes of the advertiser's new link layer address include this option with the value of the new link layer address. The solicited NA is analogous to the IPv4 ARP reply. The unsolicited NA, however, is an added feature. One NA multicast to the all-nodes address, informing other nodes of a link layer address change, replaces many ARP requests and replies broadcast on an IPv4 network when ARP caches time out and a new link layer address is sought for a well-used device.

Redirect

Routers send Redirect messages to inform a host of a better first hop to the destination. The better first hop could be a different router or it could be the destination itself. If the destination is a neighbor of the source, even if the source and destination nodes belong to different prefixes, the router can redirect the traffic so that they communicate directly (an enhancement of IPv4 ICMP). IPv4 ICMP Redirect messages are sent by a router when an alternative router on the same link as the source host has a better path to the destination host or network. It does not redirect traffic if the better first hop is the destination itself. This feature enables hosts on the same data link but assigned different prefixes to communicate directly, without having to hop through a router.

The Redirect message's source address is the link-local address of the router. The destination is the source address of the redirected packet. The hop limit is 255.

The target IP address and the destination address also are included in the ICMP packet. If the better first hop is a router, the target address is the link-local address of that router. If the better first hop is the actual destination, the target address is the IP address of that destination. The ICMP destination address is the destination IP address of the traffic being redirected. Note that if the better first hop is the destination itself, both these fields will contain the same address.

The Redirect message may contain the target link layer address option. This enables hosts to discover the link layer address without relying on address resolution.

Part of the IP packet that caused the Redirect message might be included as an option as well. The Redirect message includes as much of the IP packet as possible, without causing the Redirect packet to exceed 1280 bytes.

Next-Hop Discovery

A host that has a packet to send must first determine what next hop to use. If a packet was previously sent to the destination, the next hop might be stored in a destination cache. If this is the first packet to a destination, the next hop is discovered by comparing the destination address with the host's on-link prefix list. A packet to an on-link destination is sent directly to that destination node. An off-link destination is sent to a default router. An IPv4 node, however, must send all traffic destined to a subnet other than its own to a router. If the destination is on the same link as the source, but on a different subnet, the router forwards the traffic back onto the link. The traffic traverses the link twice.

Whether the next hop is the destination itself or a default router, the link layer address of the next hop must be identified.

Address Resolution

Address resolution is performed by nodes looking for a link layer address associated with a known IP address. The address resolution process uses Neighbor Solicitation and Neighbor Advertisement. A node with packets to send to a destination IP address first checks its neighbor cache to see whether an entry already exists. If it does not, the node creates an entry for the IP address, with a state of INCOMPLETE. The node then sends a Neighbor Solicitation to the solicited-node multicast address of the IP address in question. The source address of the solicitation is a unicast address and is either the source address of the node initiating the traffic or the source address of a router searching for the destination on a link remote from the source node. The packet also includes the source link-level address, if one is available.

A node that receives a Neighbor Solicitation from a unicast address, destined to an address that is assigned to its interface, responds with a Neighbor Advertisement indicating its own link-level address.

When the soliciting node receives a responding Neighbor Advertisement, it updates its neighbor cache entry with the target's link-level address and changes its state from INCOMPLETE to REACHABLE.

NOTE

For a complete description of the different possible reactions , see RFC 2461.[3]


Neighbor Unreachability Detection

If a node to which another is communicating fails, it is not very beneficial to detect the failure before the upper layers do. If a router in the path to the destination fails, however, there may be an alternative router to use, and it would be extremely helpful to be able to detect that failure before the upper-layer protocol does.

Neighbor reachability is verified in one of two ways ”from hints from the upper-layer protocols or from responses to Neighbor Solicitations. Forward-direction communication must be possible for a neighbor to be reachable. Reachability is verified if forward progress is being made by an upper-layer protocol. If forward progress is being made in a TCP connection, for example, as indicated by new acknowledgements being received for data sent or by new data being received in response to a sent acknowledgement , reachability is verified. If forward progress is being made end to end, it also is being made to the next-hop router, and reachability to the router is confirmed.

Some upper-layer protocols do not provide such hints, such as UDP communications. If no verification can be received from upper-layer protocols, the node actively probes neighbors to determine their reachability state. A node sends Neighbor Solicitations to the cached link layer address of the neighbor in question and waits for Neighbor Advertisements. A node sends a Neighbor Advertisement with the solicited bit set only if it received a Neighbor Solicitation. If a node receives a Neighbor Advertisement with the solicited bit set, the node can be certain that its neighbor received the NS that it sent, and therefore forward-direction communication exists. These probes are sent in conjunction with traffic. If no traffic is being sent to a node, no probes are sent to the node.

A neighbor cache stores information about neighbors, including the IP address, link layer address, and reachability state. Table 8-9 lists the possible reachability states.

Table 8-9. Neighbor Reachability States
State Description
INCOMPLETE Address resolution is in progress. An NS has been sent, but no reply has yet been received.
REACHABLE Forward-direction communication has been verified within the past 30 seconds.
STALE An entry in the neighbor cache has not been verified as reachable within the past 30 seconds. An unsolicited Neighbor Advertisement message will add an entry to the cache for the sender of the message, with state STALE. No action is required until traffic is sent to the STALE entry.
DELAY No reachable verification has been received within the past 30 seconds, and a packet has been sent to the specified neighbor within the past 5 seconds. If no positive confirmation is received within 5 seconds of entering DELAY state, send an NS and change the state to PROBE.
PROBE An NS has been sent to verify reachability. No NA has yet been received.

An entry in the neighbor cache is INCOMPLETE initially. After the link layer address for the entry has been learned, and forward-direction communication has been verified, the state changes to REACHABLE. The state remains REACHABLE as long as the forward-direction communication continues to be verified.

When no reachability confirmation is received from a REACHABLE neighbor, its state changes to STALE. An unsolicited RA or NA received from a node puts an INCOMPLETE entry into the neighbor cache, which immediately transitions to STALE. An unsolicited advertisement does not provide any information about forward communication. The entries remain STALE until traffic is sent to that neighbor.

As soon as a packet is sent to the neighbor, its state changes to DELAY, and a timer is set to 5 seconds in the neighbor cache for the entry. The packet is sent to the cached link layer address, even though it is STALE. If the timer expires before any reachability confirmation is received, the state changes to PROBE. If reachability is confirmed, the state changes to REACHABLE.

Upon entering PROBE state, an NS is sent to the cached link layer address of the neighbor. Solicitations continue to be sent every second in the absence of a response, even if no additional data packets are sent. If no response is received for 1 second after three solicitations have been sent, the entry should be deleted from the cache.

Example 8-3 shows output from the debug ipv6 icmp and debug ipv6 nd commands and shows a router's neighbor cache state going from INCOMPLETE to REACHABLE, through all the intermediate states. Example 8-3 also displays the output from the show ipv6 neighbor command, which displays the neighbor cache. The output of the show ipv6 neighbor command provides the IPv6 address, its age, its link layer address (if known), its state, and the interface through which it is known.

Example 8-3 debug Output Showing Neighbor Reachability State Changes
 Falcon#  debug ipv6 icmp  ICMP packet debugging is on Falcon#  debug ipv6 nd  ICMP Neighbor Discovery events debugging is on 10:58:08: ICMPv6-ND: Received RA from FE80::200:CFF:FE76:5B7C on   Ethernet010:58:08: ICMPv6-ND: INCMP created: FE80::200:CFF:FE76:5B7C 10:58:08: ICMPv6-ND: INCMP -> STALE: FE80::200:CFF:FE76:5B7C Falcon#  show ipv6 nei  IPv6 Address                         Age MAC Address    State Interface FE80::200:CFF:FE76:5B7C                2 0000.0c76.5b7c STALE Ethernet0 11:01:13: ICMPv6: Received echo request from FE80::200:CFF:FE76:5B7C 11:01:13: ICMPv6: Sending echo reply to FE80::200:CFF:FE76:5B7C 11:01:13: ICMPv6-ND: STALE -> DELAY: FE80::200:CFF:FE76:5B7C 11:01:19: ICMPv6-ND: DELAY -> PROBE: FE80::200:CFF:FE76:5B7C 11:01:19: ICMPv6-ND: Sending NS for FE80::200:CFF:FE76:5B7C on Ethernet0 11:01:19: ICMPv6-ND: Received NA for FE80::200:CFF:FE76:5B7C on Ethernet0   from FE80::200:CFF:FE76:5B7C 11:01:19: ICMPv6-ND: PROBE -> REACH: FE80::200:CFF:FE76:5B7C Falcon#  show ipv6 nei  IPv6 Address                         Age MAC Address    State Interface FE80::200:CFF:FE76:5B7C                0 0000.0c76.5b7c REACH Ethernet0 

Falcon receives an RA from Eagle's link-local address FE80::200:CFF:FE76:5B7C. An INCOMPLETE entry is created in Falcon's cache, which immediately turns STALE, because the RA is unsolicited. At this point, the neighbor cache is queried. The entry does indeed say the address is STALE. Eagle's link layer address is known.

A couple of minutes later, Eagle pings Falcon, as shown by the received echo request. Falcon replies to Eagle, sending the echo response to the stored link layer. Because a packet is forwarded by the router to a STALE entry, however, the router must change the state to DELAY to see whether it can verify the forward-direction communication path. The router cannot verify this with ICMP packets. So it changes the state to PROBE and sends an NS to see whether it can get reachability verification by probing Eagle. Eagle sends an NA. The debug does not show that the solicited bit is set in the NA. After receiving the NA and verifying communication, Falcon changes the state of Eagle's entry to REACH.

The neighbor unreachability detection process enables a host to redirect traffic to an alternative router if its default router fails. It detects the failure of the default router and then chooses another router to which to forward its traffic. Potentially , this can all occur before the upper-layer protocol or application times out. IPv4 hosts might never detect that the default router has failed. An upper-layer protocol or application will time out if the router fails. The IPv4 host will likely attempt to use the dead router to reestablish a connection. Some IPv4 hosts might know of multiple default routers and could choose the second router through which to reestablish the connection.

Default Router Selection

A host chooses one router (out of possibly many) from its default router list when the destination is off-link and there is no existing cached entry for the destination or when an existing default router appears to be failing. Normally, a default router is chosen the first time traffic to a particular destination requires it. The information is cached and used for subsequent traffic.

The default router selection process uses the default router list and the neighbor cache. Any router that is not known to be unreachable has preference when becoming the default router ”that is, any router not in the INCOMPLETE state. If multiple routers are in any state other than INCOMPLETE, the router selection process either returns the same router or returns routers from this list in a round- robin fashion, depending on the implementation.

If a next-hop router appears to be failing, the neighbor unreachability detection process will detect it. If it indeed has failed, the router entry is deleted from the neighbor cache. Next-hop detection and address resolution are repeated, and an available next-hop router is used.

Case Study: Default Router Failure and Communication Recovery

A host transfers a file from a remote server using FTP. The host sends the traffic to its on-link default router. The host continues to receive ACKs for data sent, so the host knows that its default router must be reachable. In mid-session, the router fails. The host stops receiving ACKs. The host can no longer verify forward-direction communication through hints from the TCP layer, so it changes the router's state to STALE. It still attempts to send packets, so the state changes to DELAY. After 5 seconds, the host still has not received positive confirmation of the router's reachability state, so it changes the state to PROBE and sends NS. The router does not respond and therefore gets deleted from the host's neighbor cache.

The host still tries to send packets, but it no longer has a next-hop entry to which to send them. So it sees, the prefix of the destination is off-link and that retrieves a default router from its stored list. It puts the router into its neighbor cache with an INCOMPLETE state, if it does not already exist, and attempts to resolve its link layer address by sending an NS. When the new router responds with an NA, positive reachability is confirmed, and traffic begins flowing through the new router.

Duplicate Address Detection

All nodes perform duplicate address detection before assigning a unicast address to an interface. It is not performed for anycast addresses. This is performed regardless of whether the address is assigned via stateless, stateful, or manual configuration. It is performed before assigning an address to an interface and on an initializing interface. The address to be assigned to the interface is called "tentative" while the duplicate address detection process is taking place.

Before sending a solicitation, the interface joins the all-nodes multicast group to ensure that the node receives Neighbor Advertisements from any node already using the address and joins the solicited-node multicast group for the tentative address to ensure that if another node is attempting to begin using the address, both nodes will learn of each other's presence.

The node sends a Neighbor Solicitation message, with the tentative IP address as the target. The source address is the unspecified address, and the destination is the tentative address's solicited-node multicast address. By default, one solicitation is sent.

Any neighbor that is already assigned the address receives the solicitation and sends a Neighbor Advertisement in reply. The target specified in the advertisement is the tentative address. The destination address is the solicited-node address of the tentative address. If a node receives this Neighbor Advertisement, and the target address is the interface's tentative address, the address is a duplicate and must not be assigned to the interface. Some IPv4 hosts perform a duplicate address detection process before assigning an IP address to an interface. Not all do, however, allowing an interface with a duplicate address to potentially disrupt existing traffic flows.

Autoconfiguration

Because network manageability is so crucial to the success of any network, processes to facilitate it need to be built in to the protocol. Networks with hosts that have static configurations, manually entered, are difficult to manage when changes are necessary.Many tools ease the management burden of IPv4 networks, such as DHCP to minimize the amount of static configuration, but they are not required elements to the protocol. IPv6 nodes can automatically configure themselves , with or without the help of a DHCP server, making host configuration changes much easier.

Router Advertisements are used to tell hosts how to configure themselves. The RA contains two bits that tell the hosts whether to use a configuration server and, if so, whether information other than addresses should be obtained from the server. The Managed Address Configuration (M) bit, if set, tells hosts to use a stateful address configuration protocol to configure its address, such as DHCP. Stateless autoconfiguration of addresses also occurs on the host. The Other Stateful Configuration (O) bit tells the host to use the stateful configuration protocol to configure information other than the address. IPv4 hosts, on the other hand, are statically configured to use DHCP with a specific DHCP server if the IP address and other configuration are to be obtained dynamically. Otherwise , the configuration is all entered manually.

Stateless Autoconfiguration

Through a combination of what a node knows (its interface identifier) and what a router knows (the prefixes assigned to a link), a node can configure its own IP address. No server is needed to establish basic IP connectivity. This works on any multicast-capable interface.

Upon interface initialization, a node generates a link-local address for that interface. The link-local address is the interface's identifier concatenated with the well-known link-local prefix FE80::. The rightmost zeros of the link-local prefix are replaced with the interface ID, forming a 128-bit address. Note that interface IDs are typically 64 bits, but not always.

Link-local prefix FE80:0:0:0:0:0:0:0 and interface ID 200:CFF:FE0A.2C51 form link-local address FE80:0:0:0: 200:CFF:FE0A.2C51.

If the interface ID is more than 118 bits long, it cannot be concatenated with the link-local FP, which is 10 bits long. The autoconfiguration will fail, and the interface will have to be configured manually.

The node does not immediately assign the generated link-local address to the interface. First, it must determine whether a duplicate address exists. The node initiates the duplicate address detection process.

A node that learns that its generated address is not unique must be configured manually. One way to configure the node is to configure an alternate interface ID. This way, the node can still participate in the stateless autoconfiguration process and automatically configure each of its required addresses plus any assigned unicast and multicast addresses. The alternative to configuring an interface ID is to manually configure IPv6 addresses on the interface. Such a configuration could be a large administrative task, given the number of addresses that must be configured on the interface.

When the node is satisfied that no duplicate address exists, it assigns the address to the interface.

At this point, basic IP level connectivity exists. IPv6 hosts on a link with no router can now communicate with each other. No manual network layer configuration is required in the hosts to enable this communication.

Example 8-4 shows the minimal basic configurations for both Falcon and Eagle.

Example 8-4 Minimal Basic Configurations for Falcon and Eagle to Enable IPv6 Communication
  Falcon   ipv6 unicast-routing   !   interface Ethernet0   ipv6 enable  _______________________________________________________________________  Eagle   ipv6 unicast-routing   !   interface Ethernet0   ipv6 enable  

Pinging Falcon's link-local address from Eagle shows that communication exists, as demonstrated by the output in Example 8-5.

Example 8-5 Verifying Communication Between Falcon and Eagle from Falcon's Link-Local Address
 Eagle#  ping ipv6 fe80::200:cff:fe0a:2c51  Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to FE80::200:CFF:FE0A:2C51, timeout is 2 seconds: !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 60/68/80 ms 

Both routers and hosts perform all the steps of the stateless autoconfiguration process discussed so far, to enable the basic IP connectivity. Every interface must create a link-local address. Duplicate address detection is performed for all unicast addresses prior to assigning them to an interface, regardless of whether the IPv6 address is configured via stateless autoconfiguration, stateful autoconfiguration, or manually, except as discussed in the following paragraph.

Hosts, not routers, continue the autoconfiguration process. The host sends an "all-routers" multicast solicitation to find a router on the link. All routers respond with Router Advertisements. The RA may tell the host to use stateful autoconfiguration to configure addresses and other information. The host uses the prefix information marked for address configuration to create a site-local address. To create a site-local address, the site-local FP, the prefix, and the interface ID are concatenated. A host is not required to perform the duplicate address detection process when assigning its site-local address. The theory is that the process just verified that the link-local address is unique. This means that the interface identifier is unique to the link. Because the site-local address assigns a different prefix to the same interface identifier, the site-local address is also unique. Globally aggregatable addresses are generated and assigned using the same method.

The RA also provides on-link prefix information. This information is a list of prefixes and prefix lengths, marked as on-link prefixes, that the host uses to build its prefix list. The prefix list is used by the host to determine whether a destination node is on-link or off, and therefore whether it needs to use a default router to send the traffic.

A host can configure its MTU size based on information contained in the RA also.

Stateful autoconfiguration is required to configure other information, such as the DNS server.

Use the following interface subcommands to configure a router to advertise a prefix with specific values, to set the managed configuration flag, and to set the other configuration flag:

  ipv6 nd prefix-advertisement 2001:ABAB::/48 3000 3000 onlink autoconfig   ipv6 nd managed-config-flag   ipv6 nd other-config-flag  

The prefix 2001:ABAB::/48 is advertised with a Valid Lifetime of 3000 seconds and a Preferred Lifetime of 3000 seconds, to be used as both an on-link advertisement and autoconfiguration.

The autoconfiguration process occurs on each interface of a node whenever the interface becomes enabled. A multihomed node performs autoconfiguration on each interface independently. An interface is enabled upon the following:

  • Initialization of the interface at system startup

  • The interface is re-enabled after an interface failure or after being temporarily disabled by a system administrator

  • The interface attaches to a link for the first time

  • The interface becomes enabled after being administratively down

Stateful Autoconfiguration

Stateful autoconfiguration may be used in conjunction with stateless autoconfiguration. DHCP provides stateful autoconfiguration for IPv4. A modified DHCP for IPv6 implementation could take advantage of a number of IPv6 features, enhancing the capabilities of DHCP.

NOTE

The Dynamic Host Configuration working group has published a draft for DHCP for IPv6 titled "draft-ietf-dhc-dhcpv6-15.txt."


A configuration server allocates addresses and other information, such as DNS server address, to requesting hosts. The addresses are associated with a Valid and Preferred Lifetime, just as are the prefixes used for stateless autoconfiguration. A server with the capability to request that all hosts revalidate their assigned addresses can use the lifetime values to renumber networks.

Renumbering

Site renumbering will still occur, even with the abundance of IP addresses. Address prefixes are strictly maintained , and an address assigned to a site might need to be recalled occasionally. Or the site might want to change ISPs, which would require a prefix change, just as it does today with IPv4 if a company changes ISPs. IPv6 was not designed to eliminate this phenomenon , but it is designed to make renumbering easier.

An address is in one of two states: preferred or deprecated. A host should always attempt to communicate using a preferred address. A deprecated address should be used only as the source address if using a preferred address will cause an existing connection to be disrupted. If two hosts have a TCP connection established using preferred addresses, for example, and one host's address changes to deprecated, if the host switches to a new preferred address, the connection will fail.

Host renumbering is simplified by the use of preferred and deprecated addresses. The time that an address remains preferred is set and modified in Router Advertisements, which are periodically sent on the link and are processed by every node on the link. New prefixes can be added to the Router Advertisements, thus adding new addresses to the interfaces, and old ones can be deprecated and removed. A similar mechanism can be used to renumber hosts using a configuration server. The server may multicast a request to all nodes, asking them to reconfirm their assigned addresses. The hosts will query the configuration server and obtain addresses with modified lifetime values, deprecating existing addresses or assigning new preferred addresses. The robustness of this renumbering mechanism depends on the Router Advertisements and stateful messages reaching all hosts on a link. Consider the following case study taken from RFC 2641.

Case Study: Renumbering a Network

A prefix is advertised with a lifetime of two months. On August 1, it is determined that the prefix must be changed and not used by September 1. The prefix advertisement can be changed so that its lifetime is two weeks, and then made smaller as the date approaches September 1, until the prefix is eventually advertised with a lifetime of zero, thereby invalidating the address. Consider, however, that a host is disconnected from the network on July 31. If it is plugged in again after September 1, it still thinks the old prefix is valid until September 30. The only way to force a host to discontinue using a prefix that was previously advertised with a long lifetime is to send an RA with a shorter lifetime. The routers must continue to send the RA with the lifetime value of zero until October 1 to ensure that any host that is disconnected before the change and reconnected before its two-month lifetime expiration does not use the invalid prefix.

In general, a router should continue to advertise a zero lifetime until such a time as any host that was disconnected, when reconnected, will not use an old prefix. Note that infinite lifetimes advertised by routers cause a problem when trying to renumber links and when hosts are connected and disconnected frequently.

Renumbering routers takes a lot of planning if communication is to be maintained. Routers communicate together, and hosts communicate with routers via the router's link-local address. This communication is independent of any assigned prefix. The nodes will therefore continue to communicate with the routers' link-local addresses, regardless of what global address is assigned to the link.

DNS implications are the same for IPv6 as they are for renumbering under IPv4. Either the new addresses need to be manually entered into the DNS databases prior to the new addresses being usable, or dynamically updated DNS servers (DDNS) should be implemented.

Routing

The preceding section discussed how an IPv6 node discovers the information required to forward a packet to neighbors and to next-hop routers if the destination is not on-link. Now routing issues are discussed to show different ways the IPv6 packet can be routed through a larger network.

MTU Path Discovery

The MTU is required to be at least 1280 bytes long on every link in an IPv6 network. However, the recommended size is 1500 bytes or larger. Any link that cannot handle a packet this large is required to provide link-level fragmentation. IP-level fragmentation is performed only by the source node, not by routers along the packet's path. Nodes are not required to implement MTU path discovery, but it is recommended. A node not implementing MTU path discovery uses an MTU equal to the minimum IPv6 MTU, 1280 bytes. A source node that implements MTU path discovery can take advantage of the largest possible packet, and possibly gain higher performance. Path discovery works for both unicast and multicast destinations.

MTU path discovery utilizes ICMP Packet Too Big error messages. A node sending traffic initially assumes the path MTU (PMTU) is equal to the MTU of its attached link. Any node along the delivery path that detects that it cannot deliver the packet over a link with a smaller MTU sends a Packet Too Big ICMP error message, which includes the size of its link MTU, and drops the big packet. The source node receives the ICMP error and reduces the size of the packets it is sending to the MTU value included in the error message. The process is likely to be repeated with nodes further down the delivery path. Figure 8-14 demonstrates the PMTU discovery.

Figure 8-14. PMTU Discovery Process

graphics/08fig14.gif

The Token Ring-connected PC begins by sending a packet of size 4500 B. The packet reaches a router with an MTU of 1500 B on the link to the delivery path. The router sends an ICMP Packet Too Big message back to the host, includes its MTU of 1500 B, and drops the original packet. The PC creates a smaller packet, of size 1500 B. The first router passes it on. The next router's link to the delivery path has an MTU of 1280 B. It sends an ICMP Packet Too Big message back to the host and drops the packet. The PC then sends a packet of size 1280 B, which is forwarded through both routers.

MTU path discovery works with multicast destination addresses as well as unicast. A multicast packet branches off into many paths. Any node along any path may send the Packet Too Big message. The minimum value of the set of PMTUs determines the size of the packets sent.

RIPng

RIPng ( ng stands for "next generation") is based on RIP version 2 (RIP-2). None of the operational procedures, timers, or stability functions have been changed. RIPng is RIP-2, modified to support the larger IP addresses and multiple addresses on each interface of IPv6. The UDP port number for RIPng is 521. RIPng does not support both IPv4 and IPv6 and is therefore not backward-compatible with RIP-2.

NOTE

Chapter 7, "Routing Information Protocol Version 2," of Routing TCP/IP, Volume I , discusses RIP version 2.


Figure 8-15 shows the RIPng message format. The basic structure is very similar to RIP-2.

Figure 8-15. RIPng Message Format

graphics/08fig15.gif

The RIPng message fields are defined as follows (with lengths shown in bytes):

  • Command is set to either 1, signifying a request, or 2, signifying a response.

  • Version is currently 1.

The rest of the message contains the list of route table entries (RTEs). Figure 8-16 shows the format of the RTEs.

Figure 8-16. RIPng Route Table Entry Format

graphics/08fig16.gif

The fields in the RTE format are defined as follows:

  • IPv6 Prefix is the 128-bit IPv6 address prefix.

  • Route Tag is identical to RIP-2, which provides a field for tagging external routes or routes that have been redistributed into the RIPng process.

  • Prefix Length specifies the significant part of the address prefix.

  • Metric is the same as in RIP-2, a hop count value between 1 and 15, inclusive.

The number of routes that a RIPng update can contain depends on the link MTU, the number of octets of header information preceding the RIPng message, the size of the RIPng header, and the size of a route table entry (RTE). The formula for determining the number of RTEs in a single update is as follows:

graphics/08equ01.gif


The number of RTEs directly relates to the link MTU and the length of the IP headers, UDP header, and RIPng header.

Each RIP-2 RTE contains a Next-Hop field associated with it, specifying a better next-hop address than the address of the advertising router. IPv6 addresses are so large that this would almost double the size of the RTE. RIPng specifies a single next-hop RTE that applies to all the following RTEs until the end of the message or until the existence of another next-hop RTE. The next-hop RTE in Figure 8-17 shows that the Route-Tag field and prefix field must contain all zeros. The metric value will be 0xFF. A value of 0:0:0:0:0:0:0:0 in the Address field indicates the next-hop is the originator of the RIPng advertisement.

Figure 8-17. Next-Hop RTE

graphics/08fig17.gif

The next-hop address must be the link-local address of the next-hop router. If the address is not a link-local address, the receiver of the advertisement treats the packet as if the address prefix value is 0:0:0:0:0:0:0:0.

Periodic and triggered RIPng responses must remain local to a link ”they must not traverse a router. Both periodic updates and triggered updates must have the router's link-local address as the source of the advertisement and the IPv6 hop limit equal to 255. The hop limit of 255 ensures that the advertisement has not traversed a router, because a router decrements the hop limit of every packet. The destination multicast address is the all-rip-routers multicast address FF02::9.

The Cisco router is capable of running multiple RIPng processes. The routing process is enabled as an interface subcommand:

  ipv6 rip   tag   enable  

The command must be enabled on any interface addressed with a prefix that needs to be advertised in the RIPng update. Multiple processes are distinguished by the tag. Currently, up to four processes are supported. Each process must use a unique UDP port number. A single process can use the default value, 521. The port number must be modified for subsequent processes; otherwise, the new process will not start up. The global command to modify the UDP port number and the multicast address used by RIPng is as follows:

  ipv6 rip   tag   port   udp-port   multicast-group   multicast-address  

More than one process can use the same multicast address. If this command is not given, the default port number, 521, and the default multicast address, FF02::9, are used.

Unlike RIP-2, for which the global command router rip is required to enable the routing protocol, no global commands are required to enable RIPng.

Optional global commands control the entire RIPng process, affecting all configured interfaces. Global commands are available to disable or enable split-horizon and poison reverse, modify UDP port numbers and RIPng multicast addresses, change default timers, change the administrative distance, and redistribute static routes. Most of these functions are also available with RIP-2.

Table 8-10 lists the available global commands.

Table 8-10. RIPng Global Commands
Command Description
[ no ] ipv6 rip tag port udp-port multicast-group multicast-address Configures the RIP routing process to use the specified UDP port and multicast address.
[ no ] ipv6 rip tag table table-number Assigns the specified routing table to the RIP process. Default is table 0. Note that only table0 will be used for IPv6 unicast packet forwarding.
[ no ] ipv6 rip tag distance distance-value Sets the administrative distance for this process. Default is 120.
[ no ] ipv6 rip tag timers update expire holddown garbage-collect Modifies the RIPng timers for this process. The values indicate seconds. Default values are 30 180 180 120.
[ no ] ipv6 rip tag redistribute static Advertises static routes into IPv6 as if they were directly connected.
[ no ] ipv6 rip tag split-horizon Performs split-horizon processing of updates. This is on by default.
[ no ] ipv6 rip tag poison-reverse Performs poison-reverse processing of updates. This is off by default.

Additional RIPng interface subcommands are also available. There are interface subcommands to initiate the advertisement of default routes on updates out the specific interface, to summarize routes advertised out the interface, to apply input and output filters to updates received or sent from the interface, and to change the metric-offset for routes received on the interface. All these functions are available with RIP-2. Table 8-11 lists the interface subcommands.

Table 8-11. RIPng Interface Subcommands
Command Description
[ no ] ipv6 rip tag enable Configures RIPng routing on an interface.
[ no ] ipv6 rip tag default-information originate Originates the default route (0::0/0) and includes it in updates sent from this interface.
[ no ] ipv6 rip tag default-information only Originates the default route (0::0/0). Suppresses sending any routes except the default route on this interface.
[ no ] ipv6 rip tag summary-address prefix / length Summarizes routing information. If the first length bits of a route match the given prefix, the prefix will be advertised instead. Multiple routes are thus replaced by a single route whose metric is the lowest metric of the multiple routes. You may use this command multiple times.
[ no ] ipv6 rip tag input-filter name Applies a simple access list to RIP routing updates received on the interface.
[ no ] ipv6 rip tag output-filter name Applies a simple access list to RIP routing updates generated on the interface.
[ no ] ipv6 rip tag metric-offset number Changes the metric-offset of a route entering the routing table. Default is 1. Value may be between 1 and 16.

A simple network diagram along with the routers' configurations helps illustrate the minimal router configurations needed to run RIPng (see Figure 8-18).

Figure 8-18. Simple RIPng Network

graphics/08fig18.gif

RIPng is configured on both routers, on the Ethernet link and the serial link. Example 8-6 shows the router configurations.

Example 8-6 Configuring RIPng on Routers Falcon and Eagle
  Falcon   ipv6 unicast-routing   no ipv6 rip birdbath split-horizon   !   !   interface Ethernet0   no ip address   no ip directed-broadcast   ipv6 enable   ipv6 address FEC0::/64 eui-64   ipv6 address FEC0::1:0:0:0:0/64 eui-64   ipv6 address FEC0::2:0:0:0:0/64 eui-64   ipv6 rip birdbath enable   !  _______________________________________________________________________  Eagle   ipv6 unicast-routing   no ipv6 rip birdbath split-horizon   !   !   interface Ethernet0   no ip address   no ip directed-broadcast   ipv6 address FEC0::/64 eui-64   ipv6 address FEC0::2:0:0:0:0/64 eui-64   ipv6 address FEC0::3:0:0:0:0/64 eui-64   ipv6 rip birdbath enable   !   interface Serial1   ipv6 address FEC0::A:0:0:0:1/126   ipv6 rip birdbath enable   !  

The two routers share two common prefixes: FEC0::/64 and FEC0::2:0:0:0:0/64. Each also is configured with a third prefix. To enable the routers to advertise their noncommon prefix to each other, split-horizon has been disabled. RIPng is enabled on the Ethernet ports and on Eagle's serial1. The process name is birdbath.

Example 8-7 shows Falcon's routing table.

Example 8-7 IPv6 Routing Table Showing RIPng-Learned Routes
 Falcon#  show ipv6 route  IPv6 Routing Table - 9 entries Codes: C - Connected, L - Local, S - Static, R - RIP, B - BGP Timers: Uptime/Expires L FE80::/64 [0/0]   via ::, Null0, 01:37:41/never L FEC0::200:CFF:FE0A:2C51/128 [0/0]   via FEC0::200:CFF:FE0A:2C51, Ethernet0, 01:20:58/never C FEC0::/64 [0/0]   via FEC0::200:CFF:FE0A:2C51, Ethernet0, 01:20:58/never L FEC0::1:200:CFF:FE0A:2C51/128 [0/0]   via FEC0::1:200:CFF:FE0A:2C51, Ethernet0, 01:01:36/never C FEC0::1:0:0:0:0/64 [0/0]   via FEC0::1:200:CFF:FE0A:2C51, Ethernet0, 01:01:36/never L FEC0::2:200:CFF:FE0A:2C51/128 [0/0]   via FEC0::2:200:CFF:FE0A:2C51, Ethernet0, 01:00:21/never C FEC0::2:0:0:0:0/64 [0/0]   via FEC0::2:200:CFF:FE0A:2C51, Ethernet0, 01:00:21/never R FEC0::3:0:0:0:0/64 [120/2]   via FE80::200:CFF:FE76:5B7C, Ethernet0, 00:00:08/00:02:51 R FEC0::A:0:0:0:0/126 [120/2]   via FE80::200:CFF:FE76:5B7C, Ethernet0, 00:00:08/00:02:51 

The routing table in Example 8-7 shows that the prefixes configured on Falcon's Ethernet port are connected. Eagle's Ethernet prefix FEC0::3:0:0:0:0/64 and serial prefix FEC0::A:0:0:0:0/126 are learned via the RIPng process.

RIPng is still a very easy protocol to implement, and the introduction of multiple processes adds a little more flexibility over RIP-2; however, the drawbacks still exist, as detailed in Chapter 7 of Volume I . For instance, it still has a small maximum hop count, limiting the size of network that can run the protocol.

OSPF for IPv6

OSPFv2 features many modifications designed to support the larger IPv6 address and changes in protocol semantics between IPv4 and IPv6. Cisco IOS does not yet support OSPF for IPv6. The fundamental mechanisms ”flooding, DR election, area support, SPF, and so on ”have remained unchanged. IPv6 OSPF operates directly over IPv6. The preceding header's next-header value is 89. The following functions have not changed in OSPF for IPv6:

  • Both versions of the protocol support the same packet types ”namely, Hellos, Database Description, Link-State Request, Link-State Update, and Link-State Acknowledgement packets, although some, such as the Hello packet, have been modified.

  • Hello packets are exchanged to discover neighbor information.

  • Adjacency selection and establishment.

  • The interface state machine, including the states that interfaces traverse as well as designated router election process.

  • The neighbor state machine, including the states that neighbors traverse before becoming adjacent.

  • Link state database aging.

NOTE

Chapter 9, "Open Shortest Path First," of Routing TCP/IP, Volume I , discusses OSPF version 2.


Some mechanisms have changed. The changes result from the desire to make OSPF network-protocol-independent (and therefore more extensible), the new address format, explicitly specified flooding scope, and interface support of multiple addresses and prefixes. The OSPF protocol has become network- protocol-independent . The version number has changed from 2 to 3, and so the protocol is referred to in the remainder of this chapter as OSPFv3. This section addresses the changes made to the protocol.

Links Rather Than Subnets

IPv6 nodes communicate over links, not subnets. They can have multiple addresses and prefixes configured on interfaces connected to the link and can communicate with other nodes on the link, independent of the subnet being used. OSPFv3 focuses on links rather than subnets as OSPFv2 does. A router interface sending an OSPF packet no longer needs to reside on the same subnet as the router interface receiving the packet, because IPv6 OSPF runs per link rather than per subnet.

Addressing Semantics Removed

Addressing semantics have been removed from OSPFv2 packets and LSAs, thus creating a network-protocol-independent core within OSPFv3. This leads the way for a future multiprotocol OSPF. Many OSPFv2 packets and LSAs contain IPv4 addresses, representing router IDs, area IDs, or LSA link state IDs. OSPFv3 router IDs, area IDs, and LSA link state IDs are still expressed using 32 bits, so they cannot be represented by an IP address (although they can be represented by a portion of the address). OSPFv2 broadcast and NBMA networks list neighbors by IP address. OSPFv3 neighbors are known solely by their router IDs. Other OSPFv2 LSAs, such as Router-LSAs and Network-LSAs, contain IP addresses; the IP addresses are used to represent the network topology in the link state database. OSPFv3 Router-LSAs and Network-LSAs express topological information only; they describe the network topology in a network-protocol-independent manner. Instead of using IP addresses to identify links, IPv6 uses interface IDs. Every interface on a router is assigned a unique interface ID. Some implementations may use the MIB-II ifIndex. The MIB-II ifIndex is discussed in RFC 2233, "The Interfaces Group MIB using SMIv2." Neighbors and designated routers are identified by router IDs, which are no longer IP addresses. IPv6 addresses are contained only in the LSA payloads carried by Link-State Update packets.

LSA Flooding Scope and Unknown LSA Types

The flooding scope of LSA packets has been generalized. The LSA type determines the scope of OSPFv2 flooding. Each type is associated with its flooding scope. In OSPFv3, the flooding scope is explicitly configured in the LSA header. An OSPFv3 router that does not recognize the LSA type still knows how to flood the packet. The scope could be local-link, Area, or AS. OSPFv3 allows routers to have differing capabilities. Routers are no longer required to drop received LSAs with unknown types. Flooding scope, handling of unknown types, and LSA type are encoded in an expanded LSA Type field in the header. The upper 3 bits encode the flooding scope and the handling of unknown types. The handling bit informs the router to either flood the unknown LSA with link-local scope or to store and flood the LSA as if it were known. The router can do the latter because of the encoded flooding scope. Tables 8-12 and 8-13 display the flooding scope values and the values associated with the handling of unknown LSAs.

Table 8-12. Flooding Scope Values and Descriptions
Flooding Scope Value (Binary) Description
00 Link-local scoping. Flooded only on the link it is originated on.
01 Area scoping. Flood to all routers in the originating area.
10 AS scoping. Flood to all routers in the AS.
11 Reserved.
Table 8-13. Values Indicating the Handling of Unknown LSA Types
Handling of Unknown LSA Value (binary) Description
Treat the LSA as if it has link-local flooding scope.
1 Store and flood the LSA as if the type is understood .

Explicitly coded flooding scope facilitates the integration new OSPF features into an existing network.

Multiple OSPF Instances per Link

Multiple OSPFv3 protocol processes can run on a single link. This proves useful when multiple areas need to share a single link (see Figure 8-19). The instance ID in OSPFv3 packet headers enables this functionality.

Figure 8-19. Two Routers Share a Link, and Two Areas Need to Run on the Single Link; Multiple OSPF Protocol Processes per Link Enables This

graphics/08fig19.gif

In Figure 8-19, Area 1 has four routers and Area 2 has four routers. The two remote routers in Area 1 have primary links to Router A, with backup links to Router B. The two remote routers in Area 2 have just the opposite ”primary links to Router B and backup links to Router A. Both Area 1 and Area 2 must run between Routers A and B over a single Ethernet link. You can accomplish this with OSPFv3, but not with OSPFv2.

Another case is when multiple companies or independent subsidiaries of a company running OSPF share a single link and use the link to communicate with each other. The link may belong to one of the companies, which uses it to connect to all independent organizations. A subset of the companies may want to peer, excluding the other companies. A more common practice is to use BGP to interconnect the independent organizations. However, an organization with OSPF expertise and no BGP expertise may dictate that the interconnecting protocol be OSPF. Figure 8-20 illustrates this scenario.

Figure 8-20. OSPF Routers Share a Common Link; a Subset of the Routers Shares an OSPF Process

graphics/08fig20.gif

If Figure 8-20, routers Poodle and Lab peer and share a common OSPFv3 process. Routers Lab, Terrier, and Collie also peer, sharing a different OSPFv3 process. Lab's link is configured with both OSPFv3 process identifiers.

OSPF's Use of Link-Local Addresses

Because link-local addresses are configured on every active IPv6 router link, OSPFv3 uses these link-local addresses as the source address of protocol packets and as contents of the Link-LSA (described in the section "New LSAs and LSA Changes"). Link-local addresses, by definition, all share the same IPv6 prefix (FE80::/64). OSPFv3 nodes can therefore easily communicate and form adjacencies regardless of the prefix assigned for their site-local or global aggregatable addresses. Link-local addresses are used within LSAs to identify links on a router without associating the link with a particular IP address, keeping the topology information independent of the network protocol in use.

Removal of Authentication

Authentication has been removed from OSPF for IPv6. IPv6 has integrity, authenticity, and confidentiality mechanisms built in to the network layer of the protocol. OSPFv3 operates directly on top of this layer. OSPFv3 improved its efficiency by removing the authentication information from its headers. Networks that do not require routing security no longer have to process the headers. Networks that do require routing security can use the Authentication and Security Encrypting Payload extension headers at the IP layer.

New LSAs and LSA Changes

Although most of the functionality has remained unchanged, some OSPFv2 LSA fields have been modified, and LSAs have been renamed in OSPFv3. New LSAs have been added to OSPF to carry IPv6 addresses and next-hop information.

The OSPFv2 LSA header contained these fields: Age, Options, Type, Link State ID, Advertising Router, Sequence Number, Checksum, and Length. The OSPFv3 LSA removed the Options field from the header, expanded it from 8 to 24 bits, and moved it to the body of Router-LSAs, Network-LSAs, Inter-Area-Router-LSAs, and Link-LSAs. The Type field expanded to 16 bits, using the space originally occupied by the Options field. The rest of the header remains unchanged.

The LSA Type field is composed of unknown type handling, flooding scope, and LSA type bits. Figure 8-21 displays the LSA Type field.

Figure 8-21. The OSPFv3 LSA Type Field

graphics/08fig21.gif

The U bit specifies the handling of unknown LSA types. S2 and S1 indicate the flooding scope.

Handling of unknown LSA types has changed. IPv4 OSPF discarded LSAs of unknown type. This discarding is undesirable in OSPFv3 because of the desire to mix routers of varying capabilities on a single link. If the designated router supports fewer options than other routers on the link, full functionality will not be available.

Table 8-14 lists the link type values for each LSA.

Table 8-14. Link Type Values for Each OSPFv3 LSA
LSA Function Code Value LSA Type
1 0x2001 Router-LSA
2 0x2002 Network-LSA
3 0x2003 Inter-Area-Prefix-LSA
4 0x2004 Inter-Area-Router-LSA
5 0x4005 AS-External-LSA
6 0x2006 Group-Membership-LSA
7 0x2007 Type-7-LSA
8 0x0008 Link-LSA
9 0x2009 Intra-Area-Prefix-LSA

From Table 8-14, you can see that the two OSPFv2 summary LSAs have been renamed, and there are two additional LSAs: Link-LSA and Intra-Area-Prefix-LSA. You also can see the flooding scope and handling of each type. All the listed types have a U bit set to 0, which indicates that if the type is unknown to any receiving router, it should treat the LSA as if it has link-local flooding scope. If the router does recognize the type, it floods the LSA according to the S2 and S1 bits. A Router-LSA type 0x2001, for instance, has S2S1 value 01 (binary). The LSA gets flooded to all routers within the area. The AS-External-LSA has a value of 10 (binary) and gets flooded to all routers in the AS.

The type-3 Network Summary-LSAs of OSPFv2 have been renamed Inter-Area-Prefix-LSAs. Remember that these LSAs are used by an Area Border Router to advertise networks external to an area.

Type-4 ASBR Summary-LSAs have been renamed Inter-Area-Router-LSA s. These LSAs are advertised by the AS boundary router and advertise ASBRs external to an area.

The LSA Options field expanded from 8 to 24 bits in OSPFv3. The field is present in Hello packets, database description packets, and certain LSAs (Router-LSAs, Network-LSAs, Inter-Area-Router-LSAs, Link-LSAs). The Options field enables routers to inform each other of their supported (or not supported) optional capabilities, allowing routers of mixed capabilities to exist within an OSPF routing domain. The action taken when routers do not support the same capabilities depends on the option.

The following 6 bits of the Options field have been defined:

  • V6 ” If the bit is clear, the router participates in topology distribution but is not used to forward transit IPv6 packets.

  • E ” As in OSPFv2, E is set when the originating router is capable of accepting AS External LSA. E = 0 in all LSAs originated within a stub area. The bit also is used in Hello packets, indicating the interface's capability to send and receive AS External LSAs. Neighboring routers with mismatched E bits do not become adjacent, ensuring that all routers in an area support stub capabilities equally.

  • MC ” The bit is set when the originating router is capable of forwarding IP multicast packets. MOSPF uses this bit.

  • N ” Used only in Hello packets. A set N bit indicates the originating router's support for NSSA External LSAs. If N = 0, the originating router does not send or accept these NSSA External LSAs. Neighboring routers with mismatched N bits do not become adjacent, ensuring that all routers in an area support NSSA capabilities equally. If N = 1, E must be 0.

  • R ” A set Router bit indicates that the router is active. If the R bit is clear, an OSPF speaker can participate in topology distribution without being used to forward transit traffic. This could be used by a multihomed node that wants to participate in routing but does not want to act as a router, forwarding packets between its interfaces. The V6 bit specializes the R bit. If the R bit is set, but the V6 bit is clear, the node does not forward IPv6 datagrams, but it does forward datagrams belonging to another protocol.

  • DC ” This bit is set when the originating router is capable of supporting OSPF over demand circuits.

Comparing these bits to the 6 defined bits in the OSPFv2 Options field (T, E, MC, N/P, EA, DC), you can see that there have been some changes. Type of service (ToS) is not supported in OSPFv3, so the T bit has been replaced. The N bit is still used only in Hello packets. The P bit is part of another set of options in OSPFv3, the prefix options associated with each advertised prefix. The OSPFv2 EA bit indicates the support of External Attribute LSAs. External Attribute LSAs are proposed as an alternative to running Internal BGP (iBGP) to transport BGP information across an OSPF domain. External Attribute LSAs have not been implemented, nor have any drafts or RFCs been published. Even without the options bit to define the EA capability, however, External Attribute LSAs could still be supported by OSPFv3, as an additional LSA type, with specified flooding scope and unknown LSA type handling.

The new Link-LSA is used to exchange IPv6 prefix and address information between routers on a single link. It is also used by a router to advertise a set of options to associate with the Network-LSA that will be originated for the link. The Link-LSA provides the router's link-local address and the list of prefixes to associate with the link. The LSA is multicast to all routers on a link. The options that are advertised by the Network-LSA are the logical OR of the options sent by all routers in the Link-LSA.

There is another new LSA, called the Intra-Area-Prefix-LSA. This LSA carries IPv6 prefix information that in OSPFv2 was carried in Router-LSAs and Network-LSAs. It is used by a router to advertise address prefixes assigned to the router itself, such as attached stub networks and attached transit networks.

The OSPFv3 LSAs that contain prefix information always carry the prefix length, prefix options, and prefix address. The Prefix Options field is an 8-bit field describing capabilities associated with the prefix. The following four options are defined:

  • NU ” A set "no-unicast" bit excludes the prefix from unicast routing calculations.

  • LA ” The set "local-address" bit indicates that the prefix is actually an IPv6 address of the advertising router.

  • MC ” A set "multicast-capable" bit indicates that the prefix should be included in multicast routing calculations.

  • P ” The "propagate" bit is set on NSSA prefixes that should be re-advertised at the NSSA area border.

Each prefix is advertised with the 8-bit Prefix Options field that serves as input to the various routing calculations. The options could indicate that certain prefixes should be excluded or that others should not be propagated.

BGP-4 Multiprotocol Extensions

Additions made to BGP-4 are not specific to IPv6. They also include support for other protocols, such as IPX. The multiprotocol additions to BGP-4 are discussed here as they relate to IPv6. Multiprotocol BGP (MBGP) is discussed in Chapter 7, "Large-Scale IP Multicast Routing."

Three pieces of BGP-4 information are IPv4-specific:

  • The next-hop attribute

  • The AGGREGATOR attribute

  • The network layer reachability information (NLRI)

At the time of this writing, it is assumed that every BGP-4 speaker will maintain at least one IPv4 address. The AGGREGATOR attribute will continue to use this address. Refer back to Chapter 2 for more information about the AGGREGATOR attribute. So, the additions to BGP-4 address the NEXT-HOP attribute and the NLRI. Furthermore, because the next-hop information is used to forward packets to a set of destinations and is used only when adding NLRI, not when withdrawing routes, the next-hop information has been added to the reachable NLRI updates.

Two new attributes are defined to support multiple protocols over BGP. The multiprotocol-reachable NLRI (MP-REACH-NLRI) and the multiprotocol-unreachable NLRI (MP-UNREACH-NLRI). Both attributes are optional and nontransitive, meaning that a BGP process that does not recognize the attribute can quietly ignore the Update message in which it is included and not advertise the information to its other peers.

As the name suggests, the multiprotocol-reachable NLRI attribute describes the reachable destinations. The attribute contains information about the network layer protocol to which the addresses belong and the next-hop address used to forward packets destined for the contained list of destination prefixes. Each MP-REACH-NLRI Update message includes one next-hop address and a list of associated NLRIs. The NLRI is a 2-tuple of the form < length / prefix > in which length is the length of the prefix and prefix is the reachable IPv6 address prefix.

The next hop is the address to be used by BGP speakers when forwarding packets destined to an associated address prefix. Looking back at Chapter 2, the default rules for the next-hop attribute are as follows:

  • If the advertising router and receiving router are in different autonomous systems (external peers), the NEXT_HOP is the IP address of the advertising router's interface.

  • If the advertising router and the receiving router are in the same autonomous system (internal peers), and the NLRI of the update refers to a destination within the same autonomous system, the NEXT_HOP is the IP address of the neighbor that advertised the route.

  • If the advertising router and the receiving router are internal peers and the NLRI of the update refers to a destination in a different AS, the NEXT_HOP is the IP address of the external peer from which the route was learned.

For IPv6, the rules are more specific because of the defined scopes of IPv6 addresses. An IPv6 BGP router advertises the global address of the next-hop router, possibly followed by its link-local address. The link-local address is included only if the BGP speaker shares a common data link with both the node identified in the Next-Hop field and the peer to which the Update message is being sent. In all other cases, only the global address is included in the Next-Hop field.

A network diagram, router configuration, and command output illustrate that the configuration and output of commands for MBGP using IPv6 closely resemble those used for IPv4.

Figure 8-22 shows a simple BGP router topology.

Figure 8-22. Simple BGP Network

graphics/08fig22.gif

Maple and Aspen are E-BGP peers. Oak and Aspen are EBGP peers. All three routers are on the same Fast Ethernet segment. Oak and Pine are IBGP peers.

Aspen advertises NLRI, learned from Maple, to Oak. It includes Maple's address as the next-hop information, so Oak can send any traffic directly to Maple instead of making the extra hop through Aspen. Because the three routers share a Fast Ethernet segment, both Maple's global address and link-local addresses are included in the update.

Oak advertises Maple's NLRI information to Pine. The next-hop address is Maple's global address. The link-local address is removed. Example 8-8 shows the configuration of Oak and Aspen.

Example 8-8 BGP Router Configurations
  Oak   interface fastethernet 0   description Oak to Aspen (e-bgp)   ipv6 address 200A::2:0:0:0:1/64   !   interface serial 0   description Oak to Pine (iBGP)   ipv6 address 200A:0:0:10::1/124   !   interface serial 1   description IGP link   ipv6 address 200A:0:0:1::1/124   !   router bgp 100   neighbor 200A::2:0:0:0:2 remote-as 300   neighbor 200A:0:0:10::2 remote-as 100   !   address-family ipv6   neighbor 200A::2:0:0:0:2 activate   neighbor 200A:0:0:10::2 activate   network 200a:0:0:1::/124   exit-address-family  _______________________________________________________________________  Aspen   interface fastethernet 0   description Oak to Aspen (e-bgp)   ipv6 address 200A::2:0:0:0:2/64   !   router bgp 300   neighbor 200A::2:0:0:0:1 remote-as 100   !   address-family ipv6   neighbor 200A::2:0:0:0:1 activate   exit-address-family  

Oak's FastEthernet address is 200A::2:0:0:0:1/64, as you can see in the interface subcommand, and is Aspen's EBGP neighbor.Oak also has an IGP link addressed with the 200A:0:0:1::/124 prefix. Example 8-9 displays the state of the BGP neighbors, a BGP update being sent from Oak to Aspen about the IGP prefix, and the entry in Aspen's routing table.

Example 8-9 Output from BGP Commands
 Aspen#  show bgp ipv6 nei  BGP neighbor is 200A::2:0:0:0:1,  remote AS 100, external link   BGP version 4, remote router ID 172.16.255.1   BGP state = Established, up for 00:00:18   Last read 00:00:18, hold time is 180, keepalive interval is 60 seconds   Neighbor capabilities:     Route refresh: advertised and received     Address family IPv6 Unicast: advertised and received   Received 40 messages, 0 notifications, 0 in queue   Sent 51 messages, 0 notifications, 0 in queue   Route refresh request: received 0, sent 0   Minimum time between advertisement runs is 30 seconds  For address family: IPv6 Unicast   BGP table version 2, neighbor version 1   Index 1, Offset 0, Mask 0x2   1 accepted prefixes consume 64 bytes   Prefix advertised 0, suppressed 0, withdrawn 0   Connections established 4; dropped 3   Last reset 00:00:43, due to User reset Connection state is ESTAB, I/O status: 1, unread input bytes: 0  Local host: 200A::2:0:0:0:2, Local port: 11015   Foreign host: 200A::2:0:0:0:1, Foreign port: 179  

You can see that the information displayed from the command output is very similar to that of IPv4. In fact, because this routing protocol is MBGP, and not a new version of BGP for IPv6, the only thing that you would expect to be added to the output is the address family type denoting IPv6 and IPv6 address formats. The output shows this. The address family value has been added. The address types differ . The TCP port number is the same, 179.

The Anycast Process

Anycast is a mechanism used to route packets to one of many identically addressed nodes. The identically addressed nodes might be a group of servers offering a well-known service to clients , or a group of routers belonging to an ISP, which requires that traffic pass through one of its anycast-addressed routers. A node addresses the IP packet to the single anycast address of the group. The node learns the next hop for the address just as it would for a unicast address. If the anycast address is on-link, the node performs the address resolution process. The first response is added to the neighbor cache. If the address is off-link, the packet is forwarded to the nearest destination based on the routing protocol's measure of distance. There will be a prefix that contains the set of anycast nodes in a domain. For instance, all nodes using the anycast address FEC0::A:FDFF:FFFF:FFFF:FFFE/64 reside within the FEC0:0:0:A::/64 prefix. All these anycast nodes must be advertised as host routes within the domain addressed with this prefix. A node uses the metric of the host route to determine the closest anycast node. You can see that if there are a lot of anycast groups and anycast nodes within the groups, and the containing domains are very large, routing tables within the domain could get very large.

Although anycasting is specified in IPv6, its use is currently very restricted. There is little experience using widespread anycasting services, and there are some known complications, such as ensuring all packets for a session reach the same anycast node, or requiring the anycast nodes to share state information[4]. More issues need to be resolved as experience is gained . The only defined anycast group, other than all anycast subnet-routers, is the Mobile IPv6 home-agent address. Until solutions to the problems are agreed upon, use of anycasting is restricted to routers only.

Multicast

IPv6 uses and facilitates the use of multicasting. Multicasting is used rather than broadcasting to minimize the impact of solicitations, advertisements, updates, and so forth on multicast-capable links. IPv6 facilitates the widespread use of multicasting through its support of scoped multicast addresses and its built-in support for a data-link group membership protocol, the Listener Discovery Protocol. The Protocol Independent Multicast (PIM) routing protocol enables the IPv6 hosts on a link to join a networkwide multicast group.

Scoped Addresses

Multicast scopes have been added to the IPv6 multicast address space. Applications and uses for multicast technology can be created for global, public use, for use within an organization or site, or for use on single links. Administrative policies have to be set to identify the boundaries of sites and organizations to utilize the scopes effectively. Well-known multicast groups can be contained within the defined scopes, making the containment of these multicast applications easier to control.

Listener Discovery

Derived from IGMPv2, the Multicast Listener Discovery (MLD) protocol enables routers to discover which nodes on a link want to receive multicast packets, and to which multicast groups those nodes belong. This information is then passed on to the multicast routing protocol in use on the network, such as PIM. MLD can be broken down into two groups of functions: the host functions and the router functions.

Host Functions

Host functions are similar to the host functions of IGMPv2, discussed in Chapter 5, "Introduction to IP Multicast Routing." Two types of Report messages are defined:

  • Membership Report

  • Done Report

When a host first begins listening to a particular multicast address on a link, it should immediately transmit a Report to inform the router that there is a listener on the link. It sends the Report to the address of the multicast group and also includes the address in the MLD Multicast Address field within the Report packet. The source address of the report is the host's link-local address. The presence of the link-local source address prevents the packet from traveling beyond the local link.

The router periodically sends queries to determine to which multicast groups hosts on the link belong. When a host hears a general query, which does not refer to any particular multicast address, it sets its delay timer for each of the multicast addresses to which it is listening, except the link-scope all-nodes multicast address and any multicast address with scope 0 (reserved) or 1 (node-local). When the host hears a query for a particular multicast address, it sets its delay timer for that particular address only. It sets the delay timers to a random value between 0 and the Max Response Time value that is sent as part of the query. The timer for each individual address is set to a different random value.

If the host does not hear any Reports from other hosts on the link for an address before that address's timer expires, the host sends its own Report. If it does hear a Report before the timer expiration, the host stops the timer and does not send a Report. The link is therefore not flooded with Reports from every member of the group, but the presence of at least one member is known.

When the router receives a Report for a particular multicast address, if the address is not already present in the router's list of multicast addresses, the router adds it to the list and informs the network's running multicast routing protocol of the addition. If the address is already in the list, the router resets the address's timer to the Multicast Listener Interval value. If this timer expires without hearing a Report for a particular address, the address is deleted from the router's list.

Figure 8-23 is a flowchart diagramming the host functions of the MLD process.

Figure 8-23. Host Functions of the MLD Process

graphics/08fig23.gif

When a host is finished listening to a multicast group, it should send a Done message. This is analogous to the IGMP2 Leave message. It is sent to the link-scope all-routers multicast group FF02::2. The Multicast Address field of the message carries the address to which the host is finished listening. A host does not need to send a Done message if its last Report for the address was interrupted by a Report from another node, because there is very likely still another node on the link listening to the same multicast address.

Router Functions

The router functions of MLD also are very similar to IGMPv2, as discussed in Chapter 5. The terms differ a little. The router sends a Multicast Listener Query, of which there are two subtypes :

  • General Query

  • Multicast-Address-Specific Query

The concepts of a querier and a nonquerier router still exist. A router assumes the state of querier or nonquerier for each of its multicast links. As with IGMPv2, an initializing router assumes it is the querier and immediately sends a General Query. If the router hears a query message from another router, it checks the received query's IPv6 source address. If the source address is numerically less than its own, the router relinquishes the role of querier to the other router. If its own address is lower, it remains the querier.

The querier router polls each of its attached links upon startup and periodically with the General Query to discover whether any group members are present. The router's link-local address is the source address of the query. The queries are sent to the link-scope all-nodes multicast address of FF02::1.

When a querier router receives a Done message, if the address referred to in the Done message is in its multicast list, it sends a Multicast-Address-Specific Query to the multicast address to determine whether any listeners remain on the link. If no host responds within the Maximum Response Delay, the router removes the address from the list and informs the multicast routing component.

Figure 8-24 shows the process flow of the MLD router function.

Figure 8-24. Router Functions of the MLD Process

graphics/08fig24.gif

PIM Multicast Routing

As with the unicast routing protocols, multicast routing protocols are modified to support IPv6. Functionally, the protocols operate in the same manner. The modifications mainly support the larger address space. PIM is currently the only multicast routing protocol with IPv6 modifications defined. PIM and other multicast routing protocols are discussed fully in Chapter 5.

The IPv6 modifications define addresses that must be used in PIM messages and identify an area of concern involving scoped multicast addresses and the centralized bootstrap mechanism.

With IPv4, each of the different PIM messages uses multicast or unicast addresses in its Destination field and its assigned interface IP address as the source. With the advent of scoped addresses in IPv6, and the multiple addresses assigned to each link, the choice of which address to use is further defined.

Most of the messages use the global IPv6 all-PIM-routers multicast address, FF02::D, as the IPv6 destination, and the sending interface's link-local address as the source. Other messages use the specific global IPv6 unicast address of the service to which they need to communicate as the destination, and their own global unicast address as the source.

Hello messages are sent on multicast interfaces to discover PIM neighbors. The all-PIM-routers multicast address is the destination of these packets. The interface's link-local address is the source. The link-local address is therefore used in building neighbor tables and in electing the designated router.

Assert messages are sent when a multicast packet is received by a router through an interface that the router views as an outgoing interface for that (source, group) or (S, G) pair. Recall from Chapter 5 that the multicast router maintains a multicast forwarding table with upstream and downstream interfaces for each particular source destined for a particular group (the (S, G) pair). If a router receives a multicast packet on an outgoing (downstream) interface for that (S, G) pair, the packet was forwarded by another router connected to that downstream link. Figure 8-25 illustrates this.

Figure 8-25. Multicast Packet Received on Downstream Interface

graphics/08fig25.gif

Router SJ's multicast forwarding table for the particular (S, G) pair indicates that E0 is downstream from the source and is therefore the outgoing interface. SJ receives a multicast packet for this (S, G) pair through its Ethernet interface. Assert messages are used to determine a single PIM forwarder for the multi-access network. The Assert message is sent by SJ in Figure 8-25 on the Ethernet network to determine which of the PIM routers should be the single PIM forwarder. The messages are sent to the all-PIM-routers multicast address and are sourced from the interface's link-local address. The value of the link-local address is used to break ties in the assert process, with the numerically highest link-local address becoming the forwarder. Downstream routers save the forwarder's link-local address to resolve any future RPF requirements.

The Join/Prune, Graft, and Graft-Ack messages, which are used to build and prune the multicast routers' forwarding tables, also use the all-PIM-router multicast address as the destination and the link-local address as the source. All these messages also contain an address for the upstream neighbor. The upstream neighbor address is set to the link-local address of that neighbor. An RPF lookup is used to obtain the address. If a link-local address for the neighbor cannot be obtained, a known global address for that neighbor is used.

Another message that uses the all-PIM-router mulitcast address destination and the link-local address source is the Bootstrap message. The Bootstrap message is multicast to all PIM routers by the bootstrap router (BSR). The bootstrap router address is contained within the message. Because this address must be accessible by all PIM routers, the address is the domainwide -reachable address of the bootstrap router.

The Register and the Register Stop messages are used in PIM Sparse mode. A source designated router (DR) wanting to send traffic to a multicast group initially encapsulates the multicast packets in a Register message and sends it to the rendezvous point (RP). An RP sends a Register Stop message to the DR, telling the source to stop encapsulating the multicast packets in the Register message. These events are not necessarily sequential. Chapter 5 describes the full sequence of these events. Both the Register and the Register Stop messages address packets to the domainwide reachable unicast address of the rendezvous point router. The source address is the domainwide-reachable unicast address of the DR. The source DR obtains the RP address from the RP-set information multicast to all-PIM-routers by the bootstrap router. The RP obtains the global IPv6 address of the DR from the source address of the Register message it received from the DR.

Each candidate RP unicasts a Candidate-RP-Advertisement message to the bootstrap router. The message contains the multicast group address for which the advertising router is a candidate RP. The message also contains the IPv6 address to be used as the RP address for this router. The destination address for the Candidate-RP-Advertisement is the domainwide-reachable unicast address of the BSR. The source address is a domainwide-reachable unicast address of the candidate RP. The BSR forms the RP-set from these advertisements.

Scoped multicast addresses solve the multicast containment problem; however, they bring up an issue involved with PIM and the bootstrap mechanism. The bootstrap process is a centralized process within a PIM-SM domain. Bootstrap messages from the centralized BSR are expected to reach all PIM routers. If the PIM domain is not a subset of the multicast scoped address domain, the bootstrap mechanism will not work. Multicast packets within one scoped address domain will not traverse to a second scoped address domain. The result is that to allow the bootstrap mechanism to work, the PIM domain must be a subset of the scoped address domain, or all multiple-hop messages must use globally reachable IPv6 addresses.

Quality of Service

No quality of service (QoS) functions are built into IPv6, such as procedures that describe ways you can queue and forward differing traffic classes through routers or ways you can prioritize multiple traffic flows, but there are mechanisms that allow such protocols to work with IPv6. The two such mechanisms are the Traffic Flow and Traffic Class fields of the IPv6 header, as defined in the following sections.

Traffic Flow

Nodes initiating traffic may want to request special handling of certain traffic flows. The node can label the flow, requesting that IPv6 routers provide nondefault QoS for that flow. For instance, a call center application requires very fast response time, so the call center representative using the application can give information obtained from a server to the person on the phone as she speaks. A node may label this flow, requesting that it obtain a different QoS from other traffic.

Traffic Class

The traffic class bits in the IPv6 header are provided for source nodes and/or intermediate routers to distinguish between different classes or priorities of IP packets. The bits can be used in the same way that the IPv4 type-of-service and precedence bits are experimentally being used today. Differentiated Services (DiffServ) redefines the Traffic Class field and calls it the DS field. The definition of the DS field is the same for IPv6 as for IPv4. The leftmost 6 bits are used by the DiffServ codepoint. Packets are marked with a codepoint at the edges of a network. The codepoint determines the behavior of each router when queuing and forwarding the packet. This behavior is called the per-hop behavior (PHB).



Routing TCP[s]IP (Vol. 22001)
Routing TCP[s]IP (Vol. 22001)
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 182

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net