IP switching is similar in concept to frame switching. The difference is that routers use Layer 3 IP addresses for switching criteria whereas frame switching uses Layer 2 MAC addresses. Switching packets within a router adds delay to applications that require real-time responses. To switch a packet, a router uses the following steps, each of which adds to the end-to-end delay of a packet through a network.
A packet may take one of three switching paths through a router. These paths are ordered by efficiency and, depending on the path packets take through a router, can affect the perceived delay of your application.
Process Switching PathWith process switching, each packet's destination is deemed reachable by looking directly in the routing table. If the entry is available, the IP address of the next-hop is retrieved from the entry. Additionally, the MAC of the next-hop is retrieved directly from the ARP table. Figure 3-8 illustrates how packets are switched using the process switching path. Figure 3-8. The Process Switching PathPer-packet load sharing is available in the process switching path, for subnets with more than one routing table entry to the final destination. Each packet within an individual flow will take a different route. Unfortunately, per-packet load sharing may introduce TCP reordering, because packets take different paths toward the destination and may end up at the receiving host at different times. As you learned in Chapter 2, TCP-based applications are not affected by out-of-sequence packets. However, for UDP-based real-time applications, such as IP telephony and streaming media, out-of-sequence packets pose major quality issues. Performing lookups into both the routing and ARP tables for each packet is a relatively expensive task. To avoid routing table and ARP cache lookups for each packet in a flow, use the fast switching or Cisco Express Forwarding (CEF) path. Fast Switching PathWith fast switching, to avoid both routing table and ARP cache lookups for every packet in a flow, the router gradually builds a flow cache from traffic received by the router. The flow cache is built by process switching the initial packet of each flow through the process switching path. Necessary forwarding information, such as next-hop interface, next-hop MAC address, and IP subnet, is retrieved from the ARP cache and routing table and stored in another table called the FIB. Subsequent packets of the flow are switched using the FIB. The premise behind fast switching is that, if a specific destination IP address is required once, more than likely it will be required in the near future. FIBs contain less information than available in both the routing tables and ARP caches, making them much more efficient for destination IP lookups. Figure 3-9 illustrates how fast switching operates. Figure 3-9. The Fast Switching PathAs shown in Figure 3-9, the router receives a packet with destination IP 209.165.200.255. In its routing table it has a single entry [209.165.200.0/24 via 10.1.1.1] for this destination IP. Additionally, its ARP table contains the entry [10.1.1.1, 0e5d.24a2.ad92] for the next-hop of the routing table entry. The resultant FIB entry contains the IP subnet, with the MAC and IP address of the next-hop router. Per-destination load sharing is available in the fast switching path for subnets with more than one routing table entry to the final destination. That is, flows with the same destination IP address take the same route to that destination. Most routes in large ISP networks are learned from BGP where the next-hop in the routing table entry is not a directly connected neighbor of the router. Instead the next-hop entry is for a remote BGP route, located at the border of the AS. In these instances, the next-hop field in the fast switching FIB entry for the remote BGP router is determined by performing an ARP cache lookup of the neighbor router IP that the BGP route was learned from. This additional lookup occurs while process switching the first packet of the flow and is referred to as a recursive lookup. For example, in Figure 3-10, the next-hop for the route to 209.165.200.0/24 is 10.1.2.2. The first packet of a flow arriving for the server 209.165.200.0 is process switched. The router notices that the next-hop is not directly connected and performs another routing table look up for 10.1.2.2. The result is the directly connected next-hop 10.1.1.2, the router that the BGP update was learned from. The router then looks into its ARP cache for 10.1.1.2 and populates the FIB entry with the result. Figure 3-10. Handling Nonadjacent Next-Hop Routing Table EntriesThe router automatically refreshes the fast switching FIB periodically by randomly invalidating 1/20th of the cache entries every minute; otherwise, the FIB could grow extremely large in a short time. Changes to the routing table also cause invalidation of cache entries, resulting in a cache refresh. Routers with high route churn cause the fast switching FIB to be frequently refreshed. To avoid issues related to cache invalidation, refreshing, and recursive lookups, you should enable the CEF switching path in your network. Cisco Express ForwardingWith CEF switching, every entry in the routing table has an associated entry in the CEF FIB. Additionally, every entry in the ARP table has an associated entry in a separate adjacency table, thus separating the reachability information from forwarding information in the CEF path. A benefit of this separation is that the process switching path is not required to build the CEF entriesthe router builds the CEF tables in parallel to the routing table. The data structure of CEF is called a trie with its leaves pointing to individual entries in the adjacency table instead of containing the forwarding information themselves. A trie has the benefit of providing per-flow load sharing by inserting a load sharing hash bucket table between the FIB and adjacency table to determine which of multiple paths each packet should take, if more than one is available. Note The CEF trie is not built on-demand, as is the fast switching cache and therefore does not require refreshing. Hash tables provide rapid access to data items that are distinguishable by a key. Each data item to be stored is associated with a key, such as IP address. A hash function is applied to the key, and the resulting hash value is used as an index to select one of a number of "hash buckets" in a hash table. Note Hash buckets that are used for content distribution in caching environments will be discussed further in Chapter 12, "Exploring Global Server Load Balancing." The buckets contain pointers to the original items. In the case of CEF, the pointers point to the CEF adjacency table entries. The Cisco IOS maintains a hash table with 16 buckets. That is, the maximum number of active paths that are load shared is 16. If 16 is not evenly divisible by the number of active paths, the last few remain unused. Per-flow load sharing is achieved because a source-destination hash provides the same hash bucket entry each time, so the same interface is chosen for every packet in the flow. To enable per-flow load sharing, use the interface configuration command ip load-sharing per-destination. Note Although the ip load-sharing parameter is "per-destination," both the source and destination are hashed together to provide per-flow load sharing. TCP ports are not hashed; therefore, a CEF flow is not a Layer 4 transport connection. Rather, a flow is the traffic generated between a source-destination IP pair at Layer 3. Table 3-3 provides an example hash bucket table for a router with three serial interfaces available for CEF load sharing.
After the FIB lookup, the source and destination IP address are hashed to determine which adjacency to use in the adjacency tablethat is, which interface to forward the flow to. If most traffic is coming from a single source-destination pair, one of the paths may be overloaded. In this case, you should consider enabling CEF per-packet load balancing instead, using the ip load-sharing per-packet interface configuration command. Figure 3-11 illustrates the CEF switching path. Figure 3-11. The CEF Switching PathNote Even with CEF switching, each packet is sent to main packet memory. To bypass main packet memory, use distributed CEF switching (dCEF). dCEF stores copies of CEF trie on supported Cisco router line-cards. Incoming packets destined to next-hop routers reachable from the local line-card are switched by the local trie to the next hop interface and not to main packet memory. To allow failover to the next less efficient path, most routers support running all three switching paths (each with its own FIB structure) in tandem. As new features are released by Cisco, such as QoS features, they normally start in the process switching path until they are well-tested; only then can they be promoted to a more efficient switching path. Netflow is the intermediary switching technology to CEF switching, but it has been superseded as a switching path by CEF. However, Netflow accounting and flow-acceleration remain as powerful IP switching features for content networking and work in conjunction with the previously described switching paths. Netflow accounting exports real-time flow information for billing purposes. |