Connecting devices with private paths over a shared infrastructure is a well-known problem. SPs have solved this with different iterations of VPN solutions over the years. Not surprisingly, we can use and adapt many of these same protocols in enterprise networks to create virtualized Layer 2 and Layer 3 connections using a common switched infrastructure. The focus in this section is on the more relevant of the rather overwhelming menu of protocols to build a VPN. Some of this section is a review for many readers, especially the material on 802.1q, generic routing encapsulation (GRE), and IPsec, and we do not devote much space to these topics. However, we also include label switching (a.k.a. MPLS) and Layer 2 Tunnel Protocol Version 3 (L2TPv3), which are probably less familiar and which consequently are covered in more detail. Note In addition to the references listed at the end of the book, we refer interested readers to Appendix A, "L2TPv3 Expanded Coverage," for more detail about L2TPv3. Layer 2: 802.1q TrunkingYou probably do not think of 802.1q as a data-path virtualization protocol. But, the 802.1q protocol, which inserts a VLAN tag on Ethernet links, has the vital attribute of guaranteeing address space separation on network interfaces. Obviously, this is a Layer 2 solution, and each hop must be configured separately to allow 802.1q connectivity across a network. Because a VLAN is synonymous with a broadcast domain, end-to-end VLANs are generally avoided. Generic Routing EncapsulationGRE provides a method of encapsulating arbitrary packets of one protocol type in packets of another type (the RFC uses the expression X over Y, which is an accurate portrayal of the problem being solved). The data from the top layer is referred to as the payload. The bottom layer is called the delivery protocol. GRE allows private network data to be transported across shared, possibly public infrastructure, usually using point-to-point tunnels. Although GRE is a generic X over Y solution, it is mostly used to transport IP over IP (a lightly modified version was used in the Microsoft Point-to-Point Tunneling Protocol [PPTP] and, recently, we are seeing GRE used to transport MPLS). GRE is also used to transport legacy protocols, such as Internetwork Packet Exchange (IPX) and AppleTalk, over an IP network and Layer 2 frames. GRE, defined in RFC 2784, has a simple header, as you can see in Figure 4-5. Figure 4-5. GRE HeaderThe second 2 octets of the header contain the payload protocol type, encoded using Internet Assigned Numbers Authority (IANA) Ethernet numbers (you can find the most recent version on http://www.iana.org/assignments/ethernet-numbers). IP is encoded as 0x800. The simplest possible expression of a GRE header is a Protocol Type field. All the preceding fields are typically 0, and the subsequent ones can be omitted. You can find freeware implementations that work only with the first 2 octets, but all 4 should be supported. GRE is purely an encapsulation mechanism. How packets arrive at tunnel endpoints is left entirely up to the user. There is no control protocol, no session state to maintain, no accounting records, and so forth; and this conciseness and simplicity allows GRE to be easily implemented in hardware on high-end systems. The concomitant disadvantage is that GRE endpoints have no knowledge of what is happening at the other end of the tunnel, or even whether it is reachable. The time-honored mechanism for detecting tunnel reachability problems is to run a dynamic routing protocol across the tunnel. Routing Protocol (RP) keepalives are dropped if the tunnel is down, and the RP itself will declare the neighbor as unreachable and attempt to route around it. You can lose a lot of data waiting for an RP to detect a problem in this way and reconverge. Cisco added a keepalive option to its GRE implementation. This option sends a packet through the tunnel at a configurable period. After a certain number of missed keepalives (the number is configurable), the router declares the tunnel interface as down. A routing protocol would detect the interface down event and react accordingly. GRE's lack of control protocol also means that there is essentially no cost to maintaining a quiescent tunnel active. The peers exchange no state information and must simply encapsulate packets as they arrive. Furthermore, like all the data-path virtualization mechanisms we discuss, the core network is oblivious of the number of tunnels traversing it. All the work is done on the edge. We do not want to suggest that GRE is the VPN equivalent of a universal solvent. There is a cost to processing GREencapsulation/decapsulation, route lookup, and so forthbut it's in the data path. GRE IOS ConfigurationOn Cisco devices, GRE endpoints are regular interfaces. This seemingly innocuous statement is replete with meaning, because anything in Cisco IOS that needs to see an interface (routing protocols, access lists, and many more) will work automatically on a GRE tunnel. Example 4-7 shows a GRE endpoint configuration, corresponding to the R103 router of Figure 4-6. Figure 4-6. GRE TopologyExample 4-7. R103 GRE Configuration
The tunnel source and tunnel destination addresses are part of the transport network address space. They need to match on both endpoints so that a source address on one router is the destination address on the remote device. The router must also have a path in its routing table to the tunnel destination address. The next hop to the tunnel destination must point to a real interface and not the tunnel interface. In this case, the router has a tunnel interface with tunnel destination of 192.168.2.1 on the public network. The 40.0.0.0/24 network used for the tunnel IP's address, however, is part of the private address space used on Sites 1 and 2. IPsecIPsec provides a comprehensive suite of security services for IP networks. IPsec was originally conceived to provide secure transport over IP networks. The security services include strong authentication (Authentication Header [AH]) and Encryption (Header [EH]) protocols and ciphers and key-exchange mechanisms. IPsec provides a way for peers to interoperate by negotiating capabilities and keys and security algorithms. IPsec peers maintain a database of security associations. A security association (SA) is a contract between peers, which defines the following:
The SA is negotiated when an IPsec session is initiated. Each IPsec header contains a unique reference to the SA for this packet in a Security Parameter Index (SPI) field, which is 32-bit numeric reference to the SA needed to process the packet. Peers maintain a list of SAs for inbound and outbound processing. The value of the SPI is shared between peers. It is one of the things exchanged during IPsec session negotiation. At the protocol level, there are two IPsec headers:
It is possible to use authentication and encryption services separately or together. If used in combination, the AH header precedes the ESP header. There are two ways to encapsulate IPsec packets. The first, called tunnel mode, encrypts an entire IP packet, including the header, in the IPsec payload. A new IP header is generated for the encrypted packet, as shown in Figure 4-7. Figure 4-7. IPsec Tunnel Mode Stack
Tunnel mode adds a 20-octet overload with the new IP header. To reduce issues with packet size and fragmentation, a second mode was defined, called transport mode. Transport mode just protects the TCP/UDP layer and is shown in Figure 4-8. Tunnel mode is better than transport mode at traversing Network Address Translation (NAT) devices. Figure 4-8. IPsec Transport Mode Stack
IPsec requires a lot of negotiation to bring up a session. So much so that there is a separate control channel protocol, called Internet Key Exchange (IKE), used to negotiate the SA between peers and exchange keys material. Note that IKE is not mandatory; you can statically configure the SAs. IKE is not only used during tunnel setup. During confidential data exchange, the session keys used to protect unidirectional traffic may need to be changed regularly, and IKE is used to negotiate new keys. IKE traffic itself is encrypted, and, in fact, it has its own SA. Most of the parameters are fixed as follows:
Cisco IOS IPsec ConfigurationThere is a lot more to IPsec than you will see here, but there are three basic parts to the configuration, which correspond to setting up SAs first for IKE, and then for the session itself, and defining which traffic to encrypt. The steps of the configuration are as follows:
L2TPv3Note Appendix A contains an expanded version of this section that discusses the L2TPv3 protocol in more detail. The L2TPv3 protocol consists of components to bring up, maintain, and tear down sessions, and the capability to multiplex different Layer 2 streams into a tunnel. The L2TP protocol has a both a control and data plane. The control channel is reliable. There are 15 different control message types. The major ones are for the setup and teardown of the control channel itself (see Appendix A for more detail). L2TPv3 peers can exchange capability information for the session during the setup phase. The most important of these are the session ID and cookie. The session ID is analogous to the control channel identifier and it is a "shortcut" value that the receiver associates with the negotiated context for a particular session (for instance, payload type, cookie size, and so forth). The cookie is an optional, variable-length field of up to 64 bits. The cookie is a cryptographically random number that extends the session identifier space so as to ensure there is little chance that a packet is misdirected because of corrupt session ID. 264 is a large number and, as long as it is random, the cookie makes L2TPv3 impervious to brute-force spoofing attacks, where the attacker tries to inject packets into an active session. After a session is established through the control session, the L2TP endpoint is ready to send and receive data traffic. Although the data header has a Sequence Number field, the data channel is not reliable. The protocol can detect missing, duplicate, or out-of-order packets, but does not retransmit. That is left to higher-layer protocols. The RFC allows for the data channel to be set up either using the native control protocol, or statically, or using another control mechanism. In the design sections after Chapter 5, "Infrastructure Segmentation Architectures: Theory," you will see occasions when, frankly, GRE could solve a problem just as well as L2TPv3. What then are the differences between these two protocols? Following is a list of them:
L2TPv3 IOS ConfigurationThere are three things to configure for the L2TPv3 IOS configuration:
To configure the first of these parameters, use the l2tp-class command for control channel setup. Here, you can change sequence number settings and so on, but the minimum required is the shared password known to both peers. Example 4-12 demonstrates the use of this command. Example 4-12. l2tp-class Command
As in classic L2TP setup, if you do not give a hostname parameter, the device name is used. The second part of the configuration is for the data channel. Cisco IOS uses the pseudowire command, which is a generic template also used for Layer 2 over MPLS (called AToM) setup. The pseudowire-class specifies the encapsulation and refers to the control channel setup with the protocol l2tpv3 name command (if you omit this, default control channel settings are used). The pseudowire-class also contains the name of the interface used as the source address of the L2TPv3 packets. Example 4-13. L2TP pseudowire-class Command
Figure 4-9. L2TPv3 TopologyThe final part of the configuration (see Example 14-14) binds the client-facing attachment circuit to the trunk port using the xconnect command (already introduced in the discussion on VPLS earlier in this section). The xconnect command defines the remote peer IP address and a unique virtual circuit (VC) identifier used on each peer to map the L2TPv3 payload to the correct attachment circuit. The L2TPv3 endpoints negotiate unique session and cookie ID values for each VC ID, as shown in Figure 4-9. You must configure a different VC ID for each VLAN, port, or data-link connection identifier (DLCI) transported across an L2TPv3 tunnel (currently, Cisco L2TPv3 supports Ethernet, 802.1q [VLAN], Frame Relay, High-Level Data Link Control [HDLC], and PPP). Example 4-14. xconnect Command
It's interesting that although the second and third versions of protocol differ in relatively small ways, the command-line interface (CLI) configuration differs significantly from the standard L2TP access concentrator / L2TP network server (LAC/LNS) configuration that you might have used for dialup or digital subscriber line (DSL) networks. However, there are obvious, and deliberate, similarities with other pseudowire solutions such as Ethernet over MPLS (EoMPLS). Label Switched PathsLabel switched paths (LSPs) are an interesting hybrid of all the preceding data-path solutions: a Layer 2 data path with Layer 3 control plane. Of course, LSPs are found in MPLS networks, which is a topic that has generated entire library shelves of books and other documents. In this chapter, we present a short review of how packets traverse an MPLS network. We do not cover label distribution or any of the major MPLS applications, such as VPN or traffic engineering (MPLS VPNs are discussed in depth in Chapter 5, however). What we are going to cover may be summarized as follows:
In a normal routing scenario, when a router needs to forward a packet, it finds the outgoing interface by looking for a matching IP address prefix in the routing table. The actual interface used for forwarding corresponds to the shortest path to the IP destination, as defined by the routing policy. Other administrative policies, such as QoS and security, may affect the choice of interface. This collection of criteria used for forwarding decisions is more generally referred to as a Forward Equivalency Class (FEC). The classification of a packet to FEC is done on each router along the IP path and happens independently of the other routers in the network. MPLS decouples packet forwarding from the information in the IP header. An MPLS router forwards packets based on fixed-length labels instead of matching on a variable-length IP address prefix. The label is a sort of shortcut for an FEC classification that has already happened. Where the label comes from is discussed later in this section, but for now, it is enough to say that the labels are calculated based on the topology information in the IP routing table. RFC 3031 puts it like this:
Before looking at this in more detail, we need to introduce some definitions:
Figure 4-10 illustrates MPLS-based forwarding, showing each of the different types of router from the preceding list. Figure 4-10. MPLS ForwardingAs a packet flows across the network shown in Figure 4-10, it is processed by each hop as follows:
Now the difference with standard IP forwarding should be clearer. FEC classification is done when a packet enters the MPLS network, not at every hop. An LSR needs to look only at the packet's label to know which outgoing interface to use. There can be different labels on an LSR for the same IP destination. Saying the same thing in a different way, there can be multiple LSPs for the same destination. A key point to understand is that the control plane is identical in both the IP and MPLS cases. LSRs use IP routing protocols to build routing tables, just as routers do. An LSR then goes the extra step of assigning labels for each destination in the routing table and advertising the label/FEC mapping to adjacent LSRs. ATM switches can also be LSRs. They run IP routing protocols, just as a router LSR does, but label switch cells rather than packets. What is missing from this description is how label information is propagated around the network. How does LSR A in Figure 4-10 know what label to use? MPLS networks use a variety of signaling protocols to distribute labels:
Label Distribution Protocol (LDP), which runs over tcp/646, is used in all MPLS networks to distribute labels for all prefixes in the nodes routing table. Referring again to Figure 4-10, LSR D and LSR B would bring up a LDP session (LSR B would have another session with LSR C and so forth). LSR D is connected to the customer 192.168.2.0/24 network and advertises this prefix to all its routing peers. LSR D also sends a label to LSR B for the 192.168.2.0 network. When LSR B's routing protocol converges and it sees 192.168.2.0 as reachable, it sends label 22 to LSR C. This process continues until LSR A receives a label from LSR C. The complete end-to-end set of labels from LSR A to LSR D form an LSP. An LSP is unidirectional. There is another LSP, identified by a different set of labels, for return traffic from LSR D to LSR A. Understand that two operations must complete for the LSP from LSR A to 192.168.2.0 to be functional:
Figure 4-10 does not show a numeric value for the label between LSR B and LSR D. In fact, as already discussed, the packet on this link has no label at all, because of PHP. Nevertheless, LSR D does still advertise a special value in LDP, called an implicit null (which has a reserved value of 3), so that LSR B performs PHP. Note In fact, LSR D might use several special label values for the 192.168.2.0 prefix, such as the aggregate or explicit null. After LSR A has all the information it needs to forward data across the MPLS network, it encapsulates outgoing packets in a shim header, shown in Figure 4-11 and defined in RFC 3032, which is inserted between the Layer 2 and Layer 3 headers. Encapsulation stacks are defined in different RFCs for Ethernet, ATM, PPP, and other media. Figure 4-11. MPLS Shim HeaderThe MPLS header is simple, as you can see in Figure 4-11. The label itself defines a flat, 20-bit address space. The EXP bits are defined as Experimental, but are in fact used for QoS. MPLS QoS is explained in more detail in the MPLS QoS section of this chapter. The S bit is set on the lowest label when there is more than one label on a packet, which is called a stack. The Time-To-Live (TTL) is analogous to the IP TTL. Many MPLS applications, such as virtual private networking (VPN) and fast reroute (FRR), involve multiple layers, or stacks, of labels. However, an LSR forwards on the basis of the top, or outer, label values only and never looks at the inner ones. The FIB RevisitedLabel switching adds a forwarding path on a router. The FIB and RIB discussed previously in this chapter contain only IP prefixes. LDP stores labels in a Label Information Base (LIB), and the label values are added to the existing forwarding information in a Label Forwarding Information Base (LFIB). The LDP should have an entry for every non-BGP route in the routing table and all the labels advertised by LDP neighbors. The LFIB is built using a combination of the FIB and LIB. For a given prefix, if there is label in the LIB that is received from the LDP peer address as determined by the FIB, that label is installed in the LFIB and is used for forwarding. It is important to understand that the LFIB does not replace the FIB. MPLS creates an alternative path through the router. However, IP packets continue to be forwarded using the FIB, and certain special label values can make a router do an FIB lookup. Cisco IOS LSP ExampleFigure 4-12 shows a simple MPLS topology. All routers are running MPLS on their interfaces, with LDP advertising labels to adjacent devices. The core routing protocol is Open Shortest Path First (OSPF), used on all interfaces. The configuration of each device is virtually identical, with the only MPLS-specific commands being activation of LDP (instead of TDP, an earlier alternative) and label switching on each interface, using the mpls ip command (which, for historical reasons, shows up in the output as tag-switching ip). Example 4-15 shows the configuration for R103 in case you want to try this at home. Figure 4-12. MPLS Network TopologyExample 4-15. R103 Configuration
Three show commands enable you to see the mapping of routes from LIB to LFIB. Examples 4-16 through 4-18 give the output of each one in turn and trace labels used to reach R101's loopback address, 101.101.101.101, from R105. To avoid repetitive output command, we focus on R103 and R102. Example 4-16. R103 show ip route output
There is a one-to-one mapping between the content of the routing table in Example 4-16 and the LIB of Example 4-17. The LFIB, shown in Example 4-18, only contains labels for LSPs that cross the device. If an MPLS packet arrives with an unknown label, it is dropped. Example 4-17. R103 Label Information Base
Example 4-18. R103 Label Forwarding Information Base
Figure 4-12 shows the label values advertised by each LSR for prefix 101.101.101.101. Example 4-18 shows how this label information appears in R103's LIB. There are three entries for 101.101.101.101:
In the LFIB in Example 4-18, there is a single entry for 101.101.101.101/32. It means that R103 will forward a packet received with value 19 onto interface Ethernet0/0. R103 also swaps the label value. It is just a coincidence that the same values are used for the same IP prefix on different routers. Labels have local significance. Figure 4-12 shows that router R102 receives an implicit null label from R101 and so performs PHP with label value 19 and forwards an IP packet on interface Ethernet0/0. Data-Path Virtualization SummaryWe presented several different protocols that can be used for data-path virtualization. Two of them are suitable for Layer 2 traffic only: 802.1q, which is configured on each hop, and L2TPv3 which is configured end to end. IPsec is suitable for IP transport. Finally, GRE and MPLS LSPs can be used for either Layer 2 or Layer 3. GRE is another IP tunnel protocol, configured only on endpoints. MPLS creates a new forwarding path and is configured on all hops in a network. |