Almost all the VPN architectures that we have looked at so far have a trait in common: They all use an overlay model. CPE devices peer each other, oblivious to the fact that all their control- and data-plane traffic is tunneled across a network. The architecture used by the VPNs that we examine in this section differs. It was first proposed in RFC 2547 and is often referred to with this moniker. The major difference is that CE and PE routers form a peer relationship. PE routers use Multiprotocol BGP (MP-BGP) to exchange customer route information. In a Layer 3 scenario, this greatly reduces the routing complexity on the CE routers (which is the value of such a service to a customer). Now, the PE is the next hop for every CE. Figure 5-8 shows the peer relationship at the edge of the network. Compared to pure overlay solutions, the PE has more work to do. The details will become clearer as you proceed through this section. Figure 5-8. RFC 2547 Control-Plane InteractionFigure 5-8 shows a simple RFC 2547 network with two VPNs connected over a core network. The RED network has two sites, which use 192.168.3.0/24 and 192.168.4.0/24. The VPN must connect these together. PE A and PE B peer with CE1 and CE2, respectively, and use OSPF to exchange routes. CE1 advertises 192.168.3.0/24, and CE2 advertises 192.168.4.0/24 over the dedicated link that connects to the PE router. When the routing protocols over the VPN connection converge, a show ip route command on CE2 will show 192.168.3.0/24 with a next hop of 192.168.2.1, which is an address on PE B. Note that the CE-PE links are part of the customer address space. When PE A receives an OSPF update from CE 1, it first stores the route in the RED VRF table. PE A has MP-BGP sessions open with PE B and PE C (only the first of these is shown in Figure 5-8). To provide Layer 3 reachability across the network core, the PEs must advertise customer routes to each other. Thus, for the RED VPN, PE A announces reachability information for prefix 192.168.3.0/24 to PE B, but in a modified form called a VPNv4 address. PE A also received route updates from its peers. In this case, it will learn that 192.168.4.0 is reachable through 10.10.10.12 (PE B's loopback address). The BGP Network Layer Reachability Information (NLRI) exchange has important additional attributes, too, most notably the following:
Now we will review each of the items announced with MP-BGP. RFC 2547 provides address separation between VPNs by using VRFs on PE routers, dedicated interfaces between CE and PE, and by creating an extended VPNv4 address space across the core network. VPNv4 addresses are created by concatenating a route distinguisher (RD) with the customer route prefixes. Every VPN must have different RDs (note that there can be more than one RD). PEs store customer IPv4 routes in VRF tables and exchange corresponding VPNv4 routers with other PEs. A traffic analyzer would see only VPNv4 data in the internal BGP (iBGP) packets. Each PE is configured so as to map an RD to every VRF. The iBGP session uses this value when it exchanges the prefixes in the VRF with other PEs. VPNv4 prefixes are used only in the control plane. Actual packets (at least with all current implementations) maintain their IP address format for source and destination. In addition, no change occurs to routing exchanges between CE and PE, which use standard IPv4. PEs rewrite the next-hop address information (which would be CE address) and replace it with one of their own addressesthat is part of the core network address space. In this way, a PE can send traffic destined for another part of a customer VPN over the core network using standard routing lookups. When the time comes to forward traffic to CE2, PE A will do a lookup to find the route to 192.168.4.0, which will resolve to 10.10.10.12, and PE A must do another forwarding information base (FIB) lookup to find the route to this address. This time the next hop will be 10.0.0.2, which is an LSR. The core routing protocol of Figure 5-8 announces the 10.0.0.0 prefixes, which includes the addresses used by the PE routers as their BGP identifier. The next two sections discuss two forwarding-plane alternatives to carry the intra-VPN traffic, first with MPLS and then with L2TPv3. Note the presence of the VPN label in the preceding list. Each PE generates a 20-bit label value for the VPN the address is associated with. The data plane uses this label to identify which VRF should be used to forward a packet received on a core-facing interface. The RFC 2547 model naturally creates a full mesh between PEs, so every CE can reach any other CE in two hops. It is possible to constrain inter-CE reachability, even as far as creating a hub-and-spoke topology, using route targets (RTs). RTs are extended BGP communities that are announced between PEs. A PE can be configured to export prefixes with a certain RT and import only prefixes that match a specified RT value. RTs allow arbitrary meshes to be built between sites. Both RD and RT are encoded in a numeric format, usually based on the autonomous system of the site 100:1. Despite the common format, no link exists between the two. You can use multiple RD values within the same VPN with the restriction of one RD per VRF, but it is better to have a one-to-one mapping for operational simplicity. Figure 5-9 shows PE1 and PE2 routers that export prefixes with a "Spoke" RT value of 100:1 and use 200:1 for import. PE3 imports the 200:1 routes into VRF_IN, which it then distributes to the CE_Hub1 router. CE_Hub2 announces these routes back to PE3, where they are placed in VRF_OUT, then exported with a 200:1 route-target value. As a result, traffic between CE1 and CE2 is routed through the Hub site. Figure 5-9. Hub-and-Spoke Using RTsThe details follow:
In the default, full-mesh scenario, PE1 and PE2 of Figure 5-9 would install each other's routes so that packets from CE1 to CE2 would follow the most direct path across the core. The RFC 2547 model supports auto-discovery but not auto-provisioning. If you add a new network at a site, reachability information is automatically propagated to all the other sites. With Layer 3 VPNs, you can also use route aggregation to simplify which prefixes must be advertised between sites. Provisioning is not complicated, nor is it automatic because, as with any BGP session, you must configure the routes to announce on a PE (and, potentially, the route prefixes to import on other PEs, but this is not obligatory). If you add a new site connected to a new PE, every other PE must be configured to bring up an iBGP session with that PE. RFC 2547 allows the use of BGP route reflectors (RRs), which remove the N-squared connectivity between PEs, thus helping to scale to large numbers of sites, and make provisioning a one-time operation on any PE (which only peers with the RR). The choice of BGP gives the RFC 2547 architecture well-understood properties of scale and robustness. As the protocol used to manage Internet backbone routes, BGP is known to be suitable for large networks. It also supports flexible policy statements. BGP was extended to work in RFC 2547 architectures; the result is known as MP-BGP and can announce VPNv4, VPNv6, IPv4, and IPv6 routes. Customers are free to use any routing protocol on CE-PE links. The only caveat is that the PE implementation must be able to store the routes in a VRF (and hence must be VRF-aware). Standard route redistribution allows appropriate routes to be announced to and imported from MP-BGP. MPLS-VPN is the most common implementation of RFC 2547, and we examine it first. The RFC describes both the MP-BGP control plane and the MPLS-based forwarding plane. However, a role exists for a more generalized model that runs over IP networks, and we look at emerging proposals to run RFC 2547 over L2TPv3 and mGRE tunnels. Note The statement at the beginning of this section posited that almost all the previous VPNs were overlays. Which one wasn't? VPLS. The CLE peers with the network at Layer 2. RFC 2547bis the MPLS WayMPLS VPNs are built with a double layer of label. The inner VPN route label identifies the customer VRF, and the outer tunnel label identifies the next hop on the LSP to the egress PE. The easiest way to understand the operation is through an example. Figure 5-10 shows an end-to-end MPLS VPN example with routing information across the network. Figure 5-10. MPLS VPN ForwardingBefore traffic is forwarded, PE routers must prepend labels so that the packet can reach the right VRF on the right PE. Figure 5-10 shows a packet going from CE green1 to CE green2.
It is important to understand that the LSRs have no view of the VPN traffic. They forward labeled traffic along LSPs established by whatever routing protocol is running in the core network. Of course, the choice of IGP can be completely different from the IGPs running on the CE-PE linksthe two do not talk to each other. Labeled packets are found only in the core network. The CE-PE links use IP. The last step of the list also describes a penultimate hop popping (PHP) operation (PHP was introduced in Chapter 4). When the last LSR pops the outer label, it reveals the packet's inner label. PE D can use a single LFIB lookup to find the VRF to use. It does a second lookup to find the outgoing CE interface. An MPLS VPN uses two protocols for signaling. The underlying MPLS network uses LDP between LSRs to announce labels for the prefixes in their routing tables. The PEs use MP-BGP to announce VPN route labels. No correlation exists between the two different label spaces. Here is a succinct summary of MPLS-VPN operation from draft-ietf-l3vpn-greip-2547-03:
Recall that Ethernet over MPLS (EoMPLS) and VPLS also use double label stacks, but they have different signaling protocols (directed LDP). MPLS VPNs have been a successful service, with many hundred operational networks worldwide. Some carriers, however, might be simultaneously attracted to the merits of the RFC 2547 model but resistant to the need to deploy MPLS to support it. For them, there are other proposals that allow 2547 to run over an IP core, that use alternatives to labels for forwarding, but still use labels for VPN identification. RFC 2547bis Forwarding-Plane AlternativesThe two proposals that we examine in this chapter decorrelate the RFC 2547 control and data planes. Both retain MP-BGP for customer route distribution and continue to use labels for VPN route table identification. However, the core network is no longer based on MPLS, but uses standard IP. The PE-CE reference architecture is maintained. MPLS over mGREThis architecture uses dynamic GRE tunnels between PEs to carry customer packets. Route distribution between CE-PE and PE-PE works just like in MPLS-VPN solution. The important difference is how a PE forwards traffic across the core. Figure 5-11 shows a sample topology that uses the same network addresses as the MPLS VPN example in Figure 5-10. When PE A needs to route a packet from CE1 to CE2, it consults its GREEN VRF table to find the BGP next-hop address (PE D) and the VPN route label announced by PE D. It prepends the label to the packet and looks up the next-hop address, outgoing interface, and encapsulation information in the FIB. PEA then prepends a GRE header to the labeled packet, with a source and destination IP addresses corresponding to the public addresses of PE A and PE D, respectively. The GRE type field indicates an MPLS payload. Figure 5-11. MPLS over mGRE ExampleThe inter-PE routes are learned using a standard IGP. No extra control-plane information is in the core network. The dynamic moniker refers to the fact that the GRE tunnels are never seen as a possible path by routing protocols running on the PE, nor do routing adjacencies form across them. In fact, given the lack of protocol state, this solution really uses GRE as an encapsulation method to traverse an IP core. One point mentioned in the draft RFC that merits discussion here is the greater susceptibility to spoofing of the core network. With MPLS, the provider can simply discard any labeled packets received at the edge of its network, so it is hard for a malicious user to introduce spoofed packets into MPLS networks. With MPLS over GRE, however, no equivalent boundary exists. The PE receives and forwards IP, so it would have to use some other filtering mechanism to enforce an antispoofing policy. MPLS over L2TPv3MPLS over L2TPv3 uses exactly the same principle as the GRE solution just discussed, except with an L2TPv3 data plane. Once again, BGP distributes customer routes and VPN route labels. Recall from the discussion in Chapter 4 that L2TPv3 has its own control plane, and information negotiated during session establishment is found in the data-plane header, notably session ID and cookie ID. Session and cookie IDs are also used in this architecture. However, their roles differ from what we saw previously. The session ID was a session multiplexer: Different sessions have different identifiers within the same tunnel. Here, a label plays this role, so the session ID value is used only to indicate to the receiving L2TPv3 engine that the incoming packet belongs to a Layer 3 VPN service and that additional processing is required by another subsystem. The cookie ID is still used for antispoofing protection. The value can be generated statically or randomly and can either be global per PE or local per session. The session and cookie identifiers are announced using MP-BGP but, just as for MPLS over GRE, different implementations could use another protocol. The forwarding plane is similar if more complex than with the GRE solution. Consider the network in Figure 5-12. When CE1 sends data to CE2, PE A first performs a route table lookup to find the BGP next hop and VPN label identifier. It also finds the L2TPv3 session and cookie identifiers for the remote PE. PE A then does a second lookup to identify the IP address of the PE D and encapsulates the packet in an L2TPv3 frame and sends it. The core is a completely standard IP network. Figure 5-12. MPLS over L2TPv3 ExampleOn the ingress, PE D removes the outer L2TPv3 header and verifies session and cookie values. If they are valid, the packet's label is used to identify the correct VRF, and output processing continues as usual. Compared to GRE, antispoofing is somewhat enhanced because PE D can drop incoming packets that do not have a correct cookie ID. Is there any great difference between using one of these IP-based encapsulations to create a Layer 3 VPN service compared to using MPLS? The major difference is clearly that the core network is still IP, so no migration is necessary to start offering a VPN service. The disadvantage is that you lose some useful MPLS-based tools, such as fast reconvergence (which is being developed for IP) and traffic engineering. |