The Enterprise: Managing the VPN | Selecting MPLS VPN Services

This section focuses on providing the network manager with information and guidelines to help him or her prepare for the introduction and subsequent management of an MPLS VPN service.

Figure 8-5 shows a life-cycle model to be used for reference within this section.

Figure 8-5. VPN Life-Cycle Model

Planning

An important part of the network manager's responsibility is ensuring availability of network resources and performance to meet end-user needs. To do this, it is essential to establish a profile or baseline.

Another reason why a profile is useful is because it allows the network manager to plan a monitoring strategy between locations. It is good practice for enterprises to test reachability between key locations. This can be used to validate SLAs and also provide fast detection of connectivity problems. In a large enterprise with many sites, however, it may simply be impractical to monitor in a full-mesh style, as shown in Figure 8-6.

Figure 8-6. Full-Mesh VPN Monitoring

As shown, the number of probe operations increases proportionally to the square of the number of sites. This can be problematic from a management perspective, because it takes up resources on the CE routers and can be difficult to maintain as the network grows.

A better approach for large networks is a partial mesh, as shown in Figure 8-7.

Figure 8-7. Partial-Mesh VPN Monitoring

In this system, specific critical paths are monitored, such as branch office to headquarters or remote site to data centers. This can dramatically reduce the number of probes, as well as management and performance overhead.

Ordering

Evolving business requirements will require changes to the VPN service. Such changes might take the form of new sites, changes to routing tables/protocols, and decommissioning of sites and circuit upgrades.

It is good practice to agree to a moves/adds/changes/deletes (MACD) process with the service provider. A basic version of this might simply involve filling in a change request form and e-mailing it to the appropriate contact within the service provider, such as the account manager. A more sophisticated system might automatically request bandwidth upgrades.

Provisioning

The impact of the provisioning process varies according to whether the service is managed or unmanaged. The main difference is that in an unmanaged service, the enterprise is responsible for providing and configuring the CE routers, whereas in the managed service, this is largely transparent. In both cases, however, the enterprise should perform a number of tests during and after the provisioning process.

CE Provisioning

With unmanaged CEs, the enterprise is responsible for configuring the following:

IP addresses
Host name
Domain name server
Fault management (and time-stamp coordination by means of the Network Time Protocol)
Collecting, archiving, and restoring CE configurations
Access data, such as passwords and SNMP strings on the unmanaged CE

When the enterprise wants to manage the CE routers themselves, it must cooperate with the service provider to ensure that the devices are configured correctly. In addition, the service provider may require access to the CE routers to deploy probes and read-only access to the router configuration and SNMP MIBs.

The requirement to access SNMP MIB and CLI data introduces a security concern for the enterprise.

The service provider management systems require connectivity to the enterprise CE routers via the PE-CE connection. Access to the CE routers can be tightly controlled by access control lists (ACLs) or the equivalent with predefined source/destination addresses and protocols. Secured access can be implemented through the following mechanisms:

An extended IP access list on the CE-PE interface of every CE router. The access list permits the service provider management system's subnets to communicate with the CE router loopback management interface using SNMP only. Any other packets sourced from the service provider management system's subnets are blocked.
A standard IP access list to explicitly define service provider SNMP server systems that require read-only access to the CE routers.
An SNMP community string and SNMP view to restrict the service provider read-only SNMP servers to a limited subset of MIBs required for reporting system and interface status, as well as traffic statistics.
A further discrete standard IP access list to define service provider SNMP management server systems, which require SNMP write access to any MIBs required for active monitoring. An example might be the Cisco Round-Trip Time Monitoring (RTTMON) MIBs to configure IP SLA probes.
A second SNMP community string and SNMP view to be configured restricting the service provider SNMP write access to the probe MIBs only.
A second loopback interface to source probes because traffic sourced from loopback0 is directed to the highest class, prohibiting per-class reporting.

CE Management Access

The need to access CEs depends on whether the service being offered is managed or unmanaged. Both these scenarios are discussed.

Unmanaged CE Routers

If the CEs are unmanaged, the service provider can use IP version 4 (IPv4) connectivity for all management traffic.

Figure 8-8 shows a basic topology with unmanaged CEs. The network management subnet has a direct link to the service provider MPLS core network.

Figure 8-8. Network Management of Unmanaged CEs

Managed CE Routers

In managed or hybrid (partially managed) scenarios, connectivity to the CEs from the service provider network is required. This is usually provided in the form of a network management subnet. However, as soon as a CE is in a VPN, it is no longer accessible by means of conventional IPv4 routing.

To enable IP connectivity between the service provider network management systems and enterprise-connected CE routers, you must configure a VRF instance on every PE router port connected to an enterprise location to import service provider management system routes. These routes are distributed within the VPN via a dedicated "management VPN" having a unique route target value. All customer-facing PE ports that connect to a service provider-managed CE router must be configured to import this specific route-target value, as shown in Figure 8-9.

Figure 8-9. Using a Management VPN

Cisco Managed MPLS VPN Solution Guide, http://www.cisco.com/univercd/cc/td/doc/product/vpn/solution/manmpls/overview/mmpls_ov.pdf

A network management VRF table contains the circuit addresses of all CE routes. The service provider management workstation(s) originate from this VRF.

Each customer VRF should contain the address of the service provider management workstation(s) to allow two-way communication between the management workstation and the CE router.

When a management VRF is created,

All CE routers can be managed from a single location.
Only routes that originate from the VRF are exported (by virtue of the transitivity rule). Routing separation is guaranteed between CE routers.

All CE routers are easily identified because they all use a circuit address from the same service provider-managed address space.

Example 8-1 shows a sample configuration of a customer VRF for use with a management VRF. This is an excerpt from the output obtained using the show running-config command.

Example 8-1. Customer Management VRF Configuration

ip vrf ACME rd 6000:1 route-target export 6000:1 route target import 6000:1 ! ! Export routes to the Management VRF ! route-target export 100:1 ! ! Import Management host(s) only ! route-target import 100:10

Example 8-1 shows the use of two additional route targets:

route-target export 100:1 exports all routes from this VRF to the management VRF using the extended community attribute 100:1.
route-target import 100:10 imports any VPN-IPv4 addresses with the extended community attribute 100:10, which identifies any management host(s).

The management VRF is configured at the PE that connects to the service provider management subnet. Example 8-2 is a sample configuration (taken from the show running-config output).

Example 8-2. Management VRF Configuration

ip vrf VPN_Management rd 1000:1 import map IN-Management export map OUT-Management route-target export 1000:1 route-target import 1000:1 ! ! Only allow PE-CE circuit addresses into VRF ! ip prefix-list PE-CE-Circuits seq 10 permit 8.1.1.0/16 ge 30 ip prefix-list PE-CE-Circuits seq 10 permit 9.1.1.0/16 ge 30 ! continued for other circuits... ! route-map IN-Management permit 10 match ip address prefix-list PE-CE-Circuits ! ! Set Management Workstation route to 1000:10 ! route-map OUT-Management permit 10 match ip address 20 set extcommunity rt 1000:10 ! ! Access List to identify management hosts (one line for each host) ! access-list 20 permit 190.1.42.3 0.0.0.0 ... ! ! Set a static route to Management workstation(s) ! ip route vrf VPN_Management 190.1.42.0 255.255.255.0 next-hop IP address ! ! Enter other routes to Management hosts... ...

In Example 8-2, the VRF uses the service provider-specified route distinguisher (RD) and route target (RT) of 1000:1. In addition to the normal import/export route targets for the VRF, two route maps are specified: IN-Management and OUT-Management.

The IN-Management route map limits any imported addresses to those of the PE-CE circuit address space. In other words, the address prefix must be a /30 subnet beginning with 8.1.1.0, 9.1.1.0, and so on. This prevents all other routes in the customer VRFs from being imported.

Because the management VRF is connected to an interface, which originates many subnets, static routing is used to specify how to reach it. In this example, the management subnet is 190.1.42.0.

To guarantee that only the host addresses of management workstations are exported from the management VRF, static routes are used to identify each management address individually.

The OUT-Management route map then sets all management host addresses (those that match access list 0) to the extended-community attribute of 1000:10. They are then imported into the customer VRF with a corresponding import map.

Network Management Configuration Considerations

If the management host addresses in the preceding configuration are redistributed to a CE through a routing protocol such as Routing Information Protocol version 2 (RIPv2), the CE can readvertise the route back to the PE in a summarized form. This occurs even though split horizon (which does not send routes out an interface on which they were received) is enabled. For example, if the host route 190.1.42.3 255.255.255.255 is advertised to a CE with auto-summary enabled, that route is summarized to a B-class address of 190.1.0.0 255.255.0.0 and is advertised back to the PE. The split-horizon process lets the route pass because it is not the same as the route that was received.

There are two ways to avoid re-advertising routes back to the PE router:

Turn off auto-summary at the CE.
Use route distribution filtering at the PE.

Another useful technique is to configure CE devices with a loopback interface, the IP address of which is used as the management address of the CE router. The CE router must be configured to advertise this address with a 32-bit mask to the PE router. The PE router in turn exports only this loopback interface address with a 32-bit mask to the service provider management VPN using a second unique route target value (different from the import value previously mentioned). In this scheme, there should not be any requirement to export the PE-CE interface network address to the management VPN.

All other IP prefixes received by the PE router from the CE router are exported to the customer VPN only. No customer IP prefixes other than the CE router loopback address are advertised within the management VPN.

Acceptance Testing

It is highly recommended that the enterprise perform some form of formal acceptance testing of new circuits and sites. This typically consists of the following steps:

PE-CE link testThis is a simple ping to the local PE device.
CE-CE testAgain, a simple ping to a remote CE. An additional and useful extension is to ping a loopback address on the CE, because this validates that routes "behind" the CE can be reached.
A more prolonged test to measure the performance characteristics of the path from the new site to others in the VPN. This is often called a "soak" test and is best implemented using IP SLA probes. Such a test might run for 12 or 24 hours.

Tip

Use an extended ping to test a range of packet sizes. This will help you discover failures that occur only near the 1500 maximum transmission unit (MTU) limit. Note that this requires the Do Not Fragment (DF) bit to be set in the header, as shown in Example 8-3.

Example 8-3. Using Extended Ping to Test a Connection

cl-12008-1#ping Protocol [ip]: Target IP address: 144.254.1.17 Repeat count [5]: 20 Datagram size [100]: Timeout in seconds [2]: Extended commands [n]: y Source address or interface: Type of service [0]: Set DF bit in IP header? [no]: yes Validate reply data? [no]: Data pattern [0xABCD]: Loose, Strict, Record, Timestamp, Verbose[none]: Sweep range of sizes [n]: y Sweep min size [36]: 60 Sweep max size [18024]: 1510 Sweep interval [1]: 10

For QoS, a basic test can be performed using extended ping commands to insert QoS markings and validate that the service provider is handling them correctly by capturing appropriate statistics on the remote devices.

Note

Because extended ping allows the type of service (ToS) byte to be set, you must be careful to use correct values when mapping from Differentiated Services Code Point (DSCP). For example, if you intend to simulate voice, it has a DSCP value of 101110. However, the ToS byte needs to be set to 10111000 (decimal 184), not 5 as is sometimes assumed.

Tip

For best results, originate the tests from behind the CE devices to test firewall capability. It is also useful to perform these tests after the VPN goes live.

Monitoring

Even though the service provider may monitor its network for SLA purposes, it is highly recommended that the enterprise perform its own monitoring. There are several reasons for this:

The enterprise should be able to monitor at a higher rate than the service provider.
The enterprise should not assume that levels of QoS are being met.
The enterprise has additional performance characteristics to be concerned with.

There are basically two relevant forms of monitoring: availability and performance or service degradation.

For availability, this may be a measurement at the host or site level or both. Because of network glitches and transient conditions, the monitoring solution should contain an element of "damping." This helps keep false alarms from being raised. An example might be a solution that pings hosts every 10 seconds but uses a sample period of several minutes to derive an availability measurement.

The second form of monitoring is for performance or service degradation. A relatively simple metric can be obtained using the same scheme as for availability, but taking into account round-trip times.

A more sophisticated scheme might involve dedicated probes. For example, the Cisco IOS IP SLA provides a mechanism to monitor performance for different classes and types of traffic over the same connection. This technology could be used to monitor the response time between a Cisco device and an HTTP server to retrieve a web page.

Optimization

The main objective of optimization is to increase application performance (because this can be directly correlated with productivity). An important part of optimization is calculating required bandwidth. Recent technologies have tried to simplify this whole process by providing bandwidth recommendations based on individual traffic classes. An example is bandwidth estimation within Cisco IOS. This technology produces a Corvil Bandwidth value, which is the minimum amount of bandwidth required to meet a specific QoS target. This technology is most applicable at the WAN interface point between enterprise and service provider networks.

Note

This feature is currently restricted to IOS 12.3(14)T and requires a special license. More information can be found at

http://www.cisco.com/en/US/tech/tk543/tk759/tech_brief0900aecd8024d5ff.html

and

http://www.cisco.com/univercd/cc/td/doc/product/software/ios123/123newft/123t/123t_14/gtcbandw.htm

Of course, increasing bandwidth may not always be an option. Other techniques include caching, compression, and increasing transmission/application speed. All of these require dedicated management system support, but the return on investment (ROI) would make such an investment worthwhile.