QoS Tools

Some of the various tools that implement QoS are described in this section and illustrated in Figure 6-2.

Figure 6-2. QoS Tools Manage Network Traffic

Many devices send data into a network. In the example shown in Figure 6-2, an IP phone produces packets that contain voice traffic, and a PC sends file transfer data. As the data enters the network, it is analyzed and classified according to how it should be dealt with in the network. After it is classified, the data is marked accordingly.

Key Point

Classification and marking form the basis for the rest of the QoS tools; it is here that business policies, priorities, and so forth are first implemented.

The markings can then be used by other tools. For example, packets can be dropped by policing tools so that the maximum rate on an interface is not exceeded. Or packets can be dropped by congestion-avoidance tools to avoid anticipated interface congestion. Remaining packets are then queued, again according to their markings, and scheduled for output on the interface. Other tools, such as compression, can be implemented on the interface to reduce the bandwidth consumed by the traffic.

The following sections explore these QoS tools:

Classification and Marking
Policing and Shaping
Congestion Avoidance
Congestion Management
Link-Specific Tools
AutoQoS

Classification and Marking

Before any traffic can be given priority over or treated differently than other traffic, it must first be identified.

Key Point

Classification is the process of analyzing packets and sorting them into different categories so that they can then be suitably marked; after they are marked, the packets can be treated appropriately.

Marking is the process of putting an indication of the classification of the packet within the packet itself so that it can be used by other tools.

The point within the network where markings are accepted is known as the trust boundary; any markings made by devices outside the trust boundary can be overwritten at the trust boundary. Establishing a trust boundary means that the classification and marking processes can be done once, at the boundary; the rest of the network then does not have to repeat the analysis. Ideally, the trust boundary is as close to end devices as possibleor even within the end devices. For example, a Cisco IP phone could be considered to be a trusted device because it marks voice traffic appropriately. However, a user's PC would not usually be trusted because users could change markings (which they might be tempted to do in an attempt to increase the priority of their traffic).

Classification

Classification can be done based on data at any of the OSI layers. For example, traffic can be differentiated based on the Layer 1 physical interface that it came in on or the Layer 2 source Media Access Control (MAC) address in the Ethernet frame. For Transmission Control Protocol/Internet Protocol (TCP/IP) traffic, differentiators include the source and destination IP addresses (Layer 3), the transport (Layer 4) protocolTCP or User Datagram Protocol (UDP), and the application port number (indicating Layer 7).

Some applications require more analysis to correctly identify and classify them. For these cases, the Cisco Network-Based Application Recognition (NBAR) classification software feature, running within the IOS on Cisco routers, can be used. NBAR allows classification (and therefore marking) of a variety of applications, including web-based and other difficult-to-classify protocols that use dynamic TCP/UDP port assignments. For example, Hypertext Transfer Protocol (HTTP) traffic can be classified and marked by specifying uniform resource locators (URLs) so that a customer who is accessing an online ordering page could be given priority over someone accessing a general information page. Support for new protocols can be easily and quickly added through downloadable packet description language modules (PDLMs).

Note

You must enable Cisco Express Forwarding before you configure NBAR.^[4] (See Chapter 2, "Switching Design," for information about Cisco Express Forwarding.) NBAR examines only the first packet of a flow; the rest of the packets belonging to the flow are switched by Cisco Express Forwarding.

Marking

Marking can be done either in the Layer 2 frame or in the Layer 3 packet.

For Ethernet frames, Layer 2 marking can be done using the following methods:^[5]

For an Institute of Electrical and Electronics Engineers (IEEE) 802.1q frame, the three 802.1p user priority bits in the Tag field are used as class of service (CoS) bits. (Recall from Chapter 2 that 802.1q is a standard trunking protocol in which the trunking information is encoded within a Tag field that is inserted inside of the frame header itself.)
For an Inter-Switch Link (ISL) frame, three of the bits in the user field in the ISL header are used as CoS bits. (Recall from Chapter 2 that ISL is a Cisco-proprietary trunking protocol that encapsulates the data frame between a 26-byte header and a 4-byte trailer.)
No CoS representation exists for non-802.1q/non-ISL frames.

Because the CoS is represented by 3 bits, it can take on one of eight values, 0 through 7.

Key Point

Layer 2 markings are not useful as end-to-end QoS indicators because the media often changes throughout a network (for example, from Ethernet to a Frame Relay wide-area network [WAN]). Thus, Layer 3 markings are required to support end-to-end QoS.

For IP version 4 (IPv4), Layer 3 marking can be done using the type of service (ToS) field in the packet header. Recall (from Appendix B) that this 8-bit field is the second byte in the IP packet header. (Figure B-11 illustrates all the fields in the IP packet header.) Originally, only the first 3 bits were used; these bits, called the IP Precedence bits, are illustrated in the middle of Figure 6-3. Packets with higher precedence values should get higher priority within the network. Because 3 bits again can only specify eight marking values, IP precedence does not allow a granular classification of traffic.

Figure 6-3. The ToS Field in an IPv4 Header Supports IP Precedence or DSCP

Thus, more bits are now used: The first 6 bits in the ToS field are now known as the DiffServ Code Point (DSCP) bits, and are illustrated in the lower portion of Figure 6-3. (The lower 2 bits in the ToS field are used for explicit congestion notification [ECN], which is described in the "Congestion Avoidance" section, later in this chapter.) With 6 bits, DSCP allows 64 marking values.

DSCP values can be expressed numerically (with binary values from 000000 through 111111 or decimal values from 0 through 63) or by using Per-Hop Behavior (PHB) values; PHBs are just keywords that represent some numeric DSCP values. (The name per-hop behavior indicates that each device, or hop, should behave consistently when determining how to treat a packet.)

Four PHB classes exist; they are described as follows:

Default or Best Effort (BE) PHB This PHB has a DSCP binary value of 000000 and represents the best-effort service.
Class Selector (CS) PHB This PHB has the lower three DSCP bits set to 000. Because this PHB uses only the upper 3 bits, it is compatible with the IP precedence values and is in fact written as CSx, where x is the decimal IP precedence value. For example, the CS PHB with the value 011000 represents IP precedence binary 011 or decimal 3; it is written as CS3.
Expedited Forwarding (EF) PHB This PHB represents a DSCP value of binary 101110 (decimal 46) and provides a low-loss, low-latency, low-jitter, and guaranteed bandwidth service. The EF PHB should be reserved for only the most critical applications, such as voice traffic, so that if the network becomes congested, the critical traffic can get the service it requires.
Assured Forwarding (AF) PHBs Four classes of AF PHBs exist, each with three drop preferences. These classes are represented as AFxy, where x is the class (a value from 1 to 4) and y is the drop preference (a value from 1 to 3). The AF class is determined by the upper 3 bits of the DSCP, while the drop preference is determined by the next 2 bits. (The lowest bit is always set to 0.) A drop preference of 1 is the lowest and 3 is the highest; this field determines which traffic should be dropped in times of congestion. For example, AF21 traffic would be dropped less often than AF22 traffic. Figure 6-4 illustrates the AF PHBs.

Figure 6-4. AF PHB and DSCP Values

Key Point

We found that it is easy to get lost in the details of QoS markings, especially when the different PHBs, AF classes, and so forth are introduced.

To hopefully avoid this confusion, remember these key points about QoS DSCP markings:

The ToS field within an IPv4 packet header marks, or indicates, the kind of traffic that is in the packet. This marking can then be used by other tools within the network to provide the packet the service that it needs.
The first 6 bits in the ToS field are known as the DSCP bits.

DSCP values can be represented numerically (in binary or decimal) or with keywords, known as PHBs. Each PHB (BE, CSx, EF, and AFxy) represents a specific numeric DSCP value and therefore a specific way that traffic should be handled.

Cisco has created a QoS Baseline that provides recommendations to ensure that both its products, and the designs and deployments that use them, are consistent in terms of QoS. Although the QoS Baseline document itself is internal to Cisco, it includes an 11-class classification scheme that can be used for enterprises; this QoS Baseline suggestion for enterprise traffic classes is provided in Figure 6-5. This figure identifies the 11 types of traffic and the QoS marking that each type should be assigned. As described earlier, the QoS marking is either a Layer 2 CoS (specified within the 802.1q Tag field or ISL header) or a Layer 3 value marked in the IP packet header. The Layer 3 markings can either be done with a 3-bit IP precedence value (shown in the IPP column in Figure 6-5) or with a 6-bit DSCP value; both the numeric DSCP value and the PHB keyword representation of that value are shown in the figure.

Figure 6-5. Cisco QoS Baseline Provides Guidelines for Classification and Marking^[6]

The classes of traffic in the QoS Baseline are defined as followed:

IP Routing class This class is for IP routing protocol traffic such as Border Gateway Protocol (BGP), Enhanced Interior Gateway Routing Protocol (EIGRP), Open Shortest Path First (OSPF), and so forth.
Voice class This class is for Voice over IP (VoIP) bearer traffic (the conversation traffic), not for the associated signaling traffic, which would go in the Call Signaling class.
Interactive Video class This class is for IP videoconferencing traffic.
Streaming Video class This class is either unicast or multicast unidirectional video.
Mission-Critical Data class This class is intended for a subset of the Transactional Data applications that are most significant to the business. The applications in this class are different for every organization.
Call Signaling class This class is intended for voice and video-signaling traffic.
Transactional Data class This class is intended for user-interactive applications such as database access, transactions, and interactive messaging.
Network Management class This class is intended for traffic from network management protocols, such as SNMP.
Bulk Data class This class is intended for background, noninteractive traffic, such as large file transfers, content distribution, database synchronization, backup operations, and e-mail.
Scavenger class This class is based on an Internet 2 draft that defines a "less-than-Best Effort" service. If a link becomes congested, this class will be dropped the most aggressively. Any nonbusiness-related traffic (for example, downloading music in most organizations) could be put into this class.
Best Effort class This class is the default class. Unless an application has been assigned to another class, it remains in this default class. Most enterprises have hundreds, if not thousands, of applications on their networks; the majority of these applications remain in the Best Effort class.

Key Point

The QoS Baseline does not mandate that these 11 classes be used; rather this classification scheme is an example of well-designed traffic classes. Enterprises can have fewer classes, depending on their specific requirements, and can evolve to using more classes as they grow. For example, at one point, Cisco was using a 5-class model (the minimum recommended in a network with voice, video, and data) on its internal network.^[7]

Figure 6-6 illustrates an example strategy for expanding the number of classes over timefrom a 5-class, to an 8-class, and eventually to the 11-class modelas needs arise.

Figure 6-6. The Number of Classes of Service Can Evolve as Requirements Change^[8]

After traffic has been classified and marked and sent on its way through the network, other devices can then read the markings and act accordingly. The following sections examine the QoS tools that these devices can use.

Policing and Shaping

Policing and shaping tools identify traffic that violates some threshold or service-level agreement (SLA). The two tools differ in the way that they respond to this violation.

Key Point

Policing tools drop the excess traffic or modify its marking.

Shaping tools buffer the extra data until it can be sent, thus delaying but not dropping it.

The difference between these tools is illustrated in Figure 6-7.

Figure 6-7. Policing Drops Excess Traffic While Shaping Delays It

The diagram on the left in Figure 6-7 illustrates traffic that is being presented to an interface; note that some of the traffic exceeds the maximum rate allowed on the interface. If policing tools were configured on the interface, the excess traffic would simply be dropped, as indicated in the upper-right diagram. In contrast, the lower-right diagram shows that shaping tools would send all the data by delaying some of it until bandwidth is available.

Policing Tools

The Cisco IOS traffic policing feature allows control of the maximum rate of traffic sent or received on an interface. It is often configured on interfaces at the edge of a network to limit traffic into or out of the network. Traffic that does not exceed the specified rate parameters is sent, while traffic that exceeds the parameters is either dropped or is sent with a lower priority.

Note

Committed access rate (CAR) is an older IOS policing tool that can be configured to rate-limit (drop) certain traffic if it exceeds a specified speed. It can also be configured to set or change the markings within the packet header for traffic, depending on whether it meets or exceeds the acceptable rate.

Shaping Tools

Traffic shaping allows you to control the traffic going out of an interface to match its flow to the speed of the destination interface or to ensure that the traffic conforms to particular policies. The IOS software supports the following QoS traffic-shaping features:

Generic Traffic Shaping (GTS) GTS provides a mechanism to reduce the flow of outbound traffic on an interface to a specific bit rate. You can use access lists to define particular traffic to be shaped. GTS is useful when the receiving device has a lower access rate into the network than the transmitting device.
Class-based shaping This type of shaping provides the means for configuring traffic shaping on a class of traffic, based on the marking in the packet header, rather than only on an access list basis. Class-based shaping also allows you to specify average rate or peak rate traffic shaping.
Distributed Traffic Shaping (DTS) DTS is similar to class-based shaping; however, DTS is used on devices that have distributed processing (such as the Cisco 7500 Versatile Interface Processor [VIP]) and don't support class-based shaping.
Frame Relay Traffic Shaping (FRTS) Although GTS works for Frame Relay, FRTS offers the following capabilities that are more specific to Frame Relay networks:

- Rate enforcement on a pervirtual circuit (VC) basis A peak rate can be configured to limit outbound traffic to either the committed information rate (CIR) or to some other defined value.

- Generalized backward explicit congestion notification (BECN) support on a per-VC basis The router can monitor the BECN field in frames and throttle traffic if necessary.

- Priority and custom queuing support on a per-VC basis This allows finer granularity in the queuing of traffic on individual VCs.

Note

Priority and custom queuing are described in the "Congestion Management" section, later in this chapter.

Congestion Avoidance

Key Point

Congestion-avoidance techniques monitor network traffic loads so that congestion can be anticipated and then avoided, before it becomes problematic.

If congestion-avoidance techniques are not used and interface queues get full, packets trying to enter the queue will be discarded, regardless of what traffic they hold. This is known as tail dropthe packets arriving after the tail of the queue are dropped.

In contrast, congestion-avoidance techniques allow packets from streams identified as being eligible for early discard (those with lower priority) to be dropped when the queue is getting full.

Congestion avoidance works well with TCP-based traffic; TCP has a built-in flow control mechanism so that when a source detects a dropped packet, the source slows its transmission.

Weighted random early detection (WRED) is the Cisco implementation of the random early detection (RED) mechanism. RED randomly drops packets when the queue gets to a specified level (in other words, when it is nearing full). RED is designed to work with TCP traffic: When TCP packets are dropped, TCP's flow-control mechanism slows the transmission rate and then progressively begins to increase it again. RED therefore results in sources slowing down and hopefully avoiding congestion.

WRED extends RED by using the IP precedence in the IP packet header to determine which traffic should be dropped; the drop-selection process is weighted by the IP precedence. Similarly, DSCP-based WRED uses the DSCP value in the IP packet header in the drop-selection process. WRED selectively discards lower-priority (and higher-drop preference for DSCP) traffic when the interface begins to get congested.

Starting in IOS Release 12.2(8)T, Cisco has implemented an extension to WRED called explicit congestion notification (ECN), which is defined in RFC 3168, The Addition of Explicit Congestion Notification (ECN) to IP, and uses the lower 2 bits in the ToS byte (as shown earlier in Figure 6-3). Devices use these two ECN bits to communicate that they are experiencing congestion. When ECN is in use, it marks packets as experiencing congestion (rather than dropping them) if the senders are ECN-capable and the queue has not yet reached its maximum threshold. If the queue does reach the maximum, packets are dropped as they would be without ECN.

Congestion Management

While congestion avoidance manages the tail, or back, of queues, congestion management takes care of the front of queues.

Key Point

As the name implies, congestion management controls congestion after it has occurred. Thus, if no congestion exists, these tools are not triggered, and packets are sent out as soon as they arrive on the interface.

Congestion management can be thought of as two separate processes: queuing, which separates traffic into various queues or buffers, and scheduling, which decides from which queue traffic is to be sent next.

Queuing algorithms sort the traffic destined for an interface. Cisco IOS Software includes many queuing mechanisms. Priority queuing (PQ), custom queuing (CQ), and weighted fair queuing (WFQ) are the three oldest. IP Real-Time Transport Protocol (RTP) priority queuing was developed to provide priority for voice traffic, but it has been replaced by class-based weighted fair queuing (CBWFQ) and low latency queuing (LLQ). These queuing mechanisms are described as follows:

PQ A series of filters based on packet characteristics (for example, source IP address and destination port) are configured to place traffic in one of four queueshigh, medium, normal, and low priority. For example, voice traffic could be put in the high queue and other traffic in the lower three queues. The high-priority queue is serviced first until it is empty. The lower-priority queues are only serviced when no higher-priority traffic exists; these lower-priority queues run the risk of never being serviced.
CQ Traffic is placed into one of up to 16 queues, and bandwidth can be allocated proportionally for each queue by specifying the maximum number of bytes to be taken from each queue. CQ services queues by cycling through them in a round-robin fashion, sending the specified amount of traffic (if any exists) for each queue before moving on to the next queue. If one queue is empty, the router sends packets from the next queue that has packets ready to send.
WFQ WFQ classifies traffic into conversations and applies weights, or priorities, to determine the relative amount of bandwidth each conversation is allowed. WFQ recognizes IP precedence values marked in IP packet headers. For example, WFQ schedules voice traffic first and then fairly shares the remaining bandwidth among high-volume flows.
IP RTP priority queuing This type of queuing provides a strict priority-queuing scheme for delay-sensitive traffic. This traffic can be identified by its RTP port numbers and classified into a priority queue. As a result, delay-sensitive traffic such as voice can be given strict priority over other nonvoice traffic.

Note

RTP is a protocol designed to be used for real-time traffic such as voice. RTP runs on top of UDP (to avoid the additional overhead and delay of TCP). RTP adds another header that includes some sequencing information and time-stamping information to ensure that the received data is processed in the correct order and that the variation in the delay is within acceptable limits.

CBWFQ CBWFQ provides WFQ based on defined classes but does not have a strict priority queue available for real-time traffic such as voice. All packets are serviced fairly based on weight; no class of packets can be granted strict priority.
LLQ LLQ is a combination of CBWFQ and PQ, adding strict priority queuing to CBWFQ. This allows delay-sensitive data, such as voice data, to be sent first, giving it preferential treatment over other traffic.

Key Point

LLQ is the recommended mechanism for networks with voice traffic.

Link-Specific Tools

Key Point

Link-specific tools are those that are enabled on both ends of a point-to-point WAN connection to reduce the bandwidth required or delay experienced on that link. The QoS tools available include header compression (to reduce the bandwidth utilization) and link fragmentation and interleaving (LFI) (to reduce the delay encountered).

Voice packets typically have a small payload (the voice data) relative to the packet headersthe RTP, UTP, and IP headers add up to 40 bytes. So, compressing the header of such packets can have a dramatic effect on the bandwidth they require. RTP header compression, called cRTP, compresses this 40-byte header to 2 or 4 bytes.

Note

Voice compression, which reduces the size of the voice payload while still maintaining the quality at an acceptable level, is described in Chapter 7.

Even with queuing and compression in place, a delay-sensitive packet (such as a voice packet) could be ready to go out of a WAN interface just after a large packet (for example, part of a file transfer) has been sent on that interface. After forwarding of a packet out of an interface has begun, queuing has no effect and cannot recall the large packet. Therefore, a voice packet that gets stuck behind a large data packet on a WAN link can experience a relatively long delay and, as a result, the quality of the voice conversation can suffer. To counteract this, LFI can be configured on WAN links to fragment large packets (split them into smaller packets) and interleave those fragments with other packets waiting to go out on the interface. The smaller, delay-sensitive packets can travel with minimal delay. The fragments of the larger packets need to be reassembled at the receiving end, so the larger packets will experience some delay. However, because the applications sending these packets are not delay-sensitive, they should not be adversely affected by this delay. Figure 6-8 illustrates the LFI concept.

Figure 6-8. LFI Ensures That Smaller Packets Do Not Get Stuck Behind Larger Packets

Note

Recall from Appendix B that the IPv4 packet header includes a 16-bit identification field consisting of 3 bits of flags and 13 bits of fragment offset. This field indicates whether the packet is a fragment and, if so, the offset of the fragment in the original packet. The receiving end can then reassemble the fragments to create the original packet.

AutoQoS

The Cisco AutoQoS feature on routers and switches provides a simple, automatic way to enable QoS configurations in conformance with Cisco's best-practice recommendations. Only one command is required. The router or switch then creates configuration commands to perform such things as classifying and marking VoIP traffic and then applying an LLQ queuing strategy on WAN links for that traffic. The configuration created by AutoQoS becomes part of the normal configuration file and can, therefore, be edited if required.

The first phase of AutoQoS, available in various versions of router IOS Release 12.3, only creates configurations related to VoIP traffic.

Note

The Cisco Feature Navigator tool, available at http://www.cisco.com/go/fn, allows you to quickly find the Cisco IOS and switch Catalyst Operating System (CatOS) Software release required for the features that you want to run on your network. For example, you can use this tool to determine the IOS release required to run AutoQoS on the routers in your network.

The second phase of AutoQoS is called AutoQoS Enterprise and includes support for all types of data. It configures the router with commands to classify, mark, and handle packets in up to 10 of the 11 QoS Baseline traffic classes. The Mission-Critical traffic class is the only one not defined, because it is specific to each organization. As with the earlier release, the commands created by AutoQoS Enterprise can be edited if required.

Note

Further information on AutoQoS can be found at http://www.cisco.com/en/US/products/ps6656/products_ios_protocol_opt and at http://www.cisco.com/en/US/tech/tk543/tk759/tk879/tsd_technology_support_protocol_home.html.

Figure 6-2. QoS Tools Manage Network Traffic

Classification and Marking

Classification

Marking

Figure 6-3. The ToS Field in an IPv4 Header Supports IP Precedence or DSCP

Figure 6-4. AF PHB and DSCP Values

Figure 6-5. Cisco QoS Baseline Provides Guidelines for Classification and Marking[6]

Figure 6-6. The Number of Classes of Service Can Evolve as Requirements Change[8]

Policing and Shaping

Figure 6-7. Policing Drops Excess Traffic While Shaping Delays It

Policing Tools

Shaping Tools

Congestion Avoidance

Congestion Management

Link-Specific Tools

Figure 6-8. LFI Ensures That Smaller Packets Do Not Get Stuck Behind Larger Packets

AutoQoS

Figure 6-5. Cisco QoS Baseline Provides Guidelines for Classification and Marking^[6]

Figure 6-6. The Number of Classes of Service Can Evolve as Requirements Change^[8]