Planning Analysis and Assessment

Prev don't be afraid of buying books Next

Planning, Analysis, and Assessment

If you are packing your bags for an extended adventure, you try to anticipate everything you might need. You make a list of the items you need to have with you and when you might need them. To make your list, you start with the things you have already and then add the things you need but must acquire.

A VoIP deployment is analogous to such an extended adventure. Even before you pack your bags, you need to decide what you want to accomplish and settle on an accompanying schedule and budget. You need to determine where you are today so that you know how far you have to travel to reach your target.

Planning is the most important phase of a successful VoIP deployment. If you complete the upfront work and set the right expectations, every other step should be a matter of checking to make sure that expectations are being met.

Like most large IT projects, a VoIP deployment may face schedule and time constraints. Because shortcuts during the planning and evaluation process can have negative effects on the final implementation, start by estimating the time required to complete these stages. The Tolly Group, Inc. estimates that 8 to 12 months are needed to complete the planning and evaluation stages of a major VoIP project, as shown in Figure 3-1.

Figure 3-1. The Tolly Group, Inc. estimate for a major VoIP project

From The Tolly Group, Inc. ITclarity ad, 2000.




This may seem like a large amount of time just for planning and evaluation. This amount of time will probably decrease as more deployments occur and processes are refined. Tools can help automate the planning process, thus yielding further productivity improvements. However, the planning phase can still take a significant portion of the project life cycle. The time is well spent: Many questions are answered, and a large amount of data is collected during this stage. During planning there are two broad sets of questions to answer:

  • Where are you currently? What do you need to know to initiate your planning—the information related to your current environment?

  • Where are you headed? What are the things you need to decide—the scope of your deployment and the components that comprise it?

When you embark on the planning phase of a VoIP deployment, you start by collecting information about your current data network and its usage. Compile this information with a view toward understanding what must be done to reach your final goal—a successful VoIP deployment. This information is broken down into four categories of questions:

  • Telephony usage— What is the current call volume? What is the profile of these calls, including their typical frequency, duration, location, and call flow?

  • Reliability— What is the current data network/system reliability? What is your target reliability? What will it take to get from the current state of reliability to your target?

  • Call quality— What is the current estimated call quality? What is your target call quality? What will it take to get from the current level of call quality to your target?

  • VoIP readiness— How do you assess VoIP readiness? What is needed to perform a readiness assessment?

For each category of questions, there is an established methodology for finding the most useful answers. These methods are discussed in detail in the following sections.

Understanding Current Telephony Usage

Key characteristics of the telephone calls that travel over your existing phone system are well known; the data has been captured somewhere, ever since your telephones were first used. To understand how many users and how many calls your VoIP system must support, look at how many your current telephone system supports. A team somewhere within your organization, or under contract by your organization, knows this information intimately and has been tracking it for years. Here is what you can find out from them, or from their records:

  • Number of calls

  • Number of users (number of distinct phone numbers)

  • Duration of calls

  • Number of concurrent calls

  • Call volume profiles—peak and average usage statistics

    - When do they occur?

    - How long are they?

  • Location and call flow—What percentage of the calls occurs within each site? What percentage occurs within the organization, from site to site? How many calls go to and from the outside world?

All of this information helps you to plan your VoIP deployment because it enables you to understand the requirements and expectations that your deployment must fulfill.

Call Detail Records

Telephone records and the current private branch exchange (PBX) call volume reports are a good source of data about the likely call volume a network will have to handle. Your current phone supplier or the system itself captures information about telephone calls in call detail records (CDRs). You have seen simple examples of these, in your monthly itemized long-distance bill. CDRs include information such as the date and time of each call, the number that was called, the duration of the call, and its cost. Actually, lots more information is captured internally, including information about incoming calls, whether an attempted call was completed or not, the account to which a call should be billed, and so on.

Softcopy CDRs can be easily processed. Many PBXs can sort them or export them as comma-separated value (CSV) files, which you can load into spreadsheet programs, such as Microsoft Excel. A useful statistic often calculated from CDRs is the busy hour—the clock hour in a day when the most calls occur—and the busy day. Calls during the busy hour are usually broken into two categories: the busy hour calls attempted (BHCA) and the busy hour calls completed (BHCC). These two numbers, BHCA and BHCC, describe the peak call volume. (For the Public Switched Telephone Network (PSTN) in the U.S., the busiest hour on the busiest day is usually after lunchtime on Mother's Day.)

Call Volume Statistics

In the telephony industry, the busy hour traffic is often calculated in erlangs. An erlang is a number that represents the busyness of a particular telephone line. An erlang value of 1 means that the telephone line is 100 percent busy. Similarly, telephony statistics may include an Erlang B calculation, which is used to tabulate one of the following factors, given the other two:

  • Busy hour traffic (BHT)— The number of hours of call traffic during the busiest hour of operation

  • Blocking— The percentage of calls that are blocked because not enough lines are available

  • Lines— The number of lines in a trunk group

Simple calculators are available that implement the Erlang B calculation and allow for some quick modeling scenarios of the different statistics. To find them, enter "erlang calculator" in a web search engine.

Call Flow Analysis

When it comes to determining where to stage a VoIP deployment, the flow of call traffic is an especially useful statistic. If a large percentage of calls occurs within a particular site (intrasite traffic), that location may be ideal for VoIP on the LAN. If a high volume of call traffic passes between two sites (intersite traffic), those sites may be candidates for VoIP because they can take advantage of toll bypass.

In addition to examining call flow within the corporate network, it is a good idea to determine how many calls travel to and from the PSTN. Analyzing the data using the busy-hour calculations can allow for capacity planning when VoIP traffic is added to the data network.

The current telephony usage information that you gather will serve as valuable input for the later planning stages of your deployment. The next section looks at the next category of questions to ask as you are gathering data about your current network: reliability.

Understanding Reliability

Users have come to expect a high level of reliability from their phone system. Decades of knowledge, experience, and innovation have raised PSTN reliability very high. When you pick up a phone, you get a dial tone almost instantly. Can you even recall the last time a telephone call was dropped by the PSTN? Typical user expectations of unavailability for the phone system are about 5 minutes, cumulatively, per year. The level of availability the PSTN delivers in the U.S. is sometimes referred to as "five nines," which means that a dial tone is available 99.999 percent of the time. Table 3-1 shows the amount of downtime for different availability percentages.

Table 3-1. Nines of Availability and Corresponding Downtime

Availability

Cumulative Downtime per Year

99.000%

3 days, 15 hours, 36 minutes

99.500%

1 day, 19 hours, 48 minutes

99.900%

8 hours, 46 minutes

99.950%

4 hours, 23 minutes

99.990%

53 minutes

99.999%

5 minutes

99.9999%

30 seconds




To determine the reliability of a system, you need to know the availability percentage. Availability is defined as follows:

Availability = Mean time between failures / total time



where:

Mean time between failures = Average time between each outage or failure



Total time = Mean time between failures + Mean time to repair the failures



Another way to look at availability is to compare the total downtime with the total elapsed time:

Availability = 1 – (System outage time)/(System elapsed time)



Sometimes the key measure is unavailability, which is easily derived from availability:

Unavailability = 1 – Availability



So in Table 3-1, the availability of "five nines," 99.999 percent, was calculated as follows:

.99999

= 1 – (5 / 365 * 24 * 60)

 

= 1 – (5 / 525,600)




Contrast the PSTN's level of reliability with what is achieved by most data networks today and you will recognize the challenge that a VoIP deployment faces. Data networks just have not reached the reliability found in the PSTN yet. Instead, they are plagued by periodic outages. Network outages are caused by a variety of events, such as user errors, software failures, and other technology failures, as shown in Figure 3-2.

Figure 3-2. Reasons for System Unavailability

From "Getting Ready for Voice over Data," Cisco presentation, 2000.




As shown in the figure, the causes of systems unavailability include the following:

  • User errors and process— Change management, process consistency

  • Technology— Hardware, network links, environmental issues, natural disasters

  • Software application— Software issues, performance and load, scaling

So, by now you may be thinking, "Wow, my e-mail server goes down a lot. What will happen when my phone and e-mail are on the same network?" On average, computer system reliability is estimated at around 98.5 percent.[1] This number, which works out to 5 days and 11.4 hours of downtime each year, includes not only the data networks and their components, but also all the core business applications, servers, and mainframes.

Although the core business applications, servers, and mainframes are certainly important to your business, their high availability is not required to reach five nines of VoIP reliability. For VoIP, you should instead focus specifically on two areas:

  • The reliability of the network and its components

  • The reliability of the VoIP components (VoIP server, gateways, IP PBXs)

First, consider network reliability. A survey by the Merit Project shows that most network outages stem from performance issues, such as peak load and insufficient bandwidth.[1] Most often, the problem is too much traffic and too little capacity. Security intrusions, particularly denial-of-service (DoS) attacks, only add to the network outage problem.

Second, your key VoIP components really are server boxes, running complex operating systems and complex applications on off-the-shelf computers. They are susceptible to the three categories of problems discussed previously: The software can fail in various ways; users, attackers, and the IT team can cause problems; and every piece of hardware technology has many failure modes. These server boxes need to be made highly reliable, and kept highly reliable, to achieve high VoIP availability.

Therefore, consider two strong recommendations for ensuring high VoIP reliability:

  • Get a strong handle on your network traffic— Understand the network's current capacity and traffic mix, including applications, flows, and priorities. Understand where the traffic should be when the VoIP deployment is complete. Control tightly what is flowing in and out of the network (by using firewalls, for example). Use network policy management to control the priority of each type of traffic. Apply firm user management, to police what each user can do and to control which resources users can access.

  • Get high-quality VoIP server, gateway, and IP PBX boxes, and secure them well— Install their required software, then put change management and access controls on the box. Control tightly what changes are made and who can make them. Lock them down, to avoid physical or network intrusions. Put them on an uninterruptible power supply (UPS) to avoid downtime due to power outages.

After following these guidelines, the individual components that make up the network should be examined as well. Cisco Systems identifies the key availability items as those that follow.[2] All are discussed in detail in the following sections:

  • Hardware reliability

  • Software reliability and features

  • Network link and carrier reliability

  • Environment and electrical power

  • Network design

  • User errors and process management

Hardware Reliability

Inevitably hardware fails, so plan in advance to purchase hardware that is resilient—resistant to failure. In many instances, network equipment vendors have included features to help make their hardware more resilient. For example, devices may have multiple CPUs, power supplies, and cooling fans. If one of these components fails, the device can still operate. The duplication of components to make the system more resilient is referred to as redundancy.

Load balancing also provides resiliency and scalability. In a load-balanced scenario, multiple devices are configured to share the network or server load. For example, a group of web servers may be configured to alternate when responding to requests for the same website content. Load balancing of web servers is commonplace and provides resiliency in the case where a single server in the group fails.

Clustering, a third technique for achieving resiliency, enables multiple devices to behave as a single entity. Within a cluster of devices, typically one device serves as the primary device or publisher. Other devices act as the backups or subscribers. If the primary device fails, a backup takes over in a seamless manner that is transparent to users. In addition to resiliency, clustering also provides for easier management and scalability. The entire group of devices can be managed as a single unit and, when combined, can support more users than a single device. Clustering is a good way to improve the reliability of VoIP servers. Figure 3-3 shows an example of clustering.

Figure 3-3. Cisco CallManager in a Typical Cluster Configuration[3]




Software Reliability and Features

To improve reliability, don't install lots of other programs or applications on your VoIP servers. Even though server operating systems generally try to minimize the impact that one application can have on another, ill-behaved programs can use large amounts of CPU and memory, creating server-performance problems. Worse yet, programs that provide device drivers, or operate in kernel mode, can potentially cause the server to crash. Limit the applications installed on critical VoIP servers to what is needed to operate and manage the server. Many vendors offer certification programs to ensure that programs installed on their servers won't adversely affect server performance. Likewise, shop around for management tools that are vendor-certified.

Test software patches carefully and apply them in stages. For example, you might apply patches to a limited number of servers and wait to see the effects before applying them to all your servers. Unfortunately, fixes for software often introduce other problems, so it is best to test them out before widespread deployment. Check to see if you can apply the fix "hot," without a server reboot. Some operating systems have features to allow fixes to be applied while the system is running. If the fix requires a server reboot, plan to do the updates during off-hours, or have a backup server available to bring online while the other server is being updated.

Lock VoIP servers down with the tightest intrusion security available. You want to make sure that your critical VoIP servers are not vulnerable to attack. Securing your servers may require installing a firewall to protect or an intrusion detection system to warn you of any security violations. Chapter 8, "VoIP Security," is devoted entirely to VoIP security.

Link and Carrier Reliability

Network link resiliency is an important consideration when you are shopping around for an Internet service provider (ISP). Investigate the reliability record for the ISP. A service level agreement (SLA) generally requires an ISP to provide to its customer certain levels of link availability and network performance. Talk with other customers; thorough SLA contracts and tight adherence to them are strong indicators of ISP reliability. Be sure to include reliability details in your contracts with the ISP. Finally, consider what happens when a failure does occur. How quickly does the ISP restore service? When a link goes down, what is the process for opening a "trouble ticket"? All of these questions should be answered in a properly structured contract with your provider. Chapter 7, "Establishing VoIP SLAs," covers VoIP SLAs in greater depth.

Environmental and Electrical Power

The environment surrounding your network is a factor that is easy to ignore. However, environmental factors should be included in any reliability assessment. Temperature extremes can lead to system failure. Flood damage can wipe out your system and require extensive repairs. Proper air conditioning may be lacking in server locations. Whenever possible, raised floors or rooms that are protected against environmental hazards should be considered.

With plain old telephone service (POTS), the phones are powered by the phone line, which may be independent of the local electrical system. Observe that you can call the electrical company from your wired phone at home when the power goes out there. Ethernet switches are now available that provide power to IP phones over the standard Ethernet wiring. It is a good idea to look at the current capacity and reliability level of the power system for key components in your network. UPS boxes can reduce the risk of a power failure affecting these key systems.

Network Design

Good network design can eliminate problems that stem from a single point of failure, which can arise when all traffic must go through one device. (For example, an office has a single firewall that must filter all incoming network traffic.) Single points of failure can also create performance bottlenecks if a device is overburdened. Look for these points in your network design and seek to eliminate them.

Even if you use redundant components, make sure that if one device fails, its redundant partner(s) can handle the load sufficiently. For example, if you have two DS-3s, where each DS-3 carries 30 Mbps of traffic, the failure of one DS-3 can result in heavy congestion on the remaining DS-3.

Figure 3-4 shows an example of a single point of failure in a network.

Figure 3-4. When All Traffic Flows Through a Single Device, It Can Create a Single Point of Failure




When a failure occurs, operator intervention is not always required because good network design can make the network self-healing. Self-healing networks can reroute data over different paths in the case of a link failure along the primary path. Take advantage of dynamic routing protocols to take some pressure off your network operator. Advanced routing protocols, such as Hot Standby Router Protocol (HSRP), can provide increased resiliency by allowing multiple IP routers to be deployed and act as a single default gateway. Consider the case in which a single router serves as your default gateway. If it fails, you may not be able to access the rest of the network.

A dynamic routing protocol, such as Open Shortest Path First (OSPF), is preferable to the use of static routes, because dynamic protocols can adapt to network changes. Be sure to work through possible failure scenarios to ensure that traffic is not "black-holed"—that is, consumed by a router and not forwarded—while the network converges.

A good network design also considers security. Firewalls and intrusion detection systems should be used to protect the enterprise network from outside intrusions. Network address translation (NAT) devices can keep computer addresses hidden from hackers. However, be careful where you place the NAT devices, because it can be difficult to configure VoIP to work correctly across a NAT device.

As you design your network, don't neglect other crucial network services such as Domain Name Service (DNS) or Dynamic Host Configuration Protocol (DHCP). Most IP phones use DHCP to minimize the configuration necessary to obtain an IP address. If the DHCP service is down, the IP phone is unable to join the network. DNS and DHCP services run on server computers, so primary and backup servers may be necessary.

User Errors and IT Process Management

User error and IT processes are the final contributors to availability issues. To prevent both sources of network disruptions, you need good IT processes in place to support VoIP well. Look to eliminate error-prone processes; in doing so, you will be able to reduce or avoid user errors. Here are some tips to help your IT staff avoid user errors:

  • Thorough training— Give the IT staff the training needed to support and manage a VoIP deployment.

  • Intuitive user interfaces— Look for configuration and management tools that are intuitive and easy to use.

  • Redesign or automate tricky tasks— Automation is a good way to handle error-prone tasks.

Your goals in maintaining a reliable VoIP deployment occur in stages:

  • Prevent— If you prevent and avoid problems altogether, you increase availability.

  • Detect— When prevention fails, you want to spot problems as soon as possible, to shorten the time that elapses before isolation and repair.

  • React— When a problem is encountered, you want a timely and appropriate reaction to shorten the isolation and repair time. Having reacted well, close the loop by making the necessary long-term fixes and responses to prevent and avoid the same problems in the future.

These are the goals in the IT project stages introduced in Chapter 2:

  • Management and monitoring— VoIP management is required to ensure the reliability and availability of the components and high call quality. Management is critical if you are going to be proactive in dealing with problems. When a problem occurs, how long does it take you to detect it? Ideally, you would like to know about the problem before users start calling it in. Management software tools should let you set thresholds for key reliability and call-quality measurements and then receive notifications when the thresholds are crossed.

  • Fault isolation, diagnosis, and repair speed— Once you do detect a problem, the key to maintaining a reliable VoIP network is how quickly you can isolate, diagnose, and fix it. You need to quickly pinpoint the component in the VoIP system—server, phone, router, or network link—that is causing the problem.

Once you reach a point where the system is very reliable and you can resolve any availability problems quickly, then it is time to take a look at network performance as it relates to call quality.

Understanding Call Quality

Traditional networked applications and VoIP applications have different network performance requirements. For example, while file-transfer applications consume large amounts of bandwidth by sending data as quickly as possible, enterprise resource planning (ERP) applications send small amounts of data, but use frequent flows between sender and receiver. By contrast, VoIP applications consume relatively little bandwidth, but can't tolerate large delays or variations.

Even when they are carried on the same network, voice traffic and data traffic can't be handled the same way, for the following reasons:

  • They have different packet sizes.

  • They are sent at different rates.

  • They are buffered and delivered to the destination differently.

  • They must fulfill very different user expectations.

Although an e-mail message or a file transfer can be delayed by half an hour without exciting anyone's notice, delays of a few hundred milliseconds can impair a VoIP telephone call. (A millisecond, abbreviated ms, is one thousandth of a second, so 1000 ms equal 1 second.) And when you start to run VoIP across any given enterprise network, delays caused by other applications, overloaded routers, or faulty switches may be inevitable.

Most data networks are not ready to provide the performance needed for PSTN-level call quality or reliability. You might argue that the quality is great on a campus LAN with underutilized capacity, but how many enterprise networks consist of a single campus LAN? This section looks at the network performance issues before and after a VoIP deployment.

Network Performance Before VoIP

Data networks have customarily been tuned to make network applications, such as web transactions, e-mail, and ERP, run really well. Two characteristics of these types of applications affect their performance requirements on the network:

  • They send data using the TCP protocol— TCP is a connection-oriented protocol, which means that the two sides of the data exchange maintain strong tracking about everything that is sent and received. For example, your browser uses the TCP interface when fetching web pages—you don't want to see holes or out-of-order pieces of data on the screen, so your browser and the web server program work together to make sure everything is received intact. TCP also provides congestion control so that when a sender is sending too fast, the receiver sends a "slow down" message. TCP applications are usually elastic, consuming as much bandwidth as is available to them.

  • They are transaction-oriented— Application transactions consist of requests and responses. A transaction can be as simple as a single request and response: a credit card number is sent and an authorization is received. Or a transaction can contain many short request and response flows, or even, in the case of a file transfer, a single transmission of a large amount of data. In a typical application transaction, a client requests a web page, and the server responds by sending the information. Similarly, an application might request a set of records from a Structured Query Language (SQL) query on the server. The amount of time it takes to perform a transaction gives the user an indicator for how responsive the application feels. The back-and-forth nature of these transactions means the application demands certain performance provisions from the network.

How do you know when one of these transaction-oriented applications is not performing well? The key performance measurements are throughput and response time.

Throughput numbers tell the rate at which traffic can flow through a network. This is the key measurement for applications such as FTP, which need to transfer large amounts of data. Networks with higher throughput can deliver data in a shorter period of time. A measurement reflecting a network's capacity, throughput is usually measured in bytes or bits per second.

Response time is a measurement that indicates how long it takes to send a request and receive a reply over a network. The response-time metric is key for network transactions, because the longer an operation takes, the more impatient a user gets. Usually described in milliseconds or seconds, the response time measurement for a transaction reflects the user's experience with a network.

A network that consistently provides high throughput and low response time lets TCP-based transactional applications perform well.

Network Performance with VoIP

Because of additional requirements to provide good call quality, voice traffic places a new set of demands on data networks. Even a network that is tuned to provide the high throughput and low response time needed to make other applications perform well may perform poorly when voice traffic is added. Voice has real-time characteristics, which have very strict requirements for network performance. Voice applications have two characteristics that require real-time network performance:

  • They send data using the Real-time Transport Protocol— RTP is an application-layer protocol that rides on the connectionless User Datagram Protocol. UDP is said to be connectionless because it provides for no acknowledgments or tracking of the data sent and received. Nor does UDP provide for retransmission of data that has been lost by the network. In contrast to TCP applications, RTP does not provide congestion control directly, so a sender could overwhelm a receiver by sending too much data, too quickly. To help prevent this problem, RTP applications usually send data at a fixed data rate.

  • Interactive conversations can't tolerate large delays— A typical telephone conversation usually depends on a certain amount of interaction between the caller and the callee. The higher the level of interaction, the less you can tolerate delays in the conversation. If the delay is too high, the conversation is burdened by a "walkie-talkie" effect—the talkers feel they must complete each sentence with some keyword like over to let the receiver know that they have finished talking. This can become very tedious, and gives both parties in the conversation a perception of poor call quality.

When a converged network is tuned correctly, many types of applications can coexist and perform well. But the converse is also true. How do you know if the voice call quality is poor? Call quality is discussed in depth later in the sections where several underlying network performance measurements that play a key role in determining call quality are introduced. The fundamental network performance measurements for voice traffic are delay, jitter, and packet loss. These issues and their impact on call quality are discussed in detail in the next section.

After this brief introduction to some of the network performance issues that come into play when you deploy VoIP, the next portion of this chapter examines the reality that call quality equals network performance.

Standards for Measuring Call Quality

The quality goal for a VoIP call is the same level of quality that the PSTN consistently delivers, and it is a lofty goal. PSTN-level quality is sometimes referred to as "toll" quality, and it is excellent. Some companies have even advertised PSTN quality so good that "you can hear a pin drop." Getting good call quality day in and day out with a VoIP deployment is possible, but it implies that you know what level of call quality you are getting. That is why it helps to understand some of the different measurement standards for voice quality.

Ever since the telephone was invented, call-quality testing has usually been subjective: picking up a telephone and listening to the quality of the voice. The leading subjective measurement of voice quality is the mean opinion score, or MOS, as described in the International Telecommunications Union (ITU) recommendation P.800.[4]

NOTE

All ITU publications can be accessed from http://www.itu.int/publications/main_publ/itut.html.


To determine a MOS for a telephone call by using human listeners, lots of people listen to a call. A sentence is read aloud. After hearing the sentence, the listeners give their opinion of how good it sounded. (A sentence commonly used in MOS testing is "Nowadays, a chicken leg is a rare dish.")

This certainly works well, but it is pretty expensive to hire a bunch of people to assign a score to your calls each time you make a tuning adjustment or network configuration change. The good news is that the human behavioral patterns have been heavily researched and quantified. The research describes how humans would most likely react—what MOS they would give—as they hear audio with different levels of delay or packet loss. This mapping between audio performance characteristics and a quality score makes the MOS standard valuable for network assessments, benchmarking, tuning, and monitoring.

The MOS described in ITU P.800 is a subjective measurement of call quality as perceived by the receiver. A MOS can range from 5 down to 1, using the rating scale in Table 3-2.

Table 3-2. Mean Opinion Score Scale

MOS

Quality Rating

5

Excellent

4

Good

3

Fair

2

Poor

1

Bad




A MOS of 4 or higher is generally considered toll quality. A MOS below 3.6 results in many users who are not satisfied with the call quality.

Although MOS is a subjective measurement, considerable progress has been made in establishing objective measurements of call quality. Various standards have been developed:

  • PSQM (ITU P.861)/PSQM+— Perceptual Speech Quality Measure

  • PESQ (ITU P.862)— Perceptual Evaluation of Speech Quality

  • PAMS (British Telecom)— Perceptual Analysis Measurement System

  • The E-Model (ITU G.107)

PSQM, PSQM+, and PESQ are part of a succession of algorithm modifications starting in ITU recommendation P.861.[5] PESQ is the latest algorithm. British Telecom developed PAMS, which is similar to PSQM. Each of these measurements—PSQM, PAMS, and PESQ—sends a reference signal through the telephony network and then uses digital signal processor (DSP) algorithms to compare the reference signal with the signal that is received on the other end of the network. Initially, these objective measurements were used in testing with codecs, but now several voice testing and measurement tools have implemented them as ways of testing VoIP systems. However, MOS is the widely accepted criterion for call quality, and the vendors that implement these scoring algorithms all map their scores to MOS.

All of these measurement methods are good in test labs for analyzing the clarity of individual devices. For example, it makes sense to use PSQM or PESQ to describe the quality of a telephone handset. However, these approaches are not very well suited to assessing call quality on a data network, because they don't know about data networking—they are based on older telephony approaches:

  • The underlying models are not based on data network issues, so they can't map back to the network issues of delay, jitter, and packet loss. Their output does not direct the network staff how to tune the data network.

  • They don't factor in the end-to-end delay between the telephone speaker and listener. Excessive delay adversely affects MOS.

  • They show quality in one direction at a time, rather than the two-way flow used in a real telephone conversation.

  • They don't scale to let you see the effect of multiple, simultaneous calls between a pair of locations.

  • They require invasive hardware probes, which you need to purchase and deploy before beginning VoIP measurements.

To address these shortcomings, ITU recommendation G.107[6] introduced the E-model.[7] The E-model is better suited for use in data-network call-quality assessment because it takes into account impairments specific to data networks. As the E-model was developed, many subjective tests were performed—each time with varying degrees of network impairments. The resulting data was used to obtain a model for an objective calculation. The output of an E-model calculation is a single scalar, called an R-value, derived from delays and equipment impairment factors. Once an R-value is obtained, it can be mapped to an estimated MOS.

Figure 3-5 shows the mapping between R-values and estimated MOS. R-values from the E-model are shown on the X-axis, with MOS values on the Y-axis. The S-curve shows the mapping between R-values and an estimated MOS.

Figure 3-5. Mapping Between R-values and Estimated MOS

From the ITU-T G.107 Recommendation




The E-model makes particular sense for use in a VoIP-readiness assessment of a data network. Assessment tools generate RTP streams to simulate VoIP calls running between software agents in a data network. Each time a simulated VoIP call is run, measurements are collected for the delay, packet loss, and the amount of variability in the arrival time of the datagrams (known as jitter). These measurements capture the network performance metrics that underlie voice quality: how the two people on the two telephones perceive the quality of their conversation.

How exactly does the E-model come up with a MOS, given the data-network statistics? The first step is to calculate an R-value.

Calculating an R-Value

The R-value, the output from the E-model, ranges from 100 down to 0, where 100 is excellent and 0 is poor. The calculation of an R-value starts with the unadulterated signal. With no network and no equipment, quality is perfect. In equation form:

R = R0



But, the network and the equipment impair the signal, reducing its quality as it travels from end to end:

R = R0 – Is – Id – Ie + A



where:

  • Is— Simultaneous impairments to the signal.

  • Id— Delays introduced from end to end.

  • Ie— Impairment introduced by the equipment, including packet loss.

  • A— The advantage factor. For example, mobile users may tolerate lower quality because of the convenience. Set to 0 in most models and assessments.

(Source: ITU-T G.107 Recommendation)

The three data-network measurements that are key to call quality have already been mentioned: delay, jitter, and packet loss. In the R-value calculation, these measurements become impairment factors, which are influenced by the implicit delay and impairment of the codec. An E-model calculation considers all of the following factors: network delay, percentage of packet loss, packet loss burstiness, delay introduced by the jitter buffer, data lost due to jitter buffer overruns, and the behavior of the codec. Once the R-value is calculated from these factors, an estimate of the MOS can be directly calculated from the R-value.

Figure 3-6 shows the input to the E-model calculation and resulting output. The E-model calculation takes as its input network statistics. Its output is an R-value, which is straightforwardly converted to a MOS estimate.

Figure 3-6. The E-Model Calculation




The inherent degradation that occurs when converting an actual voice conversation to a network signal and back reduces the theoretical maximum R-value (a value with no impairments) to 93.2, so the highest possible MOS is 4.4. The R-value range from 0 to 93.2 maps to a MOS range of 1.0 to 4.4.

Figure 3-7 shows user satisfaction with different MOS values. R-values from the E-model are shown on the left, with MOS values on the right. The likely opinion of human listeners is shown in the middle.

Figure 3-7. Correspondence of User Satisfaction to MOS Values




Now that you have been introduced to the basics of the E-model calculation, you are ready to look in detail at each of the input components: codecs, delay, jitter, and loss.

Codec Selection

Codecs were introduced in Chapter 1, "VoIP Basics," in the section "VoIP Components." In audio processing, a codec (which stands for compressor/decompressor or coder/decoder) is the hardware or software that samples the sound and determines the data rate. There are dozens of available codecs, each with different characteristics.

The names of codecs correspond to the name of the ITU standard that describes their operation. The codecs named G.711u and G.711a convert from analog to digital and back with high quality and no compression. To do this, however, takes a fair amount of bandwidth. The G.711 codec, also called pulse code modulation (PCM), was designed based on several fundamental signaling characteristics:

  • It uses a frequency range of 4 kHz for voice information. Although the human voice covers a broader range of possible frequencies, this range is broad enough to make human conversation quite intelligible.

  • To capture the proper degree of resolution, the voice information is sampled at double the frequency range, or 8000 times per second. Thus, PCM grabs a chunk of data every 0.125 ms (1 second / 8000 = 0.000125 seconds).

  • Each sample occupies 8 bits of data, so the overall bandwidth required is 8000 * 8, or 64,000 bps.

When G.711 was developed, modern DSP technology was not available. But new compression algorithms make it possible to provide intelligible voice communications with reduced bandwidth consumption.[8]

The lower-speed codecs, G.726-32, G.729, and those in the G.723.1 family, consume less network bandwidth. Low-speed codecs impair the quality of the audio signal much more than high-speed codecs, however, because they compress the signal with lossy compression. Fewer bits are sent, so the receiving side does its best to approximate what the original signal sounded like. The fact that they use less bandwidth is good, because you can run more concurrent calls over the same links, but the compression they use reduces the clarity, introduces delay, and makes the voice quality very sensitive to lost data.

The way that the codec impairs the audio can reduce the R-value significantly. Codec impairments are added directly into the Ie portion of the R-value equation. For example, using the G.723.1a codec causes 19 points to be subtracted directly from the 93.2 points available in the theoretical maximum R-value.

Table 3-3 lists some of the most commonly used VoIP codecs and their default values. The Packetization Delay column refers to the delay a codec introduces as it converts a signal from analog to digital. Packetization delay is included in the MOS estimate, as is the jitter buffer delay, the delay introduced by the effects of buffering to reduce interarrival delay variations.

Table 3-3. Default Attributes for Six Common Codecs

Codec

Data Rate

Typical Datagram Size

Packetization Delay

Bandwidth Required

Typical Jitter Buffer Delay

Theoretical Maximum MOS

G.711u

64.0 kbps

20 ms

1.0 ms

87.2 kbps

2 datagrams (40 ms)

4.41

G.711a

64.0 kbps

20 ms

1.0 ms

87.2 kbps

2 datagrams (40 ms)

4.41

G.726-32

32.0 kbps

20 ms

1.0 ms

55.2 kbps

2 datagrams (40 ms)

4.22

G.729

8.0 kbps

20 ms

25.0 ms

31.2 kbps

2 datagrams (40 ms)

4.07

G.723.1 MPMLQ

6.3 kbps

30 ms

67.5 ms

21.9 kbps

2 datagrams (60 ms)

3.87

G.723.1 ACELP

5.3 kbps

30 ms

67.5 ms

20.8 kbps

2 datagrams (60 ms)

3.69




The Bandwidth Required column shows that the real bandwidth consumption by VoIP calls is actually higher than it first appears. The G.729 codec, for example, has a data payload rate of 8 kbps, but its actual bandwidth usage is higher than this; when sent at 20-ms intervals, the payload size is 20 bytes per datagram. To this add the 40 bytes of RTP header (yes, the header is bigger than the payload) and any additional Layer 2 headers. For example, Ethernet adds 18 more bytes.

It is worth observing in the table that both G.723.1 codecs result in calls of only Acceptable quality at best. Their theoretical maximum MOS is below the 4.0 value needed to be considered Good.

Packet loss concealment (PLC) is an additional option if you are using the G.711u or G.711a codecs. PLC techniques reduce or mask the effects of data loss during a VoIP conversation. When PLC is enabled, it is assumed that the quality of the conversation will be improved; this improvement is factored into the MOS estimate calculation if any data is lost. PLC makes the codec itself more expensive to manufacture, but does not otherwise add delay or have other bad side effects.

Delay

The time it takes a conversation to travel from the speaker to the listener is the end-to-end delay, or latency. Latency introduces into a conversation blank spaces that are annoying at best. At worst, they can even cause the listener to misunderstand you, because so much of the meaning in speech is carried nonverbally, by such things as inflection and tone and pauses in the conversation.

End-to-end delay is actually made up of four components:

  • Propagation delay— The time to travel across the network from end to end. It is based on the speed of light and the distance the signal must travel. For example, the propagation delay between Singapore and Boston is much longer than the propagation delay between New York and Boston.

  • Transport delay— The time to get through the network devices along the path. Networks with many firewalls, many routers, congestion, or slow WANs introduce more delay than an overprovisioned LAN on one floor of a building.

  • Packetization delay— The time for the codec to digitize the analog signal and build frames—and undo it at the other end. The G.729 codec has a higher packetization delay than the G.711 codecs because it takes longer to compress and decompress the signal.

  • Jitter buffer delay— The delay introduced by the receiver as it holds one or more datagrams to reduce variations in arrival times.

The combined value of propagation delay and transport delay is typically termed "network delay" or "one-way delay." The packetization delay is a fixed value and depends on the codec being used. Dynamic jitter buffers add varying amounts of delay, depending on network conditions and queuing. Likewise, you can readily experience transport delay as a result of network traffic congestion, particularly if you have deep queues.

Many VoIP engineers don't know how much latency is too much. A simple answer is 150 ms. The ITU has conducted studies on the impact that delay has on quality. These studies are published as ITU Recommendation G.114.[9] Delays greater than 150 ms cause a conversation to become uncomfortable. This level of delay is usually the point at which both parties begin to speak at the same time and can't recover gracefully—by the time they realize the other party is also talking, they are too far into their own words.

The end-to-end delay affects the MOS for each codec differently. Codecs that use little or no compression, such as G.711, can tolerate larger delays before the MOS begins to degrade.

Figure 3-8 shows the effect of end-to-end delay on MOS. If there is no jitter and no packet loss, the MOS is influenced only by the end-to-end delay and choice of VoIP codec. This graph shows the effect on the MOS of just end-to-end delay for four example codecs.

Figure 3-8. Four Example Codecs




One-way delay is measured in various ways. One simple approach measures response time (round-trip delay) and divides the resulting value by two. This is not always a good approximation of one-way delay. The round-trip response time hides assumptions about the symmetry of the paths between two locations. In fact, the two RTP streams in a VoIP call can take different paths through an IP network.

Figure 3-9 shows one-way delay measurements for a bidirectional call. There's quite a difference between the one-way delay values in the two directions of this conversation. At about 130 ms, the one-way delay slightly affects the MOS.

Figure 3-9. One-Way Delay Values in Two Directions of a Conversation




The most accurate approach to measuring one-way delay is to synchronize the clocks of the sender and receiver. However, synchronizing clocks in a network is a nontrivial undertaking. Recommended methods of clock synchronization, such as the Global Positioning System (GPS) and other high-resolution protocols, have an accuracy of about ±1 ms; contrast this with the Network Time Protocol (NTP),[10] which is accurate to about ±200 ms—not good enough for MOS calculations. After the clocks are synchronized, the one-way delay measurement for each RTP datagram is calculated as follows:

One-way delay = Receiver time stamp – Sender time stamp



Jitter

Jitter, also called delay variation, indicates the differences in arrival times among all datagrams sent during a VoIP call. When a datagram is sent, the sender gives it a time stamp, which is placed in the RTP header. When the datagram is received, the receiver generates another time stamp. These two time stamps are used to calculate the packet's transit time. If the transit times for datagrams within the same call are different, the call contains jitter. In a video application, jitter manifests itself as a flickering image, whereas in a telephone call its effect may be similar to the effect of lost data: Some words may be missing or garbled.

The amount of jitter in a call depends on the degree of difference between the datagrams' transit times. If the transit time for all datagrams is the same (no matter how long it took for the datagrams to arrive), the call contains no jitter. If the transit times differ slightly, the call contains some jitter. As jitter values exceed 40 ms, the MOS declines, indicating poor call quality. Jitter provides a short-term measurement of network congestion and can show the effects of queuing within the network.

IP phones send voice datagrams at a constant rate based on the codec's default speech frame size. The speech frame size is the amount of time that the codec takes to build a datagram with voice data for transmission. For example, G.711 typically has a default speech frame size of 20 ms. Every 20 ms, the G.711 codec outputs a datagram for transmission.

The receiving side is expecting to receive datagrams at a constant rate—in the preceding example, every 20 ms. To lessen the impact of jitter, VoIP phones usually have a jitter buffer. The jitter buffer can usually hold one or two datagrams at a time and may adjust itself dynamically based on the perceived jitter. As datagrams arrive, they are placed in the jitter buffer, which holds them long enough to supply them to the codec at a more constant rate. If a datagram arrives too early or too late, it may not fit in the jitter buffer and is discarded. You would like to make the jitter buffer just large enough to handle any variation due to the data network. However, for every millisecond that you increase the jitter buffer, you add a millisecond of delay.

The datagrams that are discarded because they do not fit in the jitter buffer come across as lost data to the listener. As you will see next, lost data has a noticeable impact on call quality.

Lost Data

VoIP datagrams are sent using RTP. Although every RTP datagram contains a sequence number to help applications detect data loss and datagrams received out of order, there is not enough time to retransmit lost or out-of-order datagrams.

Any lost datagram impairs the quality of the audio signal, because when a datagram is lost during a VoIP transmission, you can lose an entire syllable or word in a conversation. Obviously, data loss can severely impair call quality. Data loss is thus a key call-quality impairment factor in calculating the MOS.

To measure data loss, each side keeps track of how many bytes of data it sent. The sender tells the receiver how many bytes it sent, and the receiver compares that value to the number of bytes it received to determine lost data.

A few different profiles describe datagram loss. The simplest describes a more-or-less random loss. That occurs when there is general, consistent congestion in the network, so one or two datagrams are lost occasionally. But it is bursts of loss that degrade quality most significantly. A burst is generally considered to be more than one consecutive lost datagram. Human listeners don't readily notice lower quality if loss is randomly distributed, with just a few datagrams at a time dropped. This type of loss pattern has some effect, but the quality decline mostly stems from a combination of loss and delay. Bursts of loss, however, can have a devastating effect, and are weighted heavily in the E-model calculation.

Take, for example, the following comparison charts in Figures 3-10 and 3-11.

Figure 3-10. Effect on MOS of 5 Percent Randomized Packet Loss on Four Codecs, as Delay Increases




Figure 3-11. Effect on MOS of 5 Percent Bursty Packet Loss on Four Codecs, as Delay Increases




At 5 percent random packet loss, the MOS starts at around 4 for the G.711 codec with PLC and declines as the delay increases. Contrast this with 5 percent bursty packet loss in Figure 3-11, and you see that the MOS starts at around 3.5 for the same codec. The effect of bursty packet loss is even greater on the other codecs with high compression. For example, G.729 starts with a MOS of around 3.4 for 5 percent random packet loss. However, with 5 percent bursty packet loss, G.729 drops to a MOS below 2.

Two primary reasons explain why RTP datagrams might be lost in a data network:

  • There is too much traffic, so datagrams are discarded when there's congestion.

  • There is too much delay variation (jitter), so datagrams are discarded because they arrive at the listener's jitter buffer too late or too early.

An assessment of a network's readiness to handle VoIP with high call quality should include statistics on lost datagrams, expressed as a percentage of all data sent in the relevant calls. For example, lost data is generally expressed as a percentage of all data sent between a pair of agents over the course of the entire assessment. Other charts might show data loss as a percentage of data sent at a certain time of day, averaged over the course of all days in the assessment.

Good call quality is essential to the success of a VoIP deployment—especially when you recall that your users are accustomed to toll-quality calls. The concepts, tips, and trade-offs that follow should allow you to establish good call quality in your VoIP network.

VoIP-Readiness Assessment

We started this chapter by asking the question, "Where are you currently?" Answering it tells you how close you are to being ready for a successful VoIP deployment. After analyzing your current telephony usage, reliability, and call quality, the final piece of information needed to answer this question is in the form of a VoIP-readiness assessment.

You are probably uncertain whether your existing data network is ready to carry high-quality voice transmissions. A VoIP-readiness assessment should systematically analyze data-network configuration, monitor utilization of key components, and assess call quality by generating traffic loads that imitate a VoIP system's traffic across the network. Such measurements provide information that can't be gleaned from a pilot implementation that simply uses an IP PBX and a few dozen IP phones.[11] A VoIP-readiness assessment is designed to do the following:

  • Evaluate VoIP call quality over several days, running hundreds or even thousands of simulated calls over the network and taking measurements

  • Determine whether an existing data network is ready to deliver quality VoIP calls in its current configuration

A VoIP-readiness assessment should comprise several approaches to network readiness, which is why this section begins with a discussion of network configuration.

Configuration Assessment

A configuration assessment looks at the current state of your network equipment to see if it is ready for VoIP. An estimate is made of what equipment needs to be upgraded to continue with a VoIP deployment. For example, quality of service (QoS) is a requirement for a VoIP deployment. Do your switches and routers support QoS mechanisms? If not, does the software or hardware need to be upgraded? These recommended upgrades should be aimed at increased functionality, capacity, reliability, and call quality.

Start by taking an inventory of your network equipment. Software tools can discover the network devices using the Simple Network Management Protocol (SNMP). SNMP agents running on network devices provide management information in standardized and proprietary formats called Management Information Base (MIB) objects. Network discovery tools can collect configuration information from the MIB objects on IP routers and switches. Some of the tools, such as Microsoft Visio, can also help you draw a diagram of the network.

Figure 3-12 shows network discovery using SNMP.

Figure 3-12. Network Discovery Tools Gather Router and Switch Information Using SNMP Requests and Responses




Having collected the device information, look at how the configuration matches the specifications recommended by the VoIP vendor. Does the current configuration meet the criteria needed to support VoIP? The following parameters should be included in a configuration assessment:

  • Operating system— What version of operating system is running on the routers, switches, firewalls, and other devices? Is it a version that can support VoIP traffic? Does it have the proper functionality to support VoIP?

  • Memory— How much memory (RAM) is installed in the network devices? Is there enough memory to support VoIP functions well? Is there enough memory to support the number of calls that will be added to the network? Additional Flash memory may be needed if an operating system upgrade is required.

  • QoS— Most vendors recommend some QoS mechanism. Do the network devices support those QoS mechanisms? Is QoS already configured on the IP routers? What QoS mechanism is in use? How is VoIP traffic to be prioritized?

  • VLANs— A virtual LAN (VLAN) is used to group or segregate LAN traffic by users. VLANs enable different data classes to be prioritized by the switches using the 802.1p standard. Do the switches support VLANs and 802.1p? Do the switches have VLANs already configured?

  • Shared LAN hubs— Shared hubs offer no QoS guarantees. Any device attached to the hub, even an IP phone, can end up competing with any other attached devices for bandwidth. Consider upgrading all shared hubs in the network to switches.

  • Interface speed— The interfaces in the routers operate at various speeds. Are the interfaces 56 kbps, 1.544 Mbps, 10 Mbps, 100 Mbps, or 1000 Mbps (gigabit)? For Ethernet interfaces, do they support full-duplex mode of operation? Do the interface speeds support the number of VoIP calls that will be added to the network?

  • Power to the phone— If you are about to upgrade your switches, ask your vendor if the specific platform supports providing power to IP telephone handsets via high-speed Ethernet (Cat 5) cable.

After you have analyzed the configuration of key network components, it is time to look at how they are currently being used.

Utilization Assessment

In addition to the configuration information, you should also collect utilization statistics for the network devices and links. Once you have discovered the hardware devices and links, monitor them for a period of time—a reasonable start is to monitor for 24 hours a day, for 7 days. Collect enough data to see whether there are any problematic time periods—certain days or certain hours within a day when utilization is high. What you want to see is whether the devices have sufficient capacity to support VoIP well. If they are already operating near 100 percent capacity, adding VoIP traffic is not a good idea. Consider monitoring these metrics:

  • CPU utilization— A device's processor utilization is a good indication of its workload. If the CPU utilization is consistently high, a processor upgrade may be in order. Look at the average and the peak CPU utilization. The average CPU utilization may be low, but the peaks during busy times may indicate problems when VoIP traffic is added.

  • Memory utilization— To reduce jitter in a network, there should be plenty of available buffers. If buffers are highly utilized, there may be more delay associated with buffering packets and thus jitter can increase. Look at the average and peak buffer utilization.

  • Backplane utilization— A key utilization metric for switches. Provides a good indication of how much network traffic is flowing through the switch.

  • Dropped packets— When congestion occurs at a bottleneck, packets get dropped. Dropped packets are detrimental to VoIP call quality, so a high number for this statistic indicates frequent or prolonged congestion. This statistic may be correlated to high CPU utilization numbers.

  • Buffer errors— Failures that occur when allocating router memory buffers result in discarded or delayed packets. If you are seeing these types of failures, then it is a good indication that the device needs more physical memory.

  • Interface errors— Errors such as cyclic redundancy check (CRC) errors can indicate a physical media problem, such as a bad Ethernet cable. These types of errors will result in discarded packets.

  • Bandwidth utilization— What percentage of your bandwidth is already being used? A sure way to achieve excellent voice quality is to be sincerely overprovisioned. The bandwidth utilization should give a good indication of capacity available for VoIP. Pay close attention to bandwidth utilization on WAN links that will carry VoIP traffic. These links typically have less capacity and are usually highly utilized even before VoIP traffic is added.

When analyzing the utilization of the network components, be sure to look at average and peak values.

After you have collected the configuration and utilization information, you have some good indications of problem areas that need to be addressed or areas where VoIP should perform well. Combine these statistics with your current telephony usage statistics that were discussed previously, and plan ahead for potential problem areas before the VoIP traffic is added.

Call-Quality Assessment

The call-quality portion of a VoIP-readiness assessment determines how well VoIP will sound on a network by assessing the quality of simulated VoIP calls. To assess call quality, realistic VoIP traffic is sent across the network and the resulting flows are measured. Measurements for delay, jitter, and packet loss are collected and input to the E-model to obtain a MOS.

There are several characteristics of the simulated VoIP traffic to consider before running a call-quality assessment:

  • The codecs that are used. Compression algorithms, data rates, and datagram sizes that are used.

  • Whether PLC is enabled for G.711 codecs.

  • Voice datagram sizes.

  • The ability to use silence suppression. Silence suppression can be used by some IP phones to reduce the amount of bandwidth consumed. With silence suppression, if no one is talking, the phone sends much smaller packets.

  • Jitter buffers and their sizes.

  • QoS.

You can use preconfigured defaults for system parameters, or you can tune them to see how various technical choices affect call quality and bandwidth consumption. For example, you can examine the effects of a half-dozen codecs representing various compression algorithms; you also can tinker with jitter buffers, datagram size, and silence suppression.

Call-quality testing simulates VoIP traffic between preselected points on a network for a chosen period of time. While the simulated calls are running, measurements are taken and call-quality scores are calculated. Reports quantify what is collected over the course of an assessment to ascertain a network's readiness and capacity for handling real VoIP traffic.

Assessment software measures delay, jitter, and lost data, and produces a report showing call quality by day of week, location, network cause, and so on. You end up being able to tell what technical factors affect call quality. What is wonderful is that you can get all of these answers before you have spent a lot of money, time, and energy on actually deploying VoIP equipment. You can work through all the data-network issues so that by the time you actually start running the real VoIP piece of it, you have a data network that is going to work well. You also can make cost-effective decisions about network infrastructure and application traffic after you know how VoIP is performing.

When running a call-quality assessment, try to model the expected VoIP traffic. For example, we have eight major sites in NetIQ. From our development site in Raleigh, we rarely call our sales offices in Japan and Europe. It makes sense to set up just one simulated call from Raleigh to Japan and from Raleigh to Europe. An assessment generates several calls an hour, although we probably make less than one call a day between these sites. Part of an assessment is to make sure that you can get a connection and make a toll-quality call any time you want, so testing throughout the day is fine. We call infrequently from Raleigh to Portland, so we would probably define two calls between those sites. Finally, we make many calls to our Houston and San Jose sites, so we would define 10 simultaneous calls.

A call-quality assessment is not a stress test; remember, you are running simulated traffic on a production network. Test with an approximation of the average call volume during work hours, as opposed to the peak call volume. There is a nice "weakest component mode," though, that is easily observed. If the data network is already heavily loaded with existing application traffic and you then add VoIP traffic, it is the VoIP traffic that breaks "first"—it shows high delay, jitter, packet loss, or some combination. Points of weakness are readily seen during preliminary test runs. If initial runs show a high MOS, the additional VoIP traffic will probably have no adverse effect on the other application traffic. However, if initial runs quickly show a low MOS, you may or may not be affecting the other traffic—but you know immediately that the network resources are stretched too thin.

Predicting call quality before investing in VoIP equipment is a valuable step in the VoIP-readiness assessment. The call-quality assessment can be difficult without the proper tools. John Walker cowrote a white paper[12] that details how to do a call-quality assessment with one such tool, NetIQ's Vivinet Assessor.[13]

Bandwidth Modeling

In previous phases of the planning process, you have collected current telephony usage statistics and hardware configuration information. Now it is time to use some of that information for modeling purposes. The goal of modeling is to look at existing telephony usage and existing data-network utilization and try to determine if the current network infrastructure can support the future VoIP traffic. This is the time to ask all the "what if" questions concerning call volume and link capacity. The simplest case for modeling uses the projected call volumes, codec selections, and bandwidth requirements. Calculate the bandwidth required by the new VoIP traffic and see whether its additional bandwidth requirements overload the network. Modeling can require a lot of math to calculate different values for different input variables. You may need to redo the calculations over and over, changing a different variable each time.

Modeling is often done for critical network links. As you are looking at initial VoIP deployments, ask the "what if" questions before the voice traffic is placed on the network. Take a look at the different links that need to support VoIP traffic. Then, take the following parameters as input for a model of that traffic:

  • Codec— What codecs are used for the calls? As discussed earlier, different codecs have different bandwidth requirements. For example, G.711 requires 64 kbps (without protocol header overhead), but G.729 requires only 8 kbps. Selecting a codec with a lower bandwidth may allow more calls, but the resulting MOS will be lower.

  • Number of calls— What is the number of simultaneous voice calls that could be supported? This number may be represented in erlangs to represent the number of hours of call traffic that occurs during the peak call volume.

  • Current bandwidth utilization— What is the current bandwidth utilization? This is usually expressed as a percentage of the total bandwidth available.

  • Bandwidth capacity— What is the maximum bandwidth capacity for the link? This is usually expressed in kilobits per second or megabits per second.

A Modeling Example

Suppose that there is a T1 link between the NetIQ offices in Raleigh and Houston. Based on hardware analysis and assessment, you know that the link has a speed of 1.544 Mbps, average utilization of 35 percent, and a peak utilization of 75 percent (occurring at various times throughout the week). To handle peak usage scenarios you need to keep the total utilization, including the additional VoIP traffic, to 75 percent of the link capacity. Because the T1 link is full duplex (you can send and receive data at the same time), you will use the single-direction call bandwidth for the calculations.

The call volume data says that you need to support, on average, 20 simultaneous calls between Raleigh and Houston. Your first choice is to use G.711 as the codec because it has the highest theoretical maximum MOS.

What if you add 20 G.711 calls to this link?

20 calls * 87.2 kbps per call = 1744 kbps = 1.744 Mbps



The resulting call volume for 20 G.711 calls alone would exceed the capacity for the link. So try changing the codec.

What if you switch to a G.729 codec?

20 calls * 31.2 kbps per call = 624 kbps



The link has, on average, bandwidth usage of 1.544 Mbps * 35% = 540 kbps. This leaves 1544 kbps – 540 kbps = 1004 kbps of bandwidth, on average, available for VoIP calls. So the 20 simultaneous calls using the G.729 codec could be sustained over the link. However, you are right at, or in some cases above, your target utilization goal, which is 75 percent of the link capacity. In addition, during peak utilization periods, only 1.544 Mbps * 25% = 386 kbps would be available. Thus, some of the calls would be dropped or would suffer reduced call quality. So on average, this link might be okay, but when it comes to call quality, you want to be better than average.

Figure 3-13 shows a graphical example of the link utilization using this scenario. The bandwidth utilization after the VoIP calls are added to the link is dangerously close, or in some cases above, the target utilization.

Figure 3-13. Link Bandwidth Utilization Before and After VoIP Traffic Is Added




What if you enable RTP header compression on this link? IP routers support RTP header compression in an effort to reduce the bandwidth required by VoIP traffic. The IP, UDP, and RTP headers are compressed from 40 bytes to between 2 and 4 bytes. Although RTP header compression can dramatically reduce the bandwidth requirements for some codecs, you need to be careful because there is a trade-off. The compression and decompression that occurs on each end of the link can add delay and severely degrade CPU performance, which may show up as reduced call quality.

For bandwidth modeling purposes, assume that there is no delay problem on the T1 link, but you know that there is a capacity problem with the additional VoIP call traffic. Assume that you enable RTP header compression on both ends of the link and the header compresses from 40 to 4 bytes. Now the bandwidth requirements for each G.729 call are as follows:

20 bytes G.729 payload + 4 bytes RTP/UDP/IP header + 18 bytes Layer 2 header * 8 bits/byte * 50 packets/second = 16.8 kbps per call



20 calls * 16.8 kbps per call = 336 kbps



Now, call traffic is consuming 336 kbps, which fits in the 386 kbps that you have available in the worst case during peak utilization.

Modeling can get very complicated, very quickly. There are many other questions that you can ask: What if QoS mechanisms are applied? What if silence suppression is enabled? Start simple. Look at the bandwidth considerations related to your VoIP deployment and go from there, performing the calculations again for each variable you consider.

Amazon


Taking Charge of Your VoIP Project
Taking Charge of Your VoIP Project
ISBN: 1587200929
EAN: 2147483647
Year: 2004
Pages: 90

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net