Generally, echo is at its worst when end-to-end latency is high. If end-to-end latency is below 150 ms, echo should be nearly imperceptible. Remove echo by removing latency. Remember that using bigger packet sizes, which are often used with low-bandwidth codecs, can increase latency. If capacity is stopping you from removing latency, increase the capacity on the links that are causing the latency. Steer away from frame-relay and VPN if the link is criticalthese technologies provide the slowest links.
This problem can occur when a NAT firewall exists between the caller and the recipient.
When placing a call using a SIP phone, your phone establishes the call after determining the recipient's socket through DNS and your (or your provider's) SIP registrar. The local socket used to communicate with the recipient is the local IP address and port used by your phone's RTP agent: i.e., your phone's IP address and RTP port. Consider Figure 15-1.
In Figure 15-1, phone A is able to send audio data to phone C, but phone C is unable to send audio data to phone A because the firewall (B) doesn't know what to do with the RTP packets coming from phone C. This is a problem created by NAT (network address translation), which cannot keep track of connectionless applications that use more than one socket pair, like a two-way phone conversation.
To solve this problem, add a SIP proxy server between phone A and firewall B. Configure the SIP phone so that it places all calls through the SIP proxy. The SIP proxy knows how to handle the RTP data sent to, and received from, many SIP endpoints simultaneously . In this instance, the SIP proxy would need to reside on a DMZ between A and B, so that it can have a publicly routable IP address. The SIP proxy could also reside on the same host as the firewall, if public IP addresses aren't abundant.
Another way to approach the NAT problem is to use a STUN (Simple Traversal of UDP) NAT server. This server assists endpoint devices in figuring out what sockets to use in signaling a VoIP call setup, so that UDP NAT traversal can occur without a DMZ. STUN is described in RFC 3489. More solutions to the NAT problem are presented in Chapter 13.
Like the problem shown in Figure 15-1, NAT firewalls are prone to breaking two-way packet conversations. SIP is no exception. When a privately addressed SIP endpoint (or a voice server registering as SIP client) registers to a public SIP registrar, it writes its private address into the SIP REGISTER method it sends. This is the address used by the SIP registrar to send its 200 or 400 response, depending upon whether or not the registration attempt was successful. Since private addresses aren't routable on the public Internet, the response never gets returned to the registering endpoint, which interprets a timeout. Registration fails and the SIP endpoint can't make any phone calls.
The solution to this problem is also to use a SIP proxy that runs parallel with the NAT firewall. But the trick is that the SIP proxy must have a public address so that the remote SIP registrar's responses have somewhere to go.
Assuming the phone's IP configuration is correct, this problem normally occurs when the phone is unable to register or log on to the SIP registrar or H.323 gatekeeper server. Without an outlet to place calls, which is what these servers provide, the IP phone is unable to call anybody.
In SIP setups, this condition results in a 403 (and sometimes 401) SIP response, indicating registration failed. If the SIP setup uses a proxy, then response 407 occurs. Either way, the authentication of the IP phone has failed, so it can't place calls. (Chapter 11 demonstrated a failed SIP registration using Ethereal.)
To resolve the issue, be certain the phone's user ID and password match those stored in the registrar or softPBX. If MD5 is used, be sure the secret key matches as well. Since some phones don't support MD5, try removing the MD5 requirement for this phone to see if it can register without it.
Levels of utilization that result in a performance loss, or performance limits , are a result of improper planning or provisioning. This can mean a number of things:
The RAM and processor power of servers along the call path may be too low to handle a high number of simultaneous calls. There's no hard and fast rule for provisioning these. But the test lab is the best place to establish the spec for your servers
The bandwidth availability on a data link in the call path, such as an Ethernet segment or frame-relay virtual circuit may not be great enough. An old-fashioned calculator will suffice for bandwidth projections
The wrong type of codecs are being used, resulting in unnecessary processing load. For example, running G.729A over Ethernet isn't really necessary. (See the later section, "Calls across a wide area call path have dropouts in the audio.")
Pay attention to the obvious...
While writing and editing of this book, my editor and I were communicating by email about a VoIP TSP service. The editor was about to try it and was hooking up the shiny new ATA the TSP had just shipped him. He plugged his analog phone into the ATA, plugged in his Ethernet cable, and powered the ATA up. After a few minutes, he lifted the receiver, hoping to hear the dial-tone, but instead all he heard was silence.
So he looked over the lights on the front of the unit. They indicated nothing abnormal. He felt all the connections to make sure they were firmly in place and restarted the unit, but to no avail. After getting nowhere with the tech support department at the TSP (they said, "It should just work!"), he emailed me to see if I could help.
With the hope of troubleshooting the problem, I asked a bunch of questions about the editor's firewall, if the ATA was using DHCP, whether the ATA could be pinged from a nearby PC, and if the editor could do a packet capture of the ATA's registration process to be emailed to me for examination. So the editor did a packet capture using Ethereal and emailed it.
After receiving the packet capture, I saw that the ATA's SIP registration was being sent successfully and that the service provider's server was sending the appropriate 200 response. The ATA was clearly communicating with the TSP without a problem. So why was there no dial-tone? Resigned to the idea of advising the editor that his brand-new ATA was broken, I instead decided to sleep on it.
The next morning, I awoke to a message from the editor in my inbox. After all that troubleshooting and Q&A, the true culprit was revealed. The ATA had two RJ11 ports, one for the analog phone and one for a POTS line for 911 calls and the like. The analog phone had been connected to the wrong port on the ATA all along! Just plugging it into the other port resolved the problem immediately. The moral of the storysimple problems tend to have simple solutions.
This problem usually occurs if the access router that connects to your Internet service provider is configured for DHCP, and its IP address has changed because the router's DHCP lease expired . The net effect is that the ATA's signaling socket is broken, and the ATA has no way of knowing it (remember, UDP is connectionless). So, the ATA must be rebooted in order to reestablish a valid socket. The best way to eliminate this problem is to get a static IP address from your ISP, or figure out a way to power-cycle your ATA daily during off-hours. Some TSP-provided ATAs have firmware that works around this problem. Contact your TSP to be sure.
The network can transmit dialed digits in two ways: in-band and out-of-band. This problem occurs most often when in-band signaling of digits is used. Because in- band signaling uses audio (dual-tone multifrequency) signals to represent each dialed digit, it is possible that the signals can be distorted during encoding, transport, and decoding. If the distortion is significant, IVR systems may not be able to interpret the audio signals they receive. Sometimes holding down each digit for an extra second or two can improve the IVR system's recognition, but not always.
To avoid this issue, configure your SIP endpoints and ATAs to use an out-of-band DTMF signaling approach such as the SIP INFO method. Generally speaking, the only time in-band DTMF signaling can be reliably used all the time is on legacy, non-packet-based voice links.
Like in-band DTMF digits, the human voice can be distorted when many subsequent digital/analog conversions or encoding/decoding operations occur. When a sound stream is encoded digitally, decoded into an analog signal or transcoded, and then reencoded into a different codec, degradation of the original signal almost always occurs. It's like making copies of an old analog cassette tape and then making copies of the copies. By the third or fourth generation, the recording on the duplicates starts sounding awful .
This phenomenon is easy to demonstrate with a SIP ATA and a TSP like BroadVoice or VoicePulse. Try calling a digital cell phone, which probably uses the GSM codec to carry the sound stream from the phone to its CO, from an Internet-based TSP like BroadVoice or VoicePulse, which probably uses G.729A to carry the sound stream from the ATA to its SIP server. This call will sound somewhat robotic. Now, try using the TSP service to conference the call with a third party using another cell phone. The two cell phone parties should sound significantly robotic by now, due to all the reencoding.
Unfortunately, most of the control over this issue resides with the service providers. While it's possible for a SIP client to negotiate GSM codec directly with a cell phone network for a call that needs no transcoding or reencoding, most telephone network operators don't facilitate this kind of advanced signaling yet. This is because they use SS7, which is out of the reach of most enterprise VoIP users.
The good news is, many more small voice carriers and large enterprise NOCs (network operations centers) will be able to use SS7 signaling in the near future. Verisign and Transaction Network Services, Inc., both offer SS7 connectivity that allows VoIP switches to directly signal and negotiate capabilities for each call, including codec selection, in order to increase quality.
As a rule, this means that the latency or jitter on the call path's route is a problem. If the wide area link in question is used for voice and data, use routers that support IP precedence to promote VoIP over non-VoIP traffic. If the link is running at capacity, consider increasing its capacity or using a bandwidth-reservation QoS measure across the link. Also, make sure that, for call paths that have limited bandwidth, calls are selecting a thrifty codec (not G.711).
This problem is either caused by latency on the call path or poor conversational manners. Most often, latency is the culprit. Do what you can to decrease latency on this call path. Are you using a small packet interval? What about packet-loss concealment ? Do you need to be using it? What about jitter buffering? How fast is the link? Is the call path a VPN? All of these things contribute to latency.
Normally, this problem is caused by poor silence suppression techniques. Silence suppression conserves bandwidth by not transmitting sound frames during periods of silence. But since a sound must occur in order to resume transmission, some of the initial frames may not actually get sampled. Of course, since sampling frames isn't what eats bandwidth ( transmitting them does), any cessation of sampling during silence-suppressed periods is just poor design. The lesson here is this: if you're going to use silence suppression, try before you buy. Listen to phones that support silence suppression with, and without, that feature enabled. You'll be able to tell very easily if the silence suppression support is any good.
Power provisioning is a common challenge with IP telephony. If IP phones are replacing TDM phones throughout the office, then you've probably decided to use 802.3af to provide power to the IP phones. If not, you'll need to place UPS (uninterruptible power supplies ) and AC/DC power packs at each phone. With a dozen phones or more, the cost of doing so more than justifies the use of 802.3af switches or power injection panels. This way, power backup can be centralized, connected to a heavy-duty UPS and/or power generator, so that when the power goes out, the phones don't! The bottom line is: don't forget to properly provision electrical power during your VoIP build-out.
And, while we're on the subject of power...
There are a couple of things you could do. First, if your budget permits , the obvious (though proprietary) answer is to use Cisco PoE switches to power the phones. Some other switch makers , like Foundry Networks, also support Cisco's proprietary PoE standard. If you can't afford to forklift your switches and instead want to power your Cisco phones by way of power injectors, then you should consider Cisco PoE-compatible injectors like those made by PowerdSine (http://www.powerdsine.com). But if you can't do that either, do the next best thing: hack .
By flipping wires 4 and 7 and wires 5 and 8 on a standard UTP Ethernet patch cable, you've basically made a "compatibility cable" that lets you plug Cisco IP phones into an 802.3af source, as in Figure 15-2.
Make sure your switch lets you program, port by port, which ports get power and which ones don't. This is necessary because, in the Cisco PoE solution, Cisco IP phone power requirements are autodetected, so power can turn itself on and off as necessary on each port. There's no such provision when using a hacked cable to supply 802.3af power to a Cisco PoE-using phone. If this is a problem, and 802.3af won't work with the hacked cable, then try using a device that does the two-pair flip but also works with autodetection, such as 3Com's 48-volt Intellijack switch converter, part number 3CNJVOIP-CPOD.
Chances are, you're trying to use a bandwidth-absorbent codec like G.711 over the Internet VoIP trunk that connects to IAXTel. Switch to a more miserly codec like GSM or, if the destination supports it, Speex.
This is a rare occurrence, but every seasoned system administrator knows the danger of a rashly typed command or a misplaced mouse click. One second, everything's humming along fine, and the next, a critical system is, well, gone . Aside from careful consideration of every administrative move you make, your best defense against this scenario is, of course, a good backup of your PBX configuration. This will also protect you against another possibility: a hard drive crash or data corruption on a PBX.
Some commercial PBXs offer built-in backup capabilities and even an undo feature to roll back a botched config. But if you're using a noncommercial solution like Asterisk or Open H.323, you'll need to brew your own backup recipe. Here's a shell script that you could trigger from cron on Linux to back up the Asterisk configuration:
mkdir /var/backup/$(date) cp -R /etc/asterisk /var/backup/$(date) > /var/backup/backup.log tar -cf /var/backup/$(date)/$(date).tar /var/backup/$(date)/* gzip -c /var/backup/$(date)/$(date).tar mv /var/backup/$(date)/$(date).tar.gz /var/backup rm -Rf /var/backup/$(date)
This script copies the Asterisk configuration directory, /etc/asterisk , to a temporary directory in /var/backup, compresses it into a gzip file, and drops it in /var/backup before deleting the temporary directory. You would want to run this daily, and use Veritas, cpio , tar , or a similar cassette backup software to copy it onto a cassette daily, too. That way, your rollback window, the time between the most recent backup and a critical failure, is never longer than a single day.
As nice as new IP phones are, if your rollout is big enough, somebody is going to complain about something . It could be the angle of the new handsets causing (probably exaggerated) neck strains, or it could be that the buttons on the keypads aren't dimpled, making it harder to dial without looking at the keypad. There may be slight differences in the sound of the phone receivers that aggravate certain users (MOS scoring can help you evaluate demonstration equipment before you make a big purchase).
The key to minimizing complaints after the implementation is to evaluate end user input before it. This means engaging end users, and allowing their requirements to drive the project. It also means defining your service-level agreement in terms that are as specific and practical as possible. The mean opinion score metric is a good guarantor of SLA attainment (MOS is covered in Chapter 9).
When the legacy equipment is gone, though, attrition will deal with the matters of opinion and nuance that you just can't do anything about. If complainers are the usual suspects , those people who complain no matter what, then you've probably done a good job. Eventually, the dust will settle , and the ROI process will begin. Once the rewards of the VoIP switch are in hand (increased efficiency and decreased cost), the noisy complaints of Ebenezer Evans in the Accounting Department about the size of his keypad buttons will suddenly seem frivolous.
Well, you can. But you may have to do a bit of hacking. The problem is, IP telephony isn't an ideal technology for broadcast applications such as overhead paging. Some proprietary solutions support IP phones for overhead paging using IP multi-cast, but there really isn't an authoritative standard for dealing with this need yet.
This simple hack will let you do intercom paging on a Cisco SIP 7960. It uses autoanswer to pick up incoming calls on a second line, which we'll call a paging line, so that the paging party can broadcast his voice. A 7960 phone configured in that way can act as a paging speaker because, as soon as a call comes in on the paging line, it's automatically answered . In principle, a similar approach should work on any phone that supports autoanswer. So the hack isconfigure a line appearance on the 7960 to autoanswer, and you've got a basic intercom paging system.
With a little work, you can make a group of 7960s with paging lines act as "overhead" speakers for a paging zone. To do this, your PBX server will have to support some form of ad hoc conferencing. Create, in the dial-plan, an extension that adds all of the 7960s to the conference room, using their paging lines so they autoanswer, and then gives the dialing party a beep to let her know she can begin her announcement. If you can, limit the absolute timeout of the conference to 10 or 15 seconds to ensure all of the paging lines are "hung up" at the conclusion of the page. Now, instead of merely having a single phone acting as an intercom, you've got a group of phones acting as a paging zone.
Keep in mind that this technique is unicast in natureit doesn't preserve wide area bandwidth the way a multicast paging setup would. And, with 96 kbps per simultaneous paging endpoint on the LAN, you could clobber your local area bandwidth if you have more than a few hundred phones on each zone. Yet, if your building has lots of obstructions (like office walls), you may need to use a lot of paging endpoints so that everybody can hear zone pages. Since there tends to be a phone on each desk, they make great paging speakers in this situation. If the building where you're placing the paging zone is open and has no interior walls to block the soundlike a warehouse or factorythen you'd probably be better off using a few large, analog speakers for zone paging.
The old adage "You get what you pay for" applies aptly to VoIP. But if your ROI payoff is going to be real, all of your project's potential costs must be recognized fully and truthfully. This means adding the cost of consulting, training, and migration to the sticker price on that shiny new VoIP hardware. Depending on your particular implementation, the hardware itself may represent only a third of the overall cost. When you create your RFP, ask responders to include a cost per user for training and implementation and then tack that cost onto the project's bottom line.
You spent months doing MOS testing, interviewing end users, looking over the shoulder of the corporate receptionist , taking detailed notes about call flow, and maybe pulling your hair out doing return-on-investment analysis.
Now, the big day for the ambitious change has come. Today, you'll be turning on 500 new IP phones, two new fully survivable softPBX servers, and a brand-new 48-trunk PSTN connect point with two PRI gateways attached. But at the moment of truth, as everybody's breath is held and fingers are crossed, the phone company makes a provisioning mistake and one of those T1 smart jacks has bright red alarm lights. Your VoIP network has no way to reach the outside world, and now that you've flipped the proverbial switch, nobody from the outside world can reach that group of usersusers who, yesterday , had a flawlessly functioning legacy phone system. Ouch.
This isn't a situation you want to end up in. If the phone company misses your switchover date, it might only mean that your service will remain on your old phone system. But if the trunks used for the new system have the same phone numbers as those used with the old system, you could be in a mess. This is because they can't be connected to both systems (old and new) at the same time. If they could, switching PSTN connectivity from one trunk group to another wouldn't be such a critical task. Here's how you can avoid a botched trunk move turning into phone downtime:
Don't switch over all of your inbound phone numbers to a new trunk circuit. Bring over only the "main" numbersthe numbers the outside world knows. That way, if the switchover doesn't happen, you'll still have usable (secondary) outbound trunks on the old system. You'll have to worry about inbound traffic only to those "main" numbers.
In the event of a failed switchover, first determine whether you still have dial-tone on your old trunks. If you do, you can continue to use them with your old system until the switchover can be rescheduled.
If the old trunks don't have dial-tone any longer and the new trunks aren't working, have the phone company forward all calls destined for those "main" inbound numbers at the CO to the secondary trunks that you haven't switched over. These trunks are, hopefully, still connected to your old system. (See item 1.)
If possible, coordinate the switchover with the phone company so that any failures can be picked up on immediately and the CO reverted back to its prior state until the switchover can be attempted again. (Third-party implementers of PBX systems are experienced at managing interaction with the phone company.)
Plan to do your switchover during off-hours. (And not an hour before business on a Monday morning.) Give yourself plenty of time for testing of the internal dial-plan, too. Even if your shiny new PRIs are working right, there may be flaws in your call-flow logic that need to be worked out before users can start making and receiving calls.)
If IP precedence and Ethernet CoS have you totally convinced that you can replace a thousand TDM phones with a single switched Ethernet segment and a palette full of SIP phones, then don't sweat it. You're not alone. A perception of poor quality, whether deserved or undeserved , was the top reason for the industry's slowness to adopt VoIP during the late 90s and early 00s.
If you have money to burn and you want to do absolutely everything in your power to make sure your VoIP links are tip-topif they're Ethernet or WAN linksthen you can dedicate them only to VoIP. By eliminating all other traffic from these network segments, you can take advantage of IP telephony without having to share the transport with non-telephony apps. In essence, this means building and running a second, independent IP network alongside your existing one. More than just a VLAN, this network is an operationally independent network that has no bridges to the other network, and therefore no way of comingling traffic with it.
Why would anybody go to such lengths for VoIP, when the IETF and IEEE have invested so much time and effort into QoS? That's a philosophical question that I'll leave to you.