A Cisco IP Communications network is a suite of components that includes Internet Protocol (IP) telephony communications. Cisco CallManager is a core component of a Cisco IP Communications network, the primary function of which is to serve as the call routing and signaling component for IP telephony.
The term IP telephony describes telephone systems that place calls over the same type of data network that makes up the Internet. Although strictly speaking, IP telephony primarily enables users to have voice conversations, CallManager also has the capability to enable users with PCs associated with their phones, users with video-only endpoints, and users with H.323-based video systems to have end-to-end video conversations.
Telephone systems have been around for more than 100 years. Small, medium, and large businesses use them to provide voice communications between employees within the business and to customers outside the business. The public telephone system itself is a very large network of interconnected telephone systems.
What makes IP telephony systems in general, and CallManager in particular, different is that they place calls over a computer network. The phones that CallManager controls plug directly into the same IP network as your PC, rather than into a phone jack connected to a telephones-only network.
Phone calls placed over an IP network differ fundamentally from those placed over a traditional telephone network. To understand how IP calls differ, you must first understand how a traditional telephone network works.
In many ways, traditional telephone networks have advanced enormously since Alexander Graham Bell invented the first telephone in 1876. Fundamentally, the traditional telephone network is about connecting a long, dedicated circuit between two telephones.
Traditional telephone networks fall into the following four categories:
A key system is a small-scale telephone system designed to handle telephone communications for a small office of 1 to 25 users. Key systems can be either analog, which means they use the same 100-year-old technology of your home phone, or digital, which means they use the 30-year-old technology of a standard office phone.
A PBX is a corporate telephone office system. These systems scale from the small office of 20 people to large campuses (and distributed sites) of 30,000 people. However, because of the nature of the typical circuit-based architecture, no PBX vendor manufactures a single system that scales throughout the entire range. Customers must replace major portions of their infrastructure if they grow past their PBX limits.
A Class 5 switch is a national telephone system operated by a local telephone company (called a local exchange carrier [LEC]). These systems scale from about 2000 to 100,000 users and serve the public at large.
Long distance companies and national carriers (called interexchange carriers [IEC or IXC]) use Class 1 to 4 switches. They process truly mammoth levels of calls and connect calls from one Class 5 switch to another.
Despite the large disparity in the number of users supported by these types of traditional networks, the core technology is circuit-based. Consider an old-time telephone operator. He or she sits in front of a large plugboard with hundreds of metal sockets and plugs. (Figure 1-1 shows a picture of an early PBX.) When a subscriber goes off-hook, a light illuminates on the plugboard. The operator plugs in the headset and requests the number of the party from the caller. After getting the number of the called party and finding the called party's socket, the operator checks to see whether the called party is busy. If the called party is not busy, the operator connects the sockets of the calling and called parties with a call cable, thus completing a circuit between them. The circuit provides a conduit for the conversation between the caller and the called party.
Figure 1-1. An Early PBX
Today's central switching officespecifically, its call processing softwareis simply a computerized replacement for the old-time telephone operator. Obeying a complex script of rules, the call processing software directs the collection of the number of the called party, looks for the circuit dedicated to the called party, checks to see whether the line is busy, and then completes the circuit between the calling and called parties.
In the past, this circuit was an analog circuit from end to end. The voice energy of the speaker was converted into an electrical wave that traveled to the listener, where it was converted back again into a sound wave. Even today, the vast majority of residential telephone users still have an analog circuit that runs from their phone to the phone company's central switching office, whereas digital circuits run between central switching offices.
This reliance on circuits characterizes traditional telephone systems and gives rise to the term circuit switching. A characteristic of circuit switching is that after the telephone system collects the number of the called party, and establishes the circuit from the calling party to the called party, this circuit is dedicated to the conversation between those calling and called parties. The resources allocated to the conversation cannot be reused for other purposes, even if the calling and called parties are silent on the call. Furthermore, if something happens to disrupt the circuit between the calling and called parties, they can no longer communicate.
Like the central switching office, CallManager is a computerized replacement for a human operator. CallManager, however, relies on packet switching to transmit conversations. Packet switching is the mechanism by which data is transmitted through the Internet, which encapsulates packets according to the Internet Protocol (IP). Web pages, e-mail, and instant messaging are all conveyed through the fabric of the Internet by packet switching. The term voice over IP (VoIP) specifically refers to the use of packet switching using IP to establish voice communications between IP-enabled endpoints on LANs and IP WANs, as well as the Internet (although CallManager is generally not deployed in configurations that route voice traffic over the Internet).
In packet switching, information to be conveyed is digitally encoded and broken down into small units called packets. Each packet consists of a header section and the encoded information. Among the pieces of header information is the network address of the recipient of the information. Packets are then placed on a router-connected network. Each router looks at the address information in each packet and decides where to send the packet. The recipient of the information can then reassemble the packets and convert the encoded data back into the original information.
Packet switching is more resilient to network problems than circuit switching because each packet contains the network address of the recipient. If something happens to the connection between two routers, a router with a redundant connection can forward the information to a secondary router, which in turn looks at the address of the recipient and determines how to reach it. Furthermore, if the sender and recipient are not communicating, the resources of the network are available to other users of the network.
In circuit-switched voice communications, an entire circuit is consumed when a conversation is established between two people. The system encodes the voice in a variety of manners, but the standard for voice encoding in the circuit-switched world is pulse code modulation (PCM). Because PCM is the de facto standard for voice communications in the circuit-switched world, it comes as no surprise that a single voice circuit has been defined as the amount of bandwidth required to carry a single PCM-encoded voice stream.
Video communications require that significantly more information be sent from one end of a connection to another. In circuit-switched video communications, multiple circuits are usually simultaneously reserved for a single call to allow endpoints to exchange high-quality video.
An interesting complication involving voice encoding is introduced by packet-switched communications. Even if circuit-switched systems encode the voice stream according to a more efficient scheme, little incentive exists to do so, because, in most instances, a circuit is fully reserved no matter how little data you place on it. In the packet-switched world, however, a more efficient encoding scheme means that for the same amount of voice traffic, you can place smaller packets on the network, which in turn means that the same network can carry a larger number of conversations. As a result, the packet-switched world has given rise to several different encoding schemes called codecs.
Different types of voice encoding offer different benefits, but generally the more high fidelity the voice quality, the more bandwidth the resulting media stream requires. As the amount of bandwidth that you are willing to permit the voice stream to consume decreases, the more clever and complex the codec must become to maintain voice quality. The codecs that attempt to minimize the bandwidth required for a voice stream require complex mathematical calculations that attempt to predict in advance information about the volume and frequency level of an utterance. Such codecs are highly optimized for the spoken voice. Furthermore, these calculations are often so computationally intensive that software cannot perform them quickly enough; only specialized hardware with digital signal processors (DSP) can handle the computations efficiently. As a result, codec support often differs substantially from device to device in the VoIP network, because devices that do not incorporate DSPs can generally support only easy-to-encode and easy-to-decode codecs such as G.711.
Because not all network devices understand all codecs, an important part of establishing a packet voice call is the negotiation of a voice codec to be used for the conversation. This codec negotiation is a part of a packet-switched call that does not assume nearly the same importance on a circuit-switched call. Chapter 5, "Media Processing," discusses codecs in more detail.
The information contained in a video call is also encoded using a particular codec; unlike voice codecs, however, of which a handful of variants must be interworked, for interactive videoconferencing, video in the IP world has widely adopted H.263 to encode end-to-end video information (although most products are moving toward H.264).
The rest of this chapter discusses the following topics:
Cisco CallManager Architecture
Call Routing
Station Devices
Trunk Devices
Media Processing
Manageability and Monitoring
Call Detail Records
Appendix A. Feature List
Appendix B. Cisco Integrated Solutions
Appendix C. Protocol Details
Index