To accommodate multimedia applications requiring delay-sensitive delivery, use real-time media transfer protocols such as Real-Time Transport Protocol (RTP), and its partner control protocol Real-Time Control Protocol (RTCP). To provide additional control to the media transfer, use signaling protocols such as Real-Time Streaming Protocol (RTSP), SIP, and H.323. You also can specify these signaling protocols to use Session Description Protocol (SDP), XML, or SMIL to supply the information related to the session to the participants.
Table 9-3 outlines the real-time protocols discussed in this Chapter.
Transferring Streaming Media with the Real-Time Transport Protocol
RTP provides the transport for audio and visual media transmission over an IP network. The Layer 4 transport can be over UDP or TCP, but more often UDP is used as the transport protocol. For real-time applications, UDP provides less packet delay than TCP. Recall from Chapter 2, "Exploring the Network Layers," that delays occur using TCP and are associated with retransmissions from packet loss and the TCP slow start congestion control algorithm. Furthermore, most real-time applications prefer to conceal packet loss rather than retransmit lost packets. As such, RTP provides mechanisms to handle network issues, such as jitter and packet loss, on its own at the application layer of the OSI model.
You can scale RTP by using IP Multicast Layer 3 forwarding, provided that you enable IP Multicast features in your network infrastructure, such as PIM-SM and Bidir-PIM. You also can use unicast-UDP to transport RTP sessions.
RTP is flexible as it specifies the transport mechanism, not the payload formats and algorithms for the underlying real-time media. RTP can transport a number of video formats, such as MPEG-4, H.261, JPEG compressed video, and many more. However, each payload format is normally specified separately in its respective RFCs or ITU document, in order to provide format-specific header values and controls. For example, in the case of H.261, RFC 2032 specifies Negative Acknowledgements to control video flow and handle retransmission of lost packets.
Because RTP uses UDP as its transport protocol to deal with delays associated with network errors resulting in lost or out-of-sequence packets, RTP packets include timestamps and sequence information in their application headers. RTP uses the timing information to synchronize different sources involved in a multimedia presentation, such as audio and video. For example, lip movements require synchronization with the voice of someone presenting over a video conference or corporate communication. RTP also includes sequence numbers to determine lost or out-of-sequence packets. In contrast to the mechanism TCP uses to retransmit and reorder in the event of errors, RTP uses sequence numbers to detect packet loss in order to conceal rather than correct errors. The assumption is that a video frame received out-of-sequence is better discarded than displayed to the viewer out-of-sequence.
Although RTP uses a short playout buffer, it is meant to alleviate packet jitter and is not suitable for buffering, retransmitting, and reordering packets.
RTCP is the protocol within RTP for session monitoring and control but does not provide any delivery guarantee. RTCP maintains the state of RTP sessions using a unique identifier, called CNAME, for each group of RTP-UDP connections. RTCP uses this state to group the different feeds into a single multimedia session. Based on the session that the RTP-UDP connections belong to, synchronization can take place using the timing information that is included in the session information embedded within each RTP and RTCP UDP connection. However, the timestamps in the RTP packets originating from different servers may skew from one another, making synchronization difficult. As a result, RTCP provides a reference clock (or wallclock) to reconcile timestamps from different RTP streams for synchronization purposes. You can derive the RTCP wallclock from an external Network Time Protocol (NTP) source. NTP is accurate enough to provide time resolution appropriate for any of today's streaming media applications. The streaming media application can use the RTCP time reference to calculate jitter, data packet rates, and clock skew in the individual RTP connections.
RTCP also provides congestion control through client-side reporting on the quality of the streaming data reception. RTCP sender reports (SR) are sent periodically (for example, every 5 seconds) to receivers indicating the quality of the stream. Based on these reports, participants can calculate statistics, such as number of lost packets, round trip times, and inter-arrival jitter. Optionally, RTCP also can send participant information such as participant name, e-mail address, phone number, and location in the SRs.
RTCP sends packets periodically to all participants in the session, using a different port number than the RTP streams. This way, all participants can evaluate the total number of participants. Packets are sent using the same distribution as the RTP streams, either UDP unicast or multicast. RTCP traffic normally does not exceed 5 percent of the total session bandwidth, with at least 25 percent of that being for source reports.
Table 9-4 lists the available RTCP commands.
RTP organizes data into payloads such that each packet contains an independently decodable unit. If possible, each frame of a video feed is compressed and sent in a single packet, so that the user can decode the packet as it arrives on the network. If a source sends a single frame across multiple packets, the RTP timestamp is the same for each packet.
RTP uses the RTP UDP port range 1638432767. RTP uses the even numbers in this range; RTCP uses odd numbers within the range.
Real-time Data Control with Real Time Streaming Protocol
RTSP acts as a TV remote control, enabling the recipient to use functions, such as play, pause, record, fast-forward, and rewind, to control the delivery of media from the origin server to clients.
RTSP is similar to HTTP with the following major exceptions:
Table 9-5 lists the common RTSP messages that clients and servers use.
A client must have information about the following components to request and receive a stream.
RTSP can use Session Description Protocol (SDP), SMTP, XML, or SMIL or inform clients of this information. In Figure 9-3, a client uses HTTP to request a description of the streaming content from the server. You identify the streaming content by URL in the same way that HTTP identifies web content. The server responds with a detailed SDP description of the streaming media, including the media types, transport protocols, the multicast or unicast IP addresses of the sources, and codecs. Figure 9-3 shows a typical RTSP flow in which the client retrieves the SDP file using HTTP.
Figure 9-3. Sample RTSP Flow for Controlling a Multimedia Session
You also can send SDP description files to clients using RTSP in response to the DESCRIBE method. Example 9-4 is a description file in SDP format for an on-demand Windows streaming media session within an .asf file containing three streams: an audio, video, and whiteboard stream.
The W3C defines custom XML tags that enable you to implement the SDP fields given in Example 9-4 using SMIL. To use the custom tags, you need to define the SDP namespace <smil xmlns:sdp="http://www.w3.org/AudioVideo/1998/08/draft-hoschka-smilsdp-00"> in your SMIL file.
Example 9-4. Sample Session Description File to Describe a WMT Stream
If you are using RealNetworks or WMT, you can package the SDP file directly into the proprietary container file headers.
When the client receives the SDP file, it sends a SETUP method to the server to initialize the requested session. The SETUP includes the transport and the port numbers that the client requires.
SETUP rtsp://ssdl.com/test.asf RTSP/1.0 CSeq: 101 Transport: RTP/AVP;unicast;client_port=2301-3202
The server responds with a RTSP 200 OK that includes a sequence number for the current method, and an identifier for the client to use in subsequent RTSP methods as a reference to the session. The server confirms the transport (RTP/AVP;unicast) and client ports (client_port), and informs the client as to the server ports (server_port) to use for the RTP connection, within the Transport: RTSP header.
RTSP/1.0 200 OK CSeq: 101 Date: 23 Aug 2005 15:35:06 GMT Session: 47112344 Transport: RTP/AVP;unicast; client_port=4589-4589;server_port=6256-6257
By keeping track of session state, a participant may send many RTSP messages over short-lived TCP connections, throughout the timeline of the presentation. By providing the session identifier in every RTSP request or response, an RTSP session can span multiple TCP connections. RTSP supports pipelining as well and works with either unicast or multicast.
The server then sends a PLAY method notifying the server that it should start the RTP session. In this example, the server sends the three streams that are indicated in the SDP file over three independent UDP streams. The client and server use a fourth RTCP TCP stream to synchronize the three streams. The client and server establish these four streams transparently to RTSP. The client includes the time range of the session that it wishes to view in the PLAY method.
PLAY rtsp://ssdl.com/test.asf RTSP/1.0 CSeq: 102 Session: 47112344 Range: npt=1-40
RTSP reuses HTTP Basic and Message digest authentication to authenticate users.
The server responds with a 200 OK indicating to the client that it should initiate the three RTP UDP streams and the single RTCP TCP connection to the server.
RTSP/1.0 200 OK CSeq: 102
When the client wants to stop or pause the session, it sends a TEARDOWN or PAUSE method to the server. Figure 9-4 illustrates the finite state machine (FSM) that RTSP uses to transition between idle, ready, and playing/recording.
Figure 9-4. RTSP State Diagram
RTSP supports cache control mechanisms in a similar manner to HTTP.
If your clients are behind a Cisco PIX firewall performing NAT, you can use the PIX application recognition (fixup) feature to rewrite private NAT addresses in the payload with registered public IP addresses. To enable RTSP fixup, use the command fixup protocol rtsp. If instead your Cisco IOS router is performing NAT, you can use NBAR to recognize/rewrite RTSP. To enable NAT RTSP support on your router, use the ip nat service rtsp port global configuration command.
Fast-Forwarding and Rewinding a Stream with RTSP
RTSP clients can use the Scale: header in conjunction with the PLAY method to indicate the speed and direction that the server should stream the media to the client. As an example, if the client sets the header as Scale: 2, it will receive the stream at twice the normal rate. Also, if Scale: 0.5, the server sends the stream at half the rate. Negative Scale: values instruct the server to deliver the stream in reverse direction. That is, the client requests to rewind the stream.
The client also uses the Range: header in the PLAY method to seek through various parts of the stream. The Range: header takes three different timestamp types as parameters.
Table 9-6 gives samples of each timestamp type.
Using Quality of Service and IP Multicast with Streaming Media
Streaming video normally produces constant data rates with frames arriving at constant intervals. However, as you learned in Chapter 6, "Ensuring Content Delivery with Quality of Service," networks can introduce packet loss, packet delay, and jitter between frame inter-arrival times. To reduce the effect of these network-related issues on your applications, you should use the following QoS features that you learned in Chapter 6, to prioritize your streaming media applications over less-critical applications.
Most streaming applications are uni-directional, so you can enable PIM-SM to scale the application on your network. However, if your application requires bidirectional communication, you can enable Bidir-PIM, as you learned in Chapter 5, "IP Multicast Content Delivery."