Streaming Files and Content Playback

You can stream media files live, as a scheduled rebroadcast of a live event or on-demand. Live streaming involves capturing the event with a video camera, an audio recording system, or both. You can then encode the live event into a digital network-ready format as it occurs and send it over the network to active receivers. Schedule rebroadcast and on-demand streaming involves encoding the media for storage and later viewing.

Streaming introduces the concept of progressive downloading, in which you view the multimedia session after a momentary delay to allow data buffering to take place in order to reduce packet jitter. The result is an almost immediate playback of the streaming content.

Figure 9-1 illustrates a basic streaming media solution. The audiovideo recorder transmits raw media to the conversion device. You can use either analog or digital video transmission standards to transmit the raw content to the streaming data converter. Common analog standards include composite video (such as Phase Alternation Line [PAL] National Television Standards Committee [NTSC]), and S-Video, and Component Video. Digital audiovideo recording devices commonly use Firewire or USB interfaces into conversion devices. The conversion device then encodes the data for storage either to a video-on-demand/live origin server or transmission to a network for live viewing. The origin server can receive the live stream from the data converter and deliver it directly to clients live or store it for later viewing. You also can configure some streaming data converters to deliver the live stream to clients directly.

Figure 9-1. A Sample Streaming Media Network Environment

PAL and NTSC are the most widely used analog television and motion picture standards, and are commonly known as composite video. NTSC is used in North America and PAL in Europe. Systeme Electronique Couleur Avec Memoire (SECAM) is another analog protocol that is used in France. You choose the format depending on where you will be viewing the content.

Raw audio-video from analog recording devices normally originates from one of these three formats. You can convert these formats digitally for delivery over a network using codecs. You also can stream from digital audiovideo recorders directly, if you have the appropriate hardware to convert the digital formats into streaming formats.

The encoding format you select, whether analog or digital, will dictate playback quality. Make sure that you understand each format in terms of its playback quality before selecting a format. Additionally, when encoding your presentation from the recording device, using too few scan-lines will ultimately affect the viewer's experience.

Note

The word codec is short for compress-decompress and sometimes code-decode. You can use codecs to translate raw media into compressed streaming media, for viewing or listening over a network. Your media player uses the same codec to piece the data back together for viewing that you used to compress the data. Some codecs are lossy, whereas others are lossless, implying a difference in the bandwidth required to transport the resulting information.

You can encode, deliver, and play back streaming content using a streaming media solution that you choose. Streaming media vendors offer you the ability to stream live, scheduled rebroadcast, or on-demand video, audio, and data. Media servers provide encoding of the inputs using audio video hardware interface capture cards. You can install the capture cards on the streaming data converter server that was shown in Figure 9-1 to provide the interface between the audiovideo recording hardware and the network.

Table 9-1 lists the popular streaming media products and standards.

Table 9-1. Streaming Media Products and Standards
Streaming Vendor/Software	Player	Container file extension	Transport Protocols	DRM	Meta-File
Windows Media Technology (WMT)/ Windows Media Services (WMS)	Windows Media Player	ASF	MMS WMT RTSP RTP (UDP) HTTP	Yes	ASX
RealNetworks/Helix Univeral Server	RealPlayer	RM	Real RTSP G2 and Real PNA RTSP MMS RTP (UDP) RealNetworks' Real Data Transport (RDT) HTTP	Yes	SMIL
Apple QuickTime Streaming	Apple Quicktime	QT	RTSP RTP (UDP) HTTP	Yes	SMIL
MPEG4	RealPlayer and Apple Quicktime	MPEG4 (based on QT)	Not specified	No, but includes proprietary interfaces	SMIL

Note

WMT uses its own version of RTSP and a proprietary streaming control protocol called Microsoft Media Services (MMS) to control streaming flows. MMS provides similar functionality to RTSP but uses TCP/UDP port 1755 for control and UDP ports 10245000 for data transmission.

Raw video content contains rich color, movement detail, and sound quality. As a result, raw uncompressed video can consume a great deal of your server and network resources. Consider a typical television show that produces millions of colors on the screen that change every fraction of a second. A computer that could capture and process these pixels and their changes in real-time would require vast amounts of CPU power, disk storage space, and bandwidth. Therefore, you should use codecs to compress media before network transmission.

Stills-based codecs, such as JPEG and GIF, are good at removing any unnecessary information within an individual image. Similar to stills, motion-based codecs perform encoding of individual frames, but also remove unnecessary information between frames. For example, MPEG records only changes to images across framesinformation that has not changed in the current frame remains the same as the previous frame. For uncompressed voice, 64 kbps per stream is typical; however, voice codecs such as G.711 can deliver business-class voice at less than 10 kbps per voice stream.

Table 9-2 lists the codecs available for streaming audiovideo media. Digital Rights Management (DRM) embeds pay-per-view, pay-per-subscription, and user authentication into the streaming media solution. DRM protection provides a way for organizations to generate revenue and prevent piracy.

Table 9-2. Common AudioVideo Codec Algorithms Available for Streaming Media
Codec Name (ITU-T/ISO-IEC)	Type of Codec	Typical Use	Required Bandwidth
MPEG1	High-quality video codec	CD encoding	500 kbps10+ Mbps
MPEG2	High-quality video codec	DVD/HDTV enconding	500 kbps10+ Mbps
MPEG4	Medium-quality video	Live, on-demand Internet streaming and HDTV	100 kbps1 Mbps
H.263	Low-quality codec	Originally for ISDN video conferencing	128 bps
Motion-JPEG	Video Codec (Frame-by-frame)	Internet Streaming	-
JPEG	Image encoder	Still images	-
G.711	Audio codec	Raw voice transmission	64 kbps
G.729	Audio codec	Voice over IP (VoIP)	8 kbps
MP3	MPEG1 Layer III Audio Codec	Music download-and-play or streaming	-

Streaming vendors normally use codecs containing licensed or open-source software from the standards in Table 9-2 or proprietary codec algorithms. Due to the popularization of the codec algorithms given in Table 9-2, the number of manufacturers that provide commercially available codecs based on these algorithms has dramatically increased. To differentiate among codecs, you should refer to their Four Character Code (FourCC). For example, DivX, the popular audiovideo codec manufacturer, has numerous codecs each given unique FourCC identifiers (for example, DX50 and DivX).

Note

Visit fourcc.org for a list of common codec algorithms and their respective FourCC identifiers.

Creating Streaming On-Demand Container Files

Streaming server vendors provide the ability to store multiple streams of data that form a multimedia session into a single file called a streaming container. These container files are suitable for playback that is live in nature or on-demand, whether you stream over a network or play the file locally. The underlying transport protocols (for example, RTP, UDP, and RTSP) of Windows Media and RealNetworks are independent of the container file and do not represent specifications of the container file for transmitting streams over a network. With Apple QT and MPEG4, the underlying protocols are part of the container file specification. Each vendor has its own proprietary container format, with the exception of MPEG4. MPEG4 is a standard streaming format to organize multiple audio, video streams, and multimedia streams, but is based closely on the Apple QT format.

To stream audiovideo media, your solution first compresses and encapsulates the audiovideo data. Codecs are available for media compression and decompression. The encapsulation includes application-layer headers containing information on how your client media player should play back the file to you. The headers include index information, bandwidth rate limits, and types of media contained in the file. The media may contain audio, video, or slideshow presentation media. When a player receives the streamable file over a network, it first reads the headers instructing the player on how to play the chunks of media in the file. Figure 9-2 diagrams a typical container file format.

Figure 9-2. A Typical Streaming Container File

The container file in Figure 9-2 includes the following information.

Headers The container headers include information about the streams, such as the stream size and codecs. Other information, such as the bit rates of the streams and offset between the individual packets within the streams, is included in the headers as well.
Data Packets The list of data packets, in order of delivery. Because container files normally deliver multiple streams, the data packets from each stream are interleaved. This enables your media player to render and synchronize the different streams together as they arrive from the network.
Note

The intention of the container file is to organize the streams for transmission. Your streaming solution can deliver the data packets using any of the mechanisms described later in this Chapter, such as RTP, RTCP, and RTSP.
Index Because the container file interleaves the data packets, it uses an index for each stream to enable your media player to efficiently seek through the media once the entire file is downloaded.

Note

Solutions using the container format in Figure 9-2 send the application headers using the same underlying transport (for example, RTP) as they use to transport the data packets and index information.

A container can encapsulate audio and data streams for different bit rates and languages. For example, you can encapsulate a presentation containing both French and English 10 kbps audio streams and a video stream for bandwidths 56 kbps, 100 kbps, 500 kbps, and 1 Mbps. You can publish this presentation to a website and allow your users to select their preferred bandwidth and language from the site. The server then sends the headers and the appropriate bandwidth and language from the file. When the header arrives, the player can determine the appropriate codecs for rendering the presentation and display the data packets as they arrive from their respective streams. A player can start rendering the audio and video content as soon as it reads the header and at least a single data packet from the network. By design, containers are media-independent. For example, you also can encapsulate audio, video, 3D objects, Macromedia Flash objects, sprites, and text into a container file. SDP session information about the session for use during session transmission also can be stored in the container.

You can convert your streaming content into a container format, for delivery into the respective vendor's streaming player. Alternatively, you can imbed a player into a web browser using the vendor's Software Development Kit (SDK). The SDKs include programming interfaces to languages such as C++ and Java.

To provide streaming media to your users, you can insert URLs in your HTML pointing to the streaming container files. Your client's web browsers will launch the respective media player based on the MIME-type of the container file. For example, with WMT, the browser recognizes the ".asf" extension and launches Windows Media Player. Example 9-1 shows how you can insert a WMT file into your HTML.

Example 9-1. A Sample HTML Document Linking Users to a Windows Streaming Media Clip/Container

 <HTML> <HEAD> <TITLE>A Sample Video Stream/TITLE> </HEAD> <BODY> <A HREF="mms://cisco.com/wmt/helloworld.asf">Hello World!</A> </BODY> </HTML>

As mentioned previously, containers (or clips) may hold multiple inputs, such as audio, video, and whiteboards. For example, the streaming file helloworld.asf in Example 9-1 is an audiovideo clip. To provide enhanced synchronization of individual streaming clips, you should use meta-files.

Describing Streaming On-Demand Content with Meta-Files

You can use meta-files to describe how to display streaming content to a web browser. Meta-files provide the interface between web browsers and proprietary container files. Instead of embedding the URLs of the media files directly into your HTML, you can point your users to the URL that houses the meta-file. Your user's web browser launches the media player and passes the meta-file to the player. The player reads the contents of the meta-file and performs the appropriate actions on the streaming media clips in the meta-file. You can write your meta-files using the Synchronized Multimedia Integration Language (SMIL) or with Windows Advanced Stream Redirector (ASX) files. SMIL is an XML-based standard meta-file language. Windows streaming uses ASX files, and both RealNetworks and Apple QuickTime uses SMIL. Example 9-2 shows how a browser can open two different WMT clips in sequence using ASX.

Example 9-2. A Sample ASX File for Streaming Two WMT ASF Clips in Sequence

 ! HTML referencing an ASX File: <HTML> <HEAD> <TITLE>A Sample Video Stream/TITLE> </HEAD> <BODY> <A HREF="http://cisco.com/wmt/helloworld.asx">Hello World!</A> </BODY> </HTML> ! helloworld.asx ASX File: <ASX version="3.0"> <Entry> <ref HREF="mms://cisco.com/wmt/helloworld.asf"/> </Entry> <Entry> <ref HREF="mms://cisco.com/wmt/goodbyeworld.asf"/> </Entry> </ASX>

Example 9-3 shows how you can write a SMIL file to deliver two RealNetworks clips in sequence. You also can use SMIL to include images, text, and graphics in your presentation, not just video clips.

Example 9-3. A Sample SMIL File for Streaming Two RealNetworks RM Clips in Sequence

 <smil xmlns="http://www.w3.org/2001/SMIL20/Language" xmlns:rn="http://features.real.com/2001/SMIL20/Extensions">  <body>     <video src="/books/2/203/1/html/2/rtsp://cisco.com/real/helloworld.rm "     <video src="/books/2/203/1/html/2/rtsp://cisco.com/real/goodbyeworld.rm "/> </body> </smil>

RealNetworks supplies extensions to the standard set of XML tags defined in SMIL 2.0. Some extended features include changing the opacity and color of static images such as JPEG, GIF, and PNG files. To access RealNetworks' customize XML tags, you need to declare the RealNetworks namespace xmlns:rn=http://features.real.com/2001/SMIL20/Extensions.

You must prefix all RealNetworks custom XML tags with this namespace. Refer to Chapter 7, "Presenting and Transforming Content," to understand custom namespaces. The namespace defined with http://www.w3.org/2001/SMIL20/Language is the standard SMIL 2.0 namespace. All elements not defined with the "rn:" prefix will assume this namespace. In your HTML, you would link users to the SMIL file with <A HREF="http://cisco.com/wmt/helloworld.smil">Hello World!</A>.

With SMIL and ASX meta-files you can:

Time and control your streaming media presentation.
Apply advanced layouts to your presentation.
Stream clips from different servers.
Include ads in your streaming media presentation.

Streaming with Microsoft WMT, Real Networks, and Apple QuickTime

The proprietary RealNetworks and Windows file formats are very similar to the generic container shown previously in Figure 9-2. However, the Apple and MPEG4 file formats use a much different format. The QuickTime format does not inherently organize the different media streams of a multimedia session into interleaved data packets for streaming media transmission, as the Real and WMT formats do.

Note

The MPEG4 container format is based on QuickTime's container file format.

Apple QT and MPEG4 organize the media streams into virtual packets with the use of hint tracks, thus preserving the continuous nature of the media. During transmission, QT and MPEG4 uses the hint tracks to divide the data into packet-sized chunks for transport over RTP/UDP. The hint tracks contain pointers to locations within the media streams. The hint tracks also contain necessary transmission timing information to rebuild individual streams and for synchronizing multiple streams together. Unlike RealNetworks and WMT, when an Apple QT or MPEG4 streaming server sends packets over the network, there is no QuickTime- or MPEG4-related header information embedded in the packets, just the standard application transport header and payload format information.

Note

The container defined in Figure 9-2 (for Real and WMT) does not specify anything related to lower-level transport. Because Apple and MPEG4 packetize the media on-the-fly, they must store lower-layer transport information in the container.

Vendors develop proprietary versions of streaming protocols, such as Windows MMS, Real RTSP G2, and Real PNA. The vendors develop these to provide additional features, such as stream thinning and failover to HTTP streaming. Stream thinning involves signaling between the client and server to automatically detect decreases in network bandwidth. If the bandwidth decreases, WMT and RealNetworks servers can automatically decrease the transmission rate to the clients. Additionally, the MMS and RTSP protocols generate random transport (TCP or UDP) port numbers for the RTP stream. Because some firewalls may not have the RTP, or RTSP/MMS UDP ports opened, MMS and Real RTSP streaming protocols offer proprietary HTTP streaming failover mechanisms, because HTTP is readily available through firewall devices.

Each vendor strives toward adoption of the MPEG family of transport and signaling protocols. Phasing out proprietary protocols remains in discussion by Windows and RealNetworks. However, the MPEG standards will require extensions such as stream thinning and HTTP failover to be part of the standards before the vendors consider a complete phase-out of their proprietary solutions.

Streaming Motion Picture Experts Group

The Motion Picture Experts Group (MPEG) working group initiated development of the video standards in the late 1980s to address digitization and storage of video and audio media. The International Telecommunications Union (ITU) currently maintains the MPEG standard documents.

Note

You can consider the MPEG family as residing in the MPEG standards Presentation and Session layers of the OSI reference model.

The following MPEG standards are currently available to you for compressing and decompressing video data.

MPEG1 Developed in the late 1980searly 1990s for the storage and retrieval of video and audio on storage media, such as high-capacity tapes and CD-ROMs.
MPEG2 Developed in the mid-1990s for the TV broadcasting industry's transition from analog to digital television (DTV). Most DVDs today store content in MPEG2 format. MPEG3 was merged into MPEG2. The playback of MPEG2 at the desktop is a licensed function. Typically, the license agreement and fees are accounted for when you purchase the MPEG2 player.
MPEG4 Developed in the late 1990s and introduced the concept of storing content as objects capable of being individually or collectively manipulated. Although MPEG4 is a standard, the software requires special royalty-based software licensing. MPEG4 has no built-in DRM; however, it can hook in to proprietary DRM solutions. The MPEG4 file format is based on the Apple QuickTime file format.
Note

MPEG4 is the only MPEG4 standard with inherent streaming capabilities.
MPEG7 Currently a draft under development to provide a multimedia description interface.
MPEG21 Also currently a draft under development. MPEG21 is a multimedia framework that covers the entire multimedia content delivery chain, including content creation, production, delivery, trade, and consumption.

Figure 9-1. A Sample Streaming Media Network Environment

Table 9-1. Streaming Media Products and Standards

Table 9-2. Common AudioVideo Codec Algorithms Available for Streaming Media

Creating Streaming On-Demand Container Files

Figure 9-2. A Typical Streaming Container File

Example 9-1. A Sample HTML Document Linking Users to a Windows Streaming Media Clip/Container

Describing Streaming On-Demand Content with Meta-Files

Example 9-2. A Sample ASX File for Streaming Two WMT ASF Clips in Sequence

Example 9-3. A Sample SMIL File for Streaming Two RealNetworks RM Clips in Sequence

Streaming with Microsoft WMT, Real Networks, and Apple QuickTime

Streaming Motion Picture Experts Group