You can stream media files live, as a scheduled rebroadcast of a live event or on-demand. Live streaming involves capturing the event with a video camera, an audio recording system, or both. You can then encode the live event into a digital network-ready format as it occurs and send it over the network to active receivers. Schedule rebroadcast and on-demand streaming involves encoding the media for storage and later viewing.
Streaming introduces the concept of progressive downloading, in which you view the multimedia session after a momentary delay to allow data buffering to take place in order to reduce packet jitter. The result is an almost immediate playback of the streaming content.
Figure 9-1 illustrates a basic streaming media solution. The audiovideo recorder transmits raw media to the conversion device. You can use either analog or digital video transmission standards to transmit the raw content to the streaming data converter. Common analog standards include composite video (such as Phase Alternation Line [PAL] National Television Standards Committee [NTSC]), and S-Video, and Component Video. Digital audiovideo recording devices commonly use Firewire or USB interfaces into conversion devices. The conversion device then encodes the data for storage either to a video-on-demand/live origin server or transmission to a network for live viewing. The origin server can receive the live stream from the data converter and deliver it directly to clients live or store it for later viewing. You also can configure some streaming data converters to deliver the live stream to clients directly.
Figure 9-1. A Sample Streaming Media Network Environment
PAL and NTSC are the most widely used analog television and motion picture standards, and are commonly known as composite video. NTSC is used in North America and PAL in Europe. Systeme Electronique Couleur Avec Memoire (SECAM) is another analog protocol that is used in France. You choose the format depending on where you will be viewing the content.
Raw audio-video from analog recording devices normally originates from one of these three formats. You can convert these formats digitally for delivery over a network using codecs. You also can stream from digital audiovideo recorders directly, if you have the appropriate hardware to convert the digital formats into streaming formats.
The encoding format you select, whether analog or digital, will dictate playback quality. Make sure that you understand each format in terms of its playback quality before selecting a format. Additionally, when encoding your presentation from the recording device, using too few scan-lines will ultimately affect the viewer's experience.
The word codec is short for compress-decompress and sometimes code-decode. You can use codecs to translate raw media into compressed streaming media, for viewing or listening over a network. Your media player uses the same codec to piece the data back together for viewing that you used to compress the data. Some codecs are lossy, whereas others are lossless, implying a difference in the bandwidth required to transport the resulting information.
You can encode, deliver, and play back streaming content using a streaming media solution that you choose. Streaming media vendors offer you the ability to stream live, scheduled rebroadcast, or on-demand video, audio, and data. Media servers provide encoding of the inputs using audio video hardware interface capture cards. You can install the capture cards on the streaming data converter server that was shown in Figure 9-1 to provide the interface between the audiovideo recording hardware and the network.
Table 9-1 lists the popular streaming media products and standards.
WMT uses its own version of RTSP and a proprietary streaming control protocol called Microsoft Media Services (MMS) to control streaming flows. MMS provides similar functionality to RTSP but uses TCP/UDP port 1755 for control and UDP ports 10245000 for data transmission.
Raw video content contains rich color, movement detail, and sound quality. As a result, raw uncompressed video can consume a great deal of your server and network resources. Consider a typical television show that produces millions of colors on the screen that change every fraction of a second. A computer that could capture and process these pixels and their changes in real-time would require vast amounts of CPU power, disk storage space, and bandwidth. Therefore, you should use codecs to compress media before network transmission.
Stills-based codecs, such as JPEG and GIF, are good at removing any unnecessary information within an individual image. Similar to stills, motion-based codecs perform encoding of individual frames, but also remove unnecessary information between frames. For example, MPEG records only changes to images across framesinformation that has not changed in the current frame remains the same as the previous frame. For uncompressed voice, 64 kbps per stream is typical; however, voice codecs such as G.711 can deliver business-class voice at less than 10 kbps per voice stream.
Table 9-2 lists the codecs available for streaming audiovideo media. Digital Rights Management (DRM) embeds pay-per-view, pay-per-subscription, and user authentication into the streaming media solution. DRM protection provides a way for organizations to generate revenue and prevent piracy.
Streaming vendors normally use codecs containing licensed or open-source software from the standards in Table 9-2 or proprietary codec algorithms. Due to the popularization of the codec algorithms given in Table 9-2, the number of manufacturers that provide commercially available codecs based on these algorithms has dramatically increased. To differentiate among codecs, you should refer to their Four Character Code (FourCC). For example, DivX, the popular audiovideo codec manufacturer, has numerous codecs each given unique FourCC identifiers (for example, DX50 and DivX).
Visit fourcc.org for a list of common codec algorithms and their respective FourCC identifiers.
Creating Streaming On-Demand Container Files
Streaming server vendors provide the ability to store multiple streams of data that form a multimedia session into a single file called a streaming container. These container files are suitable for playback that is live in nature or on-demand, whether you stream over a network or play the file locally. The underlying transport protocols (for example, RTP, UDP, and RTSP) of Windows Media and RealNetworks are independent of the container file and do not represent specifications of the container file for transmitting streams over a network. With Apple QT and MPEG4, the underlying protocols are part of the container file specification. Each vendor has its own proprietary container format, with the exception of MPEG4. MPEG4 is a standard streaming format to organize multiple audio, video streams, and multimedia streams, but is based closely on the Apple QT format.
To stream audiovideo media, your solution first compresses and encapsulates the audiovideo data. Codecs are available for media compression and decompression. The encapsulation includes application-layer headers containing information on how your client media player should play back the file to you. The headers include index information, bandwidth rate limits, and types of media contained in the file. The media may contain audio, video, or slideshow presentation media. When a player receives the streamable file over a network, it first reads the headers instructing the player on how to play the chunks of media in the file. Figure 9-2 diagrams a typical container file format.
Figure 9-2. A Typical Streaming Container File
The container file in Figure 9-2 includes the following information.
Solutions using the container format in Figure 9-2 send the application headers using the same underlying transport (for example, RTP) as they use to transport the data packets and index information.
A container can encapsulate audio and data streams for different bit rates and languages. For example, you can encapsulate a presentation containing both French and English 10 kbps audio streams and a video stream for bandwidths 56 kbps, 100 kbps, 500 kbps, and 1 Mbps. You can publish this presentation to a website and allow your users to select their preferred bandwidth and language from the site. The server then sends the headers and the appropriate bandwidth and language from the file. When the header arrives, the player can determine the appropriate codecs for rendering the presentation and display the data packets as they arrive from their respective streams. A player can start rendering the audio and video content as soon as it reads the header and at least a single data packet from the network. By design, containers are media-independent. For example, you also can encapsulate audio, video, 3D objects, Macromedia Flash objects, sprites, and text into a container file. SDP session information about the session for use during session transmission also can be stored in the container.
You can convert your streaming content into a container format, for delivery into the respective vendor's streaming player. Alternatively, you can imbed a player into a web browser using the vendor's Software Development Kit (SDK). The SDKs include programming interfaces to languages such as C++ and Java.
To provide streaming media to your users, you can insert URLs in your HTML pointing to the streaming container files. Your client's web browsers will launch the respective media player based on the MIME-type of the container file. For example, with WMT, the browser recognizes the ".asf" extension and launches Windows Media Player. Example 9-1 shows how you can insert a WMT file into your HTML.
Example 9-1. A Sample HTML Document Linking Users to a Windows Streaming Media Clip/Container
As mentioned previously, containers (or clips) may hold multiple inputs, such as audio, video, and whiteboards. For example, the streaming file helloworld.asf in Example 9-1 is an audiovideo clip. To provide enhanced synchronization of individual streaming clips, you should use meta-files.
Describing Streaming On-Demand Content with Meta-Files
You can use meta-files to describe how to display streaming content to a web browser. Meta-files provide the interface between web browsers and proprietary container files. Instead of embedding the URLs of the media files directly into your HTML, you can point your users to the URL that houses the meta-file. Your user's web browser launches the media player and passes the meta-file to the player. The player reads the contents of the meta-file and performs the appropriate actions on the streaming media clips in the meta-file. You can write your meta-files using the Synchronized Multimedia Integration Language (SMIL) or with Windows Advanced Stream Redirector (ASX) files. SMIL is an XML-based standard meta-file language. Windows streaming uses ASX files, and both RealNetworks and Apple QuickTime uses SMIL. Example 9-2 shows how a browser can open two different WMT clips in sequence using ASX.
Example 9-2. A Sample ASX File for Streaming Two WMT ASF Clips in Sequence
Example 9-3 shows how you can write a SMIL file to deliver two RealNetworks clips in sequence. You also can use SMIL to include images, text, and graphics in your presentation, not just video clips.
Example 9-3. A Sample SMIL File for Streaming Two RealNetworks RM Clips in Sequence
RealNetworks supplies extensions to the standard set of XML tags defined in SMIL 2.0. Some extended features include changing the opacity and color of static images such as JPEG, GIF, and PNG files. To access RealNetworks' customize XML tags, you need to declare the RealNetworks namespace xmlns:rn=http://features.real.com/2001/SMIL20/Extensions.
You must prefix all RealNetworks custom XML tags with this namespace. Refer to Chapter 7, "Presenting and Transforming Content," to understand custom namespaces. The namespace defined with http://www.w3.org/2001/SMIL20/Language is the standard SMIL 2.0 namespace. All elements not defined with the "rn:" prefix will assume this namespace. In your HTML, you would link users to the SMIL file with <A HREF="http://cisco.com/wmt/helloworld.smil">Hello World!</A>.
With SMIL and ASX meta-files you can:
Streaming with Microsoft WMT, Real Networks, and Apple QuickTime
The proprietary RealNetworks and Windows file formats are very similar to the generic container shown previously in Figure 9-2. However, the Apple and MPEG4 file formats use a much different format. The QuickTime format does not inherently organize the different media streams of a multimedia session into interleaved data packets for streaming media transmission, as the Real and WMT formats do.
The MPEG4 container format is based on QuickTime's container file format.
Apple QT and MPEG4 organize the media streams into virtual packets with the use of hint tracks, thus preserving the continuous nature of the media. During transmission, QT and MPEG4 uses the hint tracks to divide the data into packet-sized chunks for transport over RTP/UDP. The hint tracks contain pointers to locations within the media streams. The hint tracks also contain necessary transmission timing information to rebuild individual streams and for synchronizing multiple streams together. Unlike RealNetworks and WMT, when an Apple QT or MPEG4 streaming server sends packets over the network, there is no QuickTime- or MPEG4-related header information embedded in the packets, just the standard application transport header and payload format information.
The container defined in Figure 9-2 (for Real and WMT) does not specify anything related to lower-level transport. Because Apple and MPEG4 packetize the media on-the-fly, they must store lower-layer transport information in the container.
Vendors develop proprietary versions of streaming protocols, such as Windows MMS, Real RTSP G2, and Real PNA. The vendors develop these to provide additional features, such as stream thinning and failover to HTTP streaming. Stream thinning involves signaling between the client and server to automatically detect decreases in network bandwidth. If the bandwidth decreases, WMT and RealNetworks servers can automatically decrease the transmission rate to the clients. Additionally, the MMS and RTSP protocols generate random transport (TCP or UDP) port numbers for the RTP stream. Because some firewalls may not have the RTP, or RTSP/MMS UDP ports opened, MMS and Real RTSP streaming protocols offer proprietary HTTP streaming failover mechanisms, because HTTP is readily available through firewall devices.
Each vendor strives toward adoption of the MPEG family of transport and signaling protocols. Phasing out proprietary protocols remains in discussion by Windows and RealNetworks. However, the MPEG standards will require extensions such as stream thinning and HTTP failover to be part of the standards before the vendors consider a complete phase-out of their proprietary solutions.
Streaming Motion Picture Experts Group
The Motion Picture Experts Group (MPEG) working group initiated development of the video standards in the late 1980s to address digitization and storage of video and audio media. The International Telecommunications Union (ITU) currently maintains the MPEG standard documents.
You can consider the MPEG family as residing in the MPEG standards Presentation and Session layers of the OSI reference model.
The following MPEG standards are currently available to you for compressing and decompressing video data.