Chapter 36: Technologies and Standards for Universal Multimedia Access


Anthony Vetro and Hari Kalva
Mitsubishi Electric Research Labs
Murray Hill, NJ, USA
<avetro@merl.com>, <hk168@columbia.edu>

1. Introduction

Three major trends have emerged over the past five years in the way communication, information and entertainment services are provided to consumers: wireless communication, Internet technologies and digital entertainment (audio/video/games). Mobile telephony and Internet technologies have seen tremendous global growth over this period. The scale of the growth and the market needs have resulted in the basic Internet services over mobile telephones. The third major trend, digital entertainment, can be seen in the rapid adoption of digital TV and DVD products. Digital entertainment over the Internet has so far been mainly digital music and, to a lesser extent, streaming video, and online video games. New standards and technologies are making ways for offering applications and services that combine these three trends. A number of technologies have emerged that make possible access to digital multimedia content over wired and wireless networks on a range of devices with varying capabilities such as mobile phones, personal computers and television sets. This universal multimedia access (UMA), enabled by new technologies and standards, poses new challenges and requires new solutions. In this chapter we discuss the technologies, standards, and challenges that define and drive UMA.

The main elements of a system providing UMA services are 1) the digital content, 2) the sending terminal, 3) the communication network and 4) the receiving terminal. These seemingly simple and few elements represent a myriad of choices with multiple content formats, video frame rates, bit rates, network choices, protocol choices and receiving terminal capabilities. The operating environment of the service adds a dynamically varying factor that affects the operation of the end-to-end system. The selection of appropriate content formats, networks and terminals and adapting these elements, if and when necessary, to deliver multimedia content is the primary function of UMA services.

Figure 36.1 shows the components of a general purpose multimedia communications system. Rich multimedia content is communicated over communication networks to the receiving terminals. In practice, the multimedia content includes various content encoding formats such as MPEG-2, MPEG-4, Wavelets, JPEG, AAC, AC-3, etc.; all these encoding formats at various frame rates and bit rates. The communication networks also have different characteristics such as bandwidth, bit error rate, latency and packet loss rate depending on the network infrastructure and load. Likewise, the receiving terminals have different content playback and network capabilities as well as different user preferences that affect the type of content that can be played on the terminals. The communication network characteristics together with the sending terminal and the receiving terminal capabilities and their current state constitute the content playback environment. The mismatch between the rich multimedia content and the content playback environment is the primary barrier for the fulfilment of the UMA promise. The adaptation engine is the entity that bridges this mismatch by either adapting the content to fit to the content playback environment or adapting the content playback environment to accommodate the content.

click to expand
Figure 36.1: Capability mismatch and adaptation for universal access.

The content playback environment can consist of a number of elements such as routers, switches, wired-to-wireless gateways, relay servers, video server proxies, protocol converters, and media translators. Multimedia content on its way to the receiver terminal passes through one or more of these elements and offers possibilities of adapting the content within one of these elements to enable playback with acceptable quality of service. The adaptation performed can be 1) content adaptation and 2) content playback environment adaptation. Content adaptation means adapting the content to fit the playback environment. The nature of content determines the operations involved in the actual adaptation. For object-based content such as MPEG-4 content, adaptation at the highest level, the content level, involves adding or dropping objects in the content in response to the varying playback environment. See Section 2 for details on MPEG-4 and object based content representation. Adaptation at a lower level, e.g., the object level, is done by modifying the bit rate, frame rate, resolution or the encoding format of the objects in the presentation. For stored content, object level adaptation generally means reduction in quality compared with the source; for live content, adaptation can also result in the improvement in quality. Adapting the content playback environment involves acquiring additional resources to handle the content. The resources acquired can be session bandwidth, computing resources at the sending and receiving terminals, decoders in the receiver terminals, or improving the network latency and packet loss. The playback environment can change dynamically and content adaptation should match the changing environment to deliver content at the best quality possible.

While there has been significant work published in adapting the Web content, in this chapter we emphasize digital video adaptation, as the primary focus of research is digital video communication.

1.1 UMA Infrastructure

The digital audio-visual services (AV) infrastructure has to be augmented to support UMA. The AV infrastructure consists of video servers, proxies, network nodes, wired-to-wireless gateways and additional nodes depending on the delivery network used. This raises the possibility of applying adaptation operations at one of more elements between a sending terminal and the receiving terminal. With UMA applications, new adaptation possibilities as well as additional adaptation locations have to be considered. In the following we consider some of the important aspects of the UMA infrastructure.

1.1.1 Cooperative Playback

The mismatch between the rich multimedia content and small devices such as mobile phones and PDAs can also be addressed using additional receiver-side resources without having to adapt the content to a lower quality. The resources available to play back the content can include the resources available in the devices' operating environment. The receiving terminal can cooperatively play the rich content together with the devices in its immediate proximity. For example, when a high quality MPEG-2 video is to be played, the receiving PDA can identify a device in its current environment that can play the MPEG-2 video and coordinate the playback of the content on that device. The emerging technologies such as Bluetooth and UPnP make the device discovery in the receiving terminal's environment easier. In [2] Pham et al. propose a situated computing framework that employs the capabilities of the devices in the receiver's environment to play back multimedia content with small screen devices. Cooperative playback raises another important issue of session migration between clients. Depending on the session requirements and the devices in a receiving terminal's environment, sessions may have to be migrated from one client to another. An architecture for application session handoff in receivers is presented in [5,6]. Their work primarily addresses applications involving image and data retrieval from databases. Application session handoff involving multimedia and video intensive applications poses new challenges. The problem becomes more complex when user interaction is involved and when content adaptation is performed at intermediate nodes.

1.1.2 Real-time Transcoding vs Multiple Content Variants

An alternative to real-time transcoding is creating content in multiple formats and at multiple bitrates and making an appropriate selection at delivery time. This solution is not compute-intensive but highly storage-intensive. Smith et al. propose a Infopyramid approach to enable UMA [7]. The approach is based on creating content with different fidelities and modalities and selecting the appropriate content based on a receiver's capabilities. A tool that supports this functionality within the context of MPEG-7 is described further in Section 2.3.2. With so many encoding formats, bitrates, and resolutions possible, though, storing content in multiple formats at multiple bitrates becomes impractical for certain applications. However, for cases in which the types of receiving terminals are limited, storing multiple variants of the media is sufficient. For the general case, where no limitations exist on the terminal or access network, real-time media transcoding is necessary to support a flexible content adaptation framework.

In the case of video streams, to meet certain network bandwidth constraints, the encoding rate has to be reduced. The desired bitrate can be achieved either by reducing the spatial resolution, by reducing the frame rate, or by reducing the bitrate. The perceived quality of the delivered video depends on the target application as well as the target device capabilities such as the receiving terminal's display capability. The mode of adaptation chosen is also influenced by the resources available to perform the adaptation. Frame dropping has the lowest complexity of the three. In general, the complexity of rate reduction is lower than that of resolution reduction. The problem of determining an optimal choice to maximize the quality of the delivered video is difficult and is an open research issue.

1.1.3 Adaptation in Wireless Networks

The freedom of un-tethered connected devices is making wireless and mobile networks popular for commercial as well as consumer applications. The emergence of 802.11 wireless Ethernet standards is creating new possibilities at the same time posing new challenges for multimedia access. The mobile network topology offers additional locations in the network to perform content adaptation. The wired-to-wireless gateway is one location to adapt the content to suit the wireless network and the mobile device. The mobility of the receiver poses new challenges to perform content adaptation. The mobility of the users causes the users to move to different wireless networks. The receiver handoff becomes difficult when a content adaptation session is in progress. The gateways in question have to communicate the state of the adaptation session in order to seamlessly deliver the content. The problem becomes difficult with content that includes multiple media objects that have to be transcoded. Roy et al. [8] discuss an approach to handing-off media transcoding sessions. The approach of applying content adaptation operations inside a mobile network may prove to be costly because of the cost of session migration and the cost of computing resources for adaptation.

1.1.4 P2P Networks

Peer to peer networking was made popular by Gnutella and Napster, the popular software programs for file sharing on the Internet. The main reason for the popularity of this P2P application was the availability of desirable content - music and movies. P2P networks for content distribution have the potential to alleviate the bandwidth bottlenecks suffered by content distribution networks. Content can be distributed from the closest peers to localise the traffic and minimize bandwidth bottlenecks in the content distribution network. However, the P2P networks are still in their infancy and a number of issues are yet to be addressed in the P2P content distribution networks. Since P2P networks are possible primarily through voluntary user participation, one of the issues to address is the source migration when a user that is currently the source for a media distribution session withdraws. In such cases, the session has to be migrated to a new source with little or no disruption to the end user. The CoopNet project discussed in [3] proposes the use of multi description coding in conjunction with a central server to address the issue of source migration. Another important issue in P2P networking is content discovery. The easiest solution is a central server that indexes the content available on the P2P network. With content adaptation and user preferences in place, multiple variants of the same content will be present on the network. Managing the content variations and delivering the appropriate content to the users is a difficult problem. The simplest cases for P2P delivery are simple audio/video streaming applications. With multi-object and interactive applications, the content distribution as well as source migration becomes problematic. The server functionality required to support interactive content makes P2P networks unsuitable for delivering interactive content. Similar arguments hold for content adaptation operations in a P2P network. When multiple variants of the content are available on the P2P network, the problem becomes one of locating an appropriate variant to fit the receiver capabilities. The Digital Item Declaration Language, specified as Part 2 of the MPEG-21 standard, is targeted to overcome such a problem. Essentially, a language has been defined that can be used to structure and describe variants of the multimedia content and allow one to customize the delivery and/or presentation of the content to a particular receiver or network.




Handbook of Video Databases. Design and Applications
Handbook of Video Databases: Design and Applications (Internet and Communications)
ISBN: 084937006X
EAN: 2147483647
Year: 2003
Pages: 393

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net