Chapter 1: Introduction

 < Day Day Up > 



Overview

Internet multimedia communication, such as video- and audio-on-demand, video conferencing, and distance e-Learning, let us experience multimedia at the desk. Video mobile phones will, in the near future, let us create short video clips, send these videos to a partner mobile phone or to a personal database at home, and play videos that we receive from a friend. These systems are peer-to-peer: terminal clients become servers and vice versa. To realize the vision of peer-to-peer multimedia systems with heterogonous terminals, many technical problems need to be solved. These problems concern all layers of a multimedia system; for instance, the multimedia communication system, distributed control system, and storage and retrieval systems. Strongly related to this vision of peer-to-peer multimedia systems is the enhancement of a multimedia system with metadata. Metadata describe the multimedia content and could be semantic descriptions, such as which persons appear in a video clip; information on color characteristics of a video, such as the dominant color in an image; or information on how a video might be adapted if resources become rare.

The emerging International Organization for Standardization and International Electrotechnical Commission (ISO-IEC) Moving Picture Experts Group (MPEG)-7 metadata standard, proposing description schemes for multimedia, helps us to find multimedia by content. Imagine that you are listening to a radio song and you could not remember the title (see Exhibit 1.1). Using your mobile phone, you can get recorded 10 seconds of the song and then use an audio recognition service (e.g., the Music Scout service from net mobile),[1] from which you will very probably get a prompt and positive content identification via Short Message Service (SMS). How does this work?

Exhibit 1.1: Looking for the title of a song.

start example

click to expand

end example

The Fraunhofer Institute proposed an audio signature technology (integrated into Music Scout, for instance) that automatically identifies audio material through comparison with previously registered audio content.[2] More than 15,000 so-called audio fingerprints have been recorded and are ready to be searched. The search is tolerant to linear distortion (e.g., filtering and band limiting), nonlinear distortion (including the codec used: MP3 [MPEG Audio Layer-3] or MPEG-2/4, adaptive audio coding [AAC]), and relies on the spectral flatness properties in several frequency bands to identify the unique AudioSignature of a piece of music. The Fraunhofer Institute introduced this AudioSignature into the MPEG-7 standards, which allows not only the identification of the song but also retrieval of information on the melody, the sound timbres, recoding history, and so on.

Using multimedia metadata effectively in a distributed multimedia system requires a multimedia database for managing, storing, searching, and delivering metadata. The metadatabase needs to have knowledge about the storage location of the media data and the metadata and should match usage descriptions against media resource requirements. Exhibit 1.2 shows a possible distributed architecture and the main players in a multimedia system and database. Often referred to as N-tier architecture in the distributed database literature, [3] this system includes a Web server as an entry tier to user requests. Normally, the Web server handles authentication issues, which may be delegated to some authentication server. At the same time, the Web server is the front end to the multimedia database. The multimedia database is the "master" of metadata, and the multimedia storage and streaming server is the "master" of the media data. Both sets of data are strongly related, and a communication protocol has to be set up among these components. There are tight protocols, as are, for instance, realizable with the Oracle products. The communication link between the server elements is, in general, bidirectional, as pointed out in Exhibit 1.2. For instance, if a user inserts a video into the media storage server, the database has to be updated with its metainformation. If the database recognizes (possibly from user input) that a video is outdated, it has to instruct the media storage server to delete it.

Exhibit 1.2: Components and information flow of a typical distributed multimedia system and database.

start example

click to expand

end example

Using metadata in a distributed multimedia system has many advantages. It enables us to search multimedia data by content; for example, "list all video clips from an online video shop where Roger Moore plays 007." However, before multimedia data can be searched, the data have to be indexed. This means that metadata information has to be extracted from the video automatically or annotated manually.

Another use for metadata is in describing usage environment characteristics (e.g., user preferences, presentation preferences) and network and terminal constraints. This information may be used to personalize the search for content. For instance, a user is looking for all the soccer events of the weekend, but the network to the user's terminal has currently an available bandwidth of only 500 kilobits per second (kbps). Thus, the database not only has to search all videos showing the soccer events of the weekend, but it has to see whether the bandwidth requirement of the video exceeds the user's constraints. If this is the case, the database may offer alternatives; for instance, showing the key-images (key-frames) extracted from the video.

Metadata are also used for describing intellectual properties of the multimedia data. These properties may guarantee a fair use of the data in commercial applications. Finally, metadata may be employed in a distributed multimedia system to describe resource adaptation capabilities. Including such metadata prepares the stream for unforeseen situations on the delivery path to the client. This makes sense, as dynamic changes in resource availability may degrade the quality of the video and make further delivery impossible. In this context, the structure of the streamed media might also be described by metadata for an efficient adaptation in the network. For example, the metadata may include information on how to transcode the media to meet resource constraints.

To guarantee the widespread use of multimedia data and metadata in a dynamic distributed multimedia system, two problems have to be addressed. These are media related—for instance, how to develop scaleable media compression technologies that adapt to changing usage environments—or metadata oriented—for instance, how to extract semantic features from a video (to recognize that a car is displayed in an image), how to represent usage environments for adaptation and for intellectual properties, and how to relate these metadata to the media data they describe.

Other problems are related to issues of resource distribution in a multimedia system. To realize a multimedia delivery system meeting the necessary real-time constraints, new network technologies and communication protocols must be provided to guarantee sufficient quality of service (QoS). In many cases, however, the network cannot guarantee QoS, and the multimedia resource must be adapted to the new environment. Hence, solutions for content-based adaptation have to be sought.

The use of standard technologies and description in a distributed multimedia system is imperative to guarantee an interoperable usage. We review the newest coding standard from the ISO-IEC JTC 1/SC 29/WG 11 MPEG: the MPEG-4 standard. In particular, we describe its scalable coding properties. We introduce the ISO-IEC MPEG-7 multimedia description standard, which proposes tools for describing and delivering multimedia metadata, and the ISO-IEC MPEG-21 standard, which designs an open multimedia framework. MPEG-4 focuses on the representation of media data, while MPEG-7 deals with the standardization of metadata, which describes the content of media data. MPEG-21 encompasses these two standards for a global view of the distributed multimedia system. Looking at Exhibit 1.2 again, one can say that MPEG-21 intends to regulate and direct the traffic in a distributed system in a way that provides universal, interoperable, and fair access to users in this highly heterogeneous environment.

In addition, and equally important, the book describes the current state-of-the-art distributed multimedia database technologies. Hence, we describe precise technologies for multimedia indexing and retrieval in multimedia databases and introduce the multimedia enhancement for SQL (Structured Query Language), the SQL/MM (multimedia), for supporting a structured query language and query processing system. SQL/MM is related to MPEG-7. Common characteristics and differences between the two standards are pointed out.

The remainder of this chapter gives an overview of basic elements in a distributed multimedia system. Section 1.1 deals with multimedia content indexing and retrieval. Section 1.2 presents a common view of multimedia systems and databases. Section 1.3 introduces (multi)media data and multimedia metadata and associated standards. This section discusses media coding standards—in particular, we discuss standards from the MPEG family (MPEG-1, MPEG-2, and MPEG-4). It also introduces multimedia metadata in general and the available standards in particular. Focus is put on comparative analysis, rather than complete descriptions. Some of the standards are described in more detail in later chapters as necessary (e.g., MPEG-7 is further discussed in Chapter 2). Finally, Section 1.4 gives an overview of the remaining content of the book.

[1]http://www.net-m.de/start.uk.htm.

[2]Fraunhofer Innovation AudioID (Erlangen, Germany); see http://www.emt.iis.fraunhofer.de/PM_AudioID_Thomson.htm.

[3]Özsu, M.T. and Valduriez, P., Principles of Distributed Database Systems, 2nd ed., Prentice-Hall, Englewood Cliffs, NJ, 1999.



 < Day Day Up > 



Distributed Multimedia Database Technologies Supported by MPEG-7 and MPEG-21
Distributed Multimedia Database Technologies Supported by MPEG-7 and MPEG-21
ISBN: 0849318548
EAN: 2147483647
Year: 2003
Pages: 77
Authors: Harald Kosch

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net