Chapter 21: Relevance Feedback in Multimedia Databases | Handbook of Video Databases: Design and Applications (Internet and Communications)

Michael Ortega-Binderberger
Department of Computer Science
University of Illinois at Urbana-Champaign
Urbana, IL, USA
<miki@acm.org>

Sharad Mehrotra
Information and Computer Science
University of California, Irvine
Irvine, CA, USA
<sharad@ics.uci.edu>

1. Introduction

The popularity of web search engines has familiarized countless users with the similarity search paradigm. In this paradigm a user provides an example or simple sketch of desired information to a system and receives a list of items that "best" match the information provided. These results are typically sorted by a system-generated estimate of how closely they match the sketch/requirement provided by users. Consider a typical web search engine. The users' sketch takes the form of keywords and the search engine finds the web pages that best match those keywords.

User expectations have grown to demand powerful and flexible search capabilities for multimedia data such as images and video in addition to the traditional unstructured web pages. Consider a user searching for pictures depicting a "sunset by the sea" in an image database. One possibility is to attach a text description to each image and use standard text search engine techniques to find the results. The problem with this approach is that "a picture is worth a thousand words": it is time consuming to describe each image in sufficient detail to be useful for searching. An alternate possibility is to make the content of the image itself searchable by abstracting some of its properties into a form that can be easily searched. Many image retrieval systems have adopted this approach [7][13][18][27] by using image-processing techniques to extract features that attempt to capture the user's perception of images. When searching, the results of a user's first search attempt rarely satisfy her information need [26]. This can be due to many reasons. Any search system must first abstract the content of the searchable documents, images or videos into a form that is both searchable and effectively captures some aspect of the user's perception. There may be a gap between these abstractions, also called features, and the way humans really perceive the content. A further problem is the difficulty a user faces in constructing an appropriate example or sketch to submit for search due to interface limitations imposed on her, unfamiliarity with the database content, etc.

To cope with these limitations, users initiate an information discovery cycle [29] whereby they repeatedly modify their original sketch or example in hopes of improving the results. Typically, the sketch changes minimally between search iterations. For example, a user searching for information on image retrieval systems would use the terms "image databases," "image retrieval," or "content-based retrieval" in a web search engine. As a result, there is ample potential for the system to observe the user behaviour and aid her in enhancing her search criteria.

Relevance Feedback is a technique to offload from the user to the search engine the task of discovering a better search query formulation. When users see results, they instantly recognize how relevant, that is, how good or bad, they match their information need. Relevance Feedback refers to the ability of users to communicate, or feed back, to the search engine this notion of relevance to their information need. The search engine then uses this relevance information to construct a better sketch and uses it to retrieve improved results to the user. Relevance Feedback is of special interest for multimedia search as compared to textual search. The features derived from multimedia objects are typically more obscure than the sets of keywords used for textual searches. As a result of this feature complexity, there is no counterpart in multimedia retrieval to the ease with which users can manually modify their query formulation in a text search engine, making relevance feedback much more important.

In this chapter, we discuss several relevance feedback techniques that have been successfully applied to multimedia search. Section 2 presents some background on multimedia retrieval. Section 3 discusses the basic relevance feedback concepts we will use during the remainder of the chapter. Section 4 discusses retrieval with only one feature while section 5 discusses retrieval with multiple features. Section 6 describes techniques to reduce the number of relevance feedback iterations to quickly find the optimal results. Section 7 discusses how to evaluate the performance of relevance feedback. Finally, section 8 presents some conclusions and current trends in relevance feedback.