4. A Case Study: Indexing and Retrieval of Images by Similarity

4. A Case Study: Indexing and Retrieval of Images by Similarity

When a relational table is composed only by numbers or small character strings, the comparison operations between elements of these domains are always clearly defined: the meaning of =, <, like(), etc. is well known and is executed in the same manner in every DBMS. However, when non-conventional data types are included, the comparison operations between them are not so completely understood, and needs to be declared as part of the description of the data itself. In this section we present an example of how a similarity query can be posed on a database stored in a relational database extended by the CIRCE architecture.

4.1 A Content-Based Image Retrieval Example

To store data in an RDBMS, first of all it is needed to create the tables, using the DDL command "create table." Let us resume the example of the previous section. To store the students table example, having as attributes their names, ages, the course where they are enrolled, their advisor, their FrontView and Profile mugshot, the following command should be issued:

          CREATE TABLE Students    ( Name CHAR(50),                                     Age INTEGER,                                     Course CHAR(15),                                     Advisor CHAR(50),                                     FrontView STILLIMAGE,                                     Profile STILLIMAGE ); 

Notice that the last two lines of this command use the CBIR-extended SQL, as the STILLIMAGE keyword is part of that extension. Therefore, CIRCE replaces these two lines, changing the STILLIMAGE keyword to Numeric(10), where the image identifier will be stored in place of the actual image. We are using the STILLIMAGE keyword to declare image attributes, following the ISO/IEC standard proposal aiming to specify packages of abstract data types in SQL. This proposal, known as the SQL/MM (SQL for MultiMedia applications), includes packages for Still Image, Still Graphic, Animation, Full Motion Video, Audio, Full-Text, Spatial, Seismic, and Music [41, 42]. Although this proposal aims to support images in relational databases, it is intended to be used through the resources for user-defined type provided in SQL-99. Therefore, the image storage, manipulation and retrieval operations addressed by the proposal must be supplied by application-specific procedures. As there is no provision for CBIR in that proposal, the CIRCE approach presented here supplements SQL/MM to allow similarity queries in content-based image retrieval operations.

Attributes other than images can be compared in subsequent DML commands, but the image attributes must have their comparison operations defined in the beginning of the process, because there is no standard way to compare images. In the previous example, this is done through the definition of one or more DF(), using the "create metric" CIRCE-extended DDL command. This command specifies the schema of a feature vector, selecting the extractors and the features used in each metric. Considering, for example, that FrontView images should be compared by their color histogram and by the total area occupied on the photo, the Histogram and TraceObjects extractors can be used, as is presented in the following command:

    CREATE METRIC FrontViewHistoArea on Students (FrontView) AS (        HISTOGRAM (HISTOGRAM FrontHistogram INTEGER[256],                      MIN FrontHistoMin INTEGER,                      MAX FrontHistoMax INTEGER)          TRACEOBJECTS (TOTALAREA FrontViewArea FLOAT)   ); 

Each extractor is an image processing algorithm defined using the resources for the user-defined type provided in SQL-99. The extractors are designed to obtain the general and the meaningful properties from images. For example, the TraceObjects is a closed border detection algorithm that returns the number of objects found in an image, their total area (in pixels), the minimum bounding rectangles coordinates and area of the five larger objects found in the image. From all these features extracted by the TraceObjects extractor, the FrontViewHistoArea DF() in the example uses only the total areal, referred to in the DF() schema as FrontViewArea.

After receiving those two commands, CIRCE generates a table in the IPV database named Students_FrontView, with the following structure corresponding to this DF() schema:

      CREATE TABLE Students_FrontView ( ImID NUMERIC(10),                                        FrontView BLOB,                                        FrontHistogram                                        INTEGER [256],                                        FrontHistoMin INTEGER,                                        FrontHistoMax INTEGER,                                        FrontViewArea FLOAT  ); 

Thereafter, every tuple inserted in the Students table with a non-null value in the FrontView attribute will have a corresponding tuple inserted in the Students_FrontView table, where the image is stored together with the values of the extracted features and its corresponding image identifier. The identifier is the one stored in the FrontView attribute of the Students table. The association of each feature attribute in the DF() schema with the corresponding feature retrieved by each extractor, as is declared in the "Create Metric" command, is stored in tables in the IDD database.

As each DF() is uniquely identified (like Students_FrontView in this example), more than a DF() can be associated to the same image attribute. This allows that more than one kind of comparison can be made with each image attribute.

In this example, at least another "Create Metric" command must be issued for the Profile attribute, in order to create a DF() for it. Thus, the Students_Profile table will also be enlarged with the features used in its corresponding distance functions.

The extracted features are stored in the IPV database tables; therefore its data types must be the traditional ones, like numbers or small character strings. Extractors that return string values are adequate to support some kind of semantic interpretation of the images.

The "Create Metric" command syntax also permits to supply parameters for each extractor used in a DF(), through its "PARAMETERS()" clause. Parameters can be either a constant value or the result of extractors declared beforehand in the "Create Metric" command. For example, the Students_FrontView DF() could be declared to use histograms with a smaller number of bins, as is shown following:

     CREATE METRIC FrontViewHistoArea on Students (FrontView) AS               (    HISTOGRAM PARAMETERS (NBINS=64)                   (HISTOGRAM FrontHistogram INTEGER[64], ...               ); 

With the Students table populated, it is possible to ask similarity queries over its image attributes. To support this kind of queries, a specific search condition is added to SQL, to be used in the "where" clause of the DML commands. The general syntax of a similarity condition enables both k-nearest neighbors and range queries, as shown as follows:

       SELECT <attributes>       FROM <tables >       WHERE <old_conditions> | <similarity_conditions>       <similarity_conditions> ::= <attribute> NEAR <val>                                     [BY <MetricName>]                                     [RANGE <val_range>]                                     [STOP_AFTER <k_nearest>] 



is the name of the image attribute in the table, where the search will be performed;


is the reference image, used as the query center;


is the name of the DF() to be used in the comparison operations, required if more than one DF() was defined for the involved <attribute>;


is the range value for range queries; and


is the k value for k-nearest neighbors queries.

For example, the following command retrieves the ten images stored in the FrontView attribute of the Students table that are the most similar to the photo stored in the file PhotoFile.jpg, with the corresponding student's name. As the DF() is not specified, it is assumed as the one created by the first "Create Metric" command issued over the FrontView attribute.

               SELECT Name, FrontView               FROM Students               WHERE FrontView NEAR PhotoFile.jpg STOP AFTER 10; 

The full power representation of SQL can be exploited. For example, the following command returns both the front view, profile, and the name of the students whose front view image is dissimilar from the front view image of the student called "John Doe" by at most 50 units.

               SELECT Name, FrontView, Profile               FROM Students               WHERE FrontView NEAR (                           SELECT FrontView                           FROM Students                           WHERE Name='John Doe'                     ) RANGE 50; 

4.2 Extending CIRCE to Support Content-Based Video Retrieval

The BLOB data type supported by traditional RDBMS is used to store large amounts of data. BLOBs are not analyzed by the system, so they can be used to store video data as well as images. The CIRCE architecture can be used to enable content-based retrieval using similarity queries regarding not only images, but also audio, video and other types of complex data, provided the proper set of feature extractors are defined. Therefore, each complex data type should be supported by a set of specific feature extractors. Moreover, with the video data type, the "PARAMETERS()" clause of the "Create Metric" command must be improved.

The improvement required to support video is due to the fact that image feature vectors are composed only by numeric or short character string values. Regarding audio and video data, the extractors should work on specific key-frames or audio parts that are worth to identify in the original data. The standard SQL is not an adequate platform to support an image or audio processing language; therefore the extractors that process these data types are elementary building blocks, aiming only the declaration of queries over corresponding datasets. Therefore, the ability to support feature vectors that return complex data itself is essential not only for images but also for all the multimedia data. However, when video data are considered, this ability turns out to be quite useful. For example, consider that the similarity of two video sequences should be measured using some image DF() applied over the key-frames of each sequence. The video DF() could be defined using a video processing algorithm that extracts the key-frame, followed by the image feature extractors that receives that frame as an image to have its parameters extracted. Although a two-level architecture (video image) could be built, a more general architecture dealing with other aspects of the video data would be necessary.

Handbook of Video Databases. Design and Applications
Handbook of Video Databases: Design and Applications (Internet and Communications)
ISBN: 084937006X
EAN: 2147483647
Year: 2003
Pages: 393

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net