4. Development of a Web-Based VRS

In this section, we describe a web-based version of VideoMAP that we have been developing, which we term as VideoMAP*. In this section, we present a reference architecture for it, and discuss the design issues involved in developing such a web-based video retrieval system.

4.1 Reference Architecture

A reference architecture of VideoMAP* is shown in Figure 24.2. It is being built on top of a network and the components can be classified into two parts: server-components and client-components. For the client-components, they are grouped inside the gray boxes; for the server-components, they are shown below the network backbone. The main server-components providing services to the clients include Video Production Database (VPDB) Processing, Profile Processing and Global Query Processing. Theoretically, there is no limit on the number of users working at the same time. There are four kinds of users who can utilize and work with the system: Video Administrator, Video Producer, Video Editor and Video Query Client. Each type of user is assigned with a priority level for the VPDB Processing component to handle the processing requests from the queue.

click to expand
Figure 24.2: Architecture of a VideoMAP*— a web-based video retrieval system.

A request from a user who has the highest priority level is processed first, with the requests from the users of the same priority level being processed in FIFO. The user priority levels of our system, in descending order, are from Video Administrator, Video Producer, Video Editor to Video Query Client.

4.1.1 Main Process Flows

For the four main components of VideoMAP*, their main process flows are described immediately below.

VPDB Processing

The VPDB Processing component is responsible for accepting clients' requests and providing basic database operations of the Global VPDB through CCM/ST (Conceptual Clustering Mechanism supporting Spatio-Temporal semantics) [3]. It cooperates with a Concurrency Control mechanism to ensure data integrity and data consistency among Global VPDB and Local VPDBs. Global VPDB and Local VPDBs store the raw video data together with the semantic information. Global VPDB is the permanent storage of the system and Local VPDBs are the temporary storage for the clients.

Concurrency Control

The Concurrency Control mechanism is used to handle multiple-user processing of the system. As a small change to the ToC (Table of Content) of the video program may affect the final display order of the video program [2], data locks cannot be used in the virtual editing of a video program; otherwise, the whole sequence of the video program should be locked entirely. In order to ensure data integrity and data consistency, a complemented mechanism is introduced here. When a user wants to do virtual editing of a video program, he retrieves the video data from the Global VPDB to his Local VPDB. Then the modified video data is saved as a new view of the video program. A collection of views is stored with the original video program. Therefore, Video Query Clients can choose to display the video program from different perspectives. In order to reduce the size of the database, it is necessary to limit the number of versions of views of each user stored in the database. The Video Administrator can also remove some obsolete videos.

Profile Processing

Profile Processing is used to provide basic database operations of the ProfileDB. It ensures that all the clients can retrieve and update their corresponding user profiles (which are stored in the ProfileDB). There are three types of profiles: common profile, personal profile and the specific video profile. Each profile stores the common knowledge and the Activity Model [3] of which the semantic meanings can be different to different users. Common profile is the standard profile provided to every user, thereby supporting the desired "subjectivity" for virtual editing and access activities. Personal profile is the customized common profile according to the need of the individual user. The Video Producers create the specific video profiles based on the video programs. The personal profile can only be modified by its owner; and the specific video profile can only be modified by the Video Administrator. The profiles are represented in the form of rules for effective reasoning and can be converted into a universal format (e.g., XML) before transmitting to the clients. Therefore, the information can be easily interpreted by the web browsers.

Semantic Query Processing (CAROL/ST) with CBR

The query processing mechanism of VideoMAP* is inherited from, and thus exactly the same as that of the "stand-alone" VRS [5]. We therefore omit its description here for the sake of conciseness.

4.2 Distribution Design

In a web-based environment, the cost of data transfer and communication is an important issue worth great attention. In order to reduce this overhead, databases can be fragmented and distributed into different sites. There are many research efforts on data fragmentation of relational databases. On the contrary, there is only limited recent work on data (object) fragmentation in object databases [11,22,23]. As it is time-consuming (and quite often unnecessary) to transmit an entire object database to the client side, especially for a large video object database, it is advantageous to fragment and cluster objects in order to minimizing the overhead of locking and transferring objects. After objects are fragmented, the next step is to allocate the fragments to various sites, which is another open research issue for object databases (see, e.g., [14]).

4.2.1 Global VPDB

In VideoMAP*, different types of users such as Video Producers (VPs), Video Editors (VEs) and the Video Administration (VA) can retrieve the Global VPDB into their Local VPDBs for processing, and their work may affect and be propagated to the Global VPDB. In particular, each type of user may work on a set of video segments and create dynamic objects out of the video segments. It therefore makes sense to cluster the video objects in the same manner as shown in Figure 24.3. Specifically, each raw video is assumed to have a specific domain and relevant domains are grouped together first. Since multiple video segments (may belong to different domains) may be included by one application, raw videos linked by the same application are therefore grouped together. The other objects which have links with these video segments are then grouped into clusters. Further fragmentation based on by user views [4] can be processed if necessary. The clusters can be stored in more than one database; therefore the size of object locking can be minimized.

click to expand
Figure 24.3: Data distribution of Global VPDB.

In order to maintain the fragmentation transparency, a processing component is needed to locate objects. A Global Schema is also needed to contain the metadata of objects in several levels (by domain, application, link, etc). VP, VE and VA users can easily retrieve the objects, and also easily insert/update the databases through the global schema. Consequently, the global schema also needs to be carefully shared and maintained.

4.2.2 Global Video Query Database (Global VQDB)

In VideoMAP*, the Global VPDB is replicated into a mirror database (i.e., Global VQDB) for Video Query Client processing. As the Global VQDB is a read-only database and is independent of other processes, it can be fragmented according to the interests of the query clients thoroughly. Since an Activity Model is composed of user-interested activities, and a User Profile is constructed by the user's activity model and query histories, the Global VQDB can be fragmented based on the semantics in multiple levels.

As shown in Figure 24.4, semantic objects (i.e., SemanticFeature, and VisualObject) can be grouped by two ways: by Semantics, and by Activity Model. In the former case, the objects are grouped together by a thesaurus and then they are linked with other associated objects (such as scene, visual feature, etc.). In the latter case, semantic objects are first clustered from the object level to the activity level, and then linked with other associated objects (e.g., scene, visual feature, etc.).

click to expand
Figure 24.4: Data distribution of Global VQDB.

In order to maintain the fragmentation transparency, a processing component is needed to serve as a directory for locating the objects. As in the case of the Global VPDB, it works with the Global Schema in which multiple levels of meta data information are stored.

4.3 Distributed Query Processing and Optimization

When the query client submits a query, the Local Query Processing component may rewrite and transform it into a number of logical query plans. Since the Global VQDB is fragmented according to the semantics and the Activity Model, the logical query plans can be generated based on User Profiles which contain relevant semantic information. For example, if a user requests for a "ball," the query plans can include "ball," "football" etc. based on his/her profile. Each query plan is then decomposed into sub-queries. After the decomposition of all plans, the sub-queries are sent to the Local Query Optimization component to determine the order of importance. Next, the query plans are sent to the Global Query Processing component and the Global Query Optimization component to search the results by using the information from the Global Schema. As the original query is rewritten into a number of similar queries with different priority levels, the probability of obtaining the right results can be greatly increased.