Shin'ichi Satoh and Norio Katayama
National Institute of Informatics, Japan
This chapter covers issues and approaches on video indexing technologies for video archives, especially for large-scale archives, for instance, tera- to peta-byte order archives, which are expected to become widespread in a few years. We will discuss the issues from three viewpoints, i.e., analyzing, organizing, and searching video information.
Recent years have seen technical innovation enabling the huge number of broadcast video streams, such as digital TV broadcasting through broadcast satellites (BS), communication satellites (CS), cable television, and broadband networks. The number of accessible channels by end users is getting larger, and may reach several thousands in the near future. Types of video programs in the broadcast video streams cover wide varieties of interest of audience, e.g., news, entertainment, travel, culture, education, etc. In addition, a large amount of cost as well as labor and brand-new technologies make broadcast videos be of extreme high quality, and thus they comprise archival and cultural importance. Once archived, the broadcast video archives may contain any information which ordinary users may want in any situations, although the size of the archives will become huge. If sufficiently intelligent and flexible access to the huge broadcast video archives becomes available, users may obtain almost all information upon their needs from the video stream space.
As the key technology to realize this, video indexing has been intensively studied by many researchers, e.g., content-based annotation using video analysis, efficient browsing, access, and management of video archives, etc. Based on this idea, we have been intensively studying two aspects of video indexing: one is to delineate meaningful content information by video analysis. This approach is important because manual annotation is obviously not feasible for the huge video archives. In doing this, we take advantage of image understanding, natural language processing, and artificial intelligence technologies in an integrated way. The other is to realize efficient organization of multimedia information. For vast video archives, fast access to the archives is indispensable. Since video information is thought to be comprised of high-dimensional features such as color histograms and DCT/Wavelet coefficients for images, keyword vectors for text, and so on, we developed an efficient high-dimensional index structure for the efficient access to video archives. We describe these two activities in the following sections.
We first thought that these two approaches are independent. However, our recent research results revealed that these two are tightly coupled to each other. In this way we extract meaningful contents from videos by associating image and text information as one approach: in other words, correlating skew of data distribution in the text space and in the image space. As for the other approach, the experimental results show that the proposed method achieves efficient search when sufficiently unevenly distributed data is given, whereas only poor search performance is achieved when uniformly distributed data is handled. Based on these observations, we assume that local skew of data distribution has relation to meaning of the data. We studied on searching for meaningful information in video archives and developed a detection method of local skew in the feature space with its application to search technique.
We then conclude this chapter by presenting discussions and future directions of the video indexing research from our viewpoint.