Related Work

In the past decade, researchers have made considerable contributions in standalone image analysis and recognition to allow the partitioning of a video source into segments. In addition, they have developed algorithms to index videos in multimedia databases, allowing users to query and navigate through image or video databases by content. One approach to video segmentation is the use of simple pixel-pair wise comparison [1] to detect a quantitative change between a pair of images. By comparing the corresponding pixels in the two frames, it is easy to determine how many pixels have changed, and the percentage of change. If this percentage exceeds some preset threshold, the algorithm decides that a frame change has been detected. The main value of this method is its simplicity. However, its disadvantages outweigh the advantages. A large processing overhead is required to compare all consecutive frames. Some situations are not accommodated such as when large objects move within one video shot.

Methods using spatial and temporal skips and histogram analysis [2,3,4,5] recognize that, spatially or temporally, many video frames are very similar and that is redundant to analyze them all. These approaches confirmed that the use of color histograms better accommodates the flow motion of the objects and the camera within the same shot. Significant savings in processing time and resources can be achieved during the analysis, either temporally, by sampling every defined number of frames (instead of all frames consecutively), or spatially, by comparing the number of changed pixels to the total frame size. Color histogram distributions of the frames can also be used. The histogram comparison algorithm is less sensitive to object motion than the pixel-pair wise comparison algorithm using various histogram difference measures. Then using a preset threshold, a frame change is determined and so a shot cut operation is detected. Hirzalla [6] has described the detailed design of a key frame detection algorithm using the HVC color space. First, for every two consecutive frames, the system converts the original RBG coloring space of each pixel into the equivalent HVC Histogram coloring representation because the HVC space mirrors human color perception. Instead of intensity distribution, the system uses the hue histogram distribution to perform the comparison. This to some extent reduces the variations in intensity values due to light changes and flashes. Pass et al. [7] proposed a histogram refinement algorithm using color coherence vectors based on local spatial coherence. Wolf [8] provided an algorithm to detect the key frames in MPEG video file format using optical flow motion analysis. Sethi et al. [9] used a similarity metric, which matches both the hue and saturation components of the HSI color space between two images. Pass et al. [10] defined a notion of Joint Histograms. They use many local features in comparison, including color, edge density, texture, gradient magnitude and the rank of the pixels.

After surveying the current approaches for video indexing and segmentation, we found that most need extensive processing power, memory and storage requirements. In addition, most of them are stand-alone applications and filtering processes or algorithms. These algorithms and the current efforts in video analysis and summarization developed mainly to solve specific video processing problems. They were not designed for use in high volume requests and required run-time customization procedures. They do not necessarily perform optimally in terms of processing time besides the terms of accuracy and efficiency. We therefore developed a new algorithm called the binary penetration algorithm for video indexing, segmentation and key framing. We designed the algorithm to be evaluated by processing time as well as by the accuracy of its results. In addition, we believe that our approach could be used as an orthogonal processing mechanism within the other functions for video indexing such as the detection of gradual transitions and the classification of camera motions.

Meanwhile as we said, the proliferation and variety of end user devices that can be used by mobile users made it necessary to adapt multimedia services presentation accordingly. There are ongoing research and standardization efforts to enable location-based mobile access of multimedia content. One example is the MPEG-7 working document on mobile requirements and applications [11] from the Moving Picture Experts Group (MPEG). They are currently working on enabling mobile access to specific MPEG-7 information using GPS, context-aware and location-dependent technologies. A user would be able to watch the trailers of current movies wirelessly from the nearest cinema while he is walking or driving his car. Another related ongoing standardization effort is the web services architecture [12], which is a framework to build a distributed computing platform over the web and to enable service requesters to find and remotely invoke published services at service providers. This web service framework uses specific standards such as XML, Web Services Description Language (WSDL) for defining the service and its interfaces and Simple Object Access Protocol (SOAP), which is an XML-based protocol for accessing services, objects and components in a platform-independent manner. Web services architecture also utilizes Universal Description, Discovery and Integration (UDDI) directory and API's for registering, publishing and finding web services similar to the phone's white and yellow pages. It also uses Web Services Inspection Language (WSIL) for locating service providers and for retrieving their description documents (WSDL).

We will provide a mechanism to utilize our algorithms as a video web service over the World Wide Web for multiple video formats. An XML interface for this video service has been designed and then implemented. Thus, users (either mobile or static) or even peer service within the enterprise could utilize the video indexing service seamlessly while hiding the details of the video processing algorithms themselves.

Handbook of Video Databases. Design and Applications
Handbook of Video Databases: Design and Applications (Internet and Communications)
ISBN: 084937006X
EAN: 2147483647
Year: 2003
Pages: 393 © 2008-2017.
If you may any questions please contact us: