In this work, it can be shown that a temporal video sequence can be visualized as a trajectory of points in the multi-dimensional feature space. By studying different types of transitions in different categories of videos, it can be observed that the shot boundary detection is a temporal multi-resolution phenomenon. From this insight, a TMRA algorithm is developed for video segmentation by using Canny wavelet and applying multi-temporal-resolution analysis on the video stream. The solution is a general framework for all kinds of video transitions.
TMRA is implemented and tested on the whole MPEG-7 video data set consisting of over 13 hours of video. The results demonstrate that the method is very effective and has good tolerance to most common types of noise occurring in videos. Also, the results show that this method is independent of the feature space and can work particularly well in compressed domain DC features.
Further enhancements of this work can be carried out as follows. First, we will investigate the use of other features especially those at the semantic level to analyse video data. A well-constructed feature space can help not only to improve the performance of the algorithm, but also to perform structure extraction of video data.
Next, work can also be done to select more appropriate mother wavelets in a suitable feature space to perform TMRA on the video stream. An appropriate choice of wavelet and feature space may result in better techniques to detect and recognize gradual transitions.
The authors would like to acknowledge the support of the National Science and Technology Board, and the Ministry of Education of Singapore for supporting this research under research grant RP3989903.