3. Analysis for Content Classification


3. Analysis for Content Classification

TV programs fall largely into two categories: narrative and non-narrative. Movies, dramatic series, situational comedies, soap operas, etc., fall into the narrative category. These TV programs all focus on telling a single story or a piece of a story over an entire episode. Narrative programs are often defined by their characters, themes, environments, and unfolding story arcs. Talk shows, news, financial programs, sporting events, etc. fall into the non-narrative category. They often have no single story line but do contain clearly recognizable structure (format). For example: local news programs are often made up of many stories told one at a time. These stories are grouped into thematic segments such as breaking news, local news, sports news, and weather.

3.1 Non-Narrative Programs

The main focus of our work has been the segmentation of non-narrative programs, which are more easily addressed with current content-based retrieval technology. So far, most segmentation and retrieval advances have been made for news programs as well as some advances for talk shows and sports. The main challenges have been accurate story classification and segmentation (boundary detection).

3.1.1 Segment Classification

In general, program metadata (if available) is useful to determine if programs are either financial news or talk shows. Metadata listing the current program genre is not always available or is sometime inaccurate. Therefore we test each segment in order to classify it properly. In addition, a news program can contain, celebrity, financial, sports, weather, politics and other types of segments. Visual, audio, and transcript data are used together to resolve conflicts and make the right inferences. For example: a segment from a financial news program often contains faces and videotext onscreen at the same time. Also, a segment that has a lot of financial keywords might be financial news; however, if there is background noise such as audience clapping and laughter, it may be a talk show with jokes about the stock market rise or fall.

3.1.2 Story Segmentation

Story segmentation depends on the program genre. For example, news programs typically consist of one story after another grouped thematically: sports stories, weather stories, local stories, national stories; figure skating broadcasts usually show one competitor at a time; talk shows almost always consist of host segments followed by guest segments. The challenge is to correctly identify the boundaries of individual story segments within the broadcast.

We chose to use transcript cues to provide coarse boundary indicators. In addition, genre specific visual and audio cues are used to refine these boundaries. For our prototype we analyzed financial news and talk shows using this process. In the case of talk shows, we identify when the guest is introduced by searching for introduction cues such as: "my next guest...". We then search for the guest's actual appearance on the screen by looking entry cues such as: "please welcome...." We analyze the transcript data that falls between the introduction cue and the entry cue using a categorizer to find the main topic that will be discussed such as the release of a new movie or a new album.

3.2 Narrative Programs

Segmentation of narrative programs normally implies understanding the high-level semantics in order to identify the underlying structure, the conflict and the resolution [11]. This is a difficult task given the type of processing power we expect users to have in their living rooms. In addition, our user tests on different segmentation concepts revealed that users did not want narrative content cut into pieces unless the segmentation either produced a preview for the entire program or removed all TV commercials. Since creating an accurate preview seemed to be beyond Scout's semantic abilities, we decided instead to focus on a high-level overview for narrative content, clearly identifying program segments and commercial segments. In addition, Scout can provide users with some additional segmentations that may have value to a small number of users. For example: using face and voice detection, we can find all shots that contain a specific actor; using music detection, we can find musical numbers within a musical; examining the colors on screen, we can find baseball game scenes within a movie about baseball.




Handbook of Video Databases. Design and Applications
Handbook of Video Databases: Design and Applications (Internet and Communications)
ISBN: 084937006X
EAN: 2147483647
Year: 2003
Pages: 393

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net