crossing media for video search: enabling usability beyond traditional broadcast & tv katerina...
TRANSCRIPT
Crossing Media for Video Search:enabling usability beyond traditional broadcast & TV
Katerina Pastra and Stelios Piperidis
Language Technology Applications,
Institute for Language and Speech Processing, Athens, Greece
The “Pervasive Digital Video” Era
TV sets extended with “intelligent” DVRs, set-top boxes with PC-like functionalities, linked with PCs that display streamed video and allow interaction through gaming consoles Video viewing transferred beyond the TV set, to mobiles & i-pods allowing on-the-move viewing Video broadcast carried through broadband using IP Video content (professional and/or consumer-generated) exchanged through file swapping and headline syndication technology
A New Era in Video Search ?
From the digital video libraries context to the new pervasive digital video reality:
The scope of video search (indexing & retrieval) technologies is broadened and their role is reinforced
Does pervasive digital video affect the “search” in video search technologies? (imposes new challenges)
Does video search affect the “pervasiveness” of digital video? (affects usability of available video and corresponding new technologies)
Overview
Video search: the market perspective
- market players and video search developers
- video search in commercial prototypes Video search research prototypes
- lessons from the digital video library scenarios New technological challenges Dealing with new challenges
- Suggestions from the REVEAL THIS project
- Using cross-media decision mechanisms
Video Search & the Market Players
Transitive Dependencies Dependency Trends
Web content Aggregators, content service providers,
content repackaging companies
TV service providers & file-swapping networks Electronics Manufacturers
Content owners
IPTV software developers
ISP, computer networking & phone
companiesVideo search software developers
Video search mechanisms in the market
Characteristics :
Use of owner/broadcaster created metadata Text-based search on closed captions or
ASR or speech stream Processing of English files mostly Keyword query (restricted semantic expansion)
Retrieval unit is either whole video or short segment where keyword appears (+ few seconds before and after)
Are such mechanisms efficient?Are such mechanisms efficient?Quest for Quest for coherencecoherence
Not always Not always
presentpresent
Not robustNot robust
Other Other languages?languages?
Find the Find the right right
keyword keyword problem!problem!
Video Search in Research
Lessons learned (Hauptmann and Christel 2004): Fusion of medium-specific retrieval results boosts video retrieval performance slightly (vs. e.g. text-based only retrieval)
Fusion based on linear weights ~ query-type helpful Text query/ASR enhancement, relevance feedback and feature-concept associations, all helpful
The digital video library access scenarios prevailed in research projects up until the ’00s (cf. Informedia, TRECVid etc.) Video indexing & retrieval prototypes explored a variety of unimodal and multimodal mechanisms that go beyond commercially offered video search
MM approaches slightly MM approaches slightly better. Necessary?better. Necessary?
Video search challenges in the new context
From advanced computer users to laymen - type & quality of query - expectations & requirements on retrieval accuracy - length of retrieval unit - domain and language of data From structured data collections to pervasive video - genre, domain, language, source/structure variation - consumer-generated / noisy / low quality and prof. - broadcast metadata, closed captions etc. availability From static to dynamic search - VoD & real-time broadcast data - re-active & pro-active, personalised search (push and pull) The ideal search mechanism?The ideal search mechanism?
Multimedia approaches for video search suggested by research projects with application scenarios related to the new pervasive digital video context, e.g. leisure and entertainment in the digital home, and/or for the mobile user (e.g UP-TV, BUSMAN, AceMedia) Image features – language concepts association for video search suggested (Multimedia integration) REVEAL THIS goes a step further in suggesting the use of cross-media decision mechanisms
Video search prototypes in the ’00s
FP6 funded project (Nov. 2004-April 2007)
http://www.reveal-this.org
A system that offers both types of service :
a) Multimedia and Cross lingual Information Retrieval (pull)
b) Multimedia and Cross lingual information Filtering (push)
REVEAL THIS Use Scenario
Media Archive
Search archive
WEBRadio
TV, Radio, Web data
TV
Reveal-THIS technology
Mobile phone, and Web interfaces
UserUser profile Mobile
Media Server(Content
Aggregator)
WebLocal Archive
Delivery
EN-EL, European Parliament plenary sessions & press-conferences, national news, travel documentaries & info
audio
Media Manager
SPC–speech processing
Web text radio TV
cross-media stories
text
video
FDIC - faceDetection & identification
TPC - text processing
text keyframes
Story Boundary Detection
IAC – keyframe extraction, Image Analysis & image
categorization
Automatically extracted metadata: TPC: named entities, terms, facts
Text CategoriesSPC: speaker turns, speaker names, text
IAC: shotcuts, keyframes, image features, image categories
FDIC: face regions, names
Text categorisation
Cross-media Indexing, Cross-media CategorizationMultimedia Summarization
Translation
Media Server:
StoragePersonalisation
BrowseQuery
RetrievePush Notifications
REVEAL This
System A
rchite
cture
The notion of Cross-Media Decision Mechanisms
Mechanisms that decide on the relation that holds between medium specific pieces of information: across documents (Boll et al. 1999) within documents (Pastra 2006)
The mechanisms decided whether medium-specific pieces of information within the same Multimedia Document are: associated (multimedia integration) complementary semantically compatible/incompatible
complementaritycomplementarity independenceindependence
equivalenceequivalence
Conclusions
The scope of video search technology is broadened & new technological challenges are imposed The market players consider video search technology indispensable Commercial video search limited; research in the digital library access context goes beyond such limitations & points to slight benefits in using multimedia fusion techniques Research with new application scenarios (iTV etc.) emphasizes the necessity of such mechanisms & introduces the notion of x-media decision mechanisms
Efficient video searchEfficient video search
is indispensable for usabilityis indispensable for usability
beyond traditional broadcast and TV;beyond traditional broadcast and TV;
x-media decision mechanisms may hold the key x-media decision mechanisms may hold the key for achieving it for achieving it