automatic video tagging using content redundancy
DESCRIPTION
Automatic Video Tagging using Content Redundancy. Stefan Siersdorfer 1 , Jose San Pedro 2 , Mark Sanderson 2 1 L3S Research Center, Germany 2 University of Sheffield, UK SIGIR 2009 2009. 11. 06. Summarized and Presented by Hwang Inbeom , IDS Lab., Seoul National University. - PowerPoint PPT PresentationTRANSCRIPT
Automatic Video Tagging using Content Redundancy
Stefan Siersdorfer1 , Jose San Pedro2, Mark Sanderson2
1L3S Research Center, Germany 2University of Sheffield, UK
SIGIR 2009
2009. 11. 06.
Summarized and Presented by Hwang Inbeom, IDS Lab., Seoul National University
Copyright 2009 by CEBT
Large Amount of Data on YouTube
Traffic to/from YouTube accounts for over 20% of the web total Comprising 60% of on-line watched videos
Growing beyond human perception
Necessity to provide effective knowledge mining and retrieval tools
2
Copyright 2009 by CEBT
Knowledge Mining and Retrieval
Making use of human annotation: Folksonomy Provides relevant results at a relatively low cost
Applications
– Topic detection and tracking
– Information filtering
– Document ranking
– Etc.
However, content-based retrieval techniques are not mature enough Folksonomy-based techniques outperform content-based techniques
3
Copyright 2009 by CEBT
Problem: Poorly Annotated YouTube Videos
Hard to annotate videos Intellectually expensive process
Time consuming job
Low-quality tags Often very sparse
Lack consistency
Present numerous irregularities
Difficult to provide retrieval and knowledge extraction relying on tex-tual features
4
Copyright 2009 by CEBT
Motivation
Significant amount of near-duplicate videos Over 25% near-duplicate videos detected in search results
Has been considered as a problem of online videos
Authors have seen this redundancy as a feature Linkage between two different videos
Exploit redundancies to obtain richer video annotations
5
Copyright 2009 by CEBT
PageRank-like Graph of Videos
6
Copyright 2009 by CEBT
PageRank-like Graph of Videos
7
Overlap GraphGO = (VO, EO)
Copyright 2009 by CEBT
Edge in Graph
8
Means video i and j has redundant visual information
Three types of links Duplicate videos
Part-of relationship
Overlapping
Video iVideo j
Copyright 2009 by CEBT
Related Work: VisualRank (WWW 2008)
Builds a graph of images using visual similarity between two im-ages
Runs PageRank algorithm to re-rank images
9
Copyright 2009 by CEBT
Automatic Tagging
Different approach with that of VisualRank Aims to enrich annotations
Not to improve search result
Three methods Simple neighbor-based tagging
Overlap redundancy aware tagging
TagRank: Context-based tag propagation in video graphs
10
Copyright 2009 by CEBT
Simple Neighbor-based Tagging
Transforms GO
Into the directed graph G’O(V’O, E’O) of overlapping videos
Weighting function of (i,j) describes to what degree video j is covered by video i
11
Video iVideo j
w(vi, vj)
w(vj, vi)
Copyright 2009 by CEBT
Simple Neighbor-based Tagging (contd.)
Gets tag t’s relevance score for a video from information of adjacent videos Weighted sum of influences of overlapping videos tagged by t
Counts only adjacent videos’ tags
12
0
1),( jvtI
Oij Evv
ijji vvwvtIvtrel'),(
),(),(),(
if vj is tagged with t
otherwise
Copyright 2009 by CEBT
An Example
13
t
t
t
t
t’s relevance score
Copyright 2009 by CEBT
Overlap Redundancy Aware Tagging
Potential high increase of relevance score if a video has multiple re-dundant overlaps
Contribution of same tag is reduced by relaxation parameter
14
),( 11 vtIw
),( 22 vtIw ),( 33 vtIw ),( 22 vtIw ),( 332 vtIw
Copyright 2009 by CEBT
TagRank
Tag weight propagates through the overlap graph
Relevance scores are computed in matrix form
TR converges into a certain value: solved with power iteration method
Start power iteration with original tagging information and limited num-ber of iteration
– To keep original tag relevance
– To prevent TR(t) converging into uniform value
15
t
Copyright 2009 by CEBT
Evaluation
Two kinds of evaluation: Machine-oriented and human-oriented view Data organization with automatically generated tags
– Classification
– Clustering
User-based evaluation
16
Copyright 2009 by CEBT
Data Collection
38,283 videos: initial set C Returned videos with top 500 general queries
Together with related videos given with results
Redundancy analysis
Over 35% of videos (VO) overlap with one or more other videos
17
Copyright 2009 by CEBT
Data Organization
Classification with 7 YouTube categories
Each of them is containing over 900 videos in VO
Binary classification with SVM
– Feature vectors constructed with original tags/automatically generated tags
Four strategies
– BaseOrig: Only considering user-provided tags
– NTag: Simple Neighbor-based tagging
– RedNTag: Overlap redundancy aware tagging
– TagRankΓ: TagRank with Γ iterations
18
Copyright 2009 by CEBT
Data Organization
Clustering k-Means clustering
Partition videos into k categories
Neighbor-based tagging and overlap redundancy aware tagging out-perform baseline and TagRank methods in both experiments
19
Copyright 2009 by CEBT
User-based Evaluation
Assessors rate new tags with web interface Increasingly higher average score when considering tags having higher au-
totag relevance score
20
Copyright 2009 by CEBT
Conclusions
Content redundancy in social sharing systems can be used to obtain richer annotations
Additional information obtained by automatic tagging can largely im-prove automatic organization of content There is information gain for users also
Future work Authors plan to generalize this work to consider different domains
– Photos in Flickr
– Text in Delicious
Analysis and generation of deep tags
– Tags linked to a small part of larger media source
21
Copyright 2009 by CEBT
Discussion
Good idea and good formalization
Would be better if performance of TagRank were good Considering only neighbors is too naïve method
How can we deal with overhead of visual processing?
Would it be scalable enough to apply it to all videos in YouTube?
22