about the tradeoff between searching time and browsing ...final search query is test-set...
Post on 21-Jul-2020
4 Views
Preview:
TRANSCRIPT
IBM/Columbia team at TRECVID 2004
© 2004 IBM Corporation
About The Tradeoff Between Searching Time and Browsing Time in Interactive Search
IBM Research: A. Amir, J. O. Argillander, M. Berg, G. Iyengar, J. Kender, C-Y Lin, M. Naphade, A. Natsev, J. R. Smith, J. Tesic, G. Wu, R. Yan, D. ZhangColumbia University: S-F. Chang, W. Hsu
Presented by Arnon Amir, arnon@almaden.ibm.comNIST TRECVID 2004 Workshop, Gaithersburg, MD 11/16/2004.
IBM Research
© 2004 IBM Corporation2
IBM Search Systems – indexed data
We extend our thanks toCLIPS-IMAG – reference shot listCMU – Video OCR and time alignment of CC and ASRLDC – Video data, closed captionsLIMSI – ASR dataNIST – TRECVID
The
sem
antic
ledd
er
Multimodal query retrieval:– Speech: combined ASR, CC and Phonetic
– Video OCR
– Visual concepts
– CBIR: color histograms, color correlograms, texture.
IBM Research
© 2004 IBM Corporation
System 1: Marvel (“TJW”)
Shot-based retrieval
Speech, CC
CBIR
Concepts, models, filters
Session
Vector operations, fusion
Shot aggregation
Internet based GUI
IBM Research
© 2004 IBM Corporation
System 2: Multimedia Mining (“ARC”)
Video+time retrieval
Speech, CC, phonetic
CBIR
Concepts, Video OCR
Session
Statistic ranking
Shot aggregation
Relevance Feedback
Internet based GUI
IBM Research
© 2004 IBM Corporation
System 3: Automatic Search
Shot-based retrieval
Speech, CC
CBIR
Concepts
Vector operations, fusion
Automatic query formulation– No human in the loop, no GUI☺
IBM Research
© 2004 IBM Corporation
A Perspective on TRECVID Search
What Manual search is?
What Interactive search is?
What is the importance of browsing?
What is the importance of shots elevation?
IBM Research
© 2004 IBM Corporation
Manual Search
Input: Multimodal topic, textual + image/video samples
Task: Form the best MM query for the given topic
Measure system+human performance+ Human ability to form a good query
+ System discrimination
+ System generalization
Final search query is test-set independent, scalable
IBM Research
© 2004 IBM Corporation
Interactive Search
Input: Multimodal topic, textual + image/video samples
Task: Get me as many hits as possible, now!
Measure system+human performance+ Human ability to form a good query
+ Human browsing skills
+ System search capabilities
+ System browsing capabilities
Final search result is test-set dependent, not scalable
IBM Research
© 2004 IBM Corporation
Automatic Search
Samples should include only relevant shots/images (Dow Jones Showing Rise)
Input: Multimodal topic, textual + image/video samples
Task: Acquire this Previously Unseen Semantic Concept
Measures system (!) performance– No human in the loop
– System topic->query conversion capability
– System learning capability
– System search capability
Final result is test-set independent, scalable
IBM Research
© 2004 IBM Corporation
Interactive Search = a Combination of Search and Browse
E.g., searching for “sport”using multiple queries
A travel in video shots space• Search: travel across the entire space, collect many shots• Browse: local, in a neighborhood: storyboard, CBIR, etc.
Browse
Browse
BrowseQry1: “basketball”Qry2: “football”Qry3: “swim dive”
quer
y …
Qry4: “hockey puck ice”
Ø
Ø
query …
IBM Research
© 2004 IBM Corporation
Search and Browse
One at a time, unranked*Many, rankedCollecting hits
Anything the user can findQuery matchesHits
Result list/tableQueryStarting point
??Contribution to MAP
LongShortTime
HumanComputerProcessor
Local, “show similar ones”GlobalExtent
BrowseSearch
IBM Research
© 2004 IBM Corporation
Browsing
Browsing, “find more like this”
Helps in query refinement
Find and pick new matches
Feedback, shots elevation
IBM Research
© 2004 IBM Corporation
Multimodal Query: sport~ & basketball & (N.B.A.# | NBA$) & 04.38333Concept ASR/CC phonetic VOCR CBIR
IBM Research
© 2004 IBM Corporation
The impact of Shot Elevation on MAP
Simulated scenario: For each topic,– Take final Interactive query as baseline list
– Elevate all the correct shots to the top, up to depth n
Plot MAP(n)
A random result list of length K (=1000) would yield a linear expected AP(n) of
( )
NGT
p
KpppnGT
ipninppnp
GTnMAP
K
ni
=
+−=
⎟⎠
⎞⎜⎝
⎛ −++⋅= ∑
+=
where
)(1
)(0.11)(
22
1
IBM Research
© 2004 IBM Corporation
Shot Elevation Impact on MAP – Simulation Result
0 100 200 300 400 500 600 700 800 900 10000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
MAP(n)
Browsing and marking depth, n•Above linear -> means the search results are better than random.•Labelling the top 100 gives more than 100% improvement in MAP
IBM Research
© 2004 IBM Corporation
0 100 200 300 400 500 600 700 800 900 10000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Shots Elevation Impact on Topics AP
Browsing and marking depth, n
AP(n)
IBM Research
© 2004 IBM Corporation
Interactive Search Runs
Results can get much higher than manual search
Does it scale up for large data sets ?
0.0000
0.0500
0.1000
0.1500
0.2000
0.2500
0.3000
0.3500
0.4000
MA
P
Interactive
Manual
Automatic
IBM.Interactive_1_ARC
IBM.Interactive_2_ARC
IBM.Interactive_Marvel_TJW
IBM Research
© 2004 IBM Corporation
Manual Search Runs
0.0000
0.0500
0.1000
0.1500
0.2000
0.2500
0.3000
0.3500
0.4000
MA
P
Scalable: same query may be run on large data sets
IBM.SpeechBaseline_ARC
IBM.Manual_ARC
IBM.Manual_Fusion_1_TJW
IBM.Manual_Marvel_TJW
IBM.Manual_Fusion_2_TJW
Interactive
Manual
Automatic
IBM Research
© 2004 IBM Corporation
Automatic Search Runs
Automatic pilot runs:– Results are comparable with the ballpark of Manual search
0.0000
0.0500
0.1000
0.1500
0.2000
0.2500
0.3000
0.3500
0.4000
MA
P
IBM.Automatic_Fusion_TJW
Interactive
Manual
Automatic
IBM Research
© 2004 IBM Corporation
AP by Topics
0.00000.10000.20000.30000.40000.50000.60000.70000.8000
Total M
APBori
s Yelt
sinHock
ey
Sam don
aldso
n
Benjam
in Netan
yahu
Flood
Golf
Henry H
yde
Sadda
m usse
inTen
nisHorse
sBan
ners
Buildin
g on f
ireDog w
alkBicy
cles
Umbrella
sBill C
linton
Keyboa
rdsWhee
lchair
Weapon
firing
Stretch
erPed
s+ca
rsStairs
Capitol d
omeSki
slalom
Ave
rage
Pre
cisi
on Best IBM Run1st runMaxAverageMedian
0.00000.10000.20000.30000.40000.50000.60000.70000.80000.90001.0000
Total M
APBori
s Yelts
inHoc
key
Sam do
nalds
on
Benjam
in Neta
nyah
uFloo
d
Golf
Henry
Hyde
Sadda
m usse
inTen
nisHors
esBan
ners
Buildin
g on f
ireDog
walk
Bicycle
sUmbre
llas
Bill Clin
tonKey
board
sWhee
lchair
Weapon
firing
Stretch
erPed
s+ca
rsStairs
Capito
l dome
Ski sla
lomA
vera
ge P
reci
sion
Best IBM Run1st runMaxMedianAverage
Manual
Interactive
016
97
154
41
55
39
86
96
54
69
64
85
194
67
41
46
115
19
118
106
54
162
22
IBM Research
© 2004 IBM Corporation
Comparison – Interactive vs. Manual Runs
0.00000.10000.20000.30000.40000.50000.60000.70000.80000.90001.0000
Total M
APBori
s Yelt
sinHock
ey
Sam don
aldso
n
Benjam
in Netan
yahu
Flood
Golf
Henry H
yde
Sadda
m usse
inTen
nisHorse
sBan
ners
Buildin
g on f
ireDog w
alkBicy
cles
Umbrella
sBill C
linton
Keyboa
rdsWhee
lchair
Weapon
firing
Stretch
erPed
s+ca
rsStairs
Capitol d
omeSki
slalom
InteractiveManual multimodalManual baseline
Topics are sorted by decreasing Manual AP order
IBM Research
© 2004 IBM Corporation
Conclusions
Browsing and shots elevation are essential in Interactive search
Relevance Feedback found very helpful in query formulation
Automatic search – concept acquisition, the most objective, comparable to manual, scales up.
Manual search – less objective, scales up, speech is still the most dominant part
Interactive search – very interactive, subjective, CBIR is very instrumental, may not scale up as well.
top related