finding and re-finding through personalization
DESCRIPTION
Finding and Re-Finding Through Personalization. Jaime Teevan MIT, CSAIL. David Karger (advisor), Mark Ackerman, Sue Dumais, Rob Miller (committee), Eytan Adar, Christine Alvarado, Eric Horvitz, Rosie Jones, and Michael Potts. Thesis Overview. Supporting Finding How people find - PowerPoint PPT PresentationTRANSCRIPT
Finding and Re-Finding Through Personalization
Jaime Teevan
MIT, CSAIL
David Karger (advisor), Mark Ackerman, Sue Dumais, Rob Miller (committee), Eytan Adar, Christine Alvarado, Eric Horvitz, Rosie Jones, and Michael Potts
Thesis Overview
• Supporting Finding– How people find– Individual differences affect finding– Personalized finding tool
• Supporting Re-Finding– How people re-find– Finding and re-finding conflict– Personalized finding and re-finding tool
Old
New
Thesis Overview
• Supporting Finding– How people find– How individuals find– Personalized finding tool
• Supporting Re-Finding– How people re-find– Finding and re-finding conflict– Personalized finding and re-finding tool
Supporting Re-Finding
• How people re-find– People repeat searches– Look for old and new
• Finding and re-finding conflict– Result changes cause problems
• Personalized finding and re-finding tool– Identify what is memorable– Merge in new information
Supporting Re-Finding
• How people find– People repeat searches– Look for old and new
• Finding and re-finding conflict– Result changes cause problems
• Personalized finding and re-finding tool– Identify what is memorable– Merge in new information
Query log analysis
Memorability study
Re:Search Engine
Related Work
• How people re-find– Know a lot of meta-information [Dumais]
– Follow known paths [Capra]
• Changes cause problems re-finding– Dynamic menus [Shneiderman]
– Dynamic search result lists [White]
• Relevance relative to expectation [Joachims]
Query Log Analysis
• Previous log analysis studies– People re-visit Web pages [Greenberg]
– Query logs: Sessions [Jones]
• Yahoo! log analysis– 114 people over the course of a year– 13,060 queries and their clicks
• Can we identify re-finding behavior?
• What happens when results change?
Re-Finding Common
Repeat query
Repeat clickUnique click
40% 86%
33%
87% 38%
26%
of queries of queries
of queriesof queries
of repeat queries
of repeat queries
Change Reduces Re-Finding
• Results change rank
• Change reduces probability of repeat click– No rank change: 88% chance– Rank change: 53% chance
• Why?– Gone?– Not seen?– New results are better?
Change Slows Re-Finding
• Look at time to click as proxy for Ease
• Rank change slower repeat click– Compared with initial search to click– No rank change: Re-click is faster– Rank change: Re-click is slower
• Changes interfere with re-finding
?
Old
New
“Pick a card, any card.”
Case 1 Case 2 Case 3 Case 4 Case 5 Case 6
Your Card is GONE!
People Forget a Lot
Change Blindness
Change Blindness
Old
New
We still need magic!
Memorability Study
• Participants issued self-selected query
• After an hour, asked to fill out a survey
• 129 people remembered something
Memorability a Function of Rank
00.10.20.30.40.50.60.70.8
1 2 3 4 5 6 7 8 9 10
Rank - R
P(R
emem
|R,C
)
Clicked - C Not clicked
Remembered Results Ranked High
-2
0
2
4
6
8
10
12
-2 0 2 4 6 8 10 12
Actual Rank
Rem
embe
red
Ran
k
Old
New
result list 1
result list 2
…
result list n
Re:Search Engine Architecture
User client
Web browser
MergeIndex of past queries
Result cache
Search engine
User interaction cache
query result list
query 1
query 2
…
query n
score 1
score 2
…
score n
result list
Components of Re:Search Engine
• Index of Past Queries
• Result Cache
• User Interaction Cache
• Merge Algorithm
Index of past queries
queryquery 1
query 2
…
query n
score 1
score 2
…
score n
result list 1
result list 2
…
result list n
Result cache
query 1
query 2
…
query n
User interaction cache
result list 1
result list 2
…
result list n
Merge result list
result list
Index of Past Queries
• Studied how queries differ– Log analysis– Survey of how people remember queries
• Unimportant: case, stop words, word order
• Likelihood of re-finding deceases with time
• Get the user to tell us if they are re-finding– Encourage recognition, not recall
Index of past queries
queryquery 1
query 2
…
query n
score 1
score 2
…
score n
Merge Algorithm
• Benefit of New Information score– How likely new result is to be useful…– …In a particular rank
• Memorability score– How likely old result is to be remembered…– …In a particular rank
• Chose list maximizes memorability and benefit of new information
result list 1
result list 2
…
result list n
Merge result list
result list
Benefit of New Information
• Ideal: Use search engine score
• Approximation: Use rank
• Results that are ranked higher are more likely to be seen– Greatest benefit given to highly ranked results
being ranked highly
Memorability Score
• How memorable is a result?
• How likely is it to be remembered at a particular rank?
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
1 2 3 4 5 6 7 8 9 10
-2
0
2
4
6
8
10
12
-2 0 2 4 6 8 10 12
Choose Best Possible List
• Consider every combination
• Include at least three old and three new
• Min-cost network flow problem
…
…
…
…10
7
7
10
m2
m1
m10
b10
b2
b1
st
Old
New
Slots
Old
New
Evaluation
• Does merged list look unchanged?– List recognition study
• Does merging make re-finding easier?– List interaction study
• Is search experience improved overall?– Longitudinal study
List Interaction Study
• 42 participants
• Two sessions a day apart – 12 tasks each session
• Tasks based on queries• Queries selected based on log analysis
– Session 1– Session 2
• Re-finding• New-finding
(“stomach flu”)
(“Symptoms of stomach flu?”)
(“Symptoms of stomach flu?”)(“What to expect at the ER?”)
List Interaction Study
New 1
New 2New 3New 4
New 5New 6
Old 5New 1Old 1Old 7New 2New 3New 4Old 4New 5New 6
Old
New
Experimental Conditions
• Six re-finding tasks– Original result list– Dumb merging– Intelligent merging
• Six new-finding tasks– New result list– Dumb merging– Intelligent merging
Old
New
Experimental Conditions
• Six re-finding tasks– Original result list– Dumb merging– Intelligent merging
• Six new-finding tasks– New result list– Dumb merging– Intelligent merging
Old 1Old 2Old 4New 1New 2New 3New 4New 5New 6Old 10
Old 1Old 2Old 4
Old 10
Measures
• Performance– Correct– Time
• Subjective– Task difficulty– Result quality
Experimental Conditions
• Six re-finding tasks– Original result list– Dumb merging– Intelligent merging
• Six new-finding tasks– New result list– Dumb merging– Intelligent merging
Faster, fewer clicks, more correct answers, and easier!
Similar to Session 1
Results: Re-Finding
Performance Original Dumb Intelligent
% correct 96%
Time (seconds)
99% 88%
38.7 45.670.9
Results: Re-Finding
Subjective Original Dumb Intelligent
% correct 99% 88% 96%
Time (seconds) 38.7 70.9 45.6
Task difficulty 1.57
Result quality 3.61 3.42 3.70
1.531.79
Results: Re-Finding
Original Dumb Intelligent
% correct 99% 88% 96%
Time (seconds) 38.7 70.9 45.6
Task difficulty 1.57 1.79 1.53
Result quality 3.61 3.42 3.70
List same?
• Intelligent merging better than Dumb
• Almost as good as the Original list
Similarity
60% 76%76%
Results: New-Finding
Performance New Dumb Intelligent
% correct 73% 74% 84%
Time (seconds) 139.3 120.5153.8
Results: New-Finding
Subjective New Dumb Intelligent
% correct 73% 74% 84%
Time (seconds) 139.3 153.8 120.5
Task difficulty 2.51 2.72 2.61
Result quality 3.193.38 2.94
Results: New-Finding
New Dumb Intelligent
% correct 73% 74% 84%
Time (seconds) 139.3 153.8 120.5
Task difficulty 2.51 2.72 2.61
Result quality 3.38 2.94 3.19
List same?
• Knowledge re-use can help
• No difference between New and Intelligent
Similarity
38% 50% 61%
Results: Summary
• Re-finding– Intelligent merging better than Dumb– Almost as good as the Original list
• New-finding– Knowledge re-use can help– No difference between New and Intelligent
• Intelligent merging best of both worlds
Conclusion
• How people re-find– People repeat searches– Look for old and new
• Finding and re-finding conflict– Result changes cause problems
• Personalized finding and re-finding tool– Identify what is memorable– Merge in new information
Future Work
• Improve and generalize model– More sophisticated measures of memorability– Other types of lists (inboxes, directory listings)
• Effectively use model– Highlight change as well as hide it
• Present change at the right time– This talk’s focus: what and how– What about when to display new information?
Thesis Overview
• Supporting Finding– How people find– How individuals find– Personalized finding tool
• Supporting Re-Finding– How people re-find– Finding and re-finding conflict– Personalized finding and re-finding tool
David Karger (advisor), Mark Ackerman, Sue Dumais, Rob Miller (committee), Eytan Adar, Christine Alvarado, Eric Horvitz, Rosie Jones, and Michael Potts
Thank You!
Jaime Teevan