chi2009 mrtaggy tag-based search browser intro and evaluation
DESCRIPTION
CHI2009 research talk on MrTaggy: a Tag-based Search Browser, contains Introduction of the system and Evaluation of the interfaceTRANSCRIPT
Information Foraging: Tuesday, 9:00 AM - 10:30 AM An Elementary Social Information Foraging Model Peter Pirolli Remembrance of Things Tagged: How Tagging Effort Affects Tag Production and Human Memory Raluca Budiu, Peter Pirolli, Lichan Hong Signpost from the Masses: Learning Effects in an Exploratory Social Tag Search Browser Yvonne Kammerer, Rowan Nairn, Peter Pirolli, Ed H. Chi
Studying Wikipedia: Wednesday, 11:30 AM - 1:00 PM So You Know You’re Getting the Best Possible Information: A Tool that Increases Wikipedia Credibility Peter Pirolli, Evelin Wollny, Bongwon Suh What's in Wikipedia? Mapping Topics and Conflict Using Socially Annotated Category Structure Aniket Kittur, Ed H. Chi, Bongwon Suh
Social Search and Sensemaking: Wednesday, 4:30 PM - 6:00 PM Annotate Once, Appear Anywhere: Collective Foraging for Snippets of Interest Using Paragraph Fingerprinting Lichan Hong, Ed H. Chi With a Little Help from My Friends: Examining the Impact of Social Annotations in Sensemaking Tasks Les Nelson, Christoph Held, Peter Pirolli, Lichan Hong, Diane Schiano, Ed H. Chi
Signpost from the Masses: Learning Effects in an Exploratory Social Tag Search Browser
Yvonne Kammerer*, Rowan Nairn, Peter Pirolli, Ed H. Chi
Contact: Ed H. Chi, Ph.D. Manager, Augmented Social Cognition Area [email protected]
Palo Alto Research Center
* Intern from Knowledge Media Research Center, Germany
2 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Social Search Survey
[Brynn Evans, Ed H. Chi, CSCW2008]
Help understand the importance of: – social cues and information
exchanges – vocabulary problems – distribution and
organization
3 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
TagSearch Exploratory Focus
3
3 kinds of search
navigational transactional
28% 13%
You know what you want and where it is You know what you want to do
Existing search engines are OK
informational
59%
You roughly know what you want
but don’t know how to find it
Difficult for existing search engines
Opportunity
4 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Research Motivation
Social search systems: – Search and exploration services informed by human/group
judgments and attention data. – Social bookmarks and tags is a rich source of this data.
Key Problems: – Coverage and participation – Tag keyword ambiguity – Spam and noise
– Chris Sherman, http://searchenginewatch.com/showPage.html?page=3623153
5 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com 5
Using Information Theory to Model Social Tagging [Ed H. Chi, Todd Mytkowicz, Hypertext 2008]
TopicsConcepts
UsersDocuments
TagsT1…Tn EncodingDecoding
Noise
6 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
I(Doc; Tag)
Tags contain less information about documents and vice versa over time
Source: del.icio.us (Chi & Mytkowicz, Hypertext2008)
7 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
• Synonyms • Misspellings • Morphologies
People use different tag words to express similar concepts.
Social Tagging Creates Noise
8 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
9 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Guide
Web
Howto
Tips Help
Tools
Tip
Tricks
Tutorial
Tutorials
Reference
Semantic Similarity Graph
Use Semantic Analysis to Reduce Noise
10 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
MapReduce Implementation
Spreading Activation in a bigraph Computation over a very large data set
– 150 Million+ bookmarks
Tags URLs
P(URL|Tag)
P(Tag|URL)
11 CHI2009 MrTaggy TagSEarch– © 2008 Palo Alto Research Center Inc.
TagSearchArchitecture
WebServer
SearchResults
UIFrontend
• Delicious• Ma.gnolia• Othersocialcues
Crawling
• Tuplesofbookmarks• [User,URL,Tags,Time]
Database• P(URL|Tag)• P(Tag|URL)• BayesianNetworkInference
MapReduce
• Pre‐computedpaRernsinafastindex
Lucene• Serveupsearchresults• WelldefinedAPIs
WebServer
• MapReduce:monthsofcomputaVontoasingleday
• DevelopmentofnovelscoringfuncVon
12 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Interlude: A Word on Exploratory Search
User lack sufficient knowledge to define the problem and search space -- ill-structured [Marchionini, 2006]
Novices vs. experts – A problem may be ill-structured for a novice; – But it’s well-structured for a seasoned expert. – Implication: Experts might get less benefit from an
exploratory search system.
13 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Baseline Interface
14 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Exploratory Interface
15 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Experiment Design
2 interface x 3 task domain design – 2 Interface (between-subjects)
» Exploratory vs. Baseline – 3 task domains (within-subjects)
» Future Architecture, Global Warming, Web Mashups
30 Subjects (22 male, 8 female) – Intermediate or advanced computer and web search skills – Half assigned Exploratory, half Baseline.
For each domain, single block with 3 task types: – Easy and Difficult Page Collection Task [6min each] – Summarization Task [12min] – Keyword Generation Task [2min]
16 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Page Collection Tasks [6min each]
17 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Summarization Tasks [12min each]
18 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Procedure [2 hours]
Prior Knowledge Test 1st Task Domain
– With easy and difficult page collection tasks, summarization and keyword generation task.
– NASA cognitive load questionnaire
2nd Task Domain – Same battery of tasks and cognitive load questionaire
3rd Task Domain Experimental Survey
19 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Results: Interaction Behaviors
Number of Queries – Effect of Interface on number of queries (p < .01)
» Exploratory (M=7.81) > Baseline (M=3.77)
Time Taken – Effect of Interface on time taken (p < .01)
» Exploratory (7.7min) > Baseline (6.6min)
20 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Results: Page Collection Task
– Effects of Task Domain (p<.01) and Task Difficulty (p<.05) – Interaction effect of Interface by Task Domain (p<.05), with
Exploratory interface performing better in the Web Mashup domain – For relevance scores, similar patterns.
Measure of # of pages collected
21 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Results: Summarization Tasks
– Quality of summarization scored (Cohen’s Kappa=0.7)
– ANCOVA with Prior Knowledge as covariate
– Exploratory Interface scored higher in Future Architecture (p<.05) and Global Warming (p<.05)
– For Web Mashup, Prior Knowledge correlated positively with performance (r=.51)
22 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Results: Keyword Generation Tasks
– ANCOVA showed Exploratory > Baseline for Future Architecture (p<.05) and Web Mashups (p<.01), but not for Global Warming.
– Linear model between PK and # of keyword generated for Baseline showed mean slope = 0.32 and significant (p<.05)
23 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Results: Cognitive Load
– Exploratory > Baseline (p<.05)
24 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Discussion
Exploratory interface users: – performed more queries, – took more time, – wrote better summaries (in 2/3 domains), – generated more relevant keywords (in 2/3 domains), and – had a higher cognitive load.
Suggestive of deeper engagement and better learning.
Some evidence of scaffolding for novices in the keyword generation and summarization tasks.
25 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Limitations
Minimum control for domain expertise: – Lack depth in the implication for performance.
Pre-defined task domains: – Lack ecological validity.
26 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Summary
Harnessing user-generated tags to enrich content for social search
Weaknesses of social tagging systems is Tag Noise and Inconsistency – Difficult to leverage for search – Use data mining techniques to normalize and reduce noise – Apply normalized tag data in new search algorithm
Study suggest deeper user engagement in exploration and better learning with MrTaggy
27 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com
Thanks!
Try it now! http://mrtaggy.com
http://spartag.us http://wikidashboard.parc.com
Contact: Ed H. Chi, Ph.D. Manager, Augmented Social Cognition Area [email protected]
Our Blog: http://asc-parc.blogspot.com
Information Foraging: Tuesday, 9:00 AM - 10:30 AM An Elementary Social Information Foraging Model Peter Pirolli Remembrance of Things Tagged: How Tagging Effort Affects Tag Production and Human Memory Raluca Budiu, Peter Pirolli, Lichan Hong Signpost from the Masses: Learning Effects in an Exploratory Social Tag Search Browser Yvonne Kammerer, Rowan Nairn, Peter Pirolli, Ed H. Chi
Studying Wikipedia: Wednesday, 11:30 AM - 1:00 PM So You Know You’re Getting the Best Possible Information: A Tool that Increases Wikipedia Credibility Peter Pirolli, Evelin Wollny, Bongwon Suh What's in Wikipedia? Mapping Topics and Conflict Using Socially Annotated Category Structure Aniket Kittur, Ed H. Chi, Bongwon Suh
Social Search and Sensemaking: Wednesday, 4:30 PM - 6:00 PM Annotate Once, Appear Anywhere: Collective Foraging for Snippets of Interest Using Paragraph Fingerprinting Lichan Hong, Ed H. Chi With a Little Help from My Friends: Examining the Impact of Social Annotations in Sensemaking Tasks Les Nelson, Christoph Held, Peter Pirolli, Lichan Hong, Diane Schiano, Ed H. Chi
http://wordle.net
Cognition:theabilitytoremember,think,andreason;thefacultyofknowing.
SocialCognition:theabilityofagrouptoremember,think,andreason;theconstructionofknowledgestructuresbyagroup.– (notquitethesameasinthebranchofpsychologythatstudiesthe
cognitiveprocessesinvolvedinsocialinteraction,thoughincluded)
AugmentedSocialCognition:Supportedbysystems,theenhancementoftheabilityofagrouptoremember,think,andreason;thesystem‐supportedconstructionofknowledgestructuresbyagroup.
Citation:EdH.Chi.TheSocialWeb:OpportunitiesforResearch.IEEEComputer,Sept2008
29 2008-11-07 Ed H. Chi ASC Overview
Collective Intelligence
30
Higher Productivity via Collective Intelligence
Intelligence that emerges from the collaboration and competition of many individuals
search
sharing
foraging
TagSearch: Mining social data for automatic data clustering and organization:
• Better organization via user-assigned tags
• Better UI for browsing interesting contents
• Recommendation instead of just search
Social Transparency create trust and attribution:
• Increase participation via attribution
• Increase credibility and trust with community feedback
• Reduce wiki risks
SparTag.us: sharing of interesting contents:
• A notebook that automatically organizes your reading
• Social sharing of important and interesting tidbits
• Viral sharing of highlighted and tagged paragraphs
Foundation: • Understanding of human
cognition and behavior • Data mining of social data
Generic benefits: • Greater trust • Better decision-making • Useful sharing of info • Auto-organization thru
social data