chi2009 mrtaggy tag-based search browser intro and evaluation

30
Information Foraging: Tuesday, 9:00 AM - 10:30 AM An Elementary Social Information Foraging Model Peter Pirolli Remembrance of Things Tagged: How Tagging Effort Affects Tag Production and Human Memory Raluca Budiu, Peter Pirolli, Lichan Hong Signpost from the Masses: Learning Effects in an Exploratory Social Tag Search Browser Yvonne Kammerer, Rowan Nairn, Peter Pirolli, Ed H. Chi Studying Wikipedia: Wednesday, 11:30 AM - 1:00 PM So You Know You’re Getting the Best Possible Information: A Tool that Increases Wikipedia Credibility Peter Pirolli, Evelin Wollny, Bongwon Suh What's in Wikipedia? Mapping Topics and Conflict Using Socially Annotated Category Structure Aniket Kittur, Ed H. Chi, Bongwon Suh Social Search and Sensemaking: Wednesday, 4:30 PM - 6:00 PM Annotate Once, Appear Anywhere: Collective Foraging for Snippets of Interest Using Paragraph Fingerprinting Lichan Hong, Ed H. Chi With a Little Help from My Friends: Examining the Impact of Social Annotations in Sensemaking Tasks Les Nelson, Christoph Held, Peter Pirolli, Lichan Hong, Diane Schiano, Ed H. Chi Signpost from the Masses: Learning Effects in an Exploratory Social Tag Search Browser Yvonne Kammerer*, Rowan Nairn, Peter Pirolli, Ed H. Chi Contact: Ed H. Chi, Ph.D. Manager, Augmented Social Cognition Area [email protected] Palo Alto Research Center * Intern from Knowledge Media Research Center, Germany

Upload: ed-chi

Post on 14-Jun-2015

1.398 views

Category:

Technology


3 download

DESCRIPTION

CHI2009 research talk on MrTaggy: a Tag-based Search Browser, contains Introduction of the system and Evaluation of the interface

TRANSCRIPT

Page 1: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

Information Foraging: Tuesday, 9:00 AM - 10:30 AM An Elementary Social Information Foraging Model Peter Pirolli Remembrance of Things Tagged: How Tagging Effort Affects Tag Production and Human Memory Raluca Budiu, Peter Pirolli, Lichan Hong Signpost from the Masses: Learning Effects in an Exploratory Social Tag Search Browser Yvonne Kammerer, Rowan Nairn, Peter Pirolli, Ed H. Chi

Studying Wikipedia: Wednesday, 11:30 AM - 1:00 PM So You Know You’re Getting the Best Possible Information: A Tool that Increases Wikipedia Credibility Peter Pirolli, Evelin Wollny, Bongwon Suh What's in Wikipedia? Mapping Topics and Conflict Using Socially Annotated Category Structure Aniket Kittur, Ed H. Chi, Bongwon Suh

Social Search and Sensemaking: Wednesday, 4:30 PM - 6:00 PM Annotate Once, Appear Anywhere: Collective Foraging for Snippets of Interest Using Paragraph Fingerprinting Lichan Hong, Ed H. Chi With a Little Help from My Friends: Examining the Impact of Social Annotations in Sensemaking Tasks Les Nelson, Christoph Held, Peter Pirolli, Lichan Hong, Diane Schiano, Ed H. Chi

Signpost from the Masses: Learning Effects in an Exploratory Social Tag Search Browser

Yvonne Kammerer*, Rowan Nairn, Peter Pirolli, Ed H. Chi

Contact: Ed H. Chi, Ph.D. Manager, Augmented Social Cognition Area [email protected]

Palo Alto Research Center

* Intern from Knowledge Media Research Center, Germany

Page 2: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

2 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Social Search Survey

[Brynn Evans, Ed H. Chi, CSCW2008]

  Help understand the importance of: –  social cues and information

exchanges –  vocabulary problems –  distribution and

organization

Page 3: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

3 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

TagSearch Exploratory Focus

3

3 kinds of search

navigational transactional

28% 13%

You know what you want and where it is You know what you want to do

Existing search engines are OK

informational

59%

You roughly know what you want

but don’t know how to find it

Difficult for existing search engines

Opportunity

Page 4: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

4 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Research Motivation

  Social search systems: –  Search and exploration services informed by human/group

judgments and attention data. –  Social bookmarks and tags is a rich source of this data.

  Key Problems: –  Coverage and participation –  Tag keyword ambiguity –  Spam and noise

–  Chris Sherman, http://searchenginewatch.com/showPage.html?page=3623153

Page 5: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

5 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com 5

Using Information Theory to Model Social Tagging [Ed H. Chi, Todd Mytkowicz, Hypertext 2008]

TopicsConcepts

UsersDocuments

TagsT1…Tn EncodingDecoding

Noise

Page 6: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

6 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

I(Doc; Tag)

  Tags contain less information about documents and vice versa over time

Source: del.icio.us (Chi & Mytkowicz, Hypertext2008)

Page 7: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

7 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

•  Synonyms •  Misspellings •  Morphologies

People use different tag words to express similar concepts.

Social Tagging Creates Noise

Page 8: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

8 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Page 9: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

9 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Guide

Web

Howto

Tips Help

Tools

Tip

Tricks

Tutorial

Tutorials

Reference

Semantic Similarity Graph

Use Semantic Analysis to Reduce Noise

Page 10: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

10 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

MapReduce Implementation

  Spreading Activation in a bigraph   Computation over a very large data set

–  150 Million+ bookmarks

Tags URLs

P(URL|Tag)

P(Tag|URL)

Page 11: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

11 CHI2009 MrTaggy TagSEarch– © 2008 Palo Alto Research Center Inc.

TagSearchArchitecture

WebServer

SearchResults

UIFrontend

• Delicious• Ma.gnolia• Othersocialcues

Crawling

• Tuplesofbookmarks• [User,URL,Tags,Time]

Database• P(URL|Tag)• P(Tag|URL)• BayesianNetworkInference

MapReduce

• Pre‐computedpaRernsinafastindex

Lucene• Serveupsearchresults• WelldefinedAPIs

WebServer

•  MapReduce:monthsofcomputaVontoasingleday

•  DevelopmentofnovelscoringfuncVon

Page 12: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

12 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Interlude: A Word on Exploratory Search

  User lack sufficient knowledge to define the problem and search space -- ill-structured [Marchionini, 2006]

  Novices vs. experts –  A problem may be ill-structured for a novice; –  But it’s well-structured for a seasoned expert. –  Implication: Experts might get less benefit from an

exploratory search system.

Page 13: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

13 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Baseline Interface

Page 14: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

14 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Exploratory Interface

Page 15: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

15 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Experiment Design

  2 interface x 3 task domain design –  2 Interface (between-subjects)

»  Exploratory vs. Baseline –  3 task domains (within-subjects)

»  Future Architecture, Global Warming, Web Mashups

  30 Subjects (22 male, 8 female) –  Intermediate or advanced computer and web search skills –  Half assigned Exploratory, half Baseline.

  For each domain, single block with 3 task types: –  Easy and Difficult Page Collection Task [6min each] –  Summarization Task [12min] –  Keyword Generation Task [2min]

Page 16: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

16 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Page Collection Tasks [6min each]

Page 17: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

17 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Summarization Tasks [12min each]

Page 18: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

18 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Procedure [2 hours]

  Prior Knowledge Test   1st Task Domain

–  With easy and difficult page collection tasks, summarization and keyword generation task.

–  NASA cognitive load questionnaire

  2nd Task Domain –  Same battery of tasks and cognitive load questionaire

  3rd Task Domain   Experimental Survey

Page 19: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

19 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Results: Interaction Behaviors

  Number of Queries –  Effect of Interface on number of queries (p < .01)

»  Exploratory (M=7.81) > Baseline (M=3.77)

  Time Taken –  Effect of Interface on time taken (p < .01)

»  Exploratory (7.7min) > Baseline (6.6min)

Page 20: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

20 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Results: Page Collection Task

–  Effects of Task Domain (p<.01) and Task Difficulty (p<.05) –  Interaction effect of Interface by Task Domain (p<.05), with

Exploratory interface performing better in the Web Mashup domain –  For relevance scores, similar patterns.

Measure of # of pages collected

Page 21: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

21 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Results: Summarization Tasks

–  Quality of summarization scored (Cohen’s Kappa=0.7)

–  ANCOVA with Prior Knowledge as covariate

–  Exploratory Interface scored higher in Future Architecture (p<.05) and Global Warming (p<.05)

–  For Web Mashup, Prior Knowledge correlated positively with performance (r=.51)

Page 22: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

22 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Results: Keyword Generation Tasks

–  ANCOVA showed Exploratory > Baseline for Future Architecture (p<.05) and Web Mashups (p<.01), but not for Global Warming.

–  Linear model between PK and # of keyword generated for Baseline showed mean slope = 0.32 and significant (p<.05)

Page 23: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

23 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Results: Cognitive Load

–  Exploratory > Baseline (p<.05)

Page 24: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

24 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Discussion

  Exploratory interface users: –  performed more queries, –  took more time, –  wrote better summaries (in 2/3 domains), –  generated more relevant keywords (in 2/3 domains), and –  had a higher cognitive load.

  Suggestive of deeper engagement and better learning.

  Some evidence of scaffolding for novices in the keyword generation and summarization tasks.

Page 25: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

25 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Limitations

  Minimum control for domain expertise: –  Lack depth in the implication for performance.

  Pre-defined task domains: –  Lack ecological validity.

Page 26: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

26 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Summary

  Harnessing user-generated tags to enrich content for social search

  Weaknesses of social tagging systems is Tag Noise and Inconsistency –  Difficult to leverage for search –  Use data mining techniques to normalize and reduce noise –  Apply normalized tag data in new search algorithm

  Study suggest deeper user engagement in exploration and better learning with MrTaggy

Page 27: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

27 CHI2009 MrTaggy TagSearch– © 2008 Palo Alto Research Center Inc. Try it now: http://mrtaggy.com

Thanks!

Try it now! http://mrtaggy.com

http://spartag.us http://wikidashboard.parc.com

Contact: Ed H. Chi, Ph.D. Manager, Augmented Social Cognition Area [email protected]

Our Blog: http://asc-parc.blogspot.com

Page 28: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

Information Foraging: Tuesday, 9:00 AM - 10:30 AM An Elementary Social Information Foraging Model Peter Pirolli Remembrance of Things Tagged: How Tagging Effort Affects Tag Production and Human Memory Raluca Budiu, Peter Pirolli, Lichan Hong Signpost from the Masses: Learning Effects in an Exploratory Social Tag Search Browser Yvonne Kammerer, Rowan Nairn, Peter Pirolli, Ed H. Chi

Studying Wikipedia: Wednesday, 11:30 AM - 1:00 PM So You Know You’re Getting the Best Possible Information: A Tool that Increases Wikipedia Credibility Peter Pirolli, Evelin Wollny, Bongwon Suh What's in Wikipedia? Mapping Topics and Conflict Using Socially Annotated Category Structure Aniket Kittur, Ed H. Chi, Bongwon Suh

Social Search and Sensemaking: Wednesday, 4:30 PM - 6:00 PM Annotate Once, Appear Anywhere: Collective Foraging for Snippets of Interest Using Paragraph Fingerprinting Lichan Hong, Ed H. Chi With a Little Help from My Friends: Examining the Impact of Social Annotations in Sensemaking Tasks Les Nelson, Christoph Held, Peter Pirolli, Lichan Hong, Diane Schiano, Ed H. Chi

http://wordle.net

Page 29: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

  Cognition:theabilitytoremember,think,andreason;thefacultyofknowing.

  SocialCognition:theabilityofagrouptoremember,think,andreason;theconstructionofknowledgestructuresbyagroup.–  (notquitethesameasinthebranchofpsychologythatstudiesthe

cognitiveprocessesinvolvedinsocialinteraction,thoughincluded)

  AugmentedSocialCognition:Supportedbysystems,theenhancementoftheabilityofagrouptoremember,think,andreason;thesystem‐supportedconstructionofknowledgestructuresbyagroup.

Citation:EdH.Chi.TheSocialWeb:OpportunitiesforResearch.IEEEComputer,Sept2008

29 2008-11-07 Ed H. Chi ASC Overview

Page 30: CHI2009 MrTaggy Tag-based Search Browser Intro and Evaluation

Collective Intelligence

30

Higher Productivity via Collective Intelligence

Intelligence that emerges from the collaboration and competition of many individuals

search

sharing

foraging

TagSearch: Mining social data for automatic data clustering and organization:

•  Better organization via user-assigned tags

•  Better UI for browsing interesting contents

•  Recommendation instead of just search

Social Transparency create trust and attribution:

•  Increase participation via attribution

•  Increase credibility and trust with community feedback

•  Reduce wiki risks

SparTag.us: sharing of interesting contents:

•  A notebook that automatically organizes your reading

•  Social sharing of important and interesting tidbits

•  Viral sharing of highlighted and tagged paragraphs

Foundation: •  Understanding of human

cognition and behavior •  Data mining of social data

Generic benefits: •  Greater trust •  Better decision-making •  Useful sharing of info •  Auto-organization thru

social data