2010-03-10 parc augmented social cognition research overview
DESCRIPTION
This is an overview of the 3-year research works done at the Augmented Social Cognition research group at PARC. See blog at: http://asc-parc.blogspot.comTRANSCRIPT
Image from: http://www.flickr.com/photos/ourcommon/480538715/
Ed H. Chi, Area Manager Peter Pirolli, Lichan Hong, Bongwon Suh, Gregorio Convertino, Les Nelson, Rowan Nairn
Augmented Social Cognition Area Palo Alto Research Center
Interns: Sanjay Kairam, Jilin Chen, Michael Bernstein Alumni: Raluca Budiu, Bryan Pendleton, Niki Kittur, Todd Mytkowicz, Terrell Russell, Brynn Evans, Bryan Chan, KMRC students
1 2009-05-01 Ed H. Chi ASC Overview
14 years of work in foraging and sensemaking Information Scent
– WUFIS / IUNIS (Basic scent modeling algorithms) [CHI2000,2001]
– Bloodhound (Simulation of web navigation) [CHI2003] – LumberJack (Log analysis of user needs) [CHI2002]
Information Foraging – ScentTrails [TOCHI2003] – ScentIndex [CHI2004] – ScentHighlight [IUI2005] – Visual foraging of highlighted text [HCII]
Sensemaking – Visualization of Web Ecologies [CHI98] – Visualization Spreadsheets [Infovis97, Infovis99]
2 2009-05-01 Ed H. Chi ASC Overview
“Wikipedia is the best thing ever. Anyone in the world can write anything they want about any subject, so you know you’re getting the
best possible information.” – Steve Carell, The Office
3 2009-05-01 Ed H. Chi ASC Overview
4 2009-05-01 Ed H. Chi ASC Overview
Groups utilize systems to make sense and share complex topics and materials.
Wikipedia (social status) Slashdot (karma points) WikiHow.com Lostpedia.com
5 2009-05-01 Ed H. Chi ASC Overview
Systems that evolve structures that can be used to organize information.
Del.icio.us Flickr YouTube Friendster
6 2009-05-01 Ed H. Chi ASC Overview
Counting votes – A way to increase signal-‐to-‐noise ratio – Information faddishness
Examples: – Digg.com – Most bookmarked items on del.icio.us
– Estimating the weight of an ox or temperature of a room
– The true value of a stock
– PageRank or Hub / Authority algorithms
7 2009-05-01 Ed H. Chi ASC Overview
Voting systems Collaborative Co-Creation
Col. Information Structures
Naver
Heavier collaboration
Digg.com Wikipedia
Slashdot
eHow.com
Del.icio.us
IBM dogear PageRank
Flickr
8 2009-05-01 Ed H. Chi ASC Overview
Voting systems Collaborative Co-Creation
Col. Information Structures
Naver
Heavier collaboration
Digg.com Wikipedia
Slashdot
eHow.com
Del.icio.us
IBM dogear PageRank
Flickr
Understanding of micro-economics
• of foraging [PARC]
• Personal vs. group [Huberman, Adamic]
• Wisdom of Crowd [Surowieki]
• Information cascades [Anderson and Holt]
Understanding of conflicts and coordination
• Wikipedia coordination costs [PARC]
• Invisible Colleges [Sandstrom] • Interference effects [Pirolli] • Co-laboratories [Olson and
Olson] • Community networks / Col.
Problem solving [Carroll]
Understanding of info and social networks
• Tag network analysis [PARC, Golder, Yahoo]
• Structural holes (info brokerage) [Burt]
• Network constraints and structure [various]
• Semantic of semiotic structures / words [IR, LSA]
9 2009-05-01 Ed H. Chi ASC Overview
Cognition: the ability to remember, think, and reason; the faculty of knowing.
Social Cognition: the ability of a group to remember, think, and reason; the construction of knowledge structures by a group. – (not quite the same as in the branch of psychology that studies the
cognitive processes involved in social interaction, though included)
Augmented Social Cognition: Supported by systems, the enhancement of the ability of a group to remember, think, and reason; the system-‐supported construction of knowledge structures by a group.
Citation: Chi, IEEE Computer, Sept 2008
10 2009-05-01 Ed H. Chi ASC Overview
2009-05-01 11
Characteriza*on Models
Prototypes Evalua*ons
Ed H. Chi ASC Overview
12 2009-05-01 Ed H. Chi ASC Overview
Characteriza*on Models
Prototypes Evalua*ons
2009-05-01 13
60%
65%
70%
75%
80%
85%
90%
95%
100%
2001 2002 2003 2004 2005 2006
Perc
enta
ge o
f tot
al e
dits
Article
User
Article Talk
User Talk
Other
Maintenance
Ed H. Chi ASC Overview
Conflict is growing at the global level, and we have some idea about where it is.
But what defines conflict inside Wikipedia? Build a characterization model of article conflict
– Identify metrics relevant to conflict – Automatically identify high-‐conflict articles
14 2009-05-01 Ed H. Chi ASC Overview
“Controversial” tag
Use # revisions tagged controversial
15 2009-05-01 Ed H. Chi ASC Overview
Possible metrics for identifying conflict in articles
Metric type Page Type Revisions (#) Article, talk, article/talk Page length Article, talk, article/talk
Unique editors Article, talk, article/talk Unique editors / revisions Article, talk Links from other articles Article, talk
Links to other articles Article, talk Anonymous edits (#, %) Article, talk
Administrator edits (#, %) Article, talk Minor edits (#, %) Article, talk
Reverts (#, by unique editors) Article
16 2009-05-01 Ed H. Chi ASC Overview
5x cross-‐validation, R2 = 0.897
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Predicted controversial revisions
Actu
al c
ontr
over
sial r
evisi
ons
17 2009-05-01 Ed H. Chi ASC Overview
5x cross-‐validation, R2 = 0.897
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Predicted controversial revisions
Actu
al c
ontr
over
sial r
evisi
ons
18 2009-05-01 Ed H. Chi ASC Overview
Revisions (talk) Minor edits (talk) Unique editors (talk) Revisions (article) Unique editors (article) Anonymous edits (talk) Anonymous edits (article)
Highly weighted features of conflict model:
19 2009-05-01 Ed H. Chi ASC Overview
Revert: Undoing one or more edits – The page being restored to a version that
existed sometime previously. – Often used to fight vandalism
Revert ratio as resistance metric – # of reverted edits / # of total edit – This analysis excludes vandalism to model
“resistance”
Research Goal – How can we identify point of views between users? – Group people share a common point of view
Using revert as proxy for disagreement between users – Revert edits: 3,711,638 6.3 % of total edits – Due to vandalism: 577,643 0.99% of total edits (15.6% of reverts)
Force directed layout – Node: user, Edge: revert relationship
2009-05-01 21 Ed H. Chi ASC Overview
Group A
Group B Group C
Group D
Number of users in user group A B C Total
Users with Korean point of view 10 6 0 16
Users with Japanese point of view 1 8 7 16
Neutral or Unidentified 7 3 6 17
2009-05-01 22 Ed H. Chi ASC Overview
Mediators
Sympathetic to parents
Sympathetic to husband
Anonymous (vandals/spammers)
2009-05-01 23 Ed H. Chi ASC Overview
Monthly Ratio of Reverted Edits
25 2009-05-01 Ed H. Chi ASC Overview
Characteriza*on Models
Prototypes Evalua*ons
Encoding Retrieval
26
h:p://edge.org
“science research cogni*on”
h:p://www.ted.com/index.php/speakers
“video people talks technology”
2009-05-01 26 Ed H. Chi ASC Overview
Topics Concepts
Users Documents
Tags
T1…Tn Encoding Decoding
Noise
2009-05-01 27 Ed H. Chi ASC Overview
28 2009-05-01 Ed H. Chi ASC Overview
29 2009-05-01 Ed H. Chi ASC Overview
30 2009-05-01 Ed H. Chi ASC Overview
2009-05-01 Ed H. Chi ASC Overview 31
Source: Hypertext 2008 study on del.icio.us (Chi & Mytkowicz)
Bongwon Suh, Gregorio Convertino, Ed H. Chi, Peter Pirolli
2009-05-01 Ed H. Chi ASC Overview 32
Bongwon Suh, Gregorio Convertino, Ed H. Chi, Peter Pirolli. The Singularity is Not Near: Slowing Growth of Wikipedia. In Proc. of WikiSym 2009. Oct, 2009. Florida, USA
Monthly Edits
Monthly Active Editors
Edits beget edits – more number of previous edits, more number of new edits
€
N(t) = N0 ⋅ ert
€
dNdt
= r ⋅ N
Growth rate of population
Current population
Growth rate depends on current population size N and r = growth rate of the population
Ecological population growth model – r, growth rate of the population – K, carrying capacity (due to resource limitation)
€
dNdt
= r ⋅ N ⋅ (1− NK)
0
500000
1000000
1500000
2000000
2500000
3000000
3500000
4000000
2000 2002 2004 2006 2008 2010
Popu
latio
n
Year
K
http://en.wikipedia.org/wiki/Wikipedia:Modelling_Wikipedia’s_growth
Follows a logistic growth curve
New Article
Carrying Capacity as a function of time.
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
Popu
latio
n
Year
K(t)
2009-05-01 39
Characteriza*on Models
Prototypes Evalua*ons
Ed H. Chi ASC Overview
Create a Living Laboratory as a platform to develop, test, and market innovations
[Chi, HCIC workshop 2009, HCII 2009, IEEE Computer Sep/2008]
40 2009-05-01 Ed H. Chi ASC Overview
Joint work with Bongwon Suh, Aniket Kittur, Bryan Pendleton
Bongwon Suh, Ed H. Chi, Aniket Kittur, Bryan A. Pendleton. Lifting the Veil: Improving Accountability and Social Transparency in Wikipedia with WikiDashboard. In Proceedings of the ACM Conference on Human-‐factors in Computing Systems (CHI2008). ACM Press, 2008. Florence, Italy.
41 2009-05-01 Ed H. Chi ASC Overview
Social translucent for effective communication and collaboration [Erickson and Kellogg 2002] – Make socially significant information visible and salient – Support awareness of the rules and constraints – Accountability for actions
Wikis can be a prime candidate – Every edit is logged and retrievable – WikiScanner.com: analyze anonymous IP edits – WikiRage.com: top edits
42 2009-05-01 Ed H. Chi ASC Overview
2009-05-01 43 Ed H. Chi ASC Overview
2009-05-01 44 Ed H. Chi ASC Overview
2009-05-01 45 Ed H. Chi ASC Overview
Surfacing hidden social context to users For readers
– Any incidents in the past e.g. A sudden burst of edits? – Who are the top editors? – What is their motivation / point of views / expertise / topics of
interest? – Help them judging the quality/trustworthiness/usefulness of an
article.
For writers – Measure expertise / contribution / reputation – Motivate them to be more active / responsible (?)
46 2009-05-01 Ed H. Chi ASC Overview
3 x 2 x 2 design
Abortion
George Bush
Volcano
Shark
Pro-life feminism
Scientology and celebrities
Disk defragmenter
Beeswax
Controversial Uncontroversial
High quality
Low quality
Visualization • High stability • Low stability • Baseline
(none)
Users recruited via Amazon’s Mechanical Turk – 253 participants – 673 ratings – 7 cents per rating – Kittur, Chi, & Suh, CHI 2008: Crowdsourcing user studies
To ensure salience and valid answers, participants answered: – In what time period was this article the least stable? – How stable has this article been for the last month? – Who was the last editor? – How trustworthy do you consider the above editor?
1. Significant effect of visualization – High > low, p < .001
2. Both positive and negative effects – High > baseline, p < .001 – Low > baseline, p < .01
3. No effect of article uncertainty – No interaction of visualization
with either quality or controversy – Robust across conditions
Joint work with Rowan Nairn, Lawrence Lee
Kammerer, Y., Nairn, R., Pirolli, P., and Chi, E. H. 2009. Signpost from the masses: learning effects in an exploratory social tag search browser. In Proceedings of the 27th international Conference on Human Factors in Computing Systems (Boston, MA, USA, April 04 -‐ 09, 2009). CHI '09. ACM, New York, NY, 625-‐634.
2009-05-01 52 Ed H. Chi ASC Overview
Help understand the importance of: – social cues and information
exchanges – vocabulary problems – distribution and organization
2009-05-01 53 Ed H. Chi ASC Overview
54
3 kinds of search
navigational transactional
28% 13%
You know what you want and where it is You know what you want to do
Existing search engines are OK
informational
59%
You roughly know what you want
but don’t know how to find it
Difficult for existing search engines
Opportunity
2009-05-01 Ed H. Chi ASC Overview
• Synonyms • Misspellings • Morphologies
People use different tag words to express similar concepts.
Social Tagging Creates Noise
2009-05-01 55 Ed H. Chi ASC Overview
2009-05-01 56 Ed H. Chi ASC Overview
2009-05-01 57
Guide
Web
Howto
Tips Help
Tools
Tip
Tricks
Tutorial
Tutorials
Reference
Semantic Similarity Graph
Ed H. Chi ASC Overview
Spreading Activation in a bi-‐graph Computation over a very large data set
– 150 Million+ bookmarks
Tags URLs
P(URL|Tag)
P(Tag|URL)
2009-05-01 58 Ed H. Chi ASC Overview
Web Server
Search Results
UI Frontend
• Delicious • Ma.gnolia • Other social cues
Crawling
• Tuples of bookmarks
• [User, URL, Tags, Time]
Database • P(URL|Tag) • P(Tag|URL) • Bayesian Network Inference
MapReduce
• Pre-computed patterns in a fast index
Lucene • Serve up search results
• Well defined APIs
Web Server
• MapReduce: months of computa*on to a single day
• Development of novel scoring func*on
2009-05-01 59 Ed H. Chi ASC Overview
Exploratory interface users: – performed more queries, – took more time, – wrote better summaries (in 2/3 domains), – generated more relevant keywords (in 2/3 domains), and – had a higher cognitive load.
Suggestive of deeper engagement and better learning. Some evidence of scaffolding for novices in the keyword
generation and summarization tasks.
2009-05-01 60 Ed H. Chi ASC Overview
Joint work with Lichan Hong, Raluca Budiu, Les Nelson, Peter Pirolli
Lichan Hong, Ed H. Chi, Raluca Budiu, Peter Pirolli, and Les Nelson. SparTag.us: A Low Cost Tagging System for Foraging of Web Content. In Proceedings of the Advanced
Visual Interface (AVI2008), (to appear). ACM Press, 2008.
61 2009-05-01 Ed H. Chi ASC Overview
Interaction costs determine number of people who participate
Surplus of attention & motivation at small transaction costs
Therefore… Important to keep
interaction costs low
Cost of participation #
Peop
le w
illin
g to
pro
duce
for “
free
”
62 2009-05-01 Ed H. Chi ASC Overview
In situ tagging while reading – No new window – Clicking vs typing
Tagging + highlighting
63 2009-05-01 Ed H. Chi ASC Overview
Intuition: sub-‐doc nuggets useful – Entities, facts, concepts, paragraphs
Annotations attached to paragraphs Portable across pages and other contents (e.g.
Word documents) – Dynamic pages – Duplicate content
64 2009-05-01 Ed H. Chi ASC Overview
65 2009-05-01 Ed H. Chi ASC Overview
66 2009-05-01 Ed H. Chi ASC Overview
67 2009-05-01 Ed H. Chi ASC Overview
68 2009-05-01 Ed H. Chi ASC Overview
Without SparTag.us
(WS)
SparTag.us Only (SO)
SparTag.us With A
Friend (SF)
N=18 SparTag.us + Friend superior to both individual conditions No difference between the two controls
SF group, M=0.46, SD=0.22
SO group, M=0.13, SD=0.32
WS group, M=0.27, SD=0.23
[Nelson et al., CHI2009]
2009-05-01 69 Ed H. Chi ASC Overview
Collective Intelligence
2008-10-28 Ed H. Chi ASC Overview 70
Higher Productivity via Collective Intelligence
Intelligence that emerges from the collaboration and competition of many individuals
search
sharing
foraging
TagSearch: Mining social data for automatic data clustering and organization:
• Better organization via user-assigned tags
• Better UI for browsing interesting contents
• Recommendation instead of just search
Social Transparency create trust and attribution:
• Increase participation via attribution
• Increase credibility and trust with community feedback
• Reduce wiki risks
SparTag.us: sharing of interesting contents:
• A notebook that automatically organizes your reading
• Social sharing of important and interesting tidbits
• Viral sharing of highlighted and tagged paragraphs
Foundation: • Understanding of human
cognition and behavior • Data mining of social data • Modeling of consensus-
driven decision-making
Generic benefits: • Greater trust • Better decision-making • Useful sharing of info • Auto-organization thru
social data
Extracts data in the form of tuples from applications, e.g. (user, tag, URL) (user, activity, object)
Hadoop MapReduce, Pig, MySQL, Django, Java
Social Data Mining Platform
Pattern Operators, e.g., Tag Normalization, LDA Clustering,
Summarization, Voting Techniques…
Recommendations
Dashboard
Expertise Identification
Topic Identification
ASC is creating a plug-and-play platform to enable a number of applications in support of the Open Web Applications
Combine with other applications to create full products
App Connectors
App Connectors
App Connectors
App Connectors
…
Core Advantage
Crowdsourcing [collaborative co-‐creation] – Is there a wisdom of the crowd in Wikipedia? – How does conflict drive content creation?
Collective Intelligence [folksonomy] – Are social tags collectively gathered useful for organization of a large
document collection?
Collective Averaging [social attention] – Does voting systems identify the best quality and most interesting
information for that community?
Participation Architecture [interaction] – Does lowering the interaction cost barrier increase participation
productively?
Expertise finding [social networking] – Does getting experts through social network gets you to better quality
information sooner?
2009-05-01 72 Ed H. Chi ASC Overview
2009-05-01 Ed H. Chi ASC Overview 73
Image from: http://www.flickr.com/photos/ourcommon/480538715/
Research Vision: Understand how social computing systems can enhance the ability of a group of people to remember, think, and reason.
Living Laboratory: Create applications that harness collective intelligence to improve knowledge capture, transfer, and discovery.
http://asc-‐parc.blogspot.com http://www.edchi.net [email protected]
74 2009-05-01 Ed H. Chi ASC Overview