loren terveen computer science & engineering the university of minnesota august 2011 1

57
Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 If You Build It? Benefits and Costs of Creating Your Own Online Community 1

Post on 19-Dec-2015

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Loren TerveenComputer Science & Engineering

The University of MinnesotaAugust 2011

If You Build It? Benefits and Costs of Creating Your

Own Online Community

1

Page 2: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

TheorySimulationLab studiesSurveysQualitative studiesBuild and learn

(e.g., Google, Facebook, Wikipedia)Build To Learn

Background: ways of knowing

Page 3: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Build to learn

GroupLens Research• Create new interaction /

social computing techniques• Do empirical, quantitative

research• Learn from what we and

others build

Page 4: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

DataExperimental Control

To answer the kinds of research questions we like to ask, we need:

Page 5: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

1. Learning from others’ data2. Learning from our own data3. Exercising experimental control

The rest of the talk

Page 6: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Q&A systemsWikipedia

1. Learning from others’ data

Page 7: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

WP:Clubhouse? An Exploration of Wikipedia’s Gender Imbalance. Lam, S.K., Uduwage, A., Dong, Z., Sen, S., Musicant, D.R., Terveen, L., Riedl, J. WikiSym 2011.

NICE: Social translucence through UI intervention. A. Halfaker, B. Song, D. A. Stuart, A. Kittur and J. Riedl. Wikisym 2011.

Don't bite the Newbies: How Reverts Affect the Quantity and Quality of Wikipedia Work. A. Halfaker, A. Kittur and J. Riedl. Wikisym 2011.

Mentoring in Wikipedia: A Clash of Cultures. D. Musicant, Y. Ren, J. Johnson and J. Riedl. Wikisym 2011.

The Effects of Group Composition on Decision Quality in a Social Production Community, Lam, S.K., Karim, J., Riedl, J. Group 2010.

The Effects of Diversity on Group Productivity and Member Withdrawal in Online Volunteer Groups, Chen, J., Ren, Y., Riedl, J. CHI 2010.

rv you're dumb: Identifying Discarded Work in Wiki Article History, Ekstrand, M.D., Riedl, J.T. Wikisym 2009.

A Jury of Your Peers: Quality, Experience and Ownership in Wikipedia, Halfaker, A., Kittur, N., Kraut, R., Riedl, J. Wikisym 2009.

Is Wikipedia Growing a Longer Tail?, Lam, S.K., Riedl, J. Group 2009. Wikipedians are born, not made: a study of power editors on Wikipedia, Panciera, K., Halfaker,

A., Terveen, L. Group 2009. SuggestBot: Using Intelligent Task Routing to Help People Find Work in Wikipedia, Cosley, D.,

Frankowski, D., Terveen, L., Riedl, J. IUI 2007. Creating, Destroying, and Restoring Value in Wikipedia, Priedhorsky, R., Chen, J., Lam, S.K.,

Panciera, K., Terveen, L., Riedl, J. Group 2007.

GroupLens Wikipedia Research

Page 8: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

WP:Clubhouse? An Exploration of Wikipedia’s Gender Imbalance. Lam, S.K., Uduwage, A., Dong, Z., Sen, S., Musicant, D.R., Terveen, L., Riedl, J.

www.grouplens.org/node/466

Page 9: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

9

http://www.nytimes.com/2011/01/31/business/media/31link.html?_r=1&src=busln

A topic generally restricted to teenage girls, like friendship bracelets, can seem short at four paragraphs when compared with lengthy articles on something boys might favor, like, toy soldiers or baseball cards, whose voluminous entry includes a detailed chronological history of the subject.

(BTW, it’s not about the friendship bracelets)

Trigger…

Page 10: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Only 16% of new editors joining Wikipedia during 2009 identified themselves as women

Women made only 9% of the edits by this cohortNew women editors are more likely to stop editing

and leave Wikipedia when their edits are revertedTopics of particular interest to women appear to

get less (and poorer) coverage in Wikipedia

(Hmm… maybe Wikipedia has a low collective IQ!)

Come to Wikisym to get the details!

Findings

Page 11: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

2. Learning from our own data

Page 12: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

MovieLensCyclopath

GroupLens online communities

Page 13: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1
Page 14: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

200 Union St SE

Lagoon Theatre

Page 15: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1
Page 16: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

How do contributors to open content systems become contributors?

Inspired by…

Research Question

Page 17: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Wikipedians fill different niches than non-Wikipedians

Wikipedians branch out to new areas and topics as they mature

Wikipedians take on more “community work” as they mature

Becoming WikipedianBryant, Forte, & Bruckman 2005

Qualitative study with nine participants self-reporting

Page 18: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Evidence for “becoming”?

Our goal: test these findings quantitatively

Quantity of workQuality of workNature of work

Are Wikipedians Born or Made?

Page 19: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

A registered editor with 250+ edits over his/her lifetime

Wikipedian

If editors reach 250 edits within our data set, they are labeled Wikipedian from the

beginning

Page 20: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

DataEnglish Wikipedia dump (January 13, 2008)Edits from bots and other non-human means removed

We counted: Only registered editors Wikipedians (users with 250+ edits) - 38K Non-wikipedians - random sample of 38K

Edits per day per editor(“User days”)

(“Day 1”)

Page 21: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Quantity

Page 22: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1
Page 23: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1
Page 24: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1
Page 25: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Quantity

Is a user’s fate sealed?

Born MadeWikipedians are

Page 26: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Measure: Persistent Word Revisions (PWRs)Proportion of words added that persist five

revisions

Quality

Page 27: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1
Page 28: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1
Page 29: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1
Page 30: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Quality

Other quality metrics?

Born MadeWikipedians are

Page 31: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Conjecture: Wikipedians take on community maintenance work over time

Several ways to formalizeEditing in “talk” (and other) namespaces

(Nope: still “born”)Referring to “community norms” (Wikipedia

policies) to explain edits

Nature of Work

Page 32: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1
Page 33: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1
Page 34: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1
Page 35: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Community

Learning norms vs. learning to appeal to the norms?

Training: effective editing

Born MadeWikipedians are

Page 36: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Common pattern: Initial burst of activity, decline, steady state

Wikipedians look different from day one Little evidence for “Becoming Wikipedian”: Wikipedians

are born, not made Can we reconcile? This is depressing!

Possible responses: Early interventions Change the culture Systemic initiatives, e.g., APS Wikipedia Initiative: http://

www.psychologicalscience.org/index.php/members/aps-wikipedia-initiative

Accept the reality of the long tail

Summary of findings

Page 37: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

We can’t ask Wikipedia users about our interpretations

What if the learning happened before users registered?

But: methodological worries

Page 38: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

As of September 2009, we identified:1172 “unambiguous” users

268 of these users made some edits440 “ambiguous” users

For unambiguous usersDay 1 = First time a user came to the site (not

the day they registered)

Cyclopath: viewing and pre-registration activities are visible

Page 39: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Same pattern as for Wikipedia

Page 40: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

0

50

100

150

200

250

300

Do Not Edit Do Edit

# o

f u

sers

And few users edited before registration

Page 41: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Some viewing before registration

# of users

0

100200

300

400

500600

700

800

0 1-50 51-100 101-250 251-500 501-1000

1001+

A minute or two

<= 5 min. <= 15<= 30 <= 60

Page 42: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

But amount of viewing before registration (or before editing) does not predict subsequent behavior

“Born, Not Made” still seems true

Page 43: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

FollowupsCyclopath user surveys – Wikisym 2011 paper

Why these patterns?What ‘triggers’ initial contribution?And how might we nurture ongoing

participation?Cyclopath contextual interviews

planned

Page 44: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

3. Exercising Experimental Control

Page 45: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Motivating participation: How can we get more work done in open content systems?

Idea: match users with tasks they’re likely to be interested in and capable of doing

Requirements:Introduce tasks matching algorithms/interfacesAssign users to different conditionsGather data necessary for evaluationSurvey users

Research Question

Page 46: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Get work done Nurture new

usersServe community

Goals

Recommender algorithms

Interaction design

ToolsCollective Effort Model

Social Influence

Theory

Intelligent Task

Routing

Page 47: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

MovieLens

Task: Edit movie

content

Page 48: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Four strategies to suggest movies to a user

High Pred(individual value of outcomes)

Pick movies the system thinksthe user will really like

Rare Rated(lower effort for a given performance)

Pick movies the user has ratedthat few others have

Needs Work (contribution matters to group)

Pick movies that are missingthe most information

Random(baseline)

Pick random movies

theory-based

Page 49: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Assign ML users to four groups, one per algorithmAbout 2,000 subjects, 200 contributors

Count # editors, contributions, fields

The experiment

Page 50: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Editing behavior by strategy

0

50

100

150

200

250

Number of editors Number of edits Fields filled in

Metric

Co

un

t

HighPred

RareRated

NeedsWork

Random

Rare rated: dominantNeeds work: bang for buckRandom: not bad hereHigh prediction: lousy

Page 51: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Task matching workedFamiliarity of user with task was most helpful

Reduces effortIncreases value

Note: we’ve tried this approach in Wikipedia and Cyclopath, tooDifferent issuesGenerality

Summary of findings

Page 52: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

MovieLens14 years of continuous developmentSeveral complete software architecture / UI

redos (and another needed!)1 full-time software engineerMuch graduate student time over the years

That’s great, but is there a catch?

Page 53: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

~140K lines of code, in multiple languages1 full-time software engineerGrad students: expectation they will spend

25-30% of their time on ‘development’ tasksLooming tasks:

UI redesign / reimplementationExpanding geographic coverage

Cyclopath

Page 54: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Significant resources devoted to developmentBut: typically enables new experiments and/or

builds the user communityAnd: funding for these resources often came only

due to the success of the system/community

Adding it up

Fewer papersBut: papers of a type that would be

impossible otherwiseWe can investigate questions in different

settings, applying different methods: cumulative science

Page 55: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

Cycloplan (in collab. with Metropolitan Council)Planners can develop ideas informed by usage data (“What if

I add a trail here?”)Planners can share plans with publicPublic can explore plans, give feedback (“How much would

my route be improved with this trail?”)Public can share concerns directly to relevant officials

Participatory Crowdsourcing (in collab. with IBM)Citizens as sensorsContinua of participation; incentives

Models for participation in open content systemsRoles, privileges, processes: Nupedia vs. WikipediaModels for volunteer participation

Initial vs. ongoing

Towards TMSP

Page 56: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

http://www.grouplens.org/biblio

All references listed at

Page 57: Loren Terveen Computer Science & Engineering The University of Minnesota August 2011 1

The GroupLens Research Group, particularly:John RiedlJoe KonstanReid PriedhorskyDan CosleyKatie Panciera

And:Tom Erikcson, IBM

Me:[email protected]: @lorenterveen

Thanks to…