parc floss-wikipedia

20
From libre software to Wikipedia: From libre software to Wikipedia: A tour of open collaboration A tour of open collaboration Felipe Ortega Libresoft, Universidad Rey Juan Carlos e-mail: [email protected] Twitter | Identi.ca: @jfelipe Xerox PARC June 14, 2011 By Diego GrezCC-BY-SA 3.0, Wikimedia Commons

Upload: felipe-ortega

Post on 29-Nov-2014

1.174 views

Category:

Technology


5 download

DESCRIPTION

Research on open collaboration: Free/Libre/Open Source Software and Wikipedia.

TRANSCRIPT

Page 1: Parc floss-wikipedia

From libre software to Wikipedia:From libre software to Wikipedia:A tour of open collaborationA tour of open collaboration

Felipe Ortega Libresoft, Universidad Rey Juan Carlose-mail: [email protected] Twitter | Identi.ca: @jfelipe

Xerox PARCJune 14, 2011

By Diego GrezCC-BY-SA 3.0, Wikimedia Commons

Page 2: Parc floss-wikipedia

© 2011 Felipe Ortega.

Some rights reserved.

This document is licensed under a

Creative Commons Attribution-ShareAlike 3.0 Unported License

(Logos on first slide are (TM) of their respective organizations)

Page 3: Parc floss-wikipedia

Open collaborationOpen collaboration

Page 4: Parc floss-wikipedia

“Think of how Wikipedia works, how Amazon harnesses user annotation on its site, the way photo-sharing sites like Flickr are bleeding out into other applications... We're entering an era in which software learns from its users and all of the users are connected”.

Tim O'Reilly.TIME Magazine, 24 October 2005.

By Felipe Ortega, CC-BY-SA 3.0

Page 5: Parc floss-wikipedia

In the beginning...In the beginning...

● ...all started with “real programmers” and FLOSS.

● FSF, GNU, free licenses.

● Open source goes into industry.

● Libre software becomes ubiquitous.

● However

● Crowdsourced ! = Open source

● Much betters if results encourage reusing and

distribution of derivative works.

Page 6: Parc floss-wikipedia

The “paradox” of open collaborationThe “paradox” of open collaboration

“Wikipedia is the best thing ever. Anyone in the world can write anything they want about any subject, so you know you are getting the best possible information.”.

Michael Scott (played by Steve Carell)The Office, "The Negotiation" [3.18], 5 April 2007

Page 7: Parc floss-wikipedia

3 lessons from libre software3 lessons from libre software

● Onion model.

● Generational relay.

● Lasting participation.By El_T, Public Domain,

from Wikimedia Commons

Page 8: Parc floss-wikipedia

Onion modelOnion model

The Social Structure of Free and Open Source Software DevelopmentCrowston & Howison, 2005

Page 9: Parc floss-wikipedia

Generational relayGenerational relay

Robles, González-Barahona. Contributor Turnover in Libre Software Projects. OSS 2006.

Page 10: Parc floss-wikipedia

Lasting participationLasting participation

● Robles, González-Barahona and Michlmayr.

Evolution of Volunteer Participation in Libre Software

Projects: Evidence from Debian. OSS 2005.

Half-life ratio = 7.5 years!

+50% maintainers in Debian 2.0 still present in Debian 3.1

Page 11: Parc floss-wikipedia

Thesis. Wikipedia: A quantitative Thesis. Wikipedia: A quantitative analysis.analysis.

● Apply lessons from libre software to under-

stand open collaborative process in Wikipedia.

● Content production.

● Effort distribution.

● Implications for quality.

● Participation and sustainability.

Page 12: Parc floss-wikipedia

Tool: WikiXRayTool: WikiXRay

Compressed DB dumps

Wikimedia DownloadCenter

Downloaddumps

Preparation fordata mining

Results evaluation

Local MySQLServer

Analysis (scripts + GNU R)

WIKIXRAY

Automated analysis of Wikipedia dumps.http://git.libresoft.es/WikiXRay

Page 13: Parc floss-wikipedia

New articles created in WikipediaNew articles created in Wikipedia

Entered steady-state in 2006, before graph of monthly edits

became stable (2007)

Page 14: Parc floss-wikipedia

Interaction: talk pagesInteraction: talk pages

EN DE FR PL JA NL IT PT ES SV0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

no-talktalk

0.0086% (old talk pages deleted)

Page 15: Parc floss-wikipedia

Contributions per editorContributions per editor

● Upper truncated Pareto distribution.

● Limit in max. number of revisions by human editors.

● Better to have more editors rather than increasing contributions per editor.

Page 16: Parc floss-wikipedia

Effort distribution: Gini coefficientEffort distribution: Gini coefficient

Page 17: Parc floss-wikipedia

Monthly effort distribution WikipediaMonthly effort distribution Wikipedia

Constant over the whole history!

Ortega, F., González-Barahona, J., Robles, G.On the inequality of contributions to Wikipedia.HICSS 2008.

Page 18: Parc floss-wikipedia

Profile editors in Featured ArticlesProfile editors in Featured Articles

● Most Featured Articles are at least 1,000 days old.

● 10 times more editors in FAs than in non-FAs,

almost 200 times in EN (!!).

● FAs reviewed by significantly older authors

(+3 years actively contributing to Wikipedia).

FAs non-FAs

Page 19: Parc floss-wikipedia

The Digital PotlatchThe Digital Potlatch

● Book with J. Rodríguez (in Spanish).

● Ed. Cátedra, expected September 2011.

● Interdisciplinary.

● Anthropology + Engineering.

● Meritocracy in Wikipedia.

● Effort recognition.

● Motivations.

● Implications for quality.Public Domain, from Wikimedia Commons

Page 20: Parc floss-wikipedia

Future lines of workFuture lines of work

● Study causes of change in evolution patterns and reverts.

● “The singularity is not near”

ASC @PARC, WikiSym 2009.

● Edit diffs to study contribution patterns.

● Different types of content.

● Cross-relation with traffic patterns.

By Bios, CC-BY-SA 3.0, from Wikimedia Commons