collaborative work beneath the surface
DESCRIPTION
Collaborative work beneath the surface. Visitors only look at article pages But much of Wikipedia comprised of other pages Conflict resolution, coordination, policies and procedures. Types of work. Talk, user, procedure. Article. Direct work Immediately consumable. Indirect work - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/1.jpg)
Collaborative work beneath the surface
• Visitors only look at article pages• But much of Wikipedia comprised of
other pages– Conflict resolution, coordination, policies and
procedures
![Page 2: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/2.jpg)
Types of work
Direct work Immediately consumable
Indirect workCoordination,
conflict
Maintenance work Reverts, vandalism
Article Talk, user, procedure
![Page 3: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/3.jpg)
Less direct work
• Decrease in proportion of edits to article page
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
2001 2002 2003 2004 2005 2006
Edi
t pr
opor
tion
70%
![Page 4: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/4.jpg)
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
2001 2002 2003 2004 2005 2006
Ed
it P
rop
ort
ion
More indirect work
• Increase in proportion of edits to user talk
8%
![Page 5: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/5.jpg)
More indirect work
• Increase in proportion of edits to user talk
• Increase in proportion of edits to procedure
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
2001 2002 2003 2004 2005 2006
Edi
t pr
opor
tion 11
%
![Page 6: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/6.jpg)
More maintenance work
• Increase in proportion of edits that are reverts
00.020.040.060.08
0.10.120.140.160.18
0.2
2001 2002 2003 2004 2005 2006
Ed
it p
rop
ort
ion
7%
![Page 7: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/7.jpg)
More wasted work
• Increase in proportion of edits that are reverts
• Increase in proportion of edits reverting vandalism
00.005
0.010.015
0.02
0.0250.03
2001 2002 2003 2004 2005
Ed
it p
rop
ort
ion
1-2%
![Page 8: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/8.jpg)
Global level
• Coordination costs are growing– Less direct work (articles)+ More indirect work (article talk, user,
procedure)+ More maintenance work (reverts, vandalism)
Kittur, Suh, Pendleton, & Chi, 2007
![Page 9: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/9.jpg)
Article lifespan
• How do articles change over time?• High discussion and coordination
– Kittur et al., 2007; Viegas et al., 2007
• When does this happen?– Hyp 1: Early when articles are growing– Hyp 2: Late when articles are more stable
![Page 10: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/10.jpg)
Article lifespan
![Page 11: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/11.jpg)
User lifespan
• How do users change over time?
![Page 12: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/12.jpg)
Centralization in Wikipedia
• How much centralization?• “Gang of 500” (Jimmy Wales, 2004)
– Small group of ~500 does half the work
• Masses do the work (Aaron Swartz, 2006)
– New users add most of the words
![Page 13: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/13.jpg)
Hypotheses
• Masses dominate• Elite privileged group• Shift from elites to masses
– Technology adoption (Rogers, 1962)
Masses Elites Shift
![Page 14: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/14.jpg)
Elites
• Admins• Editing status (fixed-size)• Editing status (scaling)
![Page 15: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/15.jpg)
Admins
• Waxing and waning of admin influence
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
2001 2002 2003 2004 2005 2006
Pro
port
ion
of to
tal e
dits
mad
e by
adm
ins
Nature News, 2/2007; Kittur, Chi, Pendleton, Suh, Mytkowicz, 2007
Pro
port
ion
of
all
ed
its
![Page 16: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/16.jpg)
Admins
• Similar for changed words
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
2001 2002 2003 2004 2005 2006
Pro
portio
n ch
ange
d w
ords
(adm
ins)
Pro
port
ion
of
word
s ch
an
ged
![Page 17: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/17.jpg)
Elites
• Admins• Editing status (fixed-size)• Editing status (scaling)
![Page 18: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/18.jpg)
Editing status (fixed size)
![Page 19: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/19.jpg)
Elites
• Admins• Editing status (fixed-size)• Editing status (scaling)
![Page 20: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/20.jpg)
Editing status (scaling)
• Proportional influence of elites still high– Though absolute number of elites growing
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
2001 2002 2003 2004 2005 2006
Pro
port
ion
of E
dits
Top 5%
Top 3%
Top 1%
![Page 21: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/21.jpg)
Summary: Centralization
• Centralized elite influence is waning– Decline in admin influence– Decline in data-driven “Gang of 500”
• Decentralized proportional influence remains high– Top 1/3/5% of users account for ~50/70/80%
of edits– The “Bourgeosie”
![Page 22: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/22.jpg)
Challenges for Wikipedia
• Coordination costs• Organization structure• Conflict
![Page 23: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/23.jpg)
Characterizing conflict
![Page 24: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/24.jpg)
Conflict at the article level
• What leads to conflict in articles?• Build a characterization model of article
conflict– Identify page features and metrics
associated with conflict– Automatically identify high-conflict articles
![Page 25: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/25.jpg)
Page metrics
• Chose metrics for identifying conflict in articles– Easily computable, scalable
Metric type Page Type
Revisions (#)Article, talk, article/talk
Page lengthArticle, talk, article/talk
Unique editorsArticle, talk, article/talk
Unique editors / revisions
Article, talk
Links from other articles Article, talk
Links to other articles Article, talk
Anonymous edits (#, %) Article, talk
Administrator edits (#, %)
Article, talk
Minor edits (#, %) Article, talk
Reverts (#, by unique editors)
Article
![Page 26: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/26.jpg)
Defining conflict
• Operational definition for conflict • Revisions tagged controversial
• Conflict revision count
![Page 27: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/27.jpg)
Machine learning
• Predict conflict from page metrics– Training set of “controversial” pages– Support vector machine regression
predicting # controversial revisions (SMOreg; Smola & Scholkopf, 1998)
• Not just conflict/no conflict, but how much conflict
![Page 28: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/28.jpg)
Performance: Cross-validation
• 5x cross-validation, R2 = 0.897
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Predicted controversial revisions
Act
ual c
ontrov
ersial
revi
sion
s
![Page 29: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/29.jpg)
Performance: Cross-validation
• 5x cross-validation, R2 = 0.897
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Predicted controversial revisions
Act
ual c
ontrov
ersial
revi
sion
s
![Page 30: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/30.jpg)
Determinants of conflict
1. —Revisions (talk)2. —Minor edits (talk)3. ˜Unique editors (talk)4. —Revisions (article)5. ˜Unique editors (article)6. —Anonymous edits (talk)7. ˜Anonymous edits (article)
Highly weighted metrics of conflict model:
![Page 31: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/31.jpg)
Identifying untagged articles
• Detect conflicts for unlabeled articles– Majority of articles have never been conflict
tagged
• Testing model generalization– Applied model to untagged articles– Sample of 28 articles rated by 13 expert
Wikipedians
• Significant positive correlation with predicted scores– By rank correlation, p < 0.013 (Spearman’s
rho)
![Page 32: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/32.jpg)
Characterizing conflict
![Page 33: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/33.jpg)
Conflict at the user level
• How can we identify conflict between users?
• Reverts between users as a proxy for user conflict
• Force directed layout to cluster users– Group similar viewpoints– Find conflicts between groups
![Page 34: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/34.jpg)
Dokdo/Takeshima opinion groups
Group A
Group B Group C
Group D
![Page 35: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/35.jpg)
Terry Schiavo
Mediators
Sympathetic to parents
Sympathetic to husband
Anonymous (vandals/spammers)
![Page 36: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/36.jpg)
Cognitive atlas
![Page 37: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/37.jpg)
Visualizing hypotheses
![Page 38: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/38.jpg)
Distributed collaboration
• Lots of people• Each doing a little bit of work• Leads to high quality outcome (i.e., “wisdom
of crowds”)
Francis Galton OxScale
![Page 39: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/39.jpg)
Distributed collaboration
• Applications of distributed collaboration:– Judging: weight of an ox, temperature of a
room– Search: Google PageRank– Predicting: Iowa Electronic Market, Las
Vegas, HP– Filtering: Digg, Reddit– Organizing: del.icio.us
• Common characteristics:– Independent judgments– Independent aggregation
![Page 40: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/40.jpg)
Wikipedia and the wisdom of crowds
• But these are not characteristic of Wikipedia:– Independent judgments– High coordination costs (Kittur et al., 2007)
– Independent aggregation – Competitive aggregation (everyone is editing
the same information)
• To the extent that judgments and aggregation of individual tasks are not independent and instead require coordination and engender conflict, having more editors may not be beneficial and may even be harmful
![Page 41: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/41.jpg)
Travesty of the commoners?
• Increasing size of group generally has negative consequences:– Increased coordination costs– Increased anonymity and social loafing– Decreased attribution and individual reward– More negative social relations– Greater conflict and misbehavior– Loss of control– Cognitive overload
see Bettenhausen, 1991; Levine & Moreland, 1990
![Page 42: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/42.jpg)
Wilkinson & Huberman, 2007
• Examined featured articles vs. non-featured articles– Controlling for PageRank (i.e., popularity)
• Featured articles = more edits, more editors
• “More work, better outcome”: WP similar to other distributed collaboration systems
Nature News (2/27/07)
![Page 43: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/43.jpg)
Problem: Distribution of work
• However, articles can have different distributions of work, even with same edits/editors
• If an article has 1000 edits and 100 editors, it could have:– 1 editor making 901 edits, 99 making 1 edit– 100 editors making 10 edits each
<>
![Page 44: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/44.jpg)
Capturing skew
• Gini coefficient: measures inequality of distribution
• Measure Gini coefficient for each article– Count how many edits each editor makes,
calculate ratio• If an article is driven by few, gini -> 1• If an article is driven by many, gini -> 0
http://en.wikipedia.org/wiki/Gini_coefficient
![Page 45: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/45.jpg)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 10 100 1000 10000 100000
Edits
Gin
i co
eff
icie
nt
Top15k Page hits
Featured
* Sig difference betw een featured (M=.46) and Top5k (M=.39) gini coeff icients (p < .0001), and betw een Top5k (M=.39) and 5-15k (M=.34, p < .0001)
Old results
![Page 46: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/46.jpg)
0
0.2
0.4
0.6
0.8
1
1 10 100 1000 10000 100000
Edits
Gin
i co
eff
icie
nt
Top15k Page hits
Featured
* Sig difference betw een featured (M=.46) and Top5k (M=.39) gini coeff icients (p < .0001), and betw een Top5k (M=.39) and 5-15k (M=.34, p < .0001)
P(Featured | Gini quintile)
![Page 47: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/47.jpg)
Probability of Being Featured
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 2 3 4 5
Gini quintile
P(F
eatu
red
)
ExpertsCrowds
![Page 48: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/48.jpg)
1
10
100
1000
10000
1 10 100 1000 10000 100000
Edits
Un
iqu
e e
dit
ors
Top15k Page hits
Featured
Unique editorsFeat vs. Top15k (M=381), p < .001***
Unique editors x Edits
![Page 49: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/49.jpg)
New results
• Sampled articles at a variety of quality levels– Defined and rated by expert Wikipedians– Hundreds of thousands of articles rated
![Page 50: Collaborative work beneath the surface](https://reader035.vdocument.in/reader035/viewer/2022062719/568131ca550346895d9831f6/html5/thumbnails/50.jpg)
Cross-sectional analysis
• 900 articles sampled from Start through Featured– Higher quality associated with higher gini,
higher editors
Average of artGini
0
0.1
0.2
0.3
0.4
0.5
0.6
Start-Class B-Class GA-Class A-Class FA-Class
Average of artEditors
050
100150200250300350400
Start-Class
B-Class GA-Class A-Class FA-Class
FA-Class
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 10 100 1000 10000
Number of article editors
Art
icle
Gin
i (0=
equa
l con
trib
s)
FA-Class
A-Class
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 10 100 1000 10000
Number of article editors
Art
icle
Gin
i (0=
equa
l con
trib
s)
A-Class
GA-Class
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 10 100 1000 10000
Number of article editors
Art
icle
Gin
i (0=
equa
l con
trib
s)
GA-Class
B-Class
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 10 100 1000 10000
Number of article editors
Art
icle
Gin
i (0=
equa
l con
trib
s)
B-Class
Start-Class
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 10 100 1000 10000
Number of article editors
Art
icle
Gin
i (0=
equa
l con
trib
s)
Start-Class