community dynamics in open source software projects: aging and social reshaping
TRANSCRIPT
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-1 This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Community Dynamics in Open Source Software Projects: Aging and Social Reshaping
Anna Hannemann and Ralf Klamma RWTH Aachen University
Advanced Community Information Systems (ACIS) [email protected]
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-2
Motivation for Study Settings
Address interdisciplinary projects (Bioinformatics) – Biology meets Computer Science – High disparities in level of development experience – Better approximation for end-user integration in
community information systems (Lead User1, Open Innovation2, etc.)
Analysis of long-tail: based on mailing lists Dynamic analysis: community evolution
– Demographic perspective – Social structure perspective
1 von Hippel, E. “Lead users: a source of novel product concepts”, 1986
2 Chesbrough, H. “Open Innovation: The new imperative for creating and profiting from technology”, 2003
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-3
Open Bio*
BioJava, Biopython, BioPerl Similar problems, infrastructure, organizational issues Open Bioinformatics Foundation Long-term: over 13 years Project* #Messages #User in ML #Commits #Developers LOC
BioJava 11951 2208 8267 94 290608
Biopython 16108
1138 16868
143 249566
BioPerl 31755
2824 12848
139 383351
* [Data on May 20, 2013]
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-4
Newcomers vs. Survived Users BioJava
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-5
Newcomers vs. Survived Users Biopython
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-6
Newcomers vs. Survived Users BioPerl
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-7
External Factors
High attention to Bioinformatics due to sequencing of human genome
Cross-project influence: rich get richer
Personal aspects: – doing PhD for 3 years – being in a project with room for OSS
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-8
Population Ecology
Year of birth t0i: date of the first message from user i to project mailing list
Age group (x; x+1): all currently active project participants participating in the project for min x and max x+1 years
Currently active: at least one posting to the mailing lists in current year
Survival rate (x; x+1)è(x+1; x+2): percentage of active users in age group (x; x+1) in the last year, who still active in the current year
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-9
Population Ecology, Example 2010
Age groups – (0,1) people started in 2010 – (1,2) people started in 2009, still active in 2010 – (2,3) people started in 2008, still active in 2010 – …
Survival rates – |(1,2)|2010/|(0,1)|2009 – |(2,3)|2010/|(1,2)|2009 – |(3,4)|2010/|(2,3)|2009 – ...
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-10
Demographic Forecast
P0 = 0;1( )! (1;2)"# $% & 20%
P1 = 1;2( )! 2;3( )"# $% & 40%
Pm = x; x +1( )! (x +1; x + 2)"# $% & 90%,'x >1
Power-law distribution of survival rates Rebirth phenomena
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-11
Conclusions and Discussion (1)
Survival pattern: – Prediction of minimal number of newcomers required to
support the same level of participation
– Longer than three years survives only 7.2% – Who saves over three years, stays “forever”
No maximal participation duration – Number of “oldies” increases continuously – Possible seclusion against newcomers
newcomert+1! 0;1( )t *0.2+ 1;2( )t *0.4+… x; x +1( )t *0.9,"x >1
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-12
Social Network Analysis Social Network (1 for each year)
– Nodes: Email Participants – Relations: Same thread
Shortest path Diameter Node betweenness Largest connected component Density Transitivity Edge betweenness clustering
Biopython Network for 2012
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-13
Social Measures
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-14
Dynamic of Diameter
2001 2002 2003 2004 2005 2006 2007 2008 2009 20105
6
7
8
9
10
11
12
Years
Dia
mete
r
Dynamic of Diameter
BioJavaBiopythonBioPerl
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-15
Dynamic of Max Betweenness
2001 2002 2003 2004 2005 2006 2007 2008 2009 20100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Years
Maxi
mal B
etw
eenness
Norm
aliz
ed
Dynamic of Maximal Betweenness
BioJava
Biopython
BioPerl
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-16
BioJava, Social Network, 2006
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-17
Biopython, Social Network, 2005
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-18
BioPerl, Social Network, 2004
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-19
BioJava, Core Evolution
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-20
Biopython, Core Evolution
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-21
BioPerl, Core Evolution
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-22
Development Evolution, Biopython
TeLLNet
Lehrstuhl Informatik 5 (Information Systems)
Prof. Dr. M. Jarke I5-FL-MMYY-23
Conclusion and Discussion (2) Core evolution
– Evolves strongly – Core generations (ca. 5 year periods) – Dangerous for the whole project – Defines organizational principles – Can be predicted by combination of diameter and max
betweenness Threats to Validity
– Evolution step size (year to year, release to release, etc.) – Scientist driven OSS – Construct validity: quality of data; network construction – Internal validity: observation – explanation