implications of dual participation of floss developer

13
WoPDaSD ~. WoPDaSD ~.1 Sulayman K. Sowe, I. Samoladas, I. Stamelos, L. Angelis Dept. of Informatics, Aristotle University, Greece. [email protected] 3 rd International W orkshop o n P ublic D ata a bout S oftware D evelopment (WoPDaSD) 10th September 2008, Milan, Italy. Are FLOSS Developers Committing to CVS/SVN as much as they are Talking in Mailing Lists? Challenges for Integrating Data from Multiple Repositories This research is partially sponsored by the FLOSSMetrics Project (Ref. No. FP6-IST5-033547), http://flossmetrics.org/ and SQO-OSS project (Ref. No. FP6-IST-5-033331),http://www.sqo-oss.eu/

Upload: sksowe

Post on 14-Jun-2015

414 views

Category:

Technology


0 download

DESCRIPTION

Are FLOSS Developers Committing to CVS/SVN as much as they are Talking in Mailing Lists?Challenges for Integrating Data from Multiple Repositories

TRANSCRIPT

Page 1: Implications Of Dual Participation Of Floss Developer

WoPDaSD ~.WoPDaSD ~.11

Sulayman K. Sowe, I. Samoladas, I. Stamelos, L. AngelisDept. of Informatics, Aristotle University, Greece.

[email protected]

3rd International Workshop on Public Data about Software Development (WoPDaSD)10th September 2008, Milan, Italy.

Are FLOSS Developers Committing to CVS/SVN as much as they are Talking in Mailing Lists?

Challenges for Integrating Data from Multiple Repositories

This research is partially sponsored by the FLOSSMetrics Project (Ref. No. FP6-IST5-033547), http://flossmetrics.org/ and SQO-OSS project (Ref. No. FP6-IST-5-033331),http://www.sqo-oss.eu/

Page 2: Implications Of Dual Participation Of Floss Developer

WoPDaSD ~.WoPDaSD ~.22

In this presentation...➲ Nomadic life of FLOSS developers

Motivation for this research: Research hypothesis

➲ Methodology in brief

Data & Source Identification of developers from SVN & Lists

➲ Results & Discussion

➲ Summary & conclusion Ongoing research

Page 3: Implications Of Dual Participation Of Floss Developer

WoPDaSD ~.WoPDaSD ~.33

➲ Like the Fulani nomads of the West African planes FLOSS developers are not bound to a single territory and are free to:

Nomadic life of FLOSS developers

participate in other projects or communities, use and reuse software/bits of code from other projects, suggest, argue for or against requirements, specs., etc. in

projects where they have least commits rights, use different identities (usernames, email), etc.

Page 4: Implications Of Dual Participation Of Floss Developer

WoPDaSD ~.WoPDaSD ~.44

➲ Why research FLOSS developers or nomads? Understand the collaborative nature of developing FLOSS in

terms developer participation (code commits and email postings)

in multiple repositories - SVN and Mailing Lists.

➲ Research Hypothesis: IF Mailing lists are the main communication veins in most projects,

then CVS/SVN is a collection of arteries. Thus, FLOSS developers code and participate in lists discussions:

H0: ”FLOSS developers contribute equally to code repository and mailing lists”, alternative

H1: “FLOSS developers contribute more to code repository than mailing lists”.

Motivation for this research

Page 5: Implications Of Dual Participation Of Floss Developer

WoPDaSD ~.WoPDaSD ~.55

➲ Retrieve data from 14 projects from the Flossmetric retrieval system

Mailing lists data dumps (.sql file format) SVN data dumps (.sql file format)

Methodology…Data & Source

Page 6: Implications Of Dual Participation Of Floss Developer

WoPDaSD ~.WoPDaSD ~.66

➲ How many SVN commiters and Mailing Lists posters in each project?

Initial (Raw) Data

SVN Commits

ML Posts

Page 7: Implications Of Dual Participation Of Floss Developer

WoPDaSD ~.WoPDaSD ~.77

➲ The main problem in studying developers activities in multiple repositories is identification:

➲ Is committer A in SVN of project X the same person (Poster A) in mailing lists of project X?

Methodology…Identification of developers

Page 8: Implications Of Dual Participation Of Floss Developer

WoPDaSD ~.WoPDaSD ~.88

➲ The query result for each project gave us developers co-occurrence in both SVN and mailing list

➲ N=486 for all 14 projects. Percentage of developer in both repositories

In 8 projects = 57.14% In 4 projects = 90.11% In 2 projects = 80.21%

➲ What is going on in ibatis and turbine?

Results & Discussion…1

Page 9: Implications Of Dual Participation Of Floss Developer

WoPDaSD ~.WoPDaSD ~.99

➲ Distribution of Commits & Posts Domination of commits over posts Mean commit per developer > Mean post per developer Developers are committing more to SVN than they are posting to mailing lists,

EXCEPT in ibatis and turbine.

Results & Discussion...2

Page 10: Implications Of Dual Participation Of Floss Developer

WoPDaSD ~.WoPDaSD ~.1010

➲ Relationship between Commits and Posts➲ Overall correlation between commits and posts shows statistical significance

(with * and for p < 0.05).

Results & Discussion...3

Page 11: Implications Of Dual Participation Of Floss Developer

WoPDaSD ~.WoPDaSD ~.1111

➲ Developers contribution in terms of commits and posts Wilcoxon signed rank test applied on mean values shows almost 50-50 split

between projects where commits = posts (green) and commits > posts (yellow). With only the turbine project showing otherwise.

Results & Discussion...4

Page 12: Implications Of Dual Participation Of Floss Developer

WoPDaSD ~.WoPDaSD ~.1212

➲ FLOSS developers are coding as much as they are talking. They contribute equally to cod repositories and mailing lists, H0 supported.

➲ However, in almost all the projects, developers made more commits than posts, H1 supported.

➲ Why turbine and ibatis are outliers? Maybe the high prolific developer is making more posts than commits; in

a ratio 4:1. Something peculiar about the composition of Apache related projects

➲ Ongoing aspects of this research Automate data collection and identification process Analyze a total of 60 or more projects from the FM retrieval system. Add a quality dimension to committers variable:

Categorize commits: modifications, deletions, additions, code related, documentation (reports, readme, etc)

Time scale/Sliding frames: the evolution of commits and posts over a given period.

Summary & conclusion

Page 13: Implications Of Dual Participation Of Floss Developer

WoPDaSD ~.WoPDaSD ~.1313

Thank you for your attentionQuestions ?Comments

Suggestion for improvements