a study of the characteristics of developers′ activities in github

19
A Study of the Characteristics of Developers′ Activities in GitHub 2013/12/2 Saya Onoue, Hideaki Hata, Ken-ichi Matsumoto Software Engineering Laboratory Nara Institute of Science and Technology in Japan

Upload: naistis

Post on 22-Apr-2015

808 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: A Study of the Characteristics of Developers′ Activities in GitHub

A Study of the Characteristics of

Developers′ Activities in GitHub

2013/12/2

Saya Onoue, Hideaki Hata, Ken-ichi Matsumoto

Software Engineering Laboratory

Nara Institute of Science and Technology in Japan

Page 2: A Study of the Characteristics of Developers′ Activities in GitHub

What kind of developers are there?

2http://techiferous.com/2011/08/are-you-a-good-programmer/

http://www.techrepublic.com/blog/10-things/10-types-of-programmers-youll-encounter-in-the-field/

Cowboy

Fast worker

Sorcerer

Derive a program

Ninja

Hidden MVP

Page 3: A Study of the Characteristics of Developers′ Activities in GitHub

This Study

Goal

– Understand the different types of developers and

the ways that they make their contributions

Method

– Collect data from GitHub

– Analyze that data to investigate the characteristics

of the activities of real developers.

3

Page 4: A Study of the Characteristics of Developers′ Activities in GitHub

What is GitHub?

• Web-based hosting service for software

development projects that use the Git revision

control system.

– Social networking service for developers

• GitHub API

– We can collect various events about developers′

activities.

4

Page 5: A Study of the Characteristics of Developers′ Activities in GitHub

GitHub Developers′ Activities

5

• Code – related

– Create, Delete, Push, Fork,

PullRequest,

• Comment – related

– CommitComment,

PullRequestReviewComment,

IssueComment

• Issue – related*

– Issues

Code Issue

CommitComment

PullRequest

ReviewComment

Issue

Comment

Dev

Create

Delete

Push

Fork

PullRequest

Issues

*: contribute problems or questions

Page 6: A Study of the Characteristics of Developers′ Activities in GitHub

The data collection procedure

I. Select Projects

– Select active project for this study in GitHub

II. Identify Developers

– Identify developers in the selected project

III. Extract activity events

– Using the GitHub APIs, get specific events for the

developers

6

Page 7: A Study of the Characteristics of Developers′ Activities in GitHub

I. Select Projects

• We selected two projects for this study

– They were active projects on GitHub

7

Project Language Commits Forks Contributors

node JavaScript 8,974 4,572 447

jQuery JavaScript 5,270 4,587 168

Page 8: A Study of the Characteristics of Developers′ Activities in GitHub

II. Identify Developers

8

• We limited the contributors to developers

who have made more than 100 commits.

– node: 9 developers

– jQuery: 10 developers

Page 9: A Study of the Characteristics of Developers′ Activities in GitHub

III. Extract Activity Events• GitHub has 18 different types of activity events

• We selected 8 events since others occur rarely

• collect last 300 events

– Events of all projects that participateEvents Outline

Create created repository, branch, or tag.

Delete deleted branch or tag.

PullRequest request a change in the repository

Push upload a change history

CommitComment make statement Commit

IssueComment make statement Issues

PullRequestReviewComment make statement PullRequest

Issues contribute problems or questions

coding

commenting

9

Page 10: A Study of the Characteristics of Developers′ Activities in GitHub

The Research Questions

1. What events do developers cause?

2. When do developers work?

3. How much do developers work?

10

Page 11: A Study of the Characteristics of Developers′ Activities in GitHub

1. What events do developers cause?1/2

11Many commenting Many coding Balanced events

Coding, Commenting, and Issue Handling

Page 12: A Study of the Characteristics of Developers′ Activities in GitHub

1. What events do developers cause?2/2

• Different developers contribute differently

• There are various contributors in the projects

12

node N jQuery N

Code 2 Code 3

Comment 1 Comment 2

Code,Comment 4 Code/comment 2

Code,Comment,Issues 2 Code,Comment,Issues 3

Page 13: A Study of the Characteristics of Developers′ Activities in GitHub

2. When do developers work?1/2

13Dev 4

Dev 2

Dev 3

Dev 1

Employed

developers

Workdays

Page 14: A Study of the Characteristics of Developers′ Activities in GitHub

2. When do developers work?2/2

• Some work every day, and some work on a

specific day

• Some work on weekdays, They are actually

hired for this work

14

Page 15: A Study of the Characteristics of Developers′ Activities in GitHub

3.How much do developers work?1/2

15

A few weeks

Over a year

A few months

Frequencies of Activities (300 events)

Page 16: A Study of the Characteristics of Developers′ Activities in GitHub

3. How much do developers work?2/2

• There are large differences in frequency of

activities between developers

• We can find developers who are working actively,

and have contributed to the project for a long time

16

node N jQuery N

Days 1 Days 1

Weeks 3 Weeks 2

Months 3 Months 6

Years 2 Years 1

Page 17: A Study of the Characteristics of Developers′ Activities in GitHub

A summary of developer characteristics

17

Project Developer Majority Activities Frequencies

node

Dev 1 Code,Comments,Issues node project Years

Dev 2 Code,Comments node and others Weeks

・・・

Dev 3 Code node project Weeks

jQuery

Dev 4 Code,Comments Only Other projects Months

・・・

Dev 5 Comments jQuery project Months

Dev 6 Code,Comments,Issues Only other project A years

• Some of the characteristics of developers’

activities

Page 18: A Study of the Characteristics of Developers′ Activities in GitHub

Conclusion and Future work

• We analyzed the developers′ activities in GitHub.

And found some characteristics of developers.

– Different developers contribute differently

– Some work every day, and a specific day

– There is a large difference in activity frequencies by

developers

• Future work

– Detailed analysis with more data

– Project management based on developers’

characteristics 18

Page 19: A Study of the Characteristics of Developers′ Activities in GitHub