mining the modern code review repositories: a dataset of people, process and product (msr 2016)
TRANSCRIPT
Mining the Modern Code Review Repositories:
A Dataset of People, Process and Product
Xin Yang Raula G. KulaNorihiro Yoshida Hajimu Iida
May 14–15, 2016. Austin, Texas
MSR 2016 data showcase
Osaka University
Japan
Nagoya University
Japan
NAISTJapan
NAISTJapan
An Overview of the Code Review Dataset
1
● Code Review
● Source Code
● Human / Social
Why we made this dataset?
2
*Hamasaki et al., “Who does what during a code review? datasets of OSS peer review repositories”. MSR '13
Our JSON-based Dataset
(Hamasaki et al. MSR'13)*
Our previous work (Hamasaki et al.
MSR '13)*
Why we made this dataset?
2
*Hamasaki et al., “Who does what during a code review? datasets of OSS peer review repositories”. MSR '13
Our JSON-based Dataset
(Hamasaki et al. MSR'13)*
Some feedback:“Hard to query...”“Hard to convert...”“Unable to access the source code...”
Our previous work (Hamasaki et al.
MSR '13)*
Why we made this dataset?
2
*Hamasaki et al., “Who does what during a code review? datasets of OSS peer review repositories”. MSR '13
Our JSON-based Dataset
(Hamasaki et al. MSR'13)*
Some feedback:“Hard to query...”“Hard to convert...”“Unable to access the source code...”
Script
Typical Modern Code Review Process
3
Process
Product
People
You can mine from three different aspects
3
4 years 3 years 7 years 4 years 3 years
611 20 567 111 189
173,749 13,597 63,610 110,17
2 9,168
5,091 437 3,334 1,437 759
Dataset Statistics (updated to May 2015)
4
</></
></>
goo.gl/Wi4UoJ
5
Download the Dataset
Get Your Copy Now!!!