mining software repositories: using humans to better software

18
Mining Software Repositories: Using Humans to Better Software Marat Akhin 15/06/2015 Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 1 / 18

Upload: marat-akhin

Post on 30-Jul-2015

71 views

Category:

Education


2 download

TRANSCRIPT

Page 1: Mining Software Repositories: Using Humans to Better Software

Mining Software Repositories: Using Humans to BetterSoftware

Marat Akhin

15/06/2015

Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 1 / 18

Page 2: Mining Software Repositories: Using Humans to Better Software

What is MSR?

What is MSR?

Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 2 / 18

Page 3: Mining Software Repositories: Using Humans to Better Software

What is MSR?

Mining software repositories

Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 3 / 18

Page 4: Mining Software Repositories: Using Humans to Better Software

What is MSR?

Mining software repositories

Understand empirical aspects of software development

Use the past to guide the future

Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 4 / 18

Page 5: Mining Software Repositories: Using Humans to Better Software

What is MSR?

MSR data

Historical data

Version control systems: CVS, SVN, Git, Mercurial

Bug trackers: Bugzilla, JIRA, YouTrack

Communication: e-mails, chat logs, wiki pages

Execution data

Execution traces

Deployment logs

Crash dumps

Source code data

Source code itself

Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 5 / 18

Page 6: Mining Software Repositories: Using Humans to Better Software

What is MSR?

MSR methods

Classification

aka Supervised learning

Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 6 / 18

Page 7: Mining Software Repositories: Using Humans to Better Software

What is MSR?

MSR methods

Clustering

aka Unsupervised learning

Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 7 / 18

Page 8: Mining Software Repositories: Using Humans to Better Software

What is MSR?

MSR methods

Statistical hypothesis testing

Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 8 / 18

Page 9: Mining Software Repositories: Using Humans to Better Software

What is MSR?

MSR insights

Quality assurance

Architecture analysis

Bug prediction

Developer feedback

You-name-it!

Marat Akhin Mining Software Repositories: Using Humans to Better Software 15/06/2015 9 / 18

Page 10: Mining Software Repositories: Using Humans to Better Software

Can we predict bugs?

Can we predict bugs?

Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 10 / 18

Page 11: Mining Software Repositories: Using Humans to Better Software

Can we predict bugs?

Don’t code on Fridays 1

Eclipse/Mozilla repos / bug-trackers

Link bug fixes to source code changes

Find interesting correlations

1Jacek Sliwerski, Thomas Zimmermann, and Andreas Zeller. When do changesinduce fixes? (MSR’05)

Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 11 / 18

Page 12: Mining Software Repositories: Using Humans to Better Software

Can we predict bugs?

Reopened bugs stay 2

Eclipse / Apache / OpenOffice

Build decision trees by different criteria

Analyze the results

2Emad Shihab et al. Studying re-opened bugs in open source software (ESE’12)Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 12 / 18

Page 13: Mining Software Repositories: Using Humans to Better Software

Code reviews: yay or nay?

Code reviews: yay or nay?

Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 13 / 18

Page 14: Mining Software Repositories: Using Humans to Better Software

Code reviews: yay or nay?

More reviews == less bugs 3

Qt / ITK / VTK

Collect review metrics

Bulid regression models for bug prediction

3Shane McIntosh et al. The impact of code review coverage and code reviewparticipation on software quality: a case study of the qt, VTK, and ITK projects.(MSR’14)

Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 14 / 18

Page 15: Mining Software Repositories: Using Humans to Better Software

Code clones: what is that smell?

Code clones: what is that smell?

Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 15 / 18

Page 16: Mining Software Repositories: Using Humans to Better Software

Code clones: what is that smell?

Clones are better than other code 4

Apache / Evolution / GIMP / Nautilus

Detect clones and link them to bugs

Analyze clone-to-bug ratio

4Foyzur Rahman et al. Clones: what is that smell? (ESE’12)Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 16 / 18

Page 17: Mining Software Repositories: Using Humans to Better Software

What next?

What next?

Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 17 / 18

Page 18: Mining Software Repositories: Using Humans to Better Software

What next?

More data to explore

OSS source code doubles every year

Active use of *aaS platforms

MSR has access to vast amounts of development data

More insights coming next week!

Marat Akhin Mining Software Repositories: Using Humans to Better Software15/06/2015 18 / 18