mining software data - uni-saarland.de · what is mining software repositories (msr)? • gather...
TRANSCRIPT
![Page 1: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/1.jpg)
Mining Software Data
Software Engineering Course — Summer Semester 2017
María Gómez
![Page 2: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/2.jpg)
How Software is built is changing…
• Code centric
• In-lab testing
• Centralized development
• Long product cycle
….
Slide adapted from: https://de.slideshare.net/taoxiease/software-mining-and-software-datasets
• Data pervasive
• Debugging in the large
• Distributed development
• Continuous release
….
![Page 3: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/3.jpg)
Software Data
• Large amount of artefacts are generated in the sw development process
• Increased amount of data available in software archives through large open source projects
![Page 4: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/4.jpg)
Software Decision Making
Sw developers rely on their prior experiences to plan sw projects, fix bugs, prioritise testing, etc.
![Page 5: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/5.jpg)
Mining Software Repositories (MSR)
Let’s mine software data!
Why?
What?
How?
![Page 6: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/6.jpg)
What is Mining Software Repositories (MSR)?
”The MSR field analyzes rich data available in software repositoriesto extract useful and actionable information about software projects
and systems”. (Source: msrconf.org)
SoftwareData
DATAMINING
ActionableInformation
![Page 7: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/7.jpg)
What is Mining Software Repositories (MSR)?
• Gather and exploit data produced by developers (and other sw stakeholders) in the software development process.
• Uses data available in repositories to support development activities (e.g., defect assignment, software validation, evolution and planning).
• Discover hidden patterns and trends.
• Transform static record-keeping repositories into active repositories to guide decision processes.
• Applies data extraction and analysis to make decisions and predictions.
Main goals:
1 The Road Ahead for Mining Software Repositories. Ahmed E. Hassan. 2 Effective Mining of Software Repositories. Marco D’Ambros, Romain Robbes.
![Page 8: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/8.jpg)
MSR
• What types of software data are available to mine?
• Which data mining techniques can be used in MSR?
• Which software engineering tasks can be assisted with MSR?
![Page 9: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/9.jpg)
MSR
• What types of software data are available to mine?
• Which data mining techniques can be used in MSR?
• Which software engineering tasks can be assisted with MSR?
![Page 10: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/10.jpg)
What to mine?
Software repositories refer to artefacts produced and archived during software development processes by developers and other stakeholders.
![Page 11: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/11.jpg)
Different types of repositories1:
What to mine?
HistoricalRepositories
RuntimeRepositories
CodeRepositories
1 The Road Ahead for Mining Software Repositories. Ahmed E. Hassan.
![Page 12: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/12.jpg)
What to mine?
HistoricalRepositories
Examples:• Version control systems (CVS, SVN, Git, Mercurial)• Bug repositories (Bugzilla, JIRA)• Mailing lists (e-mails, wiki pages)• Development collaboration sites (StackOverflow)
Record information about the evolutionand progress of a project
![Page 13: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/13.jpg)
What to mine?
Examples:• Code bases (SourceForge, GoogleCode)• Project ecosystems (GitHub)
CodeRepositories
Contain source code of various applicationsDeveloped by several developers
![Page 14: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/14.jpg)
What to mine?
Examples:• Crash reports• Field logs• Execution traces
RuntimeRepositories
Contain information about the execution and usage of an application
![Page 15: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/15.jpg)
What to mine?
Examples:• App Stores (Google Play Store, Apple App Store)
• Contain mobile apps and user feedbacks (reviews, ratings)
OtherRepositories
![Page 16: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/16.jpg)
Historical Repositories
Runtime Repositories
Code Repositories
Other Repositories
What to mine?
Cross-linkof repositories!
![Page 17: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/17.jpg)
Why MSR?
• Better manage software projects
• Produce higher-quality software systems that are delivered on time and within budget
• Support maintenance of software systems
• Improve software design/reuse
• Learn from past to guide future development
1 MSR Conference: http://2017.msrconf.org/#/home 2 Mining Software Engineering Data. Ahmed E. Hassan & Tao Xie.
![Page 18: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/18.jpg)
Target Audience• Software practitioners
• Project Manager
• Developers
• Designers
• Testers
• Usability engineers
• Engineers
![Page 19: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/19.jpg)
MSR
• What types of software data are available to mine?
• Which software engineering tasks can be assisted with MSR?
• Which data mining techniques can be used in MSR?
![Page 20: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/20.jpg)
Applications of MSR• Estimate developer efforts
• Change impact and propagation
• Risk management (trends)
• Fault analysis and prediction
• Test reduction, minimisation and selection
• Continuous quality assurance
• Post-release maintenance
![Page 21: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/21.jpg)
• New bug report
• Estimate fix effort
• Mark duplicate
• Suggest experts and fix
• New change
• Suggest APIs
• Warn about risky code or bugs
• Suggest locations to co-change
Applications of MSR
![Page 22: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/22.jpg)
MSR
• What types of software data are available to mine?
• Which software engineering tasks can be assisted with MSR?
• Which data mining techniques can be used in MSR?
![Page 23: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/23.jpg)
MSR Process
Repositories
EXTRACT
ANALYZE SYNTHESIZE
ActionableInformation
![Page 24: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/24.jpg)
MSR Process
Repositories
EXTRACT
ANALYZE SYNTHESIZE
ActionableInformation
![Page 25: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/25.jpg)
Data Extraction
• Extract data from different repositories
• Selection of input data • Processing (e.g., filtering)
• Constraints to help with scalability
![Page 26: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/26.jpg)
MSR Process
Repositories
EXTRACT
ANALYZE SYNTHESIZE
ActionableInformation
![Page 27: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/27.jpg)
Data Analysis
• Process the data
• Link data between repositories
• Empirical analysis to the data
![Page 28: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/28.jpg)
Types of Empirical Analysis
Different types of empirical analysis can be performed in repositories:
• Quantitative vs qualitative
• Regression models
• Grounded theory
• Machine learning/data mining
![Page 29: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/29.jpg)
Types of Empirical AnalysisQuantitative vs qualitative
![Page 30: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/30.jpg)
Types of Empirical AnalysisQuantitative vs qualitative
Quantitative
Data is numerical Data can be measured
Qualitative
Data non-numerical Data can be observed
![Page 31: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/31.jpg)
Types of Empirical AnalysisQuantitative vs qualitative
Do performance bugs take more time to fix? Are performance bugs fixed by more experienced developers?
Example quantitative study:
What are the advantages/disadvantages of shared code ownership from the developers perspective?
Example qualitative study:
![Page 32: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/32.jpg)
Types of Empirical AnalysisRegression models• Estimate relationship among variables • Widely used for prediction and forecasting
Example:
What factors contribute to delays on bug fixing time most?
![Page 33: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/33.jpg)
Types of Empirical Analysis
Grounded theory
• Building theory from data • Discovery of emerging patterns in data
![Page 34: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/34.jpg)
Types of Empirical AnalysisGrounded theory
Figure source: https://www.researchgate.net/figure/222301824_fig1_Fig-1-Basic-process-of-the-Grounded-Theory-approach
![Page 35: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/35.jpg)
Types of Empirical Analysis
Machine learning/data mining techniques
• Association Rules and Frequent Patterns
• Classification
• Clustering
![Page 36: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/36.jpg)
Data mining techniquesAssociation Rules and Frequent Patterns
• Find frequent patterns in a database • Itemset: set of items
• Support of itemsets • Confidence of rules
Image source: https://image.slidesharecdn.com/3-150328084211-conversion-gate01/95/31-mining-frequent-patterns-with-association-rulesmca4-4-638.jpg?cb=1427532681
![Page 37: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/37.jpg)
Data mining techniquesClassification
• Supervised learning
1. Construct model with labeled objects (training set).
2. Apply model to unlabelled objects.
![Page 38: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/38.jpg)
Data mining techniquesClustering
• Unsupervised learning (no predefined classes)
• Group similar data
![Page 39: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/39.jpg)
Analysis Tools
Data mining and analysis tools:
• R http://www.r-project.org/ Free software for statistical computing and graphics
• Wekahttp://www.cs.waikato.ac.nz/ml/weka/ Open-source tool containing a collection of machine learning and
data mining algorithms.
![Page 40: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/40.jpg)
MSR Process
Repositories
EXTRACT
ANALYZE SYNTHESIZE
ActionableInformation
![Page 41: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/41.jpg)
Data Synthesis
• Report / visualisation of outcome
• Understand the needs of practitioners
• Help practitioners to make decisions • Don’t replace them!
![Page 42: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/42.jpg)
Actionable Outputs
• Developer feedback
• Bug prediction
• Quality assurance
• Architecture analysis
• ………
![Page 43: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/43.jpg)
What can we learn from software data?
MSR Application Examples
![Page 44: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/44.jpg)
Can we predict bugs?
• Link bug fixes to source code changes • Eclipse/Mozilla repos and bug-trackers • Correlations found!
When do changes induce fixes? Jacek Sliwerski, Thomas Zimmermann and Andreas Zeller. (MSR’ 05)
![Page 45: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/45.jpg)
Can we predict bugs? (2)
Example source: https://de.slideshare.net/taoxiease/software-mining-and-software-datasets
![Page 46: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/46.jpg)
How Long will it Take to Fix this Bug?
• Predicting effort to fix a bug • Mine bug databases • Text similarity to identify reports closely related
How Long will it Take to Fix This Bug? C. WeiB, R. Premraj, T. Zimmermann, A. Zeller. (MSR’ 07)
![Page 47: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/47.jpg)
Can we identify duplicate bug reports?
• Mine bug repositories (e.g., Bugzilla, Jira)
• Use information retrieval to find similar reports and rank them.
Search-Based Duplicate Defect Detection: An Industrial Experience. Amoui, M., Kaushik, N., Al-Dabbagh, A., Tahvildari, L., Li, S., & Liu, W. (MSR’13)
![Page 48: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/48.jpg)
Change PropagationHow does a change in one source code entity propagate to other entities?
• Predict change propagation
• Mine association rules from change history
Predicting Change Propagation in Software Systems. Ahmed E. Hassan and Richard C. Holt (ICSM ’04)
![Page 49: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/49.jpg)
Classify Changes as Buggy or Clean• Can we warn developers that there is a bug in a change’’?
• Identifying bug-introducing changes from bug-fix data
Automatic Identification of Bug-Introducing Changes. Kim, S., Zimmermann, T., Pan, K., & James Jr, E. (ASE’ 06)
![Page 50: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/50.jpg)
Classify Changes as Buggy or Clean
Automatic Identification of Bug-Introducing Changes. Kim, S., Zimmermann, T., Pan, K., & James Jr, E. (ASE’ 06)
![Page 51: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/51.jpg)
Classification of security bug reports
Example source: https://de.slideshare.net/taoxiease/software-mining-and-software-datasets
![Page 52: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/52.jpg)
Mining questions about software energy consumption
• Mine communities (StackOverflow)
• Use thematic analysis (e.g. LDA, Classifier) to find common themes in questions & answers
• Interpret themes
Mining questions about software energy consumption. Pinto, G., Castor, F., & Liu, Y. D. (MSR’ 14)
![Page 53: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/53.jpg)
API change and fault proneness impact success
• Relationship between success of Android apps and Android API instability
• Measure success through user ratings in app store
• Measure fault-proneness through number of bugs fixed in the used APIs
API change and fault proneness: a threat to the success of Android apps. M. Linares et al. (FSE’13)
![Page 54: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/54.jpg)
Recommending and Localizing Change Requests for Mobile Apps based on
User Reviews• Automatic classification of user reviews from Google Play store
• Link to the source code entities to be changed
• Recommend developers changes to sw artefacts
Recommending and Localizing Change Requests for Mobile Apps based on User Reviews. F. Palomba et. al. (ICSE’17)
![Page 55: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/55.jpg)
MSR in Practice
Slide extracted from: https://de.slideshare.net/taoxiease/software-mining-and-software-datasets
![Page 56: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/56.jpg)
Tools for Mining Software Repositories
• Available mining tools
• Libresoft Tools. http://tools.libresoft.es/
• CVSAnaly. VS/SVN/Git repository log parser
• MLStats. Mailman and Mboxes parser
• Bicho. Bugzilla and SF.net tracker parser
![Page 57: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/57.jpg)
MSR Repositories
Data Repositories available online:
• FLOSSmole repository of open source snapshots. flossmole.org/ • Github. http://www.ghtorrent.org • iBUGS. www.st.cs.uni-saarland.de/ibugs/ • MetricsGrimoire toolset. https://metricsgrimoire.github.io • PROMISE repository. http://openscience.us/repo/ • Software-artifact Infrastructure Repository. http://sir.unl.edu/portal/index.php • Ultimate Debian Database. https://wiki.debian.org/UltimateDebianDatabase • Apache SVN commits. https://github.com/monperrus/apache-svn-commits • Socorro: Mozilla Crash Stats. https://wiki.mozilla.org/Socorro
![Page 58: Mining Software Data - uni-saarland.de · What is Mining Software Repositories (MSR)? • Gather and exploit data produced by developers (and other sw stakeholders) in the software](https://reader030.vdocument.in/reader030/viewer/2022020318/5c658b3a09d3f2ad6e8cca60/html5/thumbnails/58.jpg)
References
• The International Conference on Mining Software Repositories. 2017.msrconf.org
• Mining Software Engineering Data. Ahmed E. Hassan & Tao Xie.
• The Road Ahead for Mining Software Repositories. Ahmed E. Hassan
• Software Intelligence: The Future of Mining Software Engineering Data. Ahmed E. Hassan & Tao Xie.
• Effective Mining of Software Repositories. M. D’Ambros & Romain Robbes.