survey on software defect prediction
DESCRIPTION
Survey on Software Defect Prediction, HKUST PhD Qualifying ExaminationTRANSCRIPT
![Page 1: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/1.jpg)
Survey on Software Defect Prediction
- PhD Qualifying Examination -
July 3, 2014 Jaechang Nam
Department of Computer Science and Engineering HKUST
![Page 2: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/2.jpg)
Outline
• Background • Software Defect Prediction Approaches
– Simple metric and defect estimation models – Complexity metrics and Fitting models – Prediction models – Just-In-Time Prediction Models – Practical Prediction Models and Applications – History Metrics from Software Repositories – Cross-Project Defect Prediction and Feasibility
• Summary and Challenging Issues
2
![Page 3: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/3.jpg)
Motivation • General question of software defect prediction
– Can we identify defect-prone entities (source code file, binary, module, change,...) in advance? • # of defects • buggy or clean
• Why? – Quality assurance for large software
(Akiyama@IFIP ’71) – Effective resource allocation
• Testing (Menzies@TSE`07) • Code review (Rahman@FSE’11)
3
![Page 4: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/4.jpg)
Ground Assumption
• The more complex, the more defect-prone
4
![Page 5: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/5.jpg)
Two Focuses on Defect Prediction
• How much complex is software and its process? – Metrics
• How can we predict whether software has defects? – Models based on the metrics
5
![Page 6: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/6.jpg)
Prediction Performance Goal
• Recall vs. Precision
• Strong predictor criteria – 70% recall and 25% false positive rate (Menzies@TSE`07) – Precision, recall, accuracy ≥ 75% (Zimmermann@FSE`09)
6
![Page 7: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/7.jpg)
Outline
• Background • Software Defect Prediction Approaches
– Simple metric and defect estimation models – Complexity metrics and Fitting models – Prediction models – Just-In-Time Prediction Models – Practical Prediction Models and Applications – History Metrics from Software Repositories – Cross-Project Defect Prediction and Feasibility
• Summary and Challenging Issues
7
![Page 8: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/8.jpg)
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Met
rics
M
odel
s O
ther
s
![Page 9: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/9.jpg)
Identifying Defect-prone Entities
• Akiyama’s equation (Ajiyama@IFIP`71)
– # of defects = 4.86 + 0.018 * LOC (=Lines Of Code)
• 23 defects in 1 KLOC • Derived from actual systems
• Limitation – Only LOC is not enough to capture software
complexity
9
![Page 10: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/10.jpg)
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Fitting Model
Cyclomatic Metric
Halstead Metrics
Met
rics
M
odel
s O
ther
s
![Page 11: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/11.jpg)
Complexity Metrics and Fitting Models
• Cyclomatic complexity metrics (McCabe`76) – “Logical complexity” of a program represented in
control flow graph – V(G) = #edge – #node + 2
• Halstead complexity metrics (Halsted`77)
– Metrics based on # of operators and operands – Volume = N * log2n – # of defects = Volume / 3000
11
![Page 12: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/12.jpg)
Complexity Metrics and Fitting Models
• Limitation – Do not capture complexity (amount) of change. – Just fitting models but not prediction models in most of
studies conducted in 1970s and early 1980s • Correlation analysis between metrics and # of defects
– By linear regression models
• Models were not validated for new entities (modules).
12
![Page 13: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/13.jpg)
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Fitting Model Prediction Model (Regression)
Cyclomatic Metric
Halstead Metrics
Process Metrics
Met
rics
M
odel
s O
ther
s
Prediction Model (Classification)
![Page 14: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/14.jpg)
Regression Model • Shen et al.’s empirical study (Shen@TSE`85)
– Linear regression model – Validated on actual new modules – Metrics
• Halstead, # of conditional statements • Process metrics
– Delta of complexity metrics between two successive system versions
– Measures • Between actual and predicted # of defects on new modules
– MRE (Mean magnitude of relative error) » average of (D-D’)/D for all modules
• D: actual # of defects • D’: predicted # of defects
» MRE = 0.48
14
![Page 15: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/15.jpg)
Classification Model • Discriminative analysis by Munson et al. (Munson@TSE`92)
• Logistic regression • High risk vs. low risk modules • Metrics
– Halstead and Cyclomatic complexity metrics
• Measure – Type I error : False positive rate – Type II error : False negative rate
• Result – Accuracy: 92% (6 misclassi f ication out of 78 modules) – Precision: 85% – Recal l : 73% – F-measure: 88%
15
![Page 16: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/16.jpg)
?
Defect Prediction Process (Based on Machine Learning)
16
Classification / Regression
Software Archives
B C C
B
...
2 5 0
1
...
Instances with metrics (features)
and labels
B C
B ...
2
0
1
...
Training Instances (Preprocessing)
Model
?
New instances
Generate Instances
Build a model
![Page 17: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/17.jpg)
Defect Prediction (Based on Machine Learning)
• Limitations – Limited resources for process metrics
• Error fix in unit testing phase was conducted informally by an individual developer (no error information available in this phase). (Shen@TSE`85)
– Existing metrics were not enough to capture complexity of object-oriented (OO) programs.
– Helpful for quality assurance team but not for individual developers
17
![Page 18: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/18.jpg)
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Fitting Model Prediction Model (Regression)
Prediction Model (Classification)
Cyclomatic Metric
Halstead Metrics
Process Metrics
Met
rics
M
odel
s O
ther
s
Just-In-Time Prediction Model
Practical Model and Applications
History Metrics
CK Metrics
![Page 19: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/19.jpg)
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Fitting Model Prediction Model (Regression)
Prediction Model (Classification)
Cyclomatic Metric
Halstead Metrics
Just-In-Time Prediction Model
Practical Model and Applications
Process Metrics
Met
rics
M
odel
s O
ther
s
History Metrics
CK Metrics
![Page 20: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/20.jpg)
Risk Prediction of Software Changes (Mockus@BLTJ`00)
• Logistic regression • Change metrics
– LOC added/deleted/modified – Diffusion of change – Developer experience
• Result – Both false positive and false negative rate: 20% in the
best case
20
![Page 21: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/21.jpg)
Risk Prediction of Software Changes (Mockus@BLTJ`00)
• Advantage – Show the feasible model in practice
• Limitation – Conducted 3 times per week
• Not fully Just-In-Time
– Validated on one commercial system (5ESS switching system software)
21
![Page 22: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/22.jpg)
BugCache (Kim@ICSE`07) • Maintain defect-prone entities in a cache
• Approach
• Result – Top 10% files account for 73-95% of defects on 7 systems
22
![Page 23: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/23.jpg)
BugCache (Kim@ICSE`07) • Advantages
– Cache can be updated quickly with less cost. (c.f. static models based on machine learning)
– Just-In-Time: always available whenever QA teams want to get the list of defect-prone entities
• Limitations – Cache is not reusable for other software projects. – Designed for QA teams
• Applicable only in a certain time point after a bunch of changes (e .g., end of a sprint)
• Stil l l imited for individual developers in development phase
23
![Page 24: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/24.jpg)
Change Classification (Kim@TSE`08)
• Classification model based on SVM • About 11,500 features
– Change metadata such as changed LOC, change count – Complexity metrics – Text features from change log messages, source code, and file
names
• Results – 78% accuracy and 60% recall on average from 12 open-source
projects
24
![Page 25: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/25.jpg)
Change Classification (Kim@TSE`08) • Limitations
– Heavy model (11,500 features) – Not validated on commercial software products.
25
![Page 26: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/26.jpg)
Follow-up Studies • Studies addressing limitations
– “Reducing Features to Improve Code Change-Based Bug Prediction” (Shiva j i@TSE`13)
• With less than 10% of all features, buggy F-measure is 21% improved.
– “Software Change Classification using Hunk Metrics” (Ferzund@ICSM`09) • 27 hunk-level metrics for change classif ication • 81% accuracy, 77% buggy hunk precision, and 67% buggy hunk recall
– “A large-scale empirical study of just-in-time quality assurance” (Kamei@TSE`13)
• 14 process metrics (mostly from Mockus`00) • 68% accuracy, 64% recall on 11open-source and commercial projects
– “An Empirical Study of Just-In-Time Defect Prediction Using Cross-Project Models” (Fukushima@MSR`14)
• Median AUC: 0.72
26
![Page 27: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/27.jpg)
Challenges of JIT model
• Practical validation is difficult – Just 10-fold cross validation in current literature – No validation on real scenario
• e.g., online machine learning
• Still difficult to review huge change – Fine-grained prediction within a change
• e.g., Line-level prediction
27
![Page 28: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/28.jpg)
Next Steps of Defect Prediction
1980s 1990s 2000s 2010s 2020s
Online Learning JIT Model
Prediction Model (Regression)
Prediction Model (Classification)
Just-In-Time Prediction Model
Process Metrics
Metrics
Models
Others
Fine-grained Prediction
![Page 29: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/29.jpg)
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Fitting Model Prediction Model (Regression)
Prediction Model (Classification)
Cyclomatic Metric
Halstead Metrics
Just-In-Time Prediction Model
Practical Model and Applications
Process Metrics
Met
rics
M
odel
s O
ther
s
History Metrics
CK Metrics
![Page 30: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/30.jpg)
Defect Prediction in Industry • “Predicting the location and number of faults in large
software systems” (Ostrand@TSE`05) – Two industrial systems – Recall 86% – 20% most fault-prone modules account for 62% faults
30
![Page 31: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/31.jpg)
Case Study for Practical Model • “Does Bug Prediction Support Human Developers?
Findings From a Google Case Study” (Lewis@ICSE`13)
– No identifiable change in developer behaviors after using defect prediction model
• Required characteristics but very challenging – Actionable messages / obvious reasoning
31
![Page 32: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/32.jpg)
Next Steps of Defect Prediction
1980s 1990s 2000s 2010s 2020s
Actionable Defect
Prediction
Prediction Model (Regression)
Prediction Model (Classification)
Just-In-Time Prediction Model
Practical Model and Applications
Process Metrics
Metrics
Models
Others
![Page 33: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/33.jpg)
Evaluation Measure for Practical Model
• Measure prediction performance based on code review effort
• AUCEC (Area Under Cost Effectiveness Curve)
33 Percent of LOC
Perc
ent
of b
ugs
foun
d
0 100%
100%
50% 10%
M1
M2
Rahman@FSE`11, Bugcache for inspections: Hit or miss?
![Page 34: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/34.jpg)
Practical Application
• What else can we do more with defect prediction models? – Test case selection on regression testing
(Engstrom@ICST`10) – Prioritizing warnings from FindBugs (Rahman@ICSE`14)
34
![Page 35: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/35.jpg)
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Fitting Model Prediction Model (Regression)
Prediction Model (Classification)
Cyclomatic Metric
Halstead Metrics
CK Metrics Process Metrics
Met
rics
M
odel
s O
ther
s
Practical Model and Applications
Just-In-Time Prediction Model
History Metrics
![Page 36: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/36.jpg)
Representative OO Metrics
Metric Description
WMC Weighted Methods per Class (# of methods)
DIT Depth of Inheritance Tree ( # of ancestor classes)
NOC Number of Children
CBO Coupling between Objects (# of coupled classes)
RFC Response for a class: WMC + # of methods called by the class)
LCOM Lack of Cohesion in Methods (# of "connected components”)
36
• CK metrics (Chidamber&Kemerer@TSE`94)
• Prediction Performance of CK vs. code (Basili@TSE`96) – F-measure: 70% vs. 60%
![Page 37: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/37.jpg)
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Fitting Model Prediction Model (Regression)
Prediction Model (Classification)
Cyclomatic Metric
Halstead Metrics
CK Metrics Process Metrics
Met
rics
M
odel
s O
ther
s
Practical Model and Applications
Just-In-Time Prediction Model
History Metrics
![Page 38: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/38.jpg)
Representative History Metrics
38
Name # of metrics
Metric source Citation
Relative code change churn 8 SW Repo.* Nagappan@ICSE`05
Change 17 SW Repo. Moser@ICSE`08
Change Entropy 1 SW Repo. Hassan@ICSE`09
Code metric churn Code Entropy 2 SW Repo. D’Ambros@MSR`10
Popularity 5 Email archive Bacchelli@FASE`10
Ownership 4 SW Repo. Bird@FSE`11
Micro Interaction Metrics (MIM) 56 Mylyn Lee@FSE`11
* SW Repo. = version control system + issue tracking system
![Page 39: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/39.jpg)
Representative History Metrics • Advantage
– Better prediction performance than code metrics
39
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
Moser`08 Hassan`09 D'Ambros`10 Bachille`10 Bird`11 Lee`11
Performance Improvement (all metrics vs. code complexity metrics)
(F-measure) (F-measure) (Absolute prediction
error)
(Spearman correlation)
(Spearman correlation)
(Spearman correlation*)
(*Bird`10’s results are from two metrics vs. code metrics, No comparison data in Nagappan`05)
Performance Improvement
(%)
![Page 40: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/40.jpg)
History Metrics
• Limitations – History metrics do not extract par ticular program characteristics
such as developer social network, component network, and anti-pattern.
– Noise data • Bias in Bug-Fix Dataset(Bird@FSE`09)
– Not applicable for new projects and projects lacking in historical data
40
![Page 41: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/41.jpg)
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Fitting Model Prediction Model (Regression)
Prediction Model (Classification)
Cyclomatic Metric
Halstead Metrics
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and Applications
Universal Model
Process Metrics
Cross-Project Feasibility
Met
rics
M
odel
s O
ther
s
History Metrics
Other Metrics
Noise Reduction
Semi-supervised/active
![Page 42: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/42.jpg)
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Fitting Model Prediction Model (Regression)
Prediction Model (Classification)
Cyclomatic Metric
Halstead Metrics
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and Applications
Universal Model
Process Metrics
Cross-Project Feasibility
Met
rics
M
odel
s O
ther
s
History Metrics
Other Metrics
Noise Reduction
Semi-supervised/active
![Page 43: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/43.jpg)
Other Metrics
43
Name # of metrics
Metric source Citation
Component network 28 Binaries
(Windows Server 2003)
Zimmermann@ICSE`08
Developer-Module network 9 SW Repo. + Binaries
Pinzger@FSE`08
Developer social network 4 SW Repo. Meenely@FSE`08
Anti-pattern 4 SW Repo. +
Design-pattern
Taba@ICSM`13
* SW Repo. = version control system + issue tracking system
![Page 44: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/44.jpg)
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Fitting Model Prediction Model (Regression)
Prediction Model (Classification)
Cyclomatic Metric
Halstead Metrics
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and Applications
Universal Model
Process Metrics
Cross-Project Feasibility
Met
rics
M
odel
s O
ther
s
History Metrics
Other Metrics
Noise Reduction
Semi-supervised/active
![Page 45: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/45.jpg)
Noise Reduction • Noise detection and elimination algorithm
(Kim@ICSE`11) – Closest List Noise Identification (CLNI)
• Based on Euclidean distance between instances
– Average F-measure improvement • 0.504 à 0.621
• Relink (Wo@FSE`11) – Recover missing links between bugs and changes – 60% à 78% recall for missing links – F-measure improvement
• e.g. 0.698 (traditional) à 0.731 (ReLink)
45
![Page 46: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/46.jpg)
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Fitting Model Prediction Model (Regression)
Prediction Model (Classification)
Cyclomatic Metric
Halstead Metrics
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and Applications
Universal Model
Process Metrics
Cross-Project Feasibility
Met
rics
M
odel
s O
ther
s
History Metrics
Other Metrics
Semi-supervised/active
![Page 47: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/47.jpg)
Defect Prediction for New Software Projects
• Universal Defect Prediction Model
• Simi-supervised / active learning • Cross-Project Defect Prediction
47
![Page 48: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/48.jpg)
Universal Defect Prediction Model (Zhang@MSR`14)
• Context-aware rank transformation – Transform metric values ranged from 1 to 10 across all
projects.
• Model built by 1398 projects collected from SourceForge and Google code
48
![Page 49: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/49.jpg)
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Fitting Model Prediction Model (Regression)
Prediction Model (Classification)
Cyclomatic Metric
Halstead Metrics
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and Applications
Universal Model
Process Metrics
Cross-Project Feasibility
Met
rics
M
odel
s O
ther
s
History Metrics
Other Metrics
Semi-supervised/active
![Page 50: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/50.jpg)
Other approaches for CDDP
• Semi-supervised learning with dimension reduction for defect prediction (Lu@ASE`12) – Training a model by a small set of labeled instances
together with many unlabeled instances – AUC improvement
• 0.83 à 0.88 with 2% labeled instances
• Sample-based semi-supervised/active learning for defect prediction (Li@AESEJ`12) – Average F-measure
• 0.628 à 0.685 with 10% sampled instances
50
![Page 51: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/51.jpg)
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Fitting Model Prediction Model (Regression)
Prediction Model (Classification)
Cyclomatic Metric
Halstead Metrics
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and Applications
Universal Model
Process Metrics
Cross-Project Feasibility
Met
rics
M
odel
s O
ther
s
History Metrics
Other Metrics
Semi-supervised/active
![Page 52: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/52.jpg)
Cross-Project Defect Prediction (CPDP)
• For a new project or a project lacking in the historical data
52
?
?
?
Training
Test
Model
Project A Project B
Only 2% out of 622 prediction combinations worked. (Zimmermann@FSE`09)
![Page 53: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/53.jpg)
Transfer Learning (TL)
27
Traditional Machine Learning (ML)
Learning System
Learning System
Transfer Learning
Learning System
Learning System
Knowledge Transfer
Pan et al.@TNN`10, Domain Adaptation via Transfer Component Analysis
![Page 54: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/54.jpg)
CPDP
54
• Adopting transfer learning
Transfer learning Metric Compensation NN Filter TNB TCA+
Preprocessing N/A Feature selection, Log-filter Log-filter Normalization
Machine learner C4.5 Naive Bayes TNB Logistic Regression
# of Subjects 2 10 10 8
# of predictions 2 10 10 26
Avg. f-measure 0.67 (W:0.79, C:0.58)
0.35 (W:0.37, C:0.26)
0.39 (NN: 0.35, C:0.33)
0.46 (W:0.46, C:0.36)
Citation Watanabe@PROMISE`08 Turhan@ESEJ`09 Ma@IST`12 Nam@ICSE`13
* NN = Nearest neighbor, W = Within, C = Cross
![Page 55: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/55.jpg)
Metric Compensation (Watanabe@PROMISE`08)
• Key idea • New target metric value =
target metric value * average source metric value
average target metric value
55
s
Source Target New Target
Let me transform like source!
![Page 56: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/56.jpg)
Metric Compensation (cont.) (Watanabe@PROMISE`08)
56
Transfer learning Metric Compensation NN Filter TNB TCA+
Preprocessing N/A Feature selection, Log-filter Log-filter Normalization
Machine learner C4.5 Naive Bayes TNB Logistic Regression
# of Subjects 2 10 10 8
# of predictions 2 10 10 26
Avg. f-measure 0.67 (W:0.79, C:0.58)
0.35 (W:0.37, C:0.26)
0.39 (NN: 0.35, C:0.33)
0.46 (W:0.46, C:0.36)
Citation Watanabe@PROMISE`08 Turhan@ESEJ`09 Ma@IST`12 Nam@ICSE`13
* NN = Nearest neighbor, W = Within, C = Cross
![Page 57: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/57.jpg)
NN filter (Turhan@ESEJ`09)
• Key idea
• Nearest neighbor filter – Select 10 nearest source instances of each
target instance
57
New Source Target
Hey, you look like me! Could you be my model?
Source
![Page 58: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/58.jpg)
NN filter (cont.) (Turhan@ESEJ`09)
58
Transfer learning Metric Compensation NN Filter TNB TCA+
Preprocessing N/A Feature selection, Log-filter Log-filter Normalization
Machine learner C4.5 Naive Bayes TNB Logistic Regression
# of Subjects 2 10 10 8
# of predictions 2 10 10 26
Avg. f-measure 0.67 (W:0.79, C:0.58)
0.35 (W:0.37, C:0.26)
0.39 (NN: 0.35, C:0.33)
0.46 (W:0.46, C:0.36)
Citation Watanabe@PROMISE`08 Turhan@ESEJ`09 Ma@IST`12 Nam@ICSE`13
* NN = Nearest neighbor, W = Within, C = Cross
![Page 59: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/59.jpg)
Transfer Naive Bayes (Ma@IST`12)
• Key idea
59
Target
Hey, you look like me! You will get more chance to be my best model!
Source
è Provide more weight to similar source instances to build a Naive Bayes Model Build a model
Please, consider me more important than other instances
![Page 60: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/60.jpg)
Transfer Naive Bayes (cont.) (Ma@IST`12)
• Transfer Naive Bayes
– New prior probability
– New conditional probability
60
![Page 61: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/61.jpg)
Transfer Naive Bayes (cont.) (Ma@IST`12)
• How to find similar source instances for target – A similarity score
– A weight value
61
F1 F2 F3 F4 Score (si)
Max of target 7 3 2 5 -
src. inst 1 5 4 2 2 3
src. inst 2 0 2 5 9 1
Min of target 1 2 0 1 -
k=# of features, si=score of instance i
![Page 62: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/62.jpg)
Transfer Naive Bayes (cont.) (Ma@IST`12)
62
Transfer learning Metric Compensation NN Filter TNB TCA+
Preprocessing N/A Feature selection, Log-filter Log-filter Normalization
Machine learner C4.5 Naive Bayes TNB Logistic Regression
# of Subjects 2 10 10 8
# of predictions 2 10 10 26
Avg. f-measure 0.67 (W:0.79, C:0.58)
0.35 (W:0.37, C:0.26)
0.39 (NN: 0.35, C:0.33)
0.46 (W:0.46, C:0.36)
Citation Watanabe@PROMISE`08 Turhan@ESEJ`09 Ma@IST`12 Nam@ICSE`13
* NN = Nearest neighbor, W = Within, C = Cross
![Page 63: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/63.jpg)
TCA+ (Nam@ICSE`13)
• Key idea – TCA (Transfer Component Analysis)
63
Source Target
Oops, we are different! Let’s meet in another world!
New Source New Target
![Page 64: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/64.jpg)
Transfer Component Analysis (cont.)
• Feature extraction approach – Dimensionality reduction – Projection • Map original data
in a lower-dimensional feature space
64
![Page 65: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/65.jpg)
TCA (cont.)
65 Pan et al.@TNN`10, Domain Adaptation via Transfer Component Analysis
Target domain data Source domain data
![Page 66: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/66.jpg)
TCA (cont.)
66
TCA
Pan et al.@TNN`10, Domain Adaptation via Transfer Component Analysis
![Page 67: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/67.jpg)
TCA+ (Nam@ICSE`13)
67
Source Target
Oops, we are different! Let’s meet at another world!
New Source New Target
But, we are still a bit different!
Source Target
Oops, we are different! Let’s meet at another world!
New Source New Target
Normalize US together!
TCA TCA+
![Page 68: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/68.jpg)
Normalization Options
• NoN: No normalization applied
• N1: Min-max normalization (max=1, min=0)
• N2: Z-score normalization (mean=0, std=1)
• N3: Z-score normalization only using source mean and standard deviation
• N4: Z-score normalization only using target mean and standard deviation
13
![Page 69: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/69.jpg)
Preliminary Results using TCA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
F-‐measure
69 *Baseline: Cross-‐project defect predicNon without TCA and normalizaNon
Prediction performance of TCA varies according to different
normalization options! Baseline NoN N1 N2 N3 N4 Baseline NoN N1 N2 N3 N4
Project A è Project B Project B è Project A
F-m
easu
re
![Page 70: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/70.jpg)
TCA+: Decision Rules
• Find a suitable normalization for TCA • Steps – #1: Characterize a dataset – #2: Measure similarity
between source and target datasets – #3: Decision rules
70
![Page 71: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/71.jpg)
TCA+: #1. Characterize a Dataset
71
3
1
…
Dataset A Dataset B
2
4
5
8
9
6
11
d1,2
d1,5
d1,3
d3,11
3
1
…
2 4
5
8
9
6 11
d2,6
d1,2
d1,3
d3,11
DIST={dij : i,j, 1 ≤ i < n, 1 < j ≤ n, i < j}
A
![Page 72: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/72.jpg)
TCA+: #2. Measure Similarity between Source and Target
• Minimum (min) and maximum (max) values of DIST • Mean and standard deviation (std) of DIST • The number of instances
72
![Page 73: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/73.jpg)
TCA+: #3. Decision Rules
• Rule #1 – Mean and Std are same è NoN
• Rule #2 – Max and Min are different è N1 (max=1, min=0)
• Rule #3,#4 – Std and # of instances are different è N3 or N4 (src/tgt mean=0, std=1)
• Rule #5 – Default è N2 (mean=0, std=1)
73
![Page 74: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/74.jpg)
TCA+ (cont.) (Nam@ICSE`13)
74
Transfer learning Metric Compensation NN Filter TNB TCA+
Preprocessing N/A Feature selection, Log-filter Log-filter Normalization
Machine learner C4.5 Naive Bayes TNB Logistic Regression
# of Subjects 2 10 10 8
# of predictions 2 10 10 26
Avg. f-measure 0.67 (W:0.79, C:0.58)
0.35 (W:0.37, C:0.26)
0.39 (NN: 0.35, C:0.33)
0.46 (W:0.46, C:0.36)
Citation Watanabe@PROMISE`08 Turhan@ESEJ`09 Ma@IST`12 Nam@ICSE`13
* NN = Nearest neighbor, W = Within, C = Cross
![Page 75: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/75.jpg)
Current CPDP using TL • Advantages
– Comparable prediction performance to within-prediction models
– Benefit from the state-of-the-art TL approaches
• Limitation – Performance of some cross-prediction pairs is still poor.
(Negative Transfer)
75 Source Target
![Page 76: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/76.jpg)
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Fitting Model Prediction Model (Regression)
Prediction Model (Classification)
Cyclomatic Metric
Halstead Metrics
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and Applications
Universal Model
Process Metrics
Cross-Project Feasibility
Met
rics
M
odel
s O
ther
s
History Metrics
Other Metrics
Semi-supervised/active
![Page 77: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/77.jpg)
Feasibility Evaluation for CPDP • Solution for negative transfer
– Decision tree using project characteristic metrics (Zimmermann@FSE`09)
• E.g. programming language , # developers, etc .
77
![Page 78: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/78.jpg)
Follow-up Studies • “An investigation on the feasibility of cross-project
defect prediction.” (He@ASEJ`12)
– Decision tree using distributional characteristics of a dataset E.g. mean, skewness, peakedness, etc.
78
![Page 79: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/79.jpg)
Feasibility for CPDP
• Challenges on current studies – Decision trees were not evaluated properly.
• Just fitting model
– Low target prediction coverage • 5 out of 34 target projects were feasible for cross-predictions
(He@ASEJ`12)
79
![Page 80: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/80.jpg)
Next Steps of Defect Prediction
1980s 1990s 2000s 2010s 2020s
Cross-Prediction Feasibility Model
Prediction Model (Regression)
Prediction Model (Classification)
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and Applications
Universal Model
Process Metrics
Cross-Project Feasibility
Metrics
Models
Others
History Metrics
Other Metrics
Semi-supervised/active
![Page 81: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/81.jpg)
Semi-supervised/active
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Fitting Model Prediction Model (Regression)
Prediction Model (Classification)
Cyclomatic Metric
Halstead Metrics
CK Metrics
History Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Other Metrics
Practical Model and Applications
Universal Model
Process Metrics
Cross-Project Feasibility
Met
rics
M
odel
s O
ther
s
Personalized Model
![Page 82: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/82.jpg)
Cross-prediction Model • Common challenge
– Current cross-prediction models are limited to datasets with same number of metrics
– Not applicable on projects with different feature spaces (different domains) • NASA Dataset: Halstead, LOC • Apache Dataset: LOC, Cyclomatic , CK metrics
82
Source Target
![Page 83: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/83.jpg)
Next Steps of Defect Prediction
1980s 1990s 2000s 2010s 2020s
Prediction Model (Regression)
Prediction Model (Classification)
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and Applications
Universal Model
Process Metrics
Cross-Project Feasibility
Metrics
Models
Others
Cross-Domain Prediction
History Metrics
Other Metrics
Noise Reduction
Semi-supervised/active
Personalized Model
![Page 84: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/84.jpg)
Other Topics
84
![Page 85: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/85.jpg)
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Fitting Model Prediction Model (Regression)
Prediction Model (Classification)
Cyclomatic Metric
Halstead Metrics
CK Metrics
History Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Other Metrics
Practical Model and Applications
Universal Model
Process Metrics
Cross-Project Feasibility
Met
rics
M
odel
s O
ther
s
Data Privacy
Noise Reduction
Semi-supervised/active
Personalized Model
![Page 86: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/86.jpg)
Other Topics • Privacy issue on defect datasets
– MORPH (Peters@ICSE`12)
• Mutate defect datasets while keeping prediction accuracy • Can accelerate cross-project defect prediction with industrial
datasets
• Personalized defect prediction model (Jiang@ASE`13)
– “Different developers have different coding styles, commit frequencies, and experience levels, all of which cause different defect patterns.”
– Results • Average F-measure: 0.62 (personalized models) vs. 0.59 (non-
personalized models)
86
![Page 87: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/87.jpg)
Outline
• Background • Software Defect Prediction Approaches
– Simple metric and defect estimation models – Complexity metrics and Fitting models – Prediction models – Just-In-Time Prediction Models – Practical Prediction Models and Applications – History Metrics from Software Repositories – Cross-Project Defect Prediction and Feasibility
• Summary and Challenging Issues
87
![Page 88: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/88.jpg)
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s LOC
Simple Model
Fitting Model Prediction Model (Regression)
Prediction Model (Classification)
Cyclomatic Metric
Halstead Metrics
CK Metrics
History Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Other Metrics
Practical Model and Applications
Data Privacy
Universal Model
Process Metrics
Cross-Project Feasibility
Met
rics
M
odel
s O
ther
s
Noise Reduction
Semi-supervised/active
Personalized Model
![Page 89: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/89.jpg)
Next Steps of Defect Prediction
1980s 1990s 2000s 2010s 2020s
Online Learning JIT Model
Actionable Defect
Prediction
Cross-Prediction Feasibility Model
Prediction Model (Regression)
Prediction Model (Classification)
CK Metrics
History Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Other Metrics
Practical Model and Applications
Universal Model
Process Metrics
Cross-Project Feasibility
Metrics
Models
Others
Cross-Domain Prediction
Fine-grained Prediction
Data Privacy
Noise Reduction
Semi-supervised/active
Personalized Model
![Page 90: Survey on Software Defect Prediction](https://reader033.vdocument.in/reader033/viewer/2022060118/55898bcfd8b42aea4a8b4742/html5/thumbnails/90.jpg)
Thank you!
90