patterns for cleaning up bug data
DESCRIPTION
Paper at https://github.com/rodrigorgs/dapse13-bugpatterns/blob/master/preprint/icsews13dapse-id2-p-16145-preprint.pdf?raw=trueTRANSCRIPT
P!""#r$% f&r C'#!$($) Up B*) D!"!
Rodrigo Souza1,*
Christina Chavez1
Roberto Bittencourt2
1 Federal University of Bahia, Brazil 2 State University of Feira de Santana, Brazil
DAPSE’13: International Workshop on Data Analysis Patterns in Software Engineering
* speaker; email: [email protected]
May 21, 2013 San Francisco, USA
Bug reports
provide insight about… - the quality of the software - the quality of the process
Bug reports
often contain data that is… - incomplete - innacurate
- biased
Bug reports
may lead you to wrong conclusions
are like vegetables…
You have to clean them up before using them
Bug reports
I$ +(% T!', Two patterns to help you clean up your data
1. Look Out For Mass Updates 2. Old Wine Tastes Better
they’re like recipes for data scientists
L&&, O*" f&r M!%% Up-!"#% Determine which changes to bug reports were the result of a mass update.
1. Context 2. Problem 3. Solution 4. Discussion
L&&, O*" f&r M!%% Up-!"#%
tuesday
Worked on bug #5
Worked on bug #12
Updated bug report #5
Updated bug report #12
Joe’s worklog
Today, Joe worked on two bugs and updated the corresponding bug reports
tuesday
Updated bug report #5
Updated bug report #12
Joe’s worklog
Data scientists just see the updates Joe updated two reports ⇒ Joe worked on two bugs
Worked on bug #5
Worked on bug #12
wednesday
Joe’s worklog
Joe updated 2600 reports ⇒ Joe worked on 2600 bugs?
Updated bug report #3
Updated bug report #18
Updated bug report #9
Updated bug report #15
Updated bug report #21
Updated bug report #52
Updated bug report #40
Updated bug report #41
Updated bug report #68
Updated bug report #73 Updated bug report #78
…
Mass updates do not represent actual work Often, they are just cleanup
Mass updates should be discarded from your analyses
1. Context 2. Problem 3. Solution 4. Discussion
L&&, O*" f&r M!%% Up-!"#%
Determine which changes to bug reports were the result of a mass update
1. Context 2. Problem 3. Solution 4. Discussion
L&&, O*" f&r M!%% Up-!"#%
You’ll need: - Changes in bug reports (i.e., updates)
- What changed - Date - User - Comment
I$)r#-(#$"%
Bug # What changed Date User Comment
1 status ⇒ VERIFIED
... ... ...
2 status ⇒ VERIFIED
... ... ...
3 status ⇒ CLOSED
... ... ...
4 status ⇒ VERIFIED
... ... ...
I$)r#-(#$"%
Select one type of change (“what changed”) e.g., status ⇒VERIFIED
1
D(r#."(&$% (%&'*"(&$ #1)
2 Seek unusually high cliffs 3 Changes in the cliff are
considered mass updates
Plot accum. number of changes over time
D(r#."(&$% (%&'*"(&$ #2)
Date User Comment
D1 U1 C1
D2 U2 C2
D3 U3 C3
D4 U4 C4
D5 U5 C5
Count ▼
1703
972
447
1
1
2 Count the groups 3 Groups with
higher counts are mass updates
1 Group changes by ⟨date, user, comment⟩
1. Context 2. Problem 3. Solution 4. Discussion
L&&, O*" f&r M!%% Up-!"#%
The main challenge is to find a suitable threshold (i.e., how many updates characterize mass updates)
O'- W($# T!%"#% B#""#r Determine bug reports that are too recent to be classified.
1. Context 2. Problem 3. Solution 4. Discussion
O'- W($# T!%"#% B#""#r
Prediction models predict which bug reports will undergo some change, e.g.,
predict which bugs get reopened, predict which bugs get closed as invalid, predict which bugs get assigned to John.
e.g., predict which bugs get reopened
# Who reported? Severity Age Reopened?
1 ... ... ... YES 2 ... ... ... YES 3 ... ... ... NO 4 ... ... ... NO 5 ... ... ... NO
training set
# Who reported? Severity Age Reopened?
1 ... ... ... YES 2 ... ... ... YES 3 ... ... ... NO 4 ... ... ... NO 5 ... ... 1 day not yet
training set
can’t use too recent bugs for training
1. Context 2. Problem 3. Solution 4. Discussion
O'- W($# T!%"#% B#""#r
Determine bug reports that are too recent to be classified
1. Context 2. Problem 3. Solution 4. Discussion
O'- W($# T!%"#% B#""#r
You’ll need: - Date of last change in your data set
- Bug reports - Creation date - Whether it has been reopened*
I$)r#-(#$"%
* or, in general, whether it has undergone a particular change
Measure each bug’s age, from its creation date to the date of the last change in your data set
1
D(r#."(&$%
# ... Age Reopened? 1 ... 180 days YES 2 ... 90 days NO 3 ... 16 days YES 4 ... 12 days NO ... ... ... ...
Guess a threshold so that bugs younger than the threshold are considered too recent to be classified
2
D(r#."(&$%
threshold = 42 days
# ... Age Reopened? 1 ... 180 days YES 2 ... 90 days NO 3 ... 16 days YES 4 ... 12 days NO ... ... ... ...
too recent
Estimate the confidence (α) that the remaining non-reopened bugs will never be reopened
3
D(r#."(&$%
# ... Age Reopened? 1 ... 180 days YES 2 ... 90 days NO 3 ... 16 days YES 4 ... 12 days NO ... ... ... ...
confidence (α)?
α =
D(r#."(&$% (f&r/*'! ($ "0# p!p#r)
# ... Age Reopened? 1 ... 180 days YES 2 ... 90 days NO 3 ... 16 days YES 4 ... 12 days NO ... ... ... ...
num. bugs that have been reopened num. bugs older than the threshold
If α is not high enough (e.g., α< 0.95), choose another threshold (i.e., repeat from )
4
D(r#."(&$%
2
1. Context 2. Problem 3. Solution 4. Discussion
O'- W($# T!%"#% B#""#r
There’s a trade off:
larger α ⇒ more confidence, less data smaller α⇒ less confidence, more data
For the project NetBeans/Platform:
removing bugs younger than 6 weeks (0.7%) raises the confidence from 88% to 95%
Arrrr!* It’s in the
paper!
*
Do ye have any source
code to show?
Thank you!
And clean up your bug reports before using them!