repository mining and program analysis & testing: …ffffffff-896a-a3c8-ffff... ·...
TRANSCRIPT
![Page 1: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/1.jpg)
Partially supported by: NSF, DHS, and US Air Force
Alessandro (Alex) OrsoSchool of Computer Science – College of Computing
Georgia Institute of Technologyhttp://www.cc.gatech.edu/~orso/
REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: BETTER TOGETHER?
![Page 2: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/2.jpg)
MSR PAPERS ANDPROGRAM ANALYSIS
0
1
2
3
4
2004 2005 2006 2007 2008 2009 2010
# M
SR p
aper
s th
at le
vera
gest
atic
and
/or
dyna
mic
ana
lyse
s
Year
Note: this is only
for MSR!
![Page 3: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/3.jpg)
• Mini-history of software archives
• < 1996 – Mostly small examples, limited evaluation
• 1996 – Siemens suite (<500 LOC)
• 2005 – Software-artifact Infrastructure Repository
• 2006 – Eclipse Bug Data
• 2007 – iBUGS
• In 2010, much (most?) research still uses the Siemens suite
PROGRAM ANALYSIS ANDSOFTWARE ARCHIVES
![Page 4: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/4.jpg)
ISSUE #1
Communication
ISSTA PCs (76) MSR (72)
4
![Page 5: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/5.jpg)
ISSUE #2
Mismatch in assumptions (or schisms)
• (Most) program analyses
• Complete programs
• Single language
• Restricted set of features
• Soundness
• False positives problematic
• Mining techniques
• Incomplete programs
• Multiple languages
• Complete languages
• Noisy data
• False positives acceptable
![Page 6: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/6.jpg)
ISSUE #3
Infrastructure
• Program analysis tools
• Unavailable
• Unusable
• Limited
• Mining infrastructure
• No standard format
• Complicated setup
• Unusable
![Page 7: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/7.jpg)
ISSUE #4
Narrow focus of some MSA research
![Page 8: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/8.jpg)
LOOKING FOR GOLD...
![Page 9: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/9.jpg)
LOOKING FOR KEYS...
![Page 10: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/10.jpg)
LOOKING FOR KEYS...
Softw
are
arch
ives
![Page 11: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/11.jpg)
LOOKING FOR KEYS...
Softw
are
arch
ives
![Page 12: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/12.jpg)
LOOKING FOR KEYS...
Softw
are
arch
ives
![Page 13: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/13.jpg)
LOOKING FOR KEYS...
Softw
are
arch
ives
![Page 14: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/14.jpg)
MAYBE IF WE TURN ON THE LIGHT
![Page 15: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/15.jpg)
MAYBE IF WE TURN ON THE LIGHT
![Page 16: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/16.jpg)
MINING MORE THAN ARCHIVES
Software
![Page 17: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/17.jpg)
MINING MORE THAN ARCHIVES
Software Archives
![Page 18: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/18.jpg)
MINING MORE THAN ARCHIVES
Software Archives Program runsProgram traces...
![Page 19: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/19.jpg)
MINING MORE THAN ARCHIVES
Software Archives Program runsProgram traces...
Static/dynamic metrics
![Page 20: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/20.jpg)
MINING MORE THAN ARCHIVES
Software Archives Program runsProgram traces...
Static/dynamic metrics
![Page 21: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/21.jpg)
GAMMA PROJECT
?
Field Data
In house In the field
Maintenance tasks:Impact analysis
Regression testingDebugging
Behavior classification...
Developers
Maintenance tasks:Impact analysisRegression testing
DebuggingBehavior classification
...
"Gamma System: Continuous Evolution of Software after Deployment."
Orso et al., ISSTA 2002.
![Page 22: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/22.jpg)
IMPACT ANALYSIS
![Page 23: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/23.jpg)
IMPACT ANALYSIS
• Assess effects of changes on a software system
• Predictive: help decide which changes to perform and how to implement changes
• Our approach
• Program-sensitive impact analysis
• User-sensitive impact analysis
![Page 24: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/24.jpg)
IMPACT ANALYSIS USING FIELD DATA
m1
Program P
XXA2XXXB2
XXXXB1XXA1
m6m5m4m3m2m1
m2
m4m3
m5 m6
m1 m2
m4m3
m5 m6
m1 m2
m4m3
m5 m6ex
ecut
ion
data
m1 m2
m4m3
m5 m6
m1 m2
m4m3
m5 m6
User A User B
C1 X X"Leveraging Field Data for Impact Analysis and Regression Testing."
Orso et al., ESEC-FSE 2003.
![Page 25: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/25.jpg)
PROGRAM-SENSITIVEIMPACT ANALYSIS
1. Field execution data
2. Change
Input:
C={m2, m5}
XXA2XXXB2
XXXXB1XXA1
m6m5m4m3m2m1
C1
Impact set = Output:
X X
![Page 26: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/26.jpg)
PROGRAM-SENSITIVEIMPACT ANALYSIS
Step 1• Identify user executions through
methods in C• Identify methods covered by such
executions
1. Field execution data
2. Change
Input:
C={m2, m5}
XXA2XXXB2
XXXXB1XXA1
m6m5m4m3m2m1
C1
Impact set = Output:
X X
![Page 27: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/27.jpg)
PROGRAM-SENSITIVEIMPACT ANALYSIS
Step 1• Identify user executions through
methods in C• Identify methods covered by such
executions
1. Field execution data
2. Change
Input:
C={m2, m5}
XXA2XXXB2
XXXXB1XXA1
m6m5m4m3m2m1
C1
Impact set = Output:
X X
![Page 28: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/28.jpg)
PROGRAM-SENSITIVEIMPACT ANALYSIS
Step 1• Identify user executions through
methods in C• Identify methods covered by such
executions
1. Field execution data
2. Change
Input:
C={m2, m5}
XXA2XXXB2
XXXXB1XXA1
m6m5m4m3m2m1
C1
Impact set = Output:
X X
![Page 29: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/29.jpg)
PROGRAM-SENSITIVEIMPACT ANALYSIS
Step 1• Identify user executions through
methods in C• Identify methods covered by such
executions
1. Field execution data
2. Change
Input:
C={m2, m5}
XXA2XXXB2
XXXXB1XXA1
m6m5m4m3m2m1
C1
Impact set = Output:
X X
![Page 30: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/30.jpg)
PROGRAM-SENSITIVEIMPACT ANALYSIS
Step 1• Identify user executions through
methods in C• Identify methods covered by such
executions
1. Field execution data
2. Change
Input:
C={m2, m5}
covered methods = {m1,m2,m3,m5,m6}
XXA2XXXB2
XXXXB1XXA1
m6m5m4m3m2m1
C1
Impact set = Output:
X X
![Page 31: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/31.jpg)
PROGRAM-SENSITIVEIMPACT ANALYSIS
Step 1• Identify user executions through
methods in C• Identify methods covered by such
executions
1. Field execution data
2. Change
Input:
C={m2, m5} Step 2
• Dynamic forward slice from C
covered methods = {m1,m2,m3,m5,m6}
XXA2XXXB2
XXXXB1XXA1
m6m5m4m3m2m1
C1
Impact set = Output:
X X
![Page 32: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/32.jpg)
PROGRAM-SENSITIVEIMPACT ANALYSIS
Step 1• Identify user executions through
methods in C• Identify methods covered by such
executions
1. Field execution data
2. Change
Input:
C={m2, m5} Step 2
• Dynamic forward slice from C
covered methods = {m1,m2,m3,m5,m6}
XXA2XXXB2
XXXXB1XXA1
m6m5m4m3m2m1
C1
Impact set = Output:
{m2,m5,m6}
X X
dynamic fwd slice = {m2,m5,m6}
![Page 33: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/33.jpg)
USER-SENSITIVEIMPACT ANALYSIS
1. Collective impact =
Collective impact• Percentage of executions through
at least one changed methodXXA2
XXXB2XXXXB1XXA1
C1
Input:
Affected users• Percentage of users that executed
at least once one changed method
3/5 = 60%
3/3 = 100%
2. Affected users =
2. Change
Output:
C={m5, m6}
60%
100%
1. Field execution data
X X
m6m5m4m3m2m1
![Page 34: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/34.jpg)
EMPIRICAL STUDY
• Subject:
• JABA: Java Architecture for Bytecode Analysis (60 KLOC, 500 classes, 3K Methods)
• Data
• Field data: 1,100 executions (14 users, 12 weeks)
• In-house data: 195 test cases, 63% method coverage
• Changes: 20 real changes extracted from JABA’s CVS repository
• Research question: Does field data yield different results than in-house data?
• Experimental setup
• Computed impact sets for the 20 changes using field data and using in-house data
• Compared impact sets for the two datasets
![Page 35: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/35.jpg)
RESULTS
0
225
450
675
900
C1 C2 C3 C4 C5 C6 C7 C8 C9C10 C11 C12 C13 C14 C15 C16 C17 C18 C19 C20
Field InHouse Field - InHouse InHouse - Field
InHouse
100 96636
Field
# m
etho
ds
changes
![Page 36: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/36.jpg)
"Gammatella: Visualizing Program-Execution Data for Deployed Software." Jones et al., Information Visualization, 2004.
DEMO
![Page 37: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/37.jpg)
DEBUGGING FIELD FAILURES
![Page 38: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/38.jpg)
FIELD FAILURES
Field failures: Anomalous behaviors (or crashes) of deployed software that occur on user machines
• Difficult to debug• Relevant to users
![Page 39: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/39.jpg)
Ask the user
CURRENT PRACTICE
I opened my web browser.
Specifically, I clicked on the dock icon. It bounced twice before crashing.
Please help.
![Page 40: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/40.jpg)
Gather static information
CURRENT PRACTICE
Difficult to reproduce the problem
Only locations directly correlated with the failure
![Page 41: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/41.jpg)
OUR SOLUTION
Recordfailing executions
in the field
Replayfailing executions
in house
Debugfield failureseffectively
+
![Page 42: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/42.jpg)
In the fieldIn house
USAGE SCENARIO
!Replay / Debug
Develop Record
Capturedfailure
![Page 43: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/43.jpg)
345345
CHALLENGES
Large in size Contain sensitiveinformation
!
![Page 44: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/44.jpg)
345345
CHALLENGES
Large in size Contain sensitiveinformation
Minimize
! !
![Page 45: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/45.jpg)
345345
CHALLENGES
Large in size Contain sensitiveinformation
Minimize Anonymize
! !
![Page 46: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/46.jpg)
In the fieldIn house
Replay / Debug
Develop Record
Capturedfailure
MinimizeAnonymize
USAGE SCENARIO
!
!
![Page 47: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/47.jpg)
Results:• negligible overheads (i.e., less than 10%)• data size is acceptable (application dependent)
Subjects:• several cpu intensive applications (e.g., bzip, gcc)
Research question 1:• does the technique impose an acceptable
overhead?
EVALUATION (PRACTICALITY)
"A Technique for Enabling and Supporting Debugging of Field Failures" Clause and Orso, ICSE 2007.
![Page 48: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/48.jpg)
EVALUATION (FEASIBILITY)Research question 2:• can the technique produce minimized executions
that can be used to debug the original failure?
Results:• execution reduced to less than 10% in size• all failures reproducible
Subject: Pine email and news client• two real field failures• 20 failing executions, 10 per failure
"A Technique for Enabling and Supporting Debugging of Field Failures" Clause and Orso, ICSE 2007.
![Page 49: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/49.jpg)
EVALUATION (EFFECTIVENESS)Research question 3:• How much information about the original inputs is
revealed?
Results:• Anonymized inputs revealed between 2% and 60%
of the information in the original inputs
Subjects: NanoXML, htmlparser, Printtokens, Columba• 20 faults overall• inputs from 100 bytes to 5MB in size• all inputs considered sensitive
"Camouflage: Automated Anonymization of Field Data." Clause and Orso, GT Tech Report, March 2010.
![Page 50: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/50.jpg)
RQ3: EFFECTIVENESSNANOXML
<!DOCTYPE Foo [ <!ELEMENT Foo (ns:Bar)> <!ATTLIST Foo xmlns CDATA #FIXED 'http://nanoxml.n3.net/bar' a CDATA #REQUIRED>
<!ELEMENT ns:Bar (Blah)> <!ATTLIST ns:Bar xmlns:ns CDATA #FIXED 'http://nanoxml.n3.net/bar'>
<!ELEMENT Blah EMPTY> <!ATTLIST Blah x CDATA #REQUIRED ns:x CDATA #REQUIRED>]><!-- comment --><Foo a='very' b='secret' c='stuff'>vaz <ns:Bar> <Blah x="1" ns:x="2"/> </ns:Bar></Foo>
![Page 51: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/51.jpg)
RQ3: EFFECTIVENESSNANOXML
<!DOCTYPE [ <! > <!ATTLIST #FIXED ' ' >
<!E > <!ATTLIST #FIXED ' '>
<!E > <!ATTLIST # : # >]><!-- -->< =' ' =' ' =' '> < : > < =" " : =" "/> </ :
![Page 52: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/52.jpg)
Wayne,Bartley,Bartley,Wayne,[email protected],,Ronald,Kahle,Kahle,Ron,[email protected],,Wilma,Lavelle,Lavelle,Wilma,,[email protected],Jesse,Hammonds,Hammonds,Jesse,,[email protected],Amy,Uhl,Uhl,Amy,uhla@corp1,com,[email protected],Hazel,Miracle,Miracle,Hazel,[email protected],,Roxanne,Nealy,Nealy,Roxie,,[email protected],Heather,Kane,Kane,Heather,[email protected],,Rosa,Stovall,Stovall,Rosa,,[email protected],Peter,Hyden,Hyden,Pete,,[email protected],Jeffrey,Wesson,Wesson,Jeff,[email protected],,Virginia,Mendoza,Mendoza,Ginny,[email protected],,Richard,Robledo,Robledo,Ralph,[email protected],,Edward,Blanding,Blanding,Ed,,[email protected],Sean,Pulliam,Pulliam,Sean,[email protected],,Steven,Kocher,Kocher,Steve,[email protected],,Tony,Whitlock,Whitlock,Tony,,[email protected],Frank,Earl,Earl,Frankie,,,Shelly,Riojas,Riojas,Shelly,[email protected],,
RQ3: EFFECTIVENESSCOLUMBA
![Page 53: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/53.jpg)
RQ3: EFFECTIVENESSCOLUMBA
, , , ,, , , , , ,,Wilma,Lavelle,Lavelle,Wilma,,[email protected],Jesse,Hammonds,Hammonds,Jesse,,[email protected],Amy,Uhl,Uhl,Amy,uhla@corp1,com,[email protected],Hazel,Miracle,Miracle,Hazel,[email protected],,Roxanne,Nealy,Nealy,Roxie,,[email protected],Heather,Kane,Kane,Heather,[email protected],,Rosa,Stovall,Stovall,Rosa,,[email protected],Peter,Hyden,Hyden,Pete,,[email protected],Jeffrey,Wesson,Wesson,Jeff,[email protected],,Virginia,Mendoza,Mendoza,Ginny,[email protected],,Richard,Robledo,Robledo,Ralph,[email protected],,Edward,Blanding,Blanding,Ed,,[email protected],Sean,Pulliam,Pulliam,Sean,[email protected],,Steven,Kocher,Kocher,Steve,[email protected],,Tony,Whitlock,Whitlock,Tony,,[email protected],Frank,Earl,Earl,Frankie,,,Shelly,Riojas,Riojas,Shelly,[email protected],,
![Page 54: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/54.jpg)
RQ3: EFFECTIVENESSHTMLPARSER
<?xml version="1.0" encoding="UTF-8" ?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"><head><title>james clause @ gatech | home</title>
<style type="text/css" media="screen" title=""><!--/*--><![CDATA[<!--*/
body { margin: 0px;...
/*]]>*/--></style></head><body> ...</body>
![Page 55: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/55.jpg)
RQ3: EFFECTIVENESSHTMLPARSER
<?xml version="1.0" encoding="UTF-8" ?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"><head><title>james clause @ gatech | home</title>
<style type="text/css" media="screen" title=""><!--/*--><![CDATA[<!--*/
body { margin: 0px;...
/*]]>*/--></style></head><body> ...</body>
![Page 56: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/56.jpg)
RQ3: EFFECTIVENESSHTMLPARSER
<?xml version="1.0" encoding="UTF-8" ?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"><head><title>james clause @ gatech | home</title>
<style type="text/css" media="screen" title=""><!--/*--><![CDATA[<!--*/
body { margin: 0px;...
/*]]>*/--></style></head><body> ...</body>
The portions of the inputs that remain after anonymization tend to be structural in nature and
therefore are safe to send to developers
![Page 57: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/57.jpg)
CONCLUDING REMARKS
![Page 58: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/58.jpg)
ADDRESSING THE ISSUES
• Issue #1: Communication
• Issue #2: Mismatch in assumptions
• Issue #3: Infrastructure
• Issue #4: Narrow focus of some MSA research
![Page 59: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/59.jpg)
ADDRESSING THE ISSUES
• Issue #1: Communication
• Issue #2: Mismatch in assumptions
• Issue #3: Infrastructure
• Issue #4: Narrow focus of some MSA research
• Reaching out
• More common events
• Challenge
![Page 60: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/60.jpg)
ADDRESSING THE ISSUES
• Issue #1: Communication
• Issue #2: Mismatch in assumptions
• Issue #3: Infrastructure
• Issue #4: Narrow focus of some MSA research
• Many similarities and potential synergies
• Opportunity for defining new (or specialized) analyses
• Opportunity for performing more thorough evaluations
![Page 61: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/61.jpg)
ADDRESSING THE ISSUES
• Issue #4: Narrow focus of some MSA research
• Related to communication
• Reciprocal help
• Issue #1: Communication
• Issue #2: Mismatch in assumptions
• Issue #3: Infrastructure
![Page 62: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/62.jpg)
ADDRESSING THE ISSUES
• Issue #1: Communication
• Issue #2: Mismatch in assumptions
• Issue #3: Infrastructure
• Issue #4: Narrow focus of some MSA research
• Go beyond the analysis of “easy” information in the repositories
• Consider all aspects of software, both static and dynamic
• Consider both in-vitro and in-vivo data
![Page 63: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/63.jpg)
IN CONCLUSION,BETTER TOGETHER?
![Page 64: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/64.jpg)
IN CONCLUSION,BETTER TOGETHER?
Techniques for analyzing/mining a program in all of its aspects, static and dynamic, and throughout its lifetime
![Page 65: REPOSITORY MINING AND PROGRAM ANALYSIS & TESTING: …ffffffff-896a-a3c8-ffff... · 2016-06-23 · •Mini-history of software archives • < 1996 – Mostly small examples, limited](https://reader036.vdocument.in/reader036/viewer/2022062914/5e5eb555cb2b4b05710dbed0/html5/thumbnails/65.jpg)
ACKNOWLEDGEMENTS
• Taweesup Apiwattanapong
• James Clause
• Mary Jean Harrold
• James Jones
• Donglin Liang
• Dick Lipton