“isolating failure causes through test case generation “ jeremias rößler gordon fraser...

32
Isolating Failure Causes through Test Case Generation Jeremias Rßler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Upload: jacob-mckinney

Post on 29-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

“Isolating Failure Causes through Test Case Generation “

Jeremias RoßlerGordon FraserAndreas Zeller

Alessandro Orso

Presented by John-Paul Ore

Page 2: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Motivation: Debugging & Maintenance is Super Expensive

• Cost to develop software worldwide: $1,500,000,000,000 (USD)

• Debugging and Maintenance cost$350,000,000,000 (USD) (assumes 23% of developer time spent

debugging)

Source: Judge Business School of the University of Cambridge, UK (2013) Evans Data Corporation (2012), Payscale (2012), RTI (2002), CVP Surveys (2012)

Page 3: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

What is Debugging?

Finding the fault responsible for the failure, and applying a change to program P such that P is correct with regard to the specification S concerning the failure.

Debugging includes a search problem. We can automate search.

Page 4: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Talk Outline

• Problems BugEx seeks to address• Background concepts• Inner Workings of BugEx Algorithm• Empirical Evaluation• Relation of this work to 990 Class Project

Page 5: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Automated Debugging: still a hard problem

Parnin, Chris, and Alessandro Orso. "Are automated debugging techniques actually helping programmers?." Proceedings of the 2011 International Symposium on Software Testing and Analysis. ACM, 2011.

Page 6: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

BugEx : Overview Problems addressed

Problem 1: Automated debugging techniques reveal too many possible code locations

Solution 1: Increase precision through guided test-generation

Problem 2: Even if the location is known, developer might not have perfect bug understanding

Solution 2: presents ‘facts’ rather than code locationProblem 3: Other experimental techniques unsound (Delta Debugging, Predicate switching)

Solution 3: Generate real program executions

Page 7: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

BugEx: Underlying Concepts

1. Expands on statistical debugging. Correlate program facts with failures

Page 8: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

1. BugEx extends Statistical DebuggingBenjamin Liblit et al.

Liblit, B., Aiken, A., Zheng, A. X., & Jordan, M. I. (2003). Bug isolation via remote program sampling. ACM SIGPLAN Notices, 38(5), 141-154.(and more, identified in the paper)

“Statistical debugging works off of the contrast between good and bad runs, so you need to feed it both.” – B. Liblit.

Passing test case

Failing test case

Page 9: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

BugEx: Underlying Concepts

1. Expands on statistical debugging. Correlate program facts with failures

2. Use automatic test generation (genetic algorithms) to create statistically significant number of tests

Page 10: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

2. Test Case GenerationGenetic Algorithms

• Individual is a TEST encoded in JAVA bytecode • Mutation might change

branching or variable values

•TE

ST_a

•TE

ST_b

•TE

ST_b

•TE

ST_a

• Fitness branch distance or predicate distance (closer is better)

Image http://www.ewh.ieee.org/soc/es/May2001/14/Begin.htm

Page 11: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Test Case GenerationGenetic Algorithms

• Shape of the search directs fitness function (the gradient)

• Globally Optimality not guaranteed

Image © Mathworks, 2010

Page 12: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Overview of BugEx (hint: it’s a Search)

Generate Tests explore search space(Genetic algorithm)

Find facts that correlate with failure to guide test generation(Statistical debugging)

Show results

Page 13: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

BugEx Algorithm : Initialization(figure 4 p. 312)

Page 14: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

BugEx Algorithm : Main Loop(figure 4 p. 312)

(of the best!)

(Statistical Debugging)

(Genetic Algorithm)

LOO

P

(branches or state predicates)

Page 15: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

14. F := getFacts(Tfail) U getFacts(Tpass) U F

1. Fact must be Boolean: either true or false at runtime2. Fact must be observable.

Branches• Reached or not reached• T or F branch taken?

attribute | parameters | inspector

< | > | <= | >= | = | !=attribute | parameters | inspector | constant

State Predicates• All available variables, objects,

constants at beginning of method

? How Big is this space (in Big O) ? {

Page 16: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

16. Fcorrelating := correlateToFailure(F, Tfail, Tpass)

Bayes’ Theorem

Bayesian Inference

Page 17: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Slides courtesy of Jeremias Roßler (2012)

Page 18: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Slides courtesy of Jeremias Roßler (2012)

Page 19: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Slides courtesy of Jeremias Roßler (2012)

Page 20: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Slides courtesy of Jeremias Roßler (2012)

Page 21: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Slides courtesy of Jeremias Roßler (2012)

Page 22: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Slides courtesy of Jeremias Roßler (2012)

Page 23: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Slides courtesy of Jeremias Roßler (2012)

Page 24: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Empirical Evaluation

Page 25: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Empirical Research Questions

• RQ1. Is the number of relevant facts identified by BUGEX small enough for a developer to examine?

Page 26: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

• RQ1. Is the number of relevant facts identified by BUGEX small enough for a developer to examine?

Page 27: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

# of Branches vs Time to Converge

Bran

ches

Seconds

Page 28: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

RQ1: BugEx compared to Statistical Debugging

BugEx

Page 29: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Empirical Research Questions

• RQ2. Do the facts identified by BUGEX help the developer understand the failure?

• Authors answered ‘yes’, compared their fix with the ‘official fix’. Challenging because sometimes the original developers refactored the code at a larger scale.

Page 30: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Subsequent User Studies: nope

“This study showed how much effort the design and preparation of

a user study requires, and how easy error prone it is. This is

probably the reason, why there are still so few user studies in the

field of automated debugging.”

“So there was little time to prepare BUGEX and the underlying

infrastructure.”

Roßler, Jeremias. "From software failure to explanation." (2013).

Page 31: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Summary

• BugEx combines Statistical Debugging and Automated Test Generation (GA) to improve debugging precision.

• BugEx treats debugging is a search problem, and tries to find information that is useful to developers.

• Usefulness difficult to evaluate because prototype tool is very specific.

Page 32: “Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore

Relation of BugEx to Project

• Guided automatic test generation. • Focus on message passing programs, observed at the

component level (ROS – robot operating system)• Use program traces to generate test suites for

regression testing, based on component properties.