the$credit$for$crea-ng$these$slides$belongs$to$ fall$2014 ...brun/class/2014fall/... · bug fixing...
TRANSCRIPT
The$credit$for$crea-ng$these$slides$belongs$to$Fall$2014$CS$521/621$students.$$Student$names$have$been$removed$per$FERPA$regula-ons.$
enter Dept name in Slide Master Electrical and Computer Engineering
SemFix:
Program Repair via Semantic Analysis Hoang Duong Thien Nguyen Dawei Qi Abhik Roychoudhury
Satish Chandra
Electrical and Computer Engineering
Background
▪ Bug fixing is mostly manual, time consuming and expensive activity ▪ Current automatic bug-fixing techniques: ▪ Specification-based repair
➢ formal specification is needed ▪ Genetic-programming-based repair
➢ correct expression should be present in program ▪ Enumeration-based repair
➢ all possible expressions should be considered
Electrical and Computer Engineering
Research Question
▪ Can we fix bugs of a program without formal specification? ▪ Is there a better way to fix bugs than genetic programming or enumeration? ▪ Is program synthesis better than enumeration?
Electrical and Computer Engineering
Contribution
▪ Provide an automatically bug-fixing tool without formal specification ▪ Come up with a Constraint: Requirement to make repaired code pass all given tests ▪ Higher success-rate and fast bug repair ▪ Provide a new efficient and wide scalability technique to add component synthesis
Electrical and Computer Engineering
Key Idea
▪ Get a Ranked Bug report • using statistical fault localization
▪ Find a repair constraint according to given tests • using symbolic execution
▪ Compute a repair for the program • using program synthesis
Electrical and Computer Engineering
Key Idea
▪ Ranked bug report ▪ generated by Tarantula toolkit -- can use other metric ▪ a list contains all faulty statements with location ▪ faulty statement ranked by suspiciousness score from most ‘suspicious’ statement to the least one.
o Suspiciousness of a line is how often the line is executed in successful and failing executions. Greater the number
of failures, greater the score
Electrical and Computer Engineering
Key Idea
▪ To get repair Constraint C (Symbolic Execution) ▪ x = fbuggy(…) -> x=f(…) ▪ Input-output pair of each testi generate one constraint Ci 1. symbolic T = f(input) 2. ci is the requirement of T to get the expected output (eg: T>10)
▪ Repair Constraint C is the conjunction of Ci
Electrical and Computer Engineering
Key Idea
▪ To solve the repair Constraint C(Program Synthesis) ▪ Decide components which can appear in the fix
➢ select primitive components based on complexity. ➢ define location variables for each component.
▪ Generate a repair statement by solving repair constraint done by SMT
Electrical and Computer Engineering
Example
bias = down_sep->bias=f(inhibit,up_sep,down_sep) T = f(inhibit,up_sep,down_sep) C={(C1 : T<=100) ^ (C2: T>110) ^…^ (C5: T<10) }
1: get bug report
2.Get repair constraint for first bug
Electrical and Computer Engineering
Example
▪ Repair constraint : C={(C1 : T<100) ^ (C2: T>110) ^…^ (C5: T<10) Provide component for function f(inhibit,up_sep,down_sep) --start with level 1 : function f(inhibit,up_sep,down_sep) = constant --if level 1 cannot satisfy C, combine level 1 and level 2 --process continues until a repair is generated : f(inhibit,up_sep,down_sep) =up_sep + 100
Electrical and Computer Engineering
Repair Algorithm
Electrical and Computer Engineering
Summary of Evaluation
▪ Subject programs used
Electrical and Computer Engineering
Summary of Evaluation
▪ SemFix versus Genprog (Based on Genetic programming) ▪ Success repair rate: Semfix > Genprog (SIR) Overall 90 buggy programs for 50 given tests: Semfix repaired 48/90 GenProg repaired 16/90
Electrical and Computer Engineering
Summary of Evaluation
▪ Bug types: SemFix fixed more types of bugs than GenProg
Electrical and Computer Engineering
Summary of Evaluation
▪ Running time: GenProg running time is greater than 3 times of SemFix (SIR)
Electrical and Computer Engineering
Summary of Evaluation
Repair that were not fixed: ● Multiple line fix ● Same wrong branch condition if (c){ ... } ... if (c) { ... } ● Updates to multiple variables x =e1; ... ; y =e2; ● Floating point bugs n = (int) (count*ratio +1.1)
Electrical and Computer Engineering
Conclusion
➔ The SemFix tool can automatically fix bugs without formal specification ➔ SemFix has higher success rate than GenProg and runs faster than than the latter. ➔ SemFix can fix variable types of bugs
Electrical and Computer Engineering
Discussion
1. If the termination condition of a loop is an expression over our introduced symbolic variable, the symbolic execution may never terminate. What should we do? For example: while(i<x){x=buggy-expression}
Electrical and Computer Engineering
Discussion
2. If we set the bound too small, what might happen?
Electrical and Computer Engineering
Discussion
3. Why SemFix run faster than GenProg tool?
Electrical and Computer Engineering
Discussion
4. Any other bugs that you can think of which SemFix can’t Fix?
Electrical and Computer Engineering
Discussion
5. Can a test case outside the test suite be used to generate a repair?
Electrical and Computer Engineering
Discussion
6. Is it easier to fix multiple simple repairs than just one complex repair?
Electrical and Computer Engineering
Discussion
7. SemFix uses program synthesis which uses different components, levels of statements to generate a repair. Why is this faster than enumeration?