a survey on automatic software evolution techniques

51
A Survey on Automatic Software Evolution Techniques Jindae Kim PhD Qualifying Examination, Aug 31, 2015 HKUST 1

Upload: sung-kim

Post on 10-Apr-2017

506 views

Category:

Software


1 download

TRANSCRIPT

Page 1: A Survey on Automatic Software Evolution Techniques

A Survey on Automatic Software Evolution

TechniquesJindae Kim

PhD Qualifying Examination, Aug 31, 2015 HKUST

1

Page 2: A Survey on Automatic Software Evolution Techniques

Overview

• Automatic Software Evolution

• Approaches

• Challenges

• Proposed Idea

2

Page 3: A Survey on Automatic Software Evolution Techniques

Automatic Software Evolution

• An activity or a technique to evolve software automatically.

• Supports software development process and increase the productivity of human developers.

3

Page 4: A Survey on Automatic Software Evolution Techniques

Area Techniques

Refactoring

Henkel et al.(2005), Murphy-Hill et al.(2007), Higo et al. (2008), Tsantalis et al.(2009), Tsantalis et al.(2010), Tsantalis et al.(2011), Dijkman et al.(2011)

Automatic Patch Generation

Arcuri (2008), Arcuri et al.(2008), Dallmeier et al.(2009), Weimer et al.(2009), Wei et al.(2010), Orlov and Sipper (2011), Le Goues et al.(2012), Kim et al.(2013), Nguyen et al.(2013), Long et al.(2015)

Automatic Runtime Recovery

Rinard et al.(2004), Elkarablieh and Khursid (2008), Dobolyi et al.(2008), Nagarajan et al.(2009), Perkins et al.(2009), Carbin et al.(2011), Kling et al.(2012), Carzaniga et al.(2013), Long et al.(2014)

Performance Improvement

White et al.(2008) , Langdon et al.(2010), Orlov et al.(2011), White et al.(2011), Harman et al.(2012), Langdon et al.(2013), Petke et al.(2014)

4

Page 5: A Survey on Automatic Software Evolution Techniques

Approches

5

Page 6: A Survey on Automatic Software Evolution Techniques

Generate and Validate

• Most recent and popular approach in automatic software evolution.

• Evolves a program in various aspects with validation.

6

Page 7: A Survey on Automatic Software Evolution Techniques

7

Generate

Validate

Seed Program

Target Program

Overview of Generate and Validate System

Pass

Program VariantProgram VariantProgram VariantProgram VariantProgram VariantProgram Variant

Fail

Page 8: A Survey on Automatic Software Evolution Techniques

8

Seed Program Program Feature

Program

Fea

ture

Search Space of Possible Program Variants

Page 9: A Survey on Automatic Software Evolution Techniques

8

Seed Program

Plausible Variant

Program Feature

Program

Fea

ture

Search Space of Possible Program Variants

Page 10: A Survey on Automatic Software Evolution Techniques

8

Seed Program

Target Variant

Plausible Variant

Program Feature

Program

Fea

ture

Search Space of Possible Program Variants

Page 11: A Survey on Automatic Software Evolution Techniques

8

Seed Program

Target Variant

Good! Plausible Variant

Program Feature

Program

Fea

ture

Search Space of Possible Program Variants

Page 12: A Survey on Automatic Software Evolution Techniques

8

Seed Program

Target Variant

Plausible Variant

Not Bad

Program Feature

Program

Fea

ture

Search Space of Possible Program Variants

Page 13: A Survey on Automatic Software Evolution Techniques

8

Seed Program

Target Variant

Plausible Variant

Wrong!

Program Feature

Program

Fea

ture

Search Space of Possible Program Variants

Page 14: A Survey on Automatic Software Evolution Techniques

Categorization of Generate and Validate

9

Approaches

Variant Generation

Simple Mutations

Mutations with Existing Source

Code

Pre-defined Templates

Search Method

Genetic Programming(GP)

Arcuri et al. 2008, FINCH (Orlov et al. 2011).

GenProg (Le Goues et al. 2012), Petke et al. 2014.

PAR (Kim et al. 2013)

Random/Heuristic Search

Debroy and Wong 2010. SemFix (Nguyen et al. 2013).

AE (Weimer et al. 2013), TrpAutoRepair (Qi et al. 2013), RSRepair (Qi et al. 2014).

SPR (Long et al. 2015), Prophet (Long et al. 2015)

Page 15: A Survey on Automatic Software Evolution Techniques

Simple Mutations + Genetic Programming

10

Page 16: A Survey on Automatic Software Evolution Techniques

Genetic Programming

11

Overview of GenProg (Le Goues, 2013)

Page 17: A Survey on Automatic Software Evolution Techniques

Co-Evolutionary Method

• Co-evolution of source code and a test suite.

• Using basic 34 primitives.

• Evaluated on eight seeded faults, and fixed five faults.

12

Page 18: A Survey on Automatic Software Evolution Techniques

Co-Evolutionary Method

13

Page 19: A Survey on Automatic Software Evolution Techniques

14

Partial GP implementation of Bubble Sort by Arcuri et al.

Page 20: A Survey on Automatic Software Evolution Techniques

Simple Mutations + Random/Heuristic Search

15

Page 21: A Survey on Automatic Software Evolution Techniques

SemFix

• Program Repair via Semantic Analysis.

• Selects a target statement based on fault localization.

• Generates repair constraint based on a test suite.

• Synthesizes a new statement satisfying the constraint.

16

Page 22: A Survey on Automatic Software Evolution Techniques

SemFix

17

Basic components used in SemFix by Nguyen et al.

Page 23: A Survey on Automatic Software Evolution Techniques

SemFix

• SemFix repaired 48 out of 90 bugs from SIR and GNU Coreutils.

• SemFix also generated a repair faster than a GP-based technique.

18

Page 24: A Survey on Automatic Software Evolution Techniques

19

Techniques Description Limitation

Co-evolutionary Method

Evolving source code and a test suite together.

Using simple primitives.

Evaluated on seed faults for a simple program.

Primitives are too small.

SemFix

Derive repair constraints by source code and test cases. Synthesize a statement satisfying the constraints.

Components used in statement synthesis are

simple. Only evaluated on small

programs.

Page 25: A Survey on Automatic Software Evolution Techniques

Existing Source Code + Genetic Programming

20

Page 26: A Survey on Automatic Software Evolution Techniques

GenProg

21

Overview of GenProg (Le Goues, 2013)

Page 27: A Survey on Automatic Software Evolution Techniques

GenProg

• Automatic program repair technique.

• Statement insertion/deletion/replacement.

• Using source code in the same revision.

• Fixed 55 out of 105 bugs.

• Assumes a patch already exists in existing source code.

22

Page 28: A Survey on Automatic Software Evolution Techniques

Genetic Improvement

• Petke et al. evolve MiniSAT solver for Combinatorial Interaction Testing (CIT).

• Using multiple variations of MiniSAT solver written by human as code base.

• Evolved MiniSAT is even faster than the human’s on CIT.

23

Page 29: A Survey on Automatic Software Evolution Techniques

Existing Source Code + Random/Heuristic Search

24

Page 30: A Survey on Automatic Software Evolution Techniques

AE

• Generating all possible variants one by one.

• Using the same mutation as GenProg.

• Selects a fix location based on fault localization.

• Detects and skip equivalent variants validation.

• Generates first order mutant only.

25

Page 31: A Survey on Automatic Software Evolution Techniques

RSRepair

• RSRepair has the same search space as GenProg.

• Generates variants one by one - fewer patch trials.

• Randomly selects a change location.

• Outperforms GenProg in 23 out of 24 cases.

26

Page 32: A Survey on Automatic Software Evolution Techniques

27

Techniques Description Limitation

GenProgStatement level mutations.

Insert/replace new statement from existing code. Lack of ability to create new

statements. Fitness guided search is not

effective.Genetic Improvement

Statement level mutations. Using multiple variation of the same program as code

base.

AEDeterministic search.

Equivalent variant detection. Lack of ability to create new statements.

Search space is limited to variants with only one

mutation.RSRepair Random search.

Page 33: A Survey on Automatic Software Evolution Techniques

Pre-defined Templates + Genetic Programming

28

Page 34: A Survey on Automatic Software Evolution Techniques

PAR

29

Overview of PAR framework by Kim et al.

Page 35: A Survey on Automatic Software Evolution Techniques

PAR

• PAR uses 10 pre-defined fix templates.

• Fix templates are drawn from manual inspection of human patches.

• Fix templates include null checker, parameter changes and expression changes.

• Patches requiring new code can be generated.

30

Page 36: A Survey on Automatic Software Evolution Techniques

Patch Acceptability

31

Average ranks evaluated by 68 developers by Kim et al.

Page 37: A Survey on Automatic Software Evolution Techniques

Pre-defined Templates + Heuristic Search

32

Page 38: A Survey on Automatic Software Evolution Techniques

SPR

• Using seven transformation schemas.

• Condition synthesis for transformation schema instantiation.

• Applying schemas in pre-defined order.

• Prioritizes transformations on branches and memory initializations.

33

Page 39: A Survey on Automatic Software Evolution Techniques

Plausible vs. Correct Patches

34

Total 69 defects SPR GenProg AE

Plausible 37 16 25

Correct 11 1 2

Repair Generation Results by Long et al.

Page 40: A Survey on Automatic Software Evolution Techniques

35

Search Space comparison by Long et al.

Page 41: A Survey on Automatic Software Evolution Techniques

Prophet

• Using the same transformation schema as SPR.

• Learning a model from successful patches.

• Ranks candidate patches based on the trained model.

• Prophet generated correct patches for 15 defects, while SPR generated 11.

36

Page 42: A Survey on Automatic Software Evolution Techniques

37

Seed Program

Plausible Variant

Program Feature

Program

Fea

ture

Repair Searches of SPR and Prophet

Target Variant

Page 43: A Survey on Automatic Software Evolution Techniques

37

Seed Program

Plausible Variant

Program Feature

Program

Fea

ture

Repair Searches of SPR and Prophet

Target Variant

SPR

Page 44: A Survey on Automatic Software Evolution Techniques

37

Seed Program

Plausible Variant

Program Feature

Program

Fea

ture

Repair Searches of SPR and Prophet

Target Variant

Prophet

Page 45: A Survey on Automatic Software Evolution Techniques

38

Techniques Description Limitation

PAR10 fix templates from manual

inspection.

Only some of fix templates are useful.

Template instantiation using existing code.

SPR

Seven transformation schemas.

Condition synthesis for schema instantiation.

Hard coded heuristic search. Search space is limited by

schemas.

Prophet

Same transformation schemas as SPR.

Ranks variants based on a probabilistic model.

Search space is limited by schemas.

Page 46: A Survey on Automatic Software Evolution Techniques

Challenges

39

Page 47: A Survey on Automatic Software Evolution Techniques

Search Space Explosion

• Pre-defined templates limit search space.

• SPR has correct variants for only 19 out of 69 defects in its search space.

• 35 out of 69 defects can be fixed with extended search space (Long et al. 2015).

• How about additional costs?

40

Page 48: A Survey on Automatic Software Evolution Techniques

Search Method

• Random application of mutations may generate plausible, incorrect variants (Qi et al. 2015).

• Prophet can find four more correct patches than SPR.

• Only difference is search method.

• Search space extension makes search even harder.

• Effective and efficient search method is necessary.

41

Page 49: A Survey on Automatic Software Evolution Techniques

How to address the issues?

• Avoiding error-prone changes by learning from existing changes.

• PAR and SPR show that template approach works.

• Identify frequent changes from software repositories, then use them as templates.

• Mining usage patterns of such changes to assist search.

42

Page 50: A Survey on Automatic Software Evolution Techniques

Summary

• Automatic software evolution have been used in many areas.

• Generate and Validate systems have been advanced in two major directions - program variant generations and search method.

• Current challenges in search space explosion and effective search method.

43

Page 51: A Survey on Automatic Software Evolution Techniques

44

Approaches Limitation

Variant Generation

Simple Mutations

Applied modifications are very simple. Scalability issue - only works for small programs.

Mutations with Existing Source

Code

Existing code restricts possible program variants. Low possibility that necessary code fragments exist.

Pre-defined Templates

Pre-defined templates restrict search space. Only a small number of templates are used.

Search Method

Genetic Programming

Additional costs for fitness evaluation. Fitness guided search is not effective.

Random/Heuristic Search

Search space is limited based on the number of mutations. Mostly consider only one mutation.