improving structural testing of object-oriented programs via integrating evolutionary testing and...

Post on 17-Jan-2018

223 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

3 Evolutionary Testing – Chromosome Encode a method sequence with argument values as a chromosome : E.g., eToc [Tonella 04] Method sequenceMethod arguments

TRANSCRIPT

Improving Structural Testing of Object-Oriented Programs via Integrating Evolutionary Testing and Symbolic Execution

Kobi Inkumsah Tao Xie

Dept. of Computer ScienceNorth Carolina State University,

Raleigh, NC

2

Motivation Generating OO unit tests involves two tasks:

Task 1: generating relevant method sequences Task 2: generating relevant method arguments

Symbolic execution good at Task 2 but not Task 1 bounded exhaustive sequence but low bound

Evolutionary testing good at Task 1 but not Task 2 concrete values can be evolved but largely random

Contribution: Integrating the two techniques to address their respective weaknesses

3

Evolutionary Testing – Chromosome Encode a method sequence with argument

values as a chromosome:

E.g., eToc [Tonella 04]

Method sequence Method arguments

Evolutionary Testing - Algorithm1. Start with random population of chromosomes2. Evolve population towards optimal solution by

recombination and mutation

P ← generateRandomPopulation()while P is not optimal CS ← evaluateFitness(chromosomes[P]) P´ ← performCrossOverOnPairsOf(CS) P ← mutate(P´)end while

E.g., eToc [Tonella 04]

5

Fitness Calculation

public void withdraw(double amount) {L1 if (amount > balance) {L2 printError();L3 return; }L4 if (numberOfWithdrawals >= 10)L5 if (amount > 500.00) {L6 printError();L7 return;}L8 dispence(amount);L9 balance -= amount;L10 numberOfWithdrawals++; }

Target

eToc [Tonella 04]

6

Re-combination

eToc [Tonella 04]

7

Re-combination

eToc [Tonella 04]

8

Re-combination

9

Mutation

80.00

randomly mutated to

Evolutionary Testing - Summary

+ Method sequences are evolved guided by fitness- Method arguments are largely randomly picked

11

Dynamic Symbolic Execution

amount: 20.0

Also called concolic testingEx. DART, jCUTE, Pex

12

Dynamic Symbolic Execution

amount > 1.0

Also called concolic testingEx. DART, jCUTE, Pex

amount: 20.0

13

Dynamic Symbolic Execution

Also called concolic testingEx. DART, jCUTE, Pex

amount: 20.0

14

Dynamic Symbolic Execution

Also called concolic testingEx. DART, jCUTE, Pex

amount: 20.0

15

Dynamic Symbolic Execution

Negates amount > 1.0 new constraint: amount <= 1.0 new value:

amount: 1.0

Also called concolic testingEx. DART, jCUTE, Pex

amount: 20.0

(1.00);

Evolutionary Testing - Summary

+ Method sequences are evolved guided by fitness- Method arguments are largely randomly picked

+ Method arguments are solved- Method sequences are fixed (or bounded)

17

Evacon

Not that good at argument generation

Not that good at sequence generation

18

Argument Transformation

Transform primitive arguments of method sequences (produced by evolutionary testing) into symbolic arguments.

Benefits: Allow a symbolic execution tester (e.g., jCUTE) to do

concrete and symbolic execution Transform any JUnit method sequence into a symbolic test

driver.

19

Argument Transformation - Example

Sym exe test generation

20

Evacon

Not that good at argument generation

Not that good at sequence generation

21

Chromosome Construction Construct chromosomes out of method

sequences generated using symbolic execution

Evolutionary test generation

22

Evacon - Summary

Not that good at argument generation

Not that good at sequence generation

23

Evaluation We compare Evacon’s achieved branch

coverage with four publicly available test generation tools: eToc [Tonella 04] alone jCUTE [Sen&Agha 06] alone JUnit Factory[Agitar Labs 07] Randoop [Pacheco&Ernst 07]

24

Experimental Subjects

25

Branch Coverage

• Evacon-A achieves the highest branch coverage for 10 out of 13 subjects • The best branch coverage can be achieved by tool combination for 8 out of 13 subjects

26

Required Length of Method Sequences

The length of the longest method sequence generated by Evacon-A or Randoop that achieves new branch coverage

- The required length reaches up to 23- Bounded exhaustive testing may not be feasible

Branch Ranking What does branch coverage of two tools: 85% >

75% tell? Tool with 75% may be better at covering those

difficult-to-cover branches when used in tool combination

Need take into account difficulties of branches being covered (esp. using tools in combination)

Proposed metric: #branches categorized into:Branch-1: covered by only 1 tool under comparison…

Branch-n: covered by only n tools under comparison Covering more branches in Branch-1 means uniquely covering more branches not being covered by the other tools under comparison

Branch Ranking

Evacon-A is best in terms of uniquely covering branches in Branch-1

Using Evacon-A + JUnit Factory is the best choice if only two tools are to be used (not necessarily Evacon-A + Randoop!)

#Covered Branches/#Branches in category Branch-n

29

Conclusion A new integration of evolutionary testing and

symbolic execution to achieve two tasks of OO test generation

An empirical comparison of our integration with state-of-the-art representative testing tools

A detailed comparison of the strengths and weaknesses of tools w.r.t achieving high structural coverage (e.g., branch ranking)

THANK YOU!

Questions?

31

Experiments We applied two types of Evacon integrations:

Evacon-A and Evacon-B We measure branch coverage achieved by

the tests generated by all six tools within the same period of runtime, except for JUnit Factory

For remaining tools we use Evacon-A’s runtime as the common runtime

32

Coverage Subsumption Branch coverage of Evacon-A subsumed:

Evacon-B (in 12 of 13 PsUT) eToc (in 7 of 13 PsUT) jCUTE (in 3 of 13 PsUT) JUnit Factory (in 1 of 13 PsUT) Randoop (in 4 of the 13 PsUT)

Branch coverage of Randoop subsumed: Evacon-A (in 1 of 13 PsUT)

Overall, branch coverage of Evacon-A + branch coverage of JUnit Factory subsumed branch coverage achieved by all tools in 5 of 13 PsUT

top related