an empirical study of advanced - mcmaster university

1

Advanced

Topics

Combining S. M. T

2

an empirical study of

predicate dependence levels

and trends

david binkley

mark harman

Icse 2003

Or …

How to use slicing to measure

testability

3

overview

evolutionary testing

variable dependence

empirical study programs

results

implications

4

DaimlerChrysler approach to test data generation

Test Evaluation

Test Execution Monitoring

Specification Program

Test O

rga

niz

at io

n

Test D

ocum

ent a

t ion

Test Planning

Test Case

Design

by Means

of EASelection

Reinsertion

Recombination

Mutation

Evaluation

the starting point for the study

was

evolutionary test data generation

5

TargetTarget

6

Target

Level 4

Level 3

Level 2

Level 1

control dependence analysis

TargetTargetTarget

approximation levelapproximation levelapproximation level

7

! fitness = approximation level + local distance

if A = B local distance = | A - B |Target

Level 4

Level 3

Level 2

Level 1

control dependence analysis

TargetTargetTarget

approximation levelapproximation levelapproximation level

local distance calculationlocal distance calculationlocal distance calculation

8

these are the inputs

but suppose we target this predicate … which variables matter?

but wait...

so we must keep yvariable d is not mentioned

so c is killed

of the original 7 variables only 3 matter

search space reduction is exponential

x, y, and z seem to be killed

search space reduction

9

foo(int x1,…,int xn)

…

if p(a1,x1,a3,g4,x4)

…

if q(a1,x5)

…

int a1,…,am;

int g1,…,gk;

q(a1,x5)

how many matter ?

how many matter ?

as n increases?

as k increases?

10

an empirical study

for a typical predicate

how much of input space is relevant ?

the question

why care?

reduce search space

implications for slicing… slice size for the predicate is closely related

comprehension

how easy is it to understand the decision ?

impact analysis

how much do inputs affect ?

measurement

how cohesive ?

11

inputs

formal parameters

globals in scope

du-globals

transitively defined or used in the predicate’s procedure

12

NameKloCPredicatesDescriptionreplace0.684Regular expression string replacementcopia1.245ESA signal processing codecompress1.988Data compression utilitywhich5.4136Unix utilitybarcode5.9365Barcode generatorspace11.5593ESA ADL interpretered13.62627Unix editorprepro14.8619ESA array pre-processing codebc16.8432Calculatorfindutils18.61819File finding utilitiesdiffutils19.82212File comparing routinesflex2-4-721.2911BSD scanner (version 2.4.7)flex2-5-421.51306BSD scanner (version 2.5.7)espresso22.12497logic simplification for CADijpeg28.22038JPEG compressorntpd45.63061daemon for the network time protocola2ps62.95984postscript formattergnugo81.75244GNU game playersendmail85.74991Linux mail daemonspice167.015527Digital circuit simulator

the programs studied

13

data collection

modified HRB slicing algorithm

implemented using CodeSurfer

formals and du-globals become parameters

thanks to GrammaTech for CodeSurfer

14

terminology

max parameters

parameters used

the maximum number of parameters a predicate could depend upon

the number of parameters a predicate actually depends upon

15

result data

three results to summarise in one data pointmax parameters

parameters used

number of predicates summarised

we adopted two diagramatic techniques

16

dependence skyline diagram

listed in ascending order of max formals

8 predicates have max formals of 3 and use 2.2 on average

skylines give a feeling for size reduction… but precise reading is lost

evident savings for prepro

notice the trend for dependent proportion to drop

replace prepro

17

dependence bubble chart

max parameters = parameters used

trend line

preproreplace formal parameters

angle shows the size reduction

x-axis: max formalsy-axis: formals usedbubble at (x,y) of size s …there are s predicateswhich have x max formals available to themthese s predicates depend on average on y formals

18

over all predicates

in

all programs

19

good news

as max formals increases

the dependent proportion falls

good news for evolutionary testing

and also for all the other applications

perhaps it is not good for cohesion

well almost

20

declining dependent proportionall predicates where max formals < 11

all predicates where max formals > 10

The

number of predicates

depended upon

as a function of

max formals available

is piecewise linear

21

declining dependent proportion

max formals < 11 max formals > 10

22

bad news

no such correlation for globals

globals could entail untestability usingsearch

also bad for other applications

is this more evidence that globals are bad

23

dependence trends

Max formals

Max du globals

24

implications

formals:

as the problem gets worse

the solution gets better

25

implications

globals:

large global variable lists

present problems

hard to generate test data

hard to understand

high levels of dependence

26

implications

cohesion:

functions with

large numbers of formals

may not be so cohesive

27

algorithm performance

the algorithm used has complexity O(P*S)

where P is the number of predicates and S is the

size of the procedure

PII 450 with 386Mb memory

analysis time per procedure = 0.017 * P * S ms

average analysis time 7.6ms per predicate

28

some possible

interpretations

what do you

read into

these diagrams ?

29

flex evolution of du globals

version 2.4.7 version 2.5.4

30

profiles

sendmail du globals findutils du globals

big bubbles top right big bubbles bottom left

31

conclusions

falling formal dependent proportion

invariant global dependent proportion

analysis is worth performing

diagrams may be an aid tounderstanding

evaluation

monitoring evolution

32

Testability Transformation

Overviewor…

How to use slicing and transformation to improve

testing

Test Data GenerationEvolutionary Test Data Generation

The Flag Problem

Flag Removal Algorithm

Initial Results


Other ‘non meaning-preserving’transformations

If time permits

33

Automatic Test Generation

We know that generating good quality test data ishard

and knowing what good quality means is hard

I do not propose to answer that question today

Starting point: structural test adequacy criterion

Specifically: that some branch is to be covered

and that we are going to use evolutionary testing

34

Target

Level 4

Level 3

Level 2

Level 1

Relevant branching statements can lead to a miss of

the desired target

! Fitness = Approximation_Level + Local_Distance

2. Local distance calculation in the branchingstatements with undesired branching

2.2. Local distance calculation in the branchingLocal distance calculation in the branching

statements with undesired branchingstatements with undesired branching

Evaluation of predicate in a branching condition in the

same manner as described for safety testing, e.g.

if A = B Local_Distance = max - (| A - B |)

Distance Based Fitness

1. Approximation level1. 1. Approximation levelApproximation level

TargetTargetTargetTargetTarget

35

0-10 10

10

Value of A

fitness

if (A==0) ...

Suppose we want to make this true

Max - (| A-0 |)10 - (| A-0 |)

flag = A==0;

if (flag) ...

Value of A

fitness

0-10 10

10

The Flag Problem

36

Flag Landscape

Large plateau of low fitness

Tiny plateau of high fitness

transform fitness function to transform landscapeTransform program to

Better

37

Informally

A transformation is a partial function on programs

which preserves meaning,of some kind or other

We need to pair the program and test adequacy criterion

– call this the test pair

A testability transformation is a partial function on test pairs

such that...

38


Test data

which

is

adequate for the transformed test pair

is

adequate for the original test pair

39

Testability Transformation Paradox

We are testing to cover structure

… but the structure is the problem

So we transform the program

… and this alters the structure

So we need to be careful:

Are we still testing according to the same criterion?

Our transformations will preserve coverage of

Statements branches MC/DC

Future work: define a semantics to verify this

40

This is not abstract interpretation

To preserve branch coverage:if (e) skip; else skip;

Cannot be transformed to skip;

But the program

if (e) x=1; else x=2;

Can be transformed to

if (e) skip; else skip;

41

Flag TT

Transform the program to remove flagsNot always possible

but worth doing where possible

Our Approach usesSimple transformations

Simple amorphous slicing

Substitute flag use with definition

Fairly well knownPerhaps less well known

Brief overview of amorphous slicing...

42

Flag Removal Transformation

flag = n < 4;

…

if (n%2==0) flag = 0;

…

if (a[i] != ‘0’ && flag)

...

flag = (n%2==0)?0:(n<4);n! = n;

flag = (n! %2==0)?0:(n! <4);

(n! %2==0)?0:(n! <4))

Suppose n is an unsigned integerWhat initial values of n will achieve this?Suppose we want to make flag true here

Once we have this

We can keep the original

flag assignment code

Claim:

Adding the new flag assignment leaves adequacy criteria invariant

43

Simple Flag Removal Algorithm

For loop free flag definition code

Bush

Blossom

Slice leaf sequences

Convert to conditional assignment

Add temporary variables

Substitute definition for use

44

r

p

q

a

c d

qq qq qq qq qq qq qq qq qq qq q

bb

r r r r r r r r r r rrr r r r

rrrr rrrrrrrrrrrrrrrrr

c

b

r

d dddd d d d d d d d

dddddddd

ddddddddddddddd

ddd

cc

cccccccc

ccc cccccccccccc

ccccc

cccc

ccccccccccccccccccccc

Bushing Produces a binary tree

45

p

a

c

q q

b

rrr

b

r

d dddc c c

'

b

c d

bbbbbbbbbbb

b bb bb b

b bb bb b

Blossoming Moves all actions to the leaves

All internal nodes will be predicates

Original predicates may be altered

46

p

a

q q

b

rr r

d dddc c c

'

c db b

dcc db b

aaaaa

'

aaaaaaaaa

aaa aa aa aa aa aa aa aa aa a

''''

a aa aa aa aa aa aa aa aa aa a a aa a a aa a a aa a a aa a a aa a a aa a a aa a a aa a a aa a a ac

rr''''

dcc db b

Blossoming is repeated for all internal action nodes

Some leaf assignments are not to the flag variable

… these can be removed using slicing

Now all internal nodes are predicates

47

p

q q

b

rr r

ddc c

'

dcc db b

'

''''

a a a a

r''''

dcc db bc d

ccccccccccac adac ad

Amorphous slicing gives single assignment at leaves

Now all leaves are single assignments

… and they all assign to the flag variable

Some predicates change several times during blossomingSometimes syntax-preserving slicing does this tooBut sometimes syntax-preserving slicing leaves several assignments

We assumed

freedom from

side effects

Fortunately we have a

side effect removal

transformation

48

Initial Empirical Analysis

Daimler ran their Evolutionary Test Data

Generator on both versions

Collected coverage information for 6 runs

49

/* date correction for september 1752 */if(special_days) result = "Day did not exist.";else if (leapflag && is_september && day>13) result = dayName((addMonths(month,year)+ (--day)+firstJanuary(year)+10)%7); else ...

/* date correction for september 1752 */if(year==1752 && month==9 && day>=3 && day<= 13) result = "Day did not exist.";else if (year==1752 && month==9 && day>13) result = dayName((addMonths(month,year)+ (--day+firstJanuary(year)+10)%7);

else ...

Special Values

Remove flags

With flags we never even got up here

With flags it took longer to get anywhere

50

returnflag = (a==0 || b==0 || c==0) || a>10000 || b>10000 || c>10000 || (c>=a+b) || (a>=b+c) || (b>=a+c);...

if (returnflag) return;

...

if (a==0 || b==0 || c==0) || a>10000 || b>10000 || c>10000 || (c>=a+b) || (a>=b+c) || (b>=a+c)

return;

Remove flagsThere is very little difference

Is it hard to find values which make this flag true?Is it hard to find values which make this flag false?

To have a problem we need

flag for which few inputs make it true/false

Nothing special flag

Let’s look at a bad flag problem ...

51

Special Value Flag

returnflag = (a==99999 || b==99999 || c==99999);...

if (returnflag) return;

...

if (a==99999 || b==99999 || c==99999) return;

There are relatively few values which make the flag trueSo we seldom randomly hit this statementThis predicate has a smoother landscape… than this one

So we get better coverage

… at less cost

52

Disposable Transformations

We generate test data using the transformed program

because it is easier

… then throw away the transformed program

Transformation as a means to an end not an end in itself

Do the transformations even need to preserve meaning?

This is a radically new form of transformation

53

Conclusion

Test data generation is hard

… anything which helps is good

Test data generation can be impeded by structure

… so transform the structure

We have to be sure to preserve branch coverage

… but not traditional meaning

This suggests a new kind of transformation


54

References

• David Binkley and Mark HarmanAnalysis and Visualization of Predicate Dependence on Formal Parameters and Global Variables.IEEE Transactions on Software Engineering. (journal version of ICSE 2003 paper, to appear)

• Mark Harman, Lin Hu, Rob Hierons, Joachim Wegener, Harmen Sthamer, Andre Baresel and MarcRoper.Testability Transformation.IEEE Transactions on Software Engineering.30(1): 3-16, 2004.

• David Binkley and Mark Harman.An Empirical Study of Predicate Dependence Levels and Trends25th IEEE/ACM International Conference on Software Engineering (ICSE 2003).3-10 May, 2003. Portland, Oregon, USA, Pages 330-339.

Electronic copies of these papers are available on my website.

an empirical study of advanced - mcmaster university

Documents