new results in program slicing

Post on 30-Dec-2015

23 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

New Results in Program Slicing. Aharon Abadi, Ran Ettinger, and Yishai Feldman IBM Haifa Research Lab. Context. The Programmer’s Apprentice The Plan Calculus Bogart Midas Sliding Painless Paz Aderet. Improving Slice Accuracy by Compression of Data and Control Flow Paths. - PowerPoint PPT Presentation

TRANSCRIPT

1

New Results inProgram Slicing

Aharon Abadi, Ran Ettinger, and Yishai Feldman

IBM Haifa Research Lab

2

Context

• The Programmer’s Apprentice– The Plan Calculus

• Bogart

• Midas

• Sliding

• Painless– Paz– Aderet

3

Improving Slice Accuracy by Compression of Data and Control Flow Paths

Presented at ESEC/FSE 2009

4

Program Slicing

Program

x := expStart Slice x := exp

Slice

The same sequence of values

5

A:Z0A:Z0

Control-Flow Path Compression

go-to Bgo-to B

if-zero-go-to A

test X

Work in two stages:- Compute the ‘traditional’ slice

- Control dependences- Data Dependences

- Compute the necessary branches to prevent infeasible control paths

test X

if-zero-go-to A

. . .

L:test Y

if-zero-

. . .

go-to L

B:

. . .

B:

6

A:Z0

go-to B

if-zero-go-to A

test X

Limitations of previous approaches:- insert all the loop;- add branches not from the program; or- do not preserve behavior

This algorithm:- preserves behavior- yields a sub-program

- one version may turn conditional branches into unconditional ones (“rhetorization”)

B:go-to Bgo-to B

test X test X

. . .

L:test Y

if-zero-

. . .

go-to L

B:

. . .

if-zero-go-to A

A:Z0

Control-Flow Path Compression

7

Data-Flow Path Compression

The result is too large

The value of R7 does not depend on the loop

R7:=exp1Out: R0:=R7 + 1

Previous syntax-preserving algorithms insert the loop and the assignments inside it

Out: R0:= R7 + 1

Start:R2:=0

R7:=exp1

Loop: R2:=R2 + 1

compare R2, R9

if-not-less-go-to Out

use R7

Temp:=R7; spill R7 to memory

… ; code that uses

; all registers

R7:=Temp; restore R7

go-to Loop

Start:R2:=0 R7:=exp1Loop: R2:=R2 + 1 compare R2, R9 if-not-less-go-to Out Temp:=R7 R7:=Temp go-to Loop Out: R0:=R7 + 1

8

Control-Flow Path Compressionx<11

F T

x:=x+1 goto A4

goto A2

y<TT F

y:=y-1

goto A

print(x)

x<9T F

x:=x-1x:=x+2

goto A2

goto A3

if (x<11)

x := x+1

goto A2

A1: if (y<T)

y := y–1

goto A1

goto A2

goto A4

x := x-1

A4: if (x<9) goto A3

A3: x := x+2

A2: print(x)

9

Compute the ‘Traditional’ Slicex<11

F T

x:=x+1 goto A4

goto A2

y<TT F

y:=y-1

goto A

print(x)

x<9T F

x:=x-1x:=x+2

goto A2

goto A3

if (x<11)

x := x+1

goto A2

A1: if (y<T)

y := y–1

goto A1

goto A2

goto A4

x := x-1

A4: if (x<9) goto A3

A3: x := x+2

print(x)

A2: print(x)

x:=x+1

x:=x+2 x:=x-1

x<11

x<9

y<T

10

Completing Control Flow Paths:Main Lemma

• precisely identifies the possible sets of branches that may be added to the slice

• any path in the original program can be chosen

• optimizations can be performed

All paths from the same point in the slice enter the slice at a single point

11

Compute the Necessary Branchesx<11

F T

x:=x+1 goto A4

goto A2

y<TT F

y:=y-1

goto A

print(x)

x<9T F

x:=x-1x:=x+2

goto A2

goto A3

if(x<11)

x:=x+1

goto A2

A1: if(y<T)

y:=y–1

goto A1

goto A2

goto A4

x:=x-1

A4: if(x<9) goto A3

A3: x:=x+2

A2: print(x)

12

Start:R2:=0 R7:=exp1Loop: R2:=R2 + 1 compare R2, R9 if-not-less-go-to Out use R7 Temp:=R7; spill R7 to ;memory … ; code that uses ;all registers R7:=Temp; restore R7 go-to Loop Out: R0:=R7 + 1

Data-Flow Path Compression

R7:=exp1Out:R0:=R7 + 1 +1

R7:=exp1

exit

R0:=R7+1

R2:=0

R2:=R2+1

compare R2,R9

if-not-less

use R7

Temp:=R7

R7:=Temp

goto Loop

go-to Out

13

++

++

exp1

Data-Flow Path Compression

R7:=exp1

exit

R7:=R7+1

R2:=0

R2:=R2+1

compare R2,R9

if-not-less

use R7

Temp:=R7

R7:=Temp

goto-Loop

• R7,Temp carry the value of exp1

• Use data edges instead of variables

go-to Out

out data portholds the last valuein data port

holds the next value

d1 d2

d1

Start:R2:=0 R7:=exp1Loop: R2:=R2 + 1 compare R2, R9 if-not-less-go-to Out use R7 Temp:=R7; spill R7 to ; memory … ; code that uses ; all registers R7:=Temp; restore R7 go-to Loop Out: R0:=R7+1

0

• The Plan Calculus:The Programmer’s Apprentice,Rich and Waters, 1990

14

exp1

entry

0

exit

++

R7

R0

R9

R2

++

R2

T F

compare R2,R9

R7:= exp1R0:=R7 + 1

Start:R2:=0

Loop: R2:=R2 + 1 compare R2, R9 if-not-less-go-to Out use R7 Temp:=R7; spill R7 to ; memory … ; code that uses ; all registers R7:=Temp; restore R7 go-to Loop Out: R0:=R7 + 1

R7:=exp1

Out: R0:=R7 + 1

R7:=exp1

if-not-less

use R7

15

exp1

0

exit

++

R7

R0

R9

R2

++

R2

T F

compare R2,R9

Start:R2:=0

Loop:R2:=R2 + 1 compare R2, R9 if-not-less- use R7 ; spill R7 to ; memory … ; code that uses ; all registers ; restore R7 go-to Loop Out: R0:=R7 + 1

R7:=exp1

if-not-less

use R7

Decompression

go-to Out

Temp:=R7

R7:=Temp

R7:=exp1

R0:=R7 + 1

go-to Out

entry

Out:

16

Properties of the Slices

• Syntax preserving, possibly rhetorizing• Behavior preserving• Executable• For structured programs

– At least as accurate as previous algorithms– Strictly smaller in interesting cases

• For unstructured programs– Empirically shown to be superior– Modification of the algorithm guaranteed at least as

accurate

17

Implementation

• A family of slicing algorithms– rhetorizing (*RB, *RM)– strictly syntax-preserving

(*PB, *PM)– amorphous (*AB, *AM)

• adds new branches(not from the program)

A1:if(y<T) goto A2

A:Z 0

if-zero-go-to A

test X

. . .

L:test Y

if-zero-go-to B

. . .

go-to L

C:

go-to exit

. . .goto exit

B:go-to C

18

Empirical Study

• Corpus of 15 manually-written assembly-language modules from a large mainframe product

• 8578 non-comment source lines

• Computed slices from all lines

• 5801 non-empty slices

19

Empirical Results

Effect of%slices better

%average decrease

%slices worse

%average decrease

Rhetorization177.5

Control path compression

Lenient BH3017

Strict BH9465

Data path compression

implemented124815

modified

20

Related WorkBehaviorPreserve

behaviorMay add infinite loops

Not executable

BH,CF1,Ag, HLB,*P,*R, *A

HLB, HDKH

Subset of the original program(for flat languages)

Syntax-preserving

RhetorizingAmorphous

BH, CF1, Ag, HD, HLB, *P

*RHLB, CF, *A

Comparison to traditional algorithm on structured programs

Smaller than traditional

Equal to traditional

Larger than traditional

*P, *R, *ABH, CF1, Ag, HD, KH, HLB, CF2

BH: Ball & Horwitz 1993CF: Choi & Ferrante 1994Ag: Agrawal 1994

KH: Kumar & Horwitz 2002HD: Harman & Danicic 1998HLB: Harman, Lakhotia & Binkley 2006

21

Conclusions

• Two techniques for reducing slice size– Control-Flow Path Compression

• Precise identification of all correct solutions• Shortest paths significantly improve slice accuracy

– 17-22% improvement for 30-37% of the cases– Data-Flow Path Compression

• Eliminates copy assignments• Yields significant improvement in a few cases

– 24% improvement for 1% of the slices computed

• Strictly smaller even for structured programs

22

Fine Slicing forProgram Transformation

23

Refactoring’s Rubicon:Extract Method

• Automating Extract Method is Refactoring’s Rubicon (Fowler*)– The one that demonstrates “serious tool

support”– Precondition for many other transformations

• This Rubicon has not yet been crossed– Getting it right requires more analysis

capability than is available in current IDEs

*http://www.martinfowler.com/articles/refactoringRubicon.html

24

Fowler’s Example (website)void printOwing() { printBanner();

//print details System.out.println("name: " + _name); System.out.println("amount " + getOutstanding());}

void printOwing() { printBanner(); printDetails(getOutstanding());}

void printDetails(double outstanding) { System.out.println("name: " + _name); System.out.println("amount " + outstanding);}

25

A Case Study inEnterprise Refactoring

• Converted a Java Servlet to use the MVC pattern*

• Used as much automated support as available– The whole conversion could be described as a series

of cataloged (“small”) refactorings– Most steps were inadequately supported by the IDE– Some were not supported at all

* Based on Alex Chaffee’s “Refactoring to Model-View-Controller” article (http://www.purpletech.com/articles/mvc/refactoring-to-mvc.html)

26

Case-Study: Automation (1)

13Total

3

3

2

1

1

1

1

1

Extract Method

Extract Temp

(Self) Encapsulate Field

Replace Magic Number with Symbolic Constant

Inline Temp

Extract Superclass

Delete Methods

Move Method

UsesFully Supported Refactorings

27

Case-Study: Automation (2)

23Total

10

5

3

2

1

1

1

Extract Method *

Substitute Expression **

Replace Temp with Query *

Replace Method with Method Object **

Substitute Statement **

Extract Class *

Move Statement (or Swap Statements) **

UsesPartial(*) or No(**) Support

28

Currently Unsupported Casesof Extract Method

(a) Extract multiple fragments

(b) Extract a partial fragment– select sub-expressions as parameters

(c) Extract loop with partial body– loop duplication with data flow

(d) Extract code with conditional exits

Program slicing pulls related code together!

29

slice (v.): to cut with or as if with a knife

Merriam-Webster

slice (n.): a thin flat piece cut from something

30

A (backward) slice of a given program with respect to selected “interesting” variables is a subprogram that computes the same values as the original program for the selected variables

A (backward) fine slice of a given program with respect to selected “interesting” variables and other “oracle” variables is a subprogram that computes the same values as the original program for the selected variables, given values for the oracle variables

31

Fine Slicing

• A generalization of traditional program slicing• Fine slices can be precisely bounded

– Slicing criteria include set of data and control dependences to ignore

• Fine slices are executable and extractable• Complement slices (co-slices) are also fine slices• Oracle-based semantics for fine slices• Algorithm for computing data-structure representing the

oracle• Forward fine slices are executable, may be slightly larger

than traditional forward slices• Confines generalize blocks for unstructured programs

32

Extract Computation

• A new refactoring

• Extracts a fine slice into contiguous code

• Computes the co-slice

• Computation can then be extracted into a separate method using Extract Method

• Passes necessary “oracle” variables between slice and co-slice

• Generates new containers if series of values need to be passed

33

(a) Extract multiple fragmentsUser user = getCurrentUser(request);

if (user == null) {

response.sendRedirect(LOGIN_PAGE_URL);

return;

}

response.setContentType("text/html");

disableCache(response);

String albumName = request.getParameter("album");

PrintWriter out = response.getWriter();

34

(b) Extract a partial fragment

out.println(DOCTYPE_HTML);

out.println("<html>");

out.println("<head>");

out.println("<title>Error</title>");

out.println("</head>");

out.print("<body><p class='error'>");

out.print("Could not load album '" +

albumName + "'");

out.println("</p></body>");

out.println("</html>");

35

out.println("<table border=0>");

int start = page * 20;

int end = start + 20;

end = Math.min(end,

album.getPictures().size());

for (int i = start; i < end; i++) {

Picture picture = album.getPicture(i);

printPicture(out, picture);

}

out.println("</table>");

(c) Extract loop with partial body

1

2

3

4

5

6

7

8

9

10

36

2

3

4

5

***

***

6

7

***

9

1

6

8

10

int start = page * 20;

int end = start + 20;

end = Math.min(end,

album.getPictures().size());

Queue<Picture> pictures =

new LinkedList<Picture>();

for (int i = start; i < end; i++) {

Picture picture = album.getPicture(i);

pictures.add(picture);

}

out.println("<table border=0>");

for (int i = start; i < end; i++)

printPicture(out, pictures.remove());

out.println("</table>");

37

(d) Extract code with conditional exits

if (album == null) {

new ErrorPage("Could not load album '"

+ album.getName() + "'").printMessage(out);

return;

}

//...

38

if (invalidAlbum(album, out))

return;

}

//...

boolean invalidAlbum(Album album,

PrintWriter out) {

boolean invalid = album == null;

if (invalid) {

new ErrorPage("Could not load album '"

+ album.getName() + "'").printMessage(out);

}

return invalid;

}

39

++

out.println("<table border=0>");int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); printPicture(out, picture);}out.println("</table>");

entry

println

out

*

album

getPictures

size

page

min

+ out

start

end

T F

>

getPicture

i

out

end

printPicture

out

out

println

i

"<table border=0>"

20

"</table>"

exit

p1

p1

p2

p2

Token Semantics

40

++

out.println("<table border=0>");int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); printPicture(out, picture);}out.println("</table>");

entry

println

out

*

album

getPictures

size

page

min

+ out

start

end

T F

>

getPicture

i

out

end

printPicture

out

out

println

i

"<table border=0>"

20

"</table>"

exit

printPicture

Fine Slicing

41

++

out.println("<table border=0>");for (int i = start; i < end; i++) { printPicture(out, picture);}out.println("</table>");

entry

println

out

out

T F

>

i

out

end

printPicture

out

out

println

i

"<table border=0>"

"</table>"

exit

printPicture

startpicture

The Fine Slice

42

++

out.println("<table border=0>");int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); printPicture(out, picture);}out.println("</table>");

entry

println

out

*

album

getPictures

size

page

min

+ out

start

end

T F

>

getPicture

i

out

end

printPicture

out

out

println

i

"<table border=0>"

20

"</table>"

exit

printPicture

Co-Slicing

43

++

int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); }

entry

*

album

getPictures

size

page

min

+

start

end

T F

>

getPicture

i

end

out

i

20

exit

startpicture

The Co-Slice

44

++

entry

*

album

getPictures

size

page

min

+

start

end

T F

>

getPicture

i

end

out

i

20

exit

start

picture

++

entry

println

out

T F

>

end

out

println

i

"<table border=0>"

"</table>"

exit

printPicture

startpicture

Fine slice Co-slice

out

45

++

println

>

remove

printPicture println

++

out.println("<table border=0>");int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());Queue<Picture> pictures = new LinkedList<Picture>();for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); pictures.add(picture); printPicture(out,pictures.remove());}out.println("</table>");

entry

println

out

*

album

getPictures

size

page

min

+ out

start

end

T F

>

getPicture

i

out

end

printPicture

out

out

println

i

"<table border=0>"

20

"</table>"

exit

new

remove

add

picture

pictures

picture

pictures

pictures

Adding a Container

pictures

46

++

println

<

remove

printPicture println

++

void display(PrintStream out, int start, int end, Queue<Picture> pictures){ out.println("<table border=0>"); for (int i = start; i < end; i++) { printPicture(out, pictures.remove()); } out.println("</table>");}

entry

println

out

out

start

T F

>

out

end

printPicture

out

println

i

"<table border=0>"

"</table>"

exit

pictures

remove

entry

i

out

The Fine Slice

pictures

pictures

picture

47

++

println

>

remove

printPicture println

++

entry

println

out

*

album

getPictures

size

page

min

+ out

start

end

T F

>

getPicture

i

out

end

printPicture

out

out

println

i

"<table border=0>"

20

"</table>"

exit

new

remove

add

out.println("<table border=0>");int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());Queue<Picture> pictures = new LinkedList<Picture>();for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); pictures.add(picture); printPicture(out,pictures.remove());}out.println("</table>");

Program with

Container

pictures

pictures

pictures

pictures

picture

picture

48

++

>

++

int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());Queue<Picture> pictures = new LinkedList<Picture>();for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); pictures.add(picture); }display(out, start, end, pictures);

entry

*

album

getPictures

size

page

min

+

out

start

end

T F

>

getPicture

i

end

i

20

exit

newpictures

add

display

pictures

start

out

The Co-Slice

pictures

pictures

pictures

picture

49

Conclusions

• Fine slicing algorithm yields executable slices whose boundaries can be precisely controlled

• Can be used to make any subset of a program executable by adding some control structures but not the data on which they depend– including forward slices, thin slices, barrier

slices, chops, and barrier chops– Conjecture: the size of these executable

programs will not be substantially larger

50

Conclusions

• New Extract Computation refactoring is an important step towards the automation of Extract Method in difficult cases– Enables the automation of big refactorings

from smaller building blocks

• Uses new fine-slicing algorithm• Automatically computes complement

slices (co-slices)• Automatically generates containers to

pass series of values if necessary

51

Related Work (I): Non-Executable Slices

• Traditional backward slicing (e.g., Weiser [ICSE81] or Ottenstein & Ottenstein [PSDE84]), when applied to unstructured code– Solved by path-completion stage in plan-based slicing (Abadi,

Ettinger & Feldman [FSE09])

• Forward slicing (Horwitz, Reps & Binkley, [TOPLAS90])• Barrier slicing (Krinke [SCAM03])• Chopping (Jackson & Rollins [FSE94]) and Barrier

Chopping (Krinke [SCAM03])• Thin slicing (Sridharan, Fink & Bodik [PLDI07])• All the above can be made executable with an

appropriate oracle, by adding the required control structure

52

Related Work (II): Executable Slices with Reduced Scope or Size

• Block-based slicing (Maruyama [SSR01]): structured code only, no correctness proof

• Co-slicing (Ettinger's thesis, Oxford 2006): limited to slicing from the end and oracle of final values only; proof on toy language

• Parametric slicing (Field, Ramalingam & Tip [POPL95]): an executable generalization of static and dynamic slices; like oracle semantics, they formalize programs with holes; however, their holes stand for expressions whose values are irrelevant, while our holes stand for significant (oracle) values

• Some forms of dynamic and forward slicing are executable (Binkley et al. [SCAM04]): forward slices made excessively large through the addition of backward slices

53

Related Work (III): Behavior- Preserving Procedure Extraction

• Contiguous code– Bill Opdyke's thesis (UIUC 1992): for C++– Griswold and Notkin [ToSE93]: for Scheme

• Arbitrary selections– Tucking (Lakhotia & Deprez [IST98]): the complement is a slice too; no dataflow from the

extracted slice to its complement yields over-duplication; strong preconditions (e.g., no global variables involved, and no live-on-exit variable defined in both the slice and complement)

– Semantics-Preserving Procedure Extraction (Komondoor & Horwitz [POPL00]): considers all permutations of selected and surrounding statements; no duplication allowed; not practical (exponential time complexity); very strong preconditions

– Effective Automatic Procedure Extraction (Komondoor & Horwitz [IWPC03]): improves on their previous algorithm by improving complexity (cubic time and space), allowing some duplication (of conditionals and jumps); might miss some correct permutations; no duplication of assignments or loops; allows dataflow from complement to extracted code and from extracted code to (the second portion of the) complement; supports extraction of returns

– Extraction of block-based slices (Maruyama [SSR01]): extracts a slice of one variable only; restricted to structured code; no proof given

– Ettinger's thesis (Oxford 2006): sliding transformation sequentially composes a slice and its complement, allowing dataflow from the former to the latter; supports loop untangling and duplication of assignments; restricted to slicing from the end, and only final values from the extracted slice can be reused in the complement; proof for toy language

top related