new results in program slicing

53
1 New Results in Program Slicing Aharon Abadi, Ran Ettinger, and Yishai Feldman IBM Haifa Research Lab

Upload: ignacia-price

Post on 30-Dec-2015

23 views

Category:

Documents


0 download

DESCRIPTION

New Results in Program Slicing. Aharon Abadi, Ran Ettinger, and Yishai Feldman IBM Haifa Research Lab. Context. The Programmer’s Apprentice The Plan Calculus Bogart Midas Sliding Painless Paz Aderet. Improving Slice Accuracy by Compression of Data and Control Flow Paths. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: New Results in Program Slicing

1

New Results inProgram Slicing

Aharon Abadi, Ran Ettinger, and Yishai Feldman

IBM Haifa Research Lab

Page 2: New Results in Program Slicing

2

Context

• The Programmer’s Apprentice– The Plan Calculus

• Bogart

• Midas

• Sliding

• Painless– Paz– Aderet

Page 3: New Results in Program Slicing

3

Improving Slice Accuracy by Compression of Data and Control Flow Paths

Presented at ESEC/FSE 2009

Page 4: New Results in Program Slicing

4

Program Slicing

Program

x := expStart Slice x := exp

Slice

The same sequence of values

Page 5: New Results in Program Slicing

5

A:Z0A:Z0

Control-Flow Path Compression

go-to Bgo-to B

if-zero-go-to A

test X

Work in two stages:- Compute the ‘traditional’ slice

- Control dependences- Data Dependences

- Compute the necessary branches to prevent infeasible control paths

test X

if-zero-go-to A

. . .

L:test Y

if-zero-

. . .

go-to L

B:

. . .

B:

Page 6: New Results in Program Slicing

6

A:Z0

go-to B

if-zero-go-to A

test X

Limitations of previous approaches:- insert all the loop;- add branches not from the program; or- do not preserve behavior

This algorithm:- preserves behavior- yields a sub-program

- one version may turn conditional branches into unconditional ones (“rhetorization”)

B:go-to Bgo-to B

test X test X

. . .

L:test Y

if-zero-

. . .

go-to L

B:

. . .

if-zero-go-to A

A:Z0

Control-Flow Path Compression

Page 7: New Results in Program Slicing

7

Data-Flow Path Compression

The result is too large

The value of R7 does not depend on the loop

R7:=exp1Out: R0:=R7 + 1

Previous syntax-preserving algorithms insert the loop and the assignments inside it

Out: R0:= R7 + 1

Start:R2:=0

R7:=exp1

Loop: R2:=R2 + 1

compare R2, R9

if-not-less-go-to Out

use R7

Temp:=R7; spill R7 to memory

… ; code that uses

; all registers

R7:=Temp; restore R7

go-to Loop

Start:R2:=0 R7:=exp1Loop: R2:=R2 + 1 compare R2, R9 if-not-less-go-to Out Temp:=R7 R7:=Temp go-to Loop Out: R0:=R7 + 1

Page 8: New Results in Program Slicing

8

Control-Flow Path Compressionx<11

F T

x:=x+1 goto A4

goto A2

y<TT F

y:=y-1

goto A

print(x)

x<9T F

x:=x-1x:=x+2

goto A2

goto A3

if (x<11)

x := x+1

goto A2

A1: if (y<T)

y := y–1

goto A1

goto A2

goto A4

x := x-1

A4: if (x<9) goto A3

A3: x := x+2

A2: print(x)

Page 9: New Results in Program Slicing

9

Compute the ‘Traditional’ Slicex<11

F T

x:=x+1 goto A4

goto A2

y<TT F

y:=y-1

goto A

print(x)

x<9T F

x:=x-1x:=x+2

goto A2

goto A3

if (x<11)

x := x+1

goto A2

A1: if (y<T)

y := y–1

goto A1

goto A2

goto A4

x := x-1

A4: if (x<9) goto A3

A3: x := x+2

print(x)

A2: print(x)

x:=x+1

x:=x+2 x:=x-1

x<11

x<9

y<T

Page 10: New Results in Program Slicing

10

Completing Control Flow Paths:Main Lemma

• precisely identifies the possible sets of branches that may be added to the slice

• any path in the original program can be chosen

• optimizations can be performed

All paths from the same point in the slice enter the slice at a single point

Page 11: New Results in Program Slicing

11

Compute the Necessary Branchesx<11

F T

x:=x+1 goto A4

goto A2

y<TT F

y:=y-1

goto A

print(x)

x<9T F

x:=x-1x:=x+2

goto A2

goto A3

if(x<11)

x:=x+1

goto A2

A1: if(y<T)

y:=y–1

goto A1

goto A2

goto A4

x:=x-1

A4: if(x<9) goto A3

A3: x:=x+2

A2: print(x)

Page 12: New Results in Program Slicing

12

Start:R2:=0 R7:=exp1Loop: R2:=R2 + 1 compare R2, R9 if-not-less-go-to Out use R7 Temp:=R7; spill R7 to ;memory … ; code that uses ;all registers R7:=Temp; restore R7 go-to Loop Out: R0:=R7 + 1

Data-Flow Path Compression

R7:=exp1Out:R0:=R7 + 1 +1

R7:=exp1

exit

R0:=R7+1

R2:=0

R2:=R2+1

compare R2,R9

if-not-less

use R7

Temp:=R7

R7:=Temp

goto Loop

go-to Out

Page 13: New Results in Program Slicing

13

++

++

exp1

Data-Flow Path Compression

R7:=exp1

exit

R7:=R7+1

R2:=0

R2:=R2+1

compare R2,R9

if-not-less

use R7

Temp:=R7

R7:=Temp

goto-Loop

• R7,Temp carry the value of exp1

• Use data edges instead of variables

go-to Out

out data portholds the last valuein data port

holds the next value

d1 d2

d1

Start:R2:=0 R7:=exp1Loop: R2:=R2 + 1 compare R2, R9 if-not-less-go-to Out use R7 Temp:=R7; spill R7 to ; memory … ; code that uses ; all registers R7:=Temp; restore R7 go-to Loop Out: R0:=R7+1

0

• The Plan Calculus:The Programmer’s Apprentice,Rich and Waters, 1990

Page 14: New Results in Program Slicing

14

exp1

entry

0

exit

++

R7

R0

R9

R2

++

R2

T F

compare R2,R9

R7:= exp1R0:=R7 + 1

Start:R2:=0

Loop: R2:=R2 + 1 compare R2, R9 if-not-less-go-to Out use R7 Temp:=R7; spill R7 to ; memory … ; code that uses ; all registers R7:=Temp; restore R7 go-to Loop Out: R0:=R7 + 1

R7:=exp1

Out: R0:=R7 + 1

R7:=exp1

if-not-less

use R7

Page 15: New Results in Program Slicing

15

exp1

0

exit

++

R7

R0

R9

R2

++

R2

T F

compare R2,R9

Start:R2:=0

Loop:R2:=R2 + 1 compare R2, R9 if-not-less- use R7 ; spill R7 to ; memory … ; code that uses ; all registers ; restore R7 go-to Loop Out: R0:=R7 + 1

R7:=exp1

if-not-less

use R7

Decompression

go-to Out

Temp:=R7

R7:=Temp

R7:=exp1

R0:=R7 + 1

go-to Out

entry

Out:

Page 16: New Results in Program Slicing

16

Properties of the Slices

• Syntax preserving, possibly rhetorizing• Behavior preserving• Executable• For structured programs

– At least as accurate as previous algorithms– Strictly smaller in interesting cases

• For unstructured programs– Empirically shown to be superior– Modification of the algorithm guaranteed at least as

accurate

Page 17: New Results in Program Slicing

17

Implementation

• A family of slicing algorithms– rhetorizing (*RB, *RM)– strictly syntax-preserving

(*PB, *PM)– amorphous (*AB, *AM)

• adds new branches(not from the program)

A1:if(y<T) goto A2

A:Z 0

if-zero-go-to A

test X

. . .

L:test Y

if-zero-go-to B

. . .

go-to L

C:

go-to exit

. . .goto exit

B:go-to C

Page 18: New Results in Program Slicing

18

Empirical Study

• Corpus of 15 manually-written assembly-language modules from a large mainframe product

• 8578 non-comment source lines

• Computed slices from all lines

• 5801 non-empty slices

Page 19: New Results in Program Slicing

19

Empirical Results

Effect of%slices better

%average decrease

%slices worse

%average decrease

Rhetorization177.5

Control path compression

Lenient BH3017

Strict BH9465

Data path compression

implemented124815

modified

Page 20: New Results in Program Slicing

20

Related WorkBehaviorPreserve

behaviorMay add infinite loops

Not executable

BH,CF1,Ag, HLB,*P,*R, *A

HLB, HDKH

Subset of the original program(for flat languages)

Syntax-preserving

RhetorizingAmorphous

BH, CF1, Ag, HD, HLB, *P

*RHLB, CF, *A

Comparison to traditional algorithm on structured programs

Smaller than traditional

Equal to traditional

Larger than traditional

*P, *R, *ABH, CF1, Ag, HD, KH, HLB, CF2

BH: Ball & Horwitz 1993CF: Choi & Ferrante 1994Ag: Agrawal 1994

KH: Kumar & Horwitz 2002HD: Harman & Danicic 1998HLB: Harman, Lakhotia & Binkley 2006

Page 21: New Results in Program Slicing

21

Conclusions

• Two techniques for reducing slice size– Control-Flow Path Compression

• Precise identification of all correct solutions• Shortest paths significantly improve slice accuracy

– 17-22% improvement for 30-37% of the cases– Data-Flow Path Compression

• Eliminates copy assignments• Yields significant improvement in a few cases

– 24% improvement for 1% of the slices computed

• Strictly smaller even for structured programs

Page 22: New Results in Program Slicing

22

Fine Slicing forProgram Transformation

Page 23: New Results in Program Slicing

23

Refactoring’s Rubicon:Extract Method

• Automating Extract Method is Refactoring’s Rubicon (Fowler*)– The one that demonstrates “serious tool

support”– Precondition for many other transformations

• This Rubicon has not yet been crossed– Getting it right requires more analysis

capability than is available in current IDEs

*http://www.martinfowler.com/articles/refactoringRubicon.html

Page 24: New Results in Program Slicing

24

Fowler’s Example (website)void printOwing() { printBanner();

//print details System.out.println("name: " + _name); System.out.println("amount " + getOutstanding());}

void printOwing() { printBanner(); printDetails(getOutstanding());}

void printDetails(double outstanding) { System.out.println("name: " + _name); System.out.println("amount " + outstanding);}

Page 25: New Results in Program Slicing

25

A Case Study inEnterprise Refactoring

• Converted a Java Servlet to use the MVC pattern*

• Used as much automated support as available– The whole conversion could be described as a series

of cataloged (“small”) refactorings– Most steps were inadequately supported by the IDE– Some were not supported at all

* Based on Alex Chaffee’s “Refactoring to Model-View-Controller” article (http://www.purpletech.com/articles/mvc/refactoring-to-mvc.html)

Page 26: New Results in Program Slicing

26

Case-Study: Automation (1)

13Total

3

3

2

1

1

1

1

1

Extract Method

Extract Temp

(Self) Encapsulate Field

Replace Magic Number with Symbolic Constant

Inline Temp

Extract Superclass

Delete Methods

Move Method

UsesFully Supported Refactorings

Page 27: New Results in Program Slicing

27

Case-Study: Automation (2)

23Total

10

5

3

2

1

1

1

Extract Method *

Substitute Expression **

Replace Temp with Query *

Replace Method with Method Object **

Substitute Statement **

Extract Class *

Move Statement (or Swap Statements) **

UsesPartial(*) or No(**) Support

Page 28: New Results in Program Slicing

28

Currently Unsupported Casesof Extract Method

(a) Extract multiple fragments

(b) Extract a partial fragment– select sub-expressions as parameters

(c) Extract loop with partial body– loop duplication with data flow

(d) Extract code with conditional exits

Program slicing pulls related code together!

Page 29: New Results in Program Slicing

29

slice (v.): to cut with or as if with a knife

Merriam-Webster

slice (n.): a thin flat piece cut from something

Page 30: New Results in Program Slicing

30

A (backward) slice of a given program with respect to selected “interesting” variables is a subprogram that computes the same values as the original program for the selected variables

A (backward) fine slice of a given program with respect to selected “interesting” variables and other “oracle” variables is a subprogram that computes the same values as the original program for the selected variables, given values for the oracle variables

Page 31: New Results in Program Slicing

31

Fine Slicing

• A generalization of traditional program slicing• Fine slices can be precisely bounded

– Slicing criteria include set of data and control dependences to ignore

• Fine slices are executable and extractable• Complement slices (co-slices) are also fine slices• Oracle-based semantics for fine slices• Algorithm for computing data-structure representing the

oracle• Forward fine slices are executable, may be slightly larger

than traditional forward slices• Confines generalize blocks for unstructured programs

Page 32: New Results in Program Slicing

32

Extract Computation

• A new refactoring

• Extracts a fine slice into contiguous code

• Computes the co-slice

• Computation can then be extracted into a separate method using Extract Method

• Passes necessary “oracle” variables between slice and co-slice

• Generates new containers if series of values need to be passed

Page 33: New Results in Program Slicing

33

(a) Extract multiple fragmentsUser user = getCurrentUser(request);

if (user == null) {

response.sendRedirect(LOGIN_PAGE_URL);

return;

}

response.setContentType("text/html");

disableCache(response);

String albumName = request.getParameter("album");

PrintWriter out = response.getWriter();

Page 34: New Results in Program Slicing

34

(b) Extract a partial fragment

out.println(DOCTYPE_HTML);

out.println("<html>");

out.println("<head>");

out.println("<title>Error</title>");

out.println("</head>");

out.print("<body><p class='error'>");

out.print("Could not load album '" +

albumName + "'");

out.println("</p></body>");

out.println("</html>");

Page 35: New Results in Program Slicing

35

out.println("<table border=0>");

int start = page * 20;

int end = start + 20;

end = Math.min(end,

album.getPictures().size());

for (int i = start; i < end; i++) {

Picture picture = album.getPicture(i);

printPicture(out, picture);

}

out.println("</table>");

(c) Extract loop with partial body

1

2

3

4

5

6

7

8

9

10

Page 36: New Results in Program Slicing

36

2

3

4

5

***

***

6

7

***

9

1

6

8

10

int start = page * 20;

int end = start + 20;

end = Math.min(end,

album.getPictures().size());

Queue<Picture> pictures =

new LinkedList<Picture>();

for (int i = start; i < end; i++) {

Picture picture = album.getPicture(i);

pictures.add(picture);

}

out.println("<table border=0>");

for (int i = start; i < end; i++)

printPicture(out, pictures.remove());

out.println("</table>");

Page 37: New Results in Program Slicing

37

(d) Extract code with conditional exits

if (album == null) {

new ErrorPage("Could not load album '"

+ album.getName() + "'").printMessage(out);

return;

}

//...

Page 38: New Results in Program Slicing

38

if (invalidAlbum(album, out))

return;

}

//...

boolean invalidAlbum(Album album,

PrintWriter out) {

boolean invalid = album == null;

if (invalid) {

new ErrorPage("Could not load album '"

+ album.getName() + "'").printMessage(out);

}

return invalid;

}

Page 39: New Results in Program Slicing

39

++

out.println("<table border=0>");int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); printPicture(out, picture);}out.println("</table>");

entry

println

out

*

album

getPictures

size

page

min

+ out

start

end

T F

>

getPicture

i

out

end

printPicture

out

out

println

i

"<table border=0>"

20

"</table>"

exit

p1

p1

p2

p2

Token Semantics

Page 40: New Results in Program Slicing

40

++

out.println("<table border=0>");int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); printPicture(out, picture);}out.println("</table>");

entry

println

out

*

album

getPictures

size

page

min

+ out

start

end

T F

>

getPicture

i

out

end

printPicture

out

out

println

i

"<table border=0>"

20

"</table>"

exit

printPicture

Fine Slicing

Page 41: New Results in Program Slicing

41

++

out.println("<table border=0>");for (int i = start; i < end; i++) { printPicture(out, picture);}out.println("</table>");

entry

println

out

out

T F

>

i

out

end

printPicture

out

out

println

i

"<table border=0>"

"</table>"

exit

printPicture

startpicture

The Fine Slice

Page 42: New Results in Program Slicing

42

++

out.println("<table border=0>");int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); printPicture(out, picture);}out.println("</table>");

entry

println

out

*

album

getPictures

size

page

min

+ out

start

end

T F

>

getPicture

i

out

end

printPicture

out

out

println

i

"<table border=0>"

20

"</table>"

exit

printPicture

Co-Slicing

Page 43: New Results in Program Slicing

43

++

int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); }

entry

*

album

getPictures

size

page

min

+

start

end

T F

>

getPicture

i

end

out

i

20

exit

startpicture

The Co-Slice

Page 44: New Results in Program Slicing

44

++

entry

*

album

getPictures

size

page

min

+

start

end

T F

>

getPicture

i

end

out

i

20

exit

start

picture

++

entry

println

out

T F

>

end

out

println

i

"<table border=0>"

"</table>"

exit

printPicture

startpicture

Fine slice Co-slice

out

Page 45: New Results in Program Slicing

45

++

println

>

remove

printPicture println

++

out.println("<table border=0>");int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());Queue<Picture> pictures = new LinkedList<Picture>();for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); pictures.add(picture); printPicture(out,pictures.remove());}out.println("</table>");

entry

println

out

*

album

getPictures

size

page

min

+ out

start

end

T F

>

getPicture

i

out

end

printPicture

out

out

println

i

"<table border=0>"

20

"</table>"

exit

new

remove

add

picture

pictures

picture

pictures

pictures

Adding a Container

pictures

Page 46: New Results in Program Slicing

46

++

println

<

remove

printPicture println

++

void display(PrintStream out, int start, int end, Queue<Picture> pictures){ out.println("<table border=0>"); for (int i = start; i < end; i++) { printPicture(out, pictures.remove()); } out.println("</table>");}

entry

println

out

out

start

T F

>

out

end

printPicture

out

println

i

"<table border=0>"

"</table>"

exit

pictures

remove

entry

i

out

The Fine Slice

pictures

pictures

picture

Page 47: New Results in Program Slicing

47

++

println

>

remove

printPicture println

++

entry

println

out

*

album

getPictures

size

page

min

+ out

start

end

T F

>

getPicture

i

out

end

printPicture

out

out

println

i

"<table border=0>"

20

"</table>"

exit

new

remove

add

out.println("<table border=0>");int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());Queue<Picture> pictures = new LinkedList<Picture>();for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); pictures.add(picture); printPicture(out,pictures.remove());}out.println("</table>");

Program with

Container

pictures

pictures

pictures

pictures

picture

picture

Page 48: New Results in Program Slicing

48

++

>

++

int start = page * 20;int end = start + 20;end = Math.min(end, album.getPictures().size());Queue<Picture> pictures = new LinkedList<Picture>();for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); pictures.add(picture); }display(out, start, end, pictures);

entry

*

album

getPictures

size

page

min

+

out

start

end

T F

>

getPicture

i

end

i

20

exit

newpictures

add

display

pictures

start

out

The Co-Slice

pictures

pictures

pictures

picture

Page 49: New Results in Program Slicing

49

Conclusions

• Fine slicing algorithm yields executable slices whose boundaries can be precisely controlled

• Can be used to make any subset of a program executable by adding some control structures but not the data on which they depend– including forward slices, thin slices, barrier

slices, chops, and barrier chops– Conjecture: the size of these executable

programs will not be substantially larger

Page 50: New Results in Program Slicing

50

Conclusions

• New Extract Computation refactoring is an important step towards the automation of Extract Method in difficult cases– Enables the automation of big refactorings

from smaller building blocks

• Uses new fine-slicing algorithm• Automatically computes complement

slices (co-slices)• Automatically generates containers to

pass series of values if necessary

Page 51: New Results in Program Slicing

51

Related Work (I): Non-Executable Slices

• Traditional backward slicing (e.g., Weiser [ICSE81] or Ottenstein & Ottenstein [PSDE84]), when applied to unstructured code– Solved by path-completion stage in plan-based slicing (Abadi,

Ettinger & Feldman [FSE09])

• Forward slicing (Horwitz, Reps & Binkley, [TOPLAS90])• Barrier slicing (Krinke [SCAM03])• Chopping (Jackson & Rollins [FSE94]) and Barrier

Chopping (Krinke [SCAM03])• Thin slicing (Sridharan, Fink & Bodik [PLDI07])• All the above can be made executable with an

appropriate oracle, by adding the required control structure

Page 52: New Results in Program Slicing

52

Related Work (II): Executable Slices with Reduced Scope or Size

• Block-based slicing (Maruyama [SSR01]): structured code only, no correctness proof

• Co-slicing (Ettinger's thesis, Oxford 2006): limited to slicing from the end and oracle of final values only; proof on toy language

• Parametric slicing (Field, Ramalingam & Tip [POPL95]): an executable generalization of static and dynamic slices; like oracle semantics, they formalize programs with holes; however, their holes stand for expressions whose values are irrelevant, while our holes stand for significant (oracle) values

• Some forms of dynamic and forward slicing are executable (Binkley et al. [SCAM04]): forward slices made excessively large through the addition of backward slices

Page 53: New Results in Program Slicing

53

Related Work (III): Behavior- Preserving Procedure Extraction

• Contiguous code– Bill Opdyke's thesis (UIUC 1992): for C++– Griswold and Notkin [ToSE93]: for Scheme

• Arbitrary selections– Tucking (Lakhotia & Deprez [IST98]): the complement is a slice too; no dataflow from the

extracted slice to its complement yields over-duplication; strong preconditions (e.g., no global variables involved, and no live-on-exit variable defined in both the slice and complement)

– Semantics-Preserving Procedure Extraction (Komondoor & Horwitz [POPL00]): considers all permutations of selected and surrounding statements; no duplication allowed; not practical (exponential time complexity); very strong preconditions

– Effective Automatic Procedure Extraction (Komondoor & Horwitz [IWPC03]): improves on their previous algorithm by improving complexity (cubic time and space), allowing some duplication (of conditionals and jumps); might miss some correct permutations; no duplication of assignments or loops; allows dataflow from complement to extracted code and from extracted code to (the second portion of the) complement; supports extraction of returns

– Extraction of block-based slices (Maruyama [SSR01]): extracts a slice of one variable only; restricted to structured code; no proof given

– Ettinger's thesis (Oxford 2006): sliding transformation sequentially composes a slice and its complement, allowing dataflow from the former to the latter; supports loop untangling and duplication of assignments; restricted to slicing from the end, and only final values from the extracted slice can be reused in the complement; proof for toy language