1 control flow analysis topic today representation and analysis paper (sections 1, 2) for next...

Post on 21-Jan-2016

222 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Control Flow Analysis

• Topic today• Representation and Analysis Paper (Sections 1, 2)

• For next class:• Read Representation and Analysis Paper (Section 3)• Do problems 1 and 2 from the representation and analysis paper

2

Intermediate Representations

• Program analyses grew out of the compiler world, where they were used to help optimize code.

• Many optimizations depend on analyses of intermediate representations of software, such as:o Parse treeso Abstract syntax treeso Three-address code

3

Intermediate

representation

Code Compilation and Analysis

Parsing, lexical analysis

Source

program

Code generation,

optimization

Target code

Code execution

Intermediate

representation

•Analyze intermediate representation, perform additional analysis on the results•Use this information for code optimization techniques

4

Tree Representations

• Representations• Parse trees represent concrete syntax• Abstract syntax trees represent abstract syntax

• Concrete versus abstract syntax• Concrete syntax shows structure and is language-specific• Abstract syntax shows structure

5

Example: Grammar

Example1. a := b + c

2. a = b + c;

• Grammar for 1• stmtlist stmt | stmt stmtlist• stmt assign | if-then | …• assign ident “:=“ ident binop ident• binop “+” | “-” | …

• Grammar for 2• stmtlist stmt “;” | stmt”;” stmtlist• stmt assign | if-then | …• assign ident “=“ ident binop ident binop “+” | “-” | …

6

Example: Parse Tree

Example1. a := b + c 2. a = b + c;

Parse tree for 1 Parse tree for 2

7

Example: Parse Tree

Example1. a := b + c 2. a = b + c;

stmt

stmtlist

ident

assign

a

ident“:=“ binop

cb

ident

“+”

stmt

stmtlist

ident

assign

a

ident“=“ binop

cb

ident

“+”

“;”

Parse tree for 1 Parse tree for 2

8

Example: Abstract Syntax Tree

Example1. a := b + c 2. a = b + c;

Abstract syntax tree for 1 and 2

assign

a add

b c

9

Three Address Code

• General form: x := y op z• May include temporary variables (intermediate values)• May reference arrays: a[t1]• Specific forms (examples)

• Assignment• Binary: x := y op z • Unary: x := op y

• Copy: x := y • Jumps

• Unconditional: goto (L) • Conditional: if x relational-op y goto (L)

• …

10

Example: Three Address Code

if a > 10 then x = y + z

else

x = y – z

1. if a > 10 goto (4)

2. x = y – z

3. goto (5)

4. x = y + z

5. …

Source code Three address code

11

Larger Example: 3 Address Code

(1) i := m-1(2) j := n(3) t1 := 4*n(4) v := a[t1](5) i := i+1(6) t2 := 4*I(7) t3 := a[t2](8) if t3 < v goto (5)(9) j := j-1(10) t4 := 4*j(11) t5 := a[t4](12) if t5 > v goto (9)(13) if I >= j goto (23)(14) t6 := 4*I(15) x := a[t6]

(16) t7 := 4*I(17) t8 := 4*j(18) t9 := a[t8](19) a[t7] := t9(20) t10 := 4*j(21) a[t10] := x(22) goto (5)(23) t11 := 4*I(24) x := a[t11](25) t12 := 4*I(26) t13 := 4*n(27) t14 := a[t13](28) a[t12] := t14(29) t15 := 4*n(30) a[t15] := x

12

Control Flow Graph

• One of the most basic program representations• Nodes represent statements or basic blocks• Edges represent flow of control between nodes

• To build a CFG: Construct basic blocks Join blocks together with labeled edges

13

Basic Blocks

• A basic block is a sequence of consecutive statements in which flow of control enters at the beginning and leaves at the end without halt or possibility of branch except at the end

• A basic block may or may not be maximal• For compiler optimizations, maximal basic blocks are

desirable• For software engineering tasks, basic blocks that

represent one source code statement are often used

14

Computing Basic Blocks (algorithm)

Input: a sequence of procedure statements

Output: A list of basic blocks with each statement in exactly one block

Method: Determine the set of leaders: the first statements of basic

blocks, using the following rules:o The first statement in the procedure is a leader

o Any statement that is the target of a conditional or unconditional goto statement is a leader.

o Any statement that immediately follows a conditional or unconditional goto statement is a leader.

Construct the basic blocks using the leaders. For each leader, its basic block consists of the leader and all statements up to but not including the next leader or the end of the procedure.

15

(1) i := m-1(2) j := n(3) t1 := 4*n(4) v := a[t1](5) i := i+1(6) t2 := 4*I(7) t3 := a[t2](8) if t3 < v goto (5)(9) j := j-1(10) t4 := 4*j(11) t5 := a[t4](12) if t5 > v goto (9)(13) if I >= j goto (23)

(14) t6 := 4*l(15) x := a[t6](16) t7 := 4*I(17) t8 := 4*j(18) t9 := a[t8](19) a[t7] := t9(20) t10 := 4*j(21) a[t10] := x(22) goto (5)(23) t11 := 4*I(24) x := a[t11](25) t12 := 4*I(26) t13 := 4*n(27) t14 := a[t13](28) a[t12] := t14(29) t15 := 4*n(30) a[t15] := x

Example: Compute Basic Blocks on this 3 Address Code

16

(1) i := m-1(2) j := n(3) t1 := 4*n(4) v := a[t1](5) i := i+1(6) t2 := 4*I(7) t3 := a[t2](8) if t3 < v goto (5)(9) j := j-1(10) t4 := 4*j(11) t5 := a[t4](12) if t5 > v goto (9)(13) if I >= j goto (23)

(14) t6 := 4*I(15) x := a[t6](16) t7 := 4*I(17) t8 := 4*j(18) t9 := a[t8](19) a[t7] := t9(20) t10 := 4*j(21) a[t10] := x(22) goto (5)(23) t11 := 4*I(24) x := a[t11](25) t12 := 4*I(26) t13 := 4*n(27) t14 := a[t13](28) a[t12] := t14(29) t15 := 4*n(30) a[t15] := x

Example: Compute Basic Blocks on this 3 Address Code: leaders

17

(1) i := m-1(2) j := n(3) t1 := 4*n(4) v := a[t1](5) i := i+1(6) t2 := 4*I(7) t3 := a[t2](8) if t3 < v goto (5)(9) j := j-1(10) t4 := 4*j(11) t5 := a[t4](12) if t5 > v goto (9)(13) if I >= j goto (23)

(14) t6 := 4*I(15) X := a[t6](16) t7 := 4*I(17) t8 := 4*j(18) t9 := a[t8](19) a[t7] := t9(20) t10 := 4*j(21) a[t10] := x(22) goto (5)(23) t11 := 4*I(24) x := a[t11](25) t12 := 4*I(26) t13 := 4*n(27) t14 := a[t13](28) a[t12] := t14(29) t15 := 4*n(30) a[t15] := x

Example: Compute Basic Blocks on this 3 Address Code: blocks

18

Computing Control Flow Graph from Basic Blocks (algorithm)

Input: a list of basic blocks

Output: A list of control-flow graph (CFG) nodes and edges

Method: Create entry and exit nodes; create edge (entry, B1); create (Bk,

exit) for each Bk that represents an exit from program Add CFG edge from Bi to Bj if Bj can immediately follow Bi in

some execution, i.e.,o There is a conditional or unconditional goto from the last statement

of Bi to the first statement of Bj oro Bj immediately follows Bi in the order of the program and Bi does not

end in an unconditional goto statement

Label edges that represent conditional transfers of control as “T” (true) or “F” (false); other edges are unlabeled

19

(1) i := m-1(2) j := n(3) t1 := 4*n(4) v := a[t1](5) i := i+1(6) t2 := 4*I(7) t3 := a[t2](8) if t3 < v goto (5)(9) j := j-1(10) t4 := 4*j(11) t5 := a[t4](12) if t5 > v goto (9)(13) if I >= j goto (23)

(14) t6 := 4*I(15) X := a[t6](16) t7 := 4*I(17) t8 := 4*j(18) t9 := a[t8](19) a[t7] := t9(20) t10 := 4*j(21) a[t10] := x(22) goto (5)(23) t11 := 4*I(24) x := a[t11](25) t12 := 4*I(26) t13 := 4*n(27) t14 := a[t13](28) a[t12] := t14(29) t15 := 4*n(30) a[t15] := x

Example: Compute Control Flow Graph from Basic Blocks

B1

B2

B3

B4

B5

B6

20

Example: Compute Control Flow Graph from Basic Blocks

B1

B2

B3

B4

B6B5TF

T

TF

F

Entry

Exit

21

Computing Control Flow from Source Code (example)

Procedure AVGS1 count = 0S2 fread(fptr, n)S3 while (not EOF) doS4 if (n < 0)S5 return (error) elseS6 nums[count] = nS7 count ++ endifS8 fread(fptr, n) endwhileS9 avg = mean(nums,count)S10 return(avg)

22

Computing Control Flow from Source Code (example)

Procedure AVGS1 count = 0S2 fread(fptr, n)S3 while (not EOF) doS4 if (n < 0)S5 return (error) elseS6 nums[count] = nS7 count ++ endifS8 fread(fptr, n) endwhileS9 avg = mean(nums,count)S10 return(avg)

S1

S2

S3

S4

S5 S6

S7

S8

S9

S10

entry

exit

F

T

F

T

23

Computing Control Flow from Source Code (maximal basic blocks)

Procedure AVGS1 count = 0S2 fread(fptr, n)S3 while (not EOF) doS4 if (n < 0)S5 return (error) elseS6 nums[count] = nS7 count ++ endifS8 fread(fptr, n) endwhileS9 avg = mean(nums,count)S10 return(avg)

S1

S2

S3

S4

S5 S6

S7

S8

S9

S10

entry

exit

F

T

F

T

24

Computing Control Flow from Source Code (another example)

Procedure TrivialS1 read (n)S2 switch (n) case 1:S3 write (“one”) break case 2:S4 write (“two”) case 3:S5 write (“three”) break defaultS6 write (“Other”) endswitchend Trivial

25

Computing Control Flow from Source Code (another example)

Procedure TrivialS1 read (n)S2 switch (n) case 1:S3 write (“one”) break case 2:S4 write (“two”) case 3:S5 write (“three”) break defaultS6 write (“Other”) endswitchend Trivial

S1

S2

S3 S4 S5 S6

entry

exit

26

Computing Control Flow from Source Code (maximal basic blocks)

Procedure TrivialS1 read (n)S2 switch (n) case 1:S3 write (“one”) break case 2:S4 write (“two”) case 3:S5 write (“three”) break defaultS6 write (“Other”) endswitchend Trivial

S1

S2

S3 S4 S5 S6

entry

exit

27

Applications of Control Flow

• Program understanding: • In CFGs, program structure and flow are

explicit

• Complexity: • Cyclomatic (McCabe’s)• Computed in several ways:

• Edges –nodes +2• Number of regions in CFG• Number of decision statements + 1 (if

program is structured)

• Indicates number of test cases needed; indicates difficulty of maintaining

1

2 3

4 5

6

78

28

Applications of Control Flow

• Testing: branch, path, basis path• Branch: must test 12, 1 3, 45,

48, 56, 57• Path: infinite number because of loop• Basis path: set of paths such that each

path executes at least one more edge (cyclomatic complexity gives max necessary); example: 1,2,4,8; 1,3,4,5,6,7,4,8

1

2 3

4 5

6

78

29

Applications of Control Flow

• Support for Other Analyses: • Dominator• Postdominator• Data dependence• Control dependence• Points-to• Regression test selection

1

2 3

4 5

6

78

30

Terminology: Levels of Analysis

Local: within a single basic block or statementIntraprocedural: within a single procedure, function, or

method (sometimes intramethod)Interprocedural: across procedure boundaries, procedure

call, shared globals, etc.Intraclass: within a single classInterclass: across class boundaries. . .

Exercise• How would we represent interprocedural control

flow; that is, control flow involving an entire program containing several procedures?

31

Procedure Abegin S3 call B() S4end

Procedure Mainbegin S1 if (P1) then call A() else call B() endif call A() S2end

Procedure Bbegin S5end

Step 1: Draw individual CFGs

32

Step 2: Connect CFGs (how?)

33

top related