seminar on optimizations for modern architectures “ optimizing compilers for modern architectures...

61
Seminar on Optimizations for Modern Architectures “Optimizing Compilers for Modern Architectures”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to end Interprocedural Interprocedural Analysis and Analysis and Optimization Optimization

Post on 19-Dec-2015

218 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

Seminar on Optimizations for Modern Architectures

“Optimizing Compilers for Modern Architectures”,

Allen and Kennedy, Chapter 11 - Section 11.2.5 to end

Presented by Li-Tal Mashiach

Interprocedural Interprocedural Analysis and Analysis and OptimizationOptimization

Page 2: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

2

Review

Examples of interprocedural problems Classification of interprocedural problems We analyzed two interprocedural

problems MOD Analysis Alias Analysis

Page 3: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

3

Today’s topics

Additional Interprocedural problems Kill Analysis Constant propagation Symbolic Analysis Array section Analysis Call graph construction

Interprocedural Optimization Managing Whole-Program Compilation

Page 4: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

4

Kill Analysis

Problems like MOD and ALIAS ask questions about what might happen on some path

It is often useful to ask about what must happened on every path

An assignment is said to “Kill” a previous value of the variable

The problem of discovering whether a variable is assigned on every path through a called procedure is known as KILL

Page 5: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

5

Kill analysis - Motivation

DO I = 1, NS0: CALL INIT(T,I)

T = T + B(I)A(I) = A(I) + T

ENDDO

To parallelize the loop: It must be possible to recognize that variable T

can be made private to each iteration It means that T is assigned before being used

on every path through INITTaken from Gabi Kliot lecture

Page 6: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

6

Kill analysis – Definitions (1)

KILL(p)KILL(p) – the set of all variables that must be modified on every path through procedure p

NKILL(p)NKILL(p)=¬KILL(p) It is OK to overestimate NKILL

THRU(b,c)THRU(b,c) – the set of all variables that are not killed on some path through basic block b to c

cNKILLcbTHRUbNKILLbsuccc

,)()(

Page 7: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

7

reduced-control-flow graphreduced-control-flow graph Vertices consist of procedure entry, procedure

exit, and call sites. Every edge (x,y) in the graph is annotated with

the set THRU(x,y) of variables not killed on that edge.

Kill Analysis – Definitions (2)

Page 8: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

8

Solution Concept

First we will construct the reduced control-flow graph for each procedure

Next we will compute NKILL(p) for every procedure p in the program

Finally we will extend the algorithm to handle reference formal parameters by using the binding graph

Page 9: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

9

Reduced Control Flow Graph

Go over the edges of the control-flow graph (without back edges) from the procedure entry node

For each edge (b,s): if the node is a call site

go over its edges to the successors if the node is normal node

merge s to b For each successor t of s

THRU[b,t] THRU[b,t] (THRU[b,s] THRU[s,t])

b

s

t

THRU[b,s]

THRU[s,t]

b

t

THRU[b,s] THRU[s,t]

Page 10: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

10

Reduced Control Flow Graph

ComputeReducedCFG(G)remove back edges from Gfor each successor of entry node s, add (entry, s) to worklistwhile worklist isn’t empty

remove a ready element (b,s) from the worklist if s is a call site

add (s,t) to worklist for each successor t of s otherwise if s isn’t the exit node

for each successor t of s if THRU[b,t] undefined then THRU[b,t] {}

THRU[b,t] THRU[b,t] (THRU[b,s] THRU[s,t])

end

Page 11: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

11

Example

1 SUBROUTINE FUNC(A,B,N)

2 IF (A .GT. 0) THEN

3 CALL INIT(A,B,N)

4 B=B+N

5 ELSE

6 A = A+B

7 ENDIF

8 END

Control-flow graph:

1

23

7

8

4

5

6

Ω={A,B,N}

Reduced control-flow graph:

1

8

3

{A,N}

{B,N}

Page 12: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

12

Computing NKILL

Assuming that all variables in the program are global

A simple iterative data-flow analysis algorithm can be used to compute NKILL

Go on the GTHRU graph in reverse topologic order

For each successor s of b NKILL[b] NKILL[b] (NKILL[s] THRU[b,s])

if b is a call site (p,q) then NKILL[b] NKILL[b] NKILL[q]

Page 13: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

13

Back to the example

Reduced control-flow graph of procedure func:

{A,N} ∩ NKILL(INIT)

{A,B,N}

{B,N} U {A,N}={A,B,N}1

8

3

{A,N}

{B,N}

Ω={A,B,N}

NKILL(func)={A,B,N}

Page 14: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

14

Computing NKILL

Compute NKILL(p)for each b in reduced graph in reverse topological order

if b is exit node then NKILL[b] {all variables}

elseNKILL[b] {}for each successor s of b

NKILL[b] NKILL[b] (NKILL[s] THRU[b,s])if b is a call site (p,q) then

NKILL[b] NKILL[b] NKILL[q]NKILL[p] NKILL[entry node]

Page 15: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

15

…It's not over till it's over The algorithm described so far can

compute NKILL(p) only if there are no reference formal parameters

A more complicated algorithm takes reference formal parameters into account using a binding graph

Page 16: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

16

Binding graph - reminder

Binding graphBinding graph GB=(NB,EB) One vertex for each formal parameter of each

procedure Directed edge from formal parameter, f1 of p

to formal parameter, f2 of q if there exists a call site s=(p,q) in p such that f1 is bound to f2

Taken from Gabi Kliot lecture

Page 17: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

17

Construct the binding graph and compute NKILL

For each p let NKILL0(p) be the result of applying NKILL with NKILL(q) = Ω for each successor q of p

For each p go over the formal parameters If f is in NKILL0

For each formal parameter g, add an edge (f,g) incase parameter f is passed to g

If f is not in NKILL0(p) Sign f as killed and add to worklist

Go over the worklist For each f go over the g such that there is an edge (g,f)

and g is not signed as killed call NKILL(q) (q is the procedure in which g is formal) If g is not in NKILL(q), sign as kill and add to worklist

Page 18: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

18

Back to the example

1 SUBROUTINE FUNC(A,B,N)

IF (A .GT. 0) THEN3 CALL INIT(A,B,N) B=B+N ELSE A = A+B ENDIF8 END

SUBROUTINE INIT(X,Y,Z) X = Z*Y+1END

A

X

B

Y

N

Z

Binding graph

worklist

XNKILL0(FUNC) = {A,B,N}

NKILL0(INIT) = {Y,Z}

{A,B,N}

{B,N} U {A,N}={A,B,N}

1

8

3

{A,N}

{B,N}

Ω={A,B,N}

{A,N} ∩ NKILL(INIT)

Page 19: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

19

Back to the example

A

X

B

Y

N

Z

Binding graph

worklist

{A,B,N}

{B,N} U {N}={B,N}

1

8

3

{A,N}

{B,N}

Ω={A,B,N}

NKILL (FUNC) = {B,N}

NKILL (INIT) = {Y,Z}

{A,N} ∩ {B,N}={N}

1 SUBROUTINE FUNC(A,B,N)

IF (A .GT. 0) THEN3 CALL INIT(A,B,N) B=B+N ELSE A = A+B ENDIF8 END

SUBROUTINE INIT(X,Y,Z) X = Z*Y+1END

Page 20: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

20

Kill analysis - Complexity Constructing the reduce control graph-

O(Nc+Ec)V - E and N are the number of edges and vertices in the control flow graph, V- number of variables

Computing NKILL- O((Nr+Er)dV) - E and N are the number of edges and

vertices in the reduce control graph, d- the maximum number of paths to the same node, V- number of variables

Total number of NKILL updates- O(NB+EB) - E and N are the number of edges and

vertices in the binding graph (call graph) The kill sets computed by this process do not take

aliasing into account

Page 21: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

21

Today’s topics

Additional Interprocedural problems Kill Analysis Constant propagation Symbolic Analysis Array section Analysis Call graph construction

Interprocedural Optimization Managing Whole-Program Compilation

Page 22: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

22

SUBROUTINE FOO(N) INTEGER N,M CALL INIT(M,N) DO I = 1,P B(M*I + 1) = 2*B(1) ENDDOEND

SUBROUTINE INIT(M,N) M = NEND

If N = 0 on entry to FOO, there is a dependency.Otherwise, we can vectorize the loop.

Constant Propagation Propagating constants between procedures

can cause significant improvements Dependence testing can be made more

precise

Page 23: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

23

Constant Propagation

A single-procedure constant propagation algorithm was presented already in section 4

We will extend this algorithm to solve interprocedural constant propagation

The single-procedure algorithm uses an iterative process on the definition-use graph

Definition-use graphDefinition-use graph -a graph that contains an edge from each definition point in the program to every possible use of the variable at run time

Page 24: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

24

Constant Propagation Replace all variables that have constant

values (at a certain point) at runtime with those constant values.

Taken from Harel Paz lecture

Page 25: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

25

Constant propagation algorithm for single-procedure

Starts with a set of all assignments that set a variable to be a constant value

Definition-use edges are used to find all inputs that the definition can reach

The Definition-use edges are tracked backwards to find all definitions that can reach a specific input

If all definitions have the same constant value, the input is replaced with that value

Otherwise it is non constantX=5

Y=X+Z

xx

Y=5+Z

Page 26: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

26

Constant propagation algorithm for interprocedure

We will extend this algorithm to solve interprocedural constant propagation

Instead of a Definition-Use graph, we construct an interprocedural value graph

Interprocedural value propagation Interprocedural value propagation graphgraph the vertices represent “jump functions” that compute values out of a given procedure from known values into a procedure

Page 27: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

27

Constant Propagation - definitions

Let s = (p,q) be a call site in procedure p of procedure q, and let x be a parameter of q. Then: Jump functionJump function for x at s, gives the value of

x in terms of parameters of p

SupportSupport of is the set of inputs actually used in the evaluation of

Jsx

Jsx

Jsx

Page 28: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

28

ExamplePROGRAM MAIN

INTEGER A,B A = 1 B = 2

CALL S(A,B)ENDSUBROUTINE S(X,Y)

INTEGER X,Y,Z,W

Z = X + YW = X - Y

CALL T(Z,W)ENDSUBROUTINE T(U,V)

PRINT U,VEND

}{

}{

}2{

}1{

YXJ

YXJ

J

J

V

U

Y

x

Page 29: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

29

The construction of the interprocedural value graph: Add a node to the graph for each jump function If x belongs to the support of , where t lies in the

procedure q, then add an edge between and for every call site s = (p,q) for some p

We can now apply the constant propagation algorithm to the interprocedural value graph

Jsx

Jty

Jsx

Jty

Interprocedural value graph

Page 30: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

30

The constant-propagation algorithm willeventually converge to above values

ExamplePROGRAM MAIN

INTEGER A,B A = 1 B = 2

CALL S(A,B)ENDSUBROUTINE S(X,Y)

INTEGER X,Y,Z,W

Z = X + YW = X - Y

CALL T(Z,W)ENDSUBROUTINE T(U,V)

PRINT U,VEND

1 2

-13

}{

}{

}2{

}1{

YXJ

YXJ

J

J

V

U

Y

x

J X

J V

J U

J Y

Page 31: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

31

To build a jump function at call site we need to know what action will be taken by subroutine INIT invoked at call site

Return jump functionReturn jump function - determines the value of x on return from an invocation of p in terms of input parameters to p

The support of support of is the same as the support of a forward jump function

Return Jump Functions

SUBROUTINE PROCESS(N,B) INTEGER N,B,I CALL INIT(I,N) CALL SOLVE(B,I)END

xpR

xpR

Page 32: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

32

PROGRAM MAIN INTEGER A CALL PROCESS(15,A) PRINT AENDSUBROUTINE PROCESS(N,B) INTEGER N,B,I CALL INIT(I,N) CALL SOLVE(B,I)ENDSUBROUTINE INIT(X,Y) INTEGER X,Y X = 2*YENDSUBROUTINE SOLVE(C,T) INTEGER C,T C = T*10END

Return Jump Functions

Can a constant be substituted for the

variable A in the print statement?

Page 33: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

33

PROGRAM MAIN INTEGER A CALL PROCESS(15,A) PRINT AENDSUBROUTINE PROCESS(N,B) INTEGER N,B,I CALL INIT(I,N) CALL SOLVE(B,I)ENDSUBROUTINE INIT(X,Y) INTEGER X,Y X = 2*YENDSUBROUTINE SOLVE(C,T) INTEGER C,T C = T*10END

RINITX {2 *Y}

RSOLVEC {T *10}

JT

RINITX (N) I MOD( )

undefined otherwise

JN 15 J

Y N

otherwiseundefined

MODCNJRR

TCSOLVEB

PROCESS

)())((

Return Jump Functions

(2*(15))*10=300BPROCESSR

Page 34: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

34

Today’s topics

Additional Interprocedural problems Kill Analysis Constant propagation Symbolic Analysis Array section Analysis Call graph construction

Interprocedural Optimization Managing Whole-Program Compilation

Page 35: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

35

Symbolic Analysis Prove facts about variables other than

constancy: Find a symbolic expression for a variable in

terms of other variables Establish a relationship between pairs of

variables at some point in program Establish a range of values for a variable at a

given point

Page 36: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

36

Symbolic Analysis

Example for symbolic expression or relationship between pairs analysis:

SUBROUTINE S(A,N,M)

REAL A(N+M)

INTEGER N, M

DO I = 1,N

A(I+M) = A(I) + B

ENDDO

END

If we could prove that N=M on entry to S, we could show that the loop within the subroutine carries no dependence

Page 37: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

37

Symbolic Analysis

Example for range analysis:

SUBROUTINE S(A,N,K)

REAL A(0:N)

INTEGER N, K

DO I = 2,N

A(I) = A(I) + A(K)

ENDDO

END

If we can prove that K in [0:1] on entry to the subroutine, we can establish that the loop carries no dependence.

Page 38: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

38

[-∞ 60 [50:∞][1:100]

[-∞:100] [1:∞]

[-∞,∞]

• Jump functions and return jump functions return ranges

• Meet operation is now more complicated• If we can bound number of times upper

bound increases and lower bound decreases, the finite-descending-chain property is satisfied

Symbolic Analysis Range analysis and symbolic evaluation can

be solved using a lattice framework

Page 39: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

39

Today’s topics

Additional Interprocedural problems Kill Analysis Constant propagation Symbolic Analysis Array section Analysis Call graph construction

Interprocedural Optimization Managing Whole-Program Compilation

Page 40: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

40

In the following example:Let be the set of locations in array modified on iteration I and set of locations used on iteration I. Then the loop carries true dependence iff:

DIMENSION A(100,100)DO I = 1,N CALL SOURCE(A,I) CALL SINK(A,I)ENDDO

•We want to know if this loop carries a dependence.•MOD and USE are not good enough

MA (I)UA(I)

MA (I1)UA (I2) 1I1 I2 N

Array Section Analysis - example

Page 41: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

41

Array Section Analysis We would like to use array version of

MOD and USE For that, we need to extend the

standard data-flow algorithm, which works on vectors of bits, to vector of more general lattice elements

Page 42: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

42

Array Section Analysis One possible lattice representation are

sections of the form: A(I,L), A(I,*), A(*,L), A(*,*)

A(I,L) A(I,J) A(K,J)

A(I,*) A(*,J)

A(*,*)

T

Page 43: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

43

Array Section Analysis The depth of the lattice is now on the order

of the number of array subscripts and the meet operation is efficient

A better representation is one in which upper and lower bounds for each subscript are allowed

Interprocedural algorithms like MOD can be adapted to deal with vectors of lattice elements, when the depth of the lattice is limited

Page 44: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

44

Today’s topics

Additional Interprocedural problems Kill Analysis Constant propagation Symbolic Analysis Array section Analysis Call graph construction

Interprocedural Optimization Managing Whole-Program Compilation

Page 45: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

45

SUBROUTINE SUB1(X,Y,P) INTEGER X,Y CALL P(X,Y)END

What procedure names can bepassed into P?

Call Graph Construction

To solve this problem we must be able to determine, for each procedure parameter P, the names of procedures that may be passed to P

Page 46: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

46

Call Graph Construction

A precise call graph construction algorithm must keep track of which pairs of procedure parameters may be simultaneously passed to the procedure formal parameters

SUBROUTINE SUB2(X,P,Q) INTEGER X CALL P(X,Q)END

CALL SUB2(X,P1,Q1)CALL SUB2(X,P2,Q2)

SUBROUTINE P1(X,Q) INTEGER X CALL Q(X)END

Page 47: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

47

Call Graph Construction PROCPARMS(p)PROCPARMS(p) – a set of tuples of

procedure names that may simultaneously be passed to p

For each call site of p, we will add to PROCPARMS(p) the tuple of procedures that was passed to it

For each call site of q inside p, we will add to PROCPARMS(q) the procedure names that were passed to it from the tuple SUBROUTINE P(pf1, pf2, pf3)

…. CALL Q(pf2,pf3)END

Page 48: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

48

ProcParms(p) {} for each procedure p for each call site s = (p,q) passing in procedure names Let t = <N1,N2, …, Nk> be procedure names passed in

worklist worklist {<t,q>} while worklist isn’t empty remove <t= <N1,N2, …, Nk>,p> from worklist ProcParms[p] ProcParms[p] {t} <P1,P2,…,Pk> are parameters bound to <N1,N2, …, Nk>

for each call site (p,q) passing in some Pi Let u=<M2,M2,…,Mk> set of procedure names and

instances of Pi passed into q

if u is not in ProcParms[q] then worklist worklist {<u,q>}

Procedure ComputeProcParms

Page 49: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

49

Today’s topics Additional Interprocedural problems

Kill Analysis Constant propagation Symbolic Analysis Array section Analysis Call graph construction

Interprocedural Optimization Inlining Procedure Cloning Hybrid optimizations

Managing Whole-Program Compilation

Page 50: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

50

Inline Substitution

SUBROUTINE MAIN REAL A(100) DO I = 1,N CALL PROCESS(A,I) ENDDOEND

SUBROUTINE PROCESS(X,K) REAL X(*) X(K) = X(K) + K RETURNEND

SUBROUTINE MAIN REAL A(100) DO I = 1,N A(I) = A(I) + I ENDDOEND

Page 51: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

51

Inline Substitution Inlining procedure calls advantages:

Eliminates procedure call overhead Allows more optimizations to take place, in the

example the loop can be vectorized However, overuse of inline can cause a

number of problems: Slowdown the compilation Changing function forces global recompilation

Page 52: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

52

Inline Substitution

Instead of systematic inline, it is recommended a selective, goal-directed inlining that uses global program analysis to determine when inlining would be profitable

Page 53: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

53

PROCEDURE UPDATE(A,N,IS) REAL A(N) DO I = 1,N A(I*IS+1)=A(I*IS+1)+PI ENDDOEND

Procedure Cloning

Often specific values of function parameters result in better optimizations If we know that IS != 0 at specific call sites,

clone a vectorized version of the procedure and use it at those sites

Page 54: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

54

DO I = 1,N CALL FOO()ENDDO

PROCEDURE FOO() …END

CALL FOO()

PROCEDURE FOO() DO I = 1,N … ENDDOEND

Hybrid optimizations

Combinations of procedures can have benefit

One example is loop embedding:

Page 55: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

55

Today’s topics

Additional Interprocedural problems Kill Analysis Constant propagation Symbolic Analysis Array section Analysis Call graph construction

Interprocedural Optimization Managing Whole-Program Compilation

Page 56: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

56

Managing Whole-Program Compilation

In a conventional compilation system, the object code for any single procedure is a function only of the source code for that procedure

In an interprocedural compilation system, the object code for a procedure may depend on the source code for the entire program

Problem: users will be unhappy

if a large program needs to be completely

recompiled after every small change

Page 57: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

57

Interprocedural Compilation Process

Local Analysis

Inter-procedural Analysis

Optimization

If intermediate representations are saved, the local analysis phase will not need to be re-invoked for any unchanged procedures

Once for each procedure

Once for program

Once for each procedure

Page 58: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

58

Recompilation analysis Basic idea: every time the optimizer passes

over a component, calculate information telling what other procedures must be rescanned

Include a feedback loop that optimizes until no procedures are out of date

Page 59: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

59

Recompilation analysis

Module Importer

Composition Editor

Program Compiler

Module Compiler

Interprocedural sets for each

procedure

List of procedures

Page 60: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

60

Summary

Solution of flow sensitive problems: Kill analysis Constant propagation

Solutions to related problems such as symbolic analysis and array section analysis

Other optimizations and ways to integrate these into whole-program compilation

Page 61: Seminar on Optimizations for Modern Architectures “ Optimizing Compilers for Modern Architectures ”, Allen and Kennedy, Chapter 11 - Section 11.2.5 to

©Li-Tal Mashiach, Technion, 2006

61

ANY ANY QUESTIONS?QUESTIONS?