seminar on optimizations for modern architectures “ optimizing compilers for modern architectures...
Post on 19-Dec-2015
218 views
TRANSCRIPT
Seminar on Optimizations for Modern Architectures
“Optimizing Compilers for Modern Architectures”,
Allen and Kennedy, Chapter 11 - Section 11.2.5 to end
Presented by Li-Tal Mashiach
Interprocedural Interprocedural Analysis and Analysis and OptimizationOptimization
©Li-Tal Mashiach, Technion, 2006
2
Review
Examples of interprocedural problems Classification of interprocedural problems We analyzed two interprocedural
problems MOD Analysis Alias Analysis
©Li-Tal Mashiach, Technion, 2006
3
Today’s topics
Additional Interprocedural problems Kill Analysis Constant propagation Symbolic Analysis Array section Analysis Call graph construction
Interprocedural Optimization Managing Whole-Program Compilation
©Li-Tal Mashiach, Technion, 2006
4
Kill Analysis
Problems like MOD and ALIAS ask questions about what might happen on some path
It is often useful to ask about what must happened on every path
An assignment is said to “Kill” a previous value of the variable
The problem of discovering whether a variable is assigned on every path through a called procedure is known as KILL
©Li-Tal Mashiach, Technion, 2006
5
Kill analysis - Motivation
DO I = 1, NS0: CALL INIT(T,I)
T = T + B(I)A(I) = A(I) + T
ENDDO
To parallelize the loop: It must be possible to recognize that variable T
can be made private to each iteration It means that T is assigned before being used
on every path through INITTaken from Gabi Kliot lecture
©Li-Tal Mashiach, Technion, 2006
6
Kill analysis – Definitions (1)
KILL(p)KILL(p) – the set of all variables that must be modified on every path through procedure p
NKILL(p)NKILL(p)=¬KILL(p) It is OK to overestimate NKILL
THRU(b,c)THRU(b,c) – the set of all variables that are not killed on some path through basic block b to c
cNKILLcbTHRUbNKILLbsuccc
,)()(
©Li-Tal Mashiach, Technion, 2006
7
reduced-control-flow graphreduced-control-flow graph Vertices consist of procedure entry, procedure
exit, and call sites. Every edge (x,y) in the graph is annotated with
the set THRU(x,y) of variables not killed on that edge.
Kill Analysis – Definitions (2)
©Li-Tal Mashiach, Technion, 2006
8
Solution Concept
First we will construct the reduced control-flow graph for each procedure
Next we will compute NKILL(p) for every procedure p in the program
Finally we will extend the algorithm to handle reference formal parameters by using the binding graph
©Li-Tal Mashiach, Technion, 2006
9
Reduced Control Flow Graph
Go over the edges of the control-flow graph (without back edges) from the procedure entry node
For each edge (b,s): if the node is a call site
go over its edges to the successors if the node is normal node
merge s to b For each successor t of s
THRU[b,t] THRU[b,t] (THRU[b,s] THRU[s,t])
b
s
t
THRU[b,s]
THRU[s,t]
b
t
THRU[b,s] THRU[s,t]
©Li-Tal Mashiach, Technion, 2006
10
Reduced Control Flow Graph
ComputeReducedCFG(G)remove back edges from Gfor each successor of entry node s, add (entry, s) to worklistwhile worklist isn’t empty
remove a ready element (b,s) from the worklist if s is a call site
add (s,t) to worklist for each successor t of s otherwise if s isn’t the exit node
for each successor t of s if THRU[b,t] undefined then THRU[b,t] {}
THRU[b,t] THRU[b,t] (THRU[b,s] THRU[s,t])
end
©Li-Tal Mashiach, Technion, 2006
11
Example
1 SUBROUTINE FUNC(A,B,N)
2 IF (A .GT. 0) THEN
3 CALL INIT(A,B,N)
4 B=B+N
5 ELSE
6 A = A+B
7 ENDIF
8 END
Control-flow graph:
1
23
7
8
4
5
6
Ω={A,B,N}
Reduced control-flow graph:
1
8
3
{A,N}
{B,N}
©Li-Tal Mashiach, Technion, 2006
12
Computing NKILL
Assuming that all variables in the program are global
A simple iterative data-flow analysis algorithm can be used to compute NKILL
Go on the GTHRU graph in reverse topologic order
For each successor s of b NKILL[b] NKILL[b] (NKILL[s] THRU[b,s])
if b is a call site (p,q) then NKILL[b] NKILL[b] NKILL[q]
©Li-Tal Mashiach, Technion, 2006
13
Back to the example
Reduced control-flow graph of procedure func:
{A,N} ∩ NKILL(INIT)
{A,B,N}
{B,N} U {A,N}={A,B,N}1
8
3
{A,N}
{B,N}
Ω={A,B,N}
NKILL(func)={A,B,N}
©Li-Tal Mashiach, Technion, 2006
14
Computing NKILL
Compute NKILL(p)for each b in reduced graph in reverse topological order
if b is exit node then NKILL[b] {all variables}
elseNKILL[b] {}for each successor s of b
NKILL[b] NKILL[b] (NKILL[s] THRU[b,s])if b is a call site (p,q) then
NKILL[b] NKILL[b] NKILL[q]NKILL[p] NKILL[entry node]
©Li-Tal Mashiach, Technion, 2006
15
…It's not over till it's over The algorithm described so far can
compute NKILL(p) only if there are no reference formal parameters
A more complicated algorithm takes reference formal parameters into account using a binding graph
©Li-Tal Mashiach, Technion, 2006
16
Binding graph - reminder
Binding graphBinding graph GB=(NB,EB) One vertex for each formal parameter of each
procedure Directed edge from formal parameter, f1 of p
to formal parameter, f2 of q if there exists a call site s=(p,q) in p such that f1 is bound to f2
Taken from Gabi Kliot lecture
©Li-Tal Mashiach, Technion, 2006
17
Construct the binding graph and compute NKILL
For each p let NKILL0(p) be the result of applying NKILL with NKILL(q) = Ω for each successor q of p
For each p go over the formal parameters If f is in NKILL0
For each formal parameter g, add an edge (f,g) incase parameter f is passed to g
If f is not in NKILL0(p) Sign f as killed and add to worklist
Go over the worklist For each f go over the g such that there is an edge (g,f)
and g is not signed as killed call NKILL(q) (q is the procedure in which g is formal) If g is not in NKILL(q), sign as kill and add to worklist
©Li-Tal Mashiach, Technion, 2006
18
Back to the example
1 SUBROUTINE FUNC(A,B,N)
IF (A .GT. 0) THEN3 CALL INIT(A,B,N) B=B+N ELSE A = A+B ENDIF8 END
SUBROUTINE INIT(X,Y,Z) X = Z*Y+1END
A
X
B
Y
N
Z
Binding graph
worklist
XNKILL0(FUNC) = {A,B,N}
NKILL0(INIT) = {Y,Z}
{A,B,N}
{B,N} U {A,N}={A,B,N}
1
8
3
{A,N}
{B,N}
Ω={A,B,N}
{A,N} ∩ NKILL(INIT)
©Li-Tal Mashiach, Technion, 2006
19
Back to the example
A
X
B
Y
N
Z
Binding graph
worklist
{A,B,N}
{B,N} U {N}={B,N}
1
8
3
{A,N}
{B,N}
Ω={A,B,N}
NKILL (FUNC) = {B,N}
NKILL (INIT) = {Y,Z}
{A,N} ∩ {B,N}={N}
1 SUBROUTINE FUNC(A,B,N)
IF (A .GT. 0) THEN3 CALL INIT(A,B,N) B=B+N ELSE A = A+B ENDIF8 END
SUBROUTINE INIT(X,Y,Z) X = Z*Y+1END
©Li-Tal Mashiach, Technion, 2006
20
Kill analysis - Complexity Constructing the reduce control graph-
O(Nc+Ec)V - E and N are the number of edges and vertices in the control flow graph, V- number of variables
Computing NKILL- O((Nr+Er)dV) - E and N are the number of edges and
vertices in the reduce control graph, d- the maximum number of paths to the same node, V- number of variables
Total number of NKILL updates- O(NB+EB) - E and N are the number of edges and
vertices in the binding graph (call graph) The kill sets computed by this process do not take
aliasing into account
©Li-Tal Mashiach, Technion, 2006
21
Today’s topics
Additional Interprocedural problems Kill Analysis Constant propagation Symbolic Analysis Array section Analysis Call graph construction
Interprocedural Optimization Managing Whole-Program Compilation
©Li-Tal Mashiach, Technion, 2006
22
SUBROUTINE FOO(N) INTEGER N,M CALL INIT(M,N) DO I = 1,P B(M*I + 1) = 2*B(1) ENDDOEND
SUBROUTINE INIT(M,N) M = NEND
If N = 0 on entry to FOO, there is a dependency.Otherwise, we can vectorize the loop.
Constant Propagation Propagating constants between procedures
can cause significant improvements Dependence testing can be made more
precise
©Li-Tal Mashiach, Technion, 2006
23
Constant Propagation
A single-procedure constant propagation algorithm was presented already in section 4
We will extend this algorithm to solve interprocedural constant propagation
The single-procedure algorithm uses an iterative process on the definition-use graph
Definition-use graphDefinition-use graph -a graph that contains an edge from each definition point in the program to every possible use of the variable at run time
©Li-Tal Mashiach, Technion, 2006
24
Constant Propagation Replace all variables that have constant
values (at a certain point) at runtime with those constant values.
Taken from Harel Paz lecture
©Li-Tal Mashiach, Technion, 2006
25
Constant propagation algorithm for single-procedure
Starts with a set of all assignments that set a variable to be a constant value
Definition-use edges are used to find all inputs that the definition can reach
The Definition-use edges are tracked backwards to find all definitions that can reach a specific input
If all definitions have the same constant value, the input is replaced with that value
Otherwise it is non constantX=5
Y=X+Z
xx
Y=5+Z
©Li-Tal Mashiach, Technion, 2006
26
Constant propagation algorithm for interprocedure
We will extend this algorithm to solve interprocedural constant propagation
Instead of a Definition-Use graph, we construct an interprocedural value graph
Interprocedural value propagation Interprocedural value propagation graphgraph the vertices represent “jump functions” that compute values out of a given procedure from known values into a procedure
©Li-Tal Mashiach, Technion, 2006
27
Constant Propagation - definitions
Let s = (p,q) be a call site in procedure p of procedure q, and let x be a parameter of q. Then: Jump functionJump function for x at s, gives the value of
x in terms of parameters of p
SupportSupport of is the set of inputs actually used in the evaluation of
Jsx
Jsx
Jsx
©Li-Tal Mashiach, Technion, 2006
28
ExamplePROGRAM MAIN
INTEGER A,B A = 1 B = 2
CALL S(A,B)ENDSUBROUTINE S(X,Y)
INTEGER X,Y,Z,W
Z = X + YW = X - Y
CALL T(Z,W)ENDSUBROUTINE T(U,V)
PRINT U,VEND
}{
}{
}2{
}1{
YXJ
YXJ
J
J
V
U
Y
x
©Li-Tal Mashiach, Technion, 2006
29
The construction of the interprocedural value graph: Add a node to the graph for each jump function If x belongs to the support of , where t lies in the
procedure q, then add an edge between and for every call site s = (p,q) for some p
We can now apply the constant propagation algorithm to the interprocedural value graph
Jsx
Jty
Jsx
Jty
Interprocedural value graph
©Li-Tal Mashiach, Technion, 2006
30
The constant-propagation algorithm willeventually converge to above values
ExamplePROGRAM MAIN
INTEGER A,B A = 1 B = 2
CALL S(A,B)ENDSUBROUTINE S(X,Y)
INTEGER X,Y,Z,W
Z = X + YW = X - Y
CALL T(Z,W)ENDSUBROUTINE T(U,V)
PRINT U,VEND
1 2
-13
}{
}{
}2{
}1{
YXJ
YXJ
J
J
V
U
Y
x
J X
J V
J U
J Y
©Li-Tal Mashiach, Technion, 2006
31
To build a jump function at call site we need to know what action will be taken by subroutine INIT invoked at call site
Return jump functionReturn jump function - determines the value of x on return from an invocation of p in terms of input parameters to p
The support of support of is the same as the support of a forward jump function
Return Jump Functions
SUBROUTINE PROCESS(N,B) INTEGER N,B,I CALL INIT(I,N) CALL SOLVE(B,I)END
xpR
xpR
©Li-Tal Mashiach, Technion, 2006
32
PROGRAM MAIN INTEGER A CALL PROCESS(15,A) PRINT AENDSUBROUTINE PROCESS(N,B) INTEGER N,B,I CALL INIT(I,N) CALL SOLVE(B,I)ENDSUBROUTINE INIT(X,Y) INTEGER X,Y X = 2*YENDSUBROUTINE SOLVE(C,T) INTEGER C,T C = T*10END
Return Jump Functions
Can a constant be substituted for the
variable A in the print statement?
©Li-Tal Mashiach, Technion, 2006
33
PROGRAM MAIN INTEGER A CALL PROCESS(15,A) PRINT AENDSUBROUTINE PROCESS(N,B) INTEGER N,B,I CALL INIT(I,N) CALL SOLVE(B,I)ENDSUBROUTINE INIT(X,Y) INTEGER X,Y X = 2*YENDSUBROUTINE SOLVE(C,T) INTEGER C,T C = T*10END
RINITX {2 *Y}
RSOLVEC {T *10}
JT
RINITX (N) I MOD( )
undefined otherwise
JN 15 J
Y N
otherwiseundefined
MODCNJRR
TCSOLVEB
PROCESS
)())((
Return Jump Functions
(2*(15))*10=300BPROCESSR
©Li-Tal Mashiach, Technion, 2006
34
Today’s topics
Additional Interprocedural problems Kill Analysis Constant propagation Symbolic Analysis Array section Analysis Call graph construction
Interprocedural Optimization Managing Whole-Program Compilation
©Li-Tal Mashiach, Technion, 2006
35
Symbolic Analysis Prove facts about variables other than
constancy: Find a symbolic expression for a variable in
terms of other variables Establish a relationship between pairs of
variables at some point in program Establish a range of values for a variable at a
given point
©Li-Tal Mashiach, Technion, 2006
36
Symbolic Analysis
Example for symbolic expression or relationship between pairs analysis:
SUBROUTINE S(A,N,M)
REAL A(N+M)
INTEGER N, M
DO I = 1,N
A(I+M) = A(I) + B
ENDDO
END
If we could prove that N=M on entry to S, we could show that the loop within the subroutine carries no dependence
©Li-Tal Mashiach, Technion, 2006
37
Symbolic Analysis
Example for range analysis:
SUBROUTINE S(A,N,K)
REAL A(0:N)
INTEGER N, K
DO I = 2,N
A(I) = A(I) + A(K)
ENDDO
END
If we can prove that K in [0:1] on entry to the subroutine, we can establish that the loop carries no dependence.
©Li-Tal Mashiach, Technion, 2006
38
[-∞ 60 [50:∞][1:100]
[-∞:100] [1:∞]
[-∞,∞]
• Jump functions and return jump functions return ranges
• Meet operation is now more complicated• If we can bound number of times upper
bound increases and lower bound decreases, the finite-descending-chain property is satisfied
Symbolic Analysis Range analysis and symbolic evaluation can
be solved using a lattice framework
©Li-Tal Mashiach, Technion, 2006
39
Today’s topics
Additional Interprocedural problems Kill Analysis Constant propagation Symbolic Analysis Array section Analysis Call graph construction
Interprocedural Optimization Managing Whole-Program Compilation
©Li-Tal Mashiach, Technion, 2006
40
In the following example:Let be the set of locations in array modified on iteration I and set of locations used on iteration I. Then the loop carries true dependence iff:
DIMENSION A(100,100)DO I = 1,N CALL SOURCE(A,I) CALL SINK(A,I)ENDDO
•We want to know if this loop carries a dependence.•MOD and USE are not good enough
MA (I)UA(I)
MA (I1)UA (I2) 1I1 I2 N
Array Section Analysis - example
©Li-Tal Mashiach, Technion, 2006
41
Array Section Analysis We would like to use array version of
MOD and USE For that, we need to extend the
standard data-flow algorithm, which works on vectors of bits, to vector of more general lattice elements
©Li-Tal Mashiach, Technion, 2006
42
Array Section Analysis One possible lattice representation are
sections of the form: A(I,L), A(I,*), A(*,L), A(*,*)
A(I,L) A(I,J) A(K,J)
A(I,*) A(*,J)
A(*,*)
T
©Li-Tal Mashiach, Technion, 2006
43
Array Section Analysis The depth of the lattice is now on the order
of the number of array subscripts and the meet operation is efficient
A better representation is one in which upper and lower bounds for each subscript are allowed
Interprocedural algorithms like MOD can be adapted to deal with vectors of lattice elements, when the depth of the lattice is limited
©Li-Tal Mashiach, Technion, 2006
44
Today’s topics
Additional Interprocedural problems Kill Analysis Constant propagation Symbolic Analysis Array section Analysis Call graph construction
Interprocedural Optimization Managing Whole-Program Compilation
©Li-Tal Mashiach, Technion, 2006
45
SUBROUTINE SUB1(X,Y,P) INTEGER X,Y CALL P(X,Y)END
What procedure names can bepassed into P?
Call Graph Construction
To solve this problem we must be able to determine, for each procedure parameter P, the names of procedures that may be passed to P
©Li-Tal Mashiach, Technion, 2006
46
Call Graph Construction
A precise call graph construction algorithm must keep track of which pairs of procedure parameters may be simultaneously passed to the procedure formal parameters
SUBROUTINE SUB2(X,P,Q) INTEGER X CALL P(X,Q)END
CALL SUB2(X,P1,Q1)CALL SUB2(X,P2,Q2)
SUBROUTINE P1(X,Q) INTEGER X CALL Q(X)END
©Li-Tal Mashiach, Technion, 2006
47
Call Graph Construction PROCPARMS(p)PROCPARMS(p) – a set of tuples of
procedure names that may simultaneously be passed to p
For each call site of p, we will add to PROCPARMS(p) the tuple of procedures that was passed to it
For each call site of q inside p, we will add to PROCPARMS(q) the procedure names that were passed to it from the tuple SUBROUTINE P(pf1, pf2, pf3)
…. CALL Q(pf2,pf3)END
©Li-Tal Mashiach, Technion, 2006
48
ProcParms(p) {} for each procedure p for each call site s = (p,q) passing in procedure names Let t = <N1,N2, …, Nk> be procedure names passed in
worklist worklist {<t,q>} while worklist isn’t empty remove <t= <N1,N2, …, Nk>,p> from worklist ProcParms[p] ProcParms[p] {t} <P1,P2,…,Pk> are parameters bound to <N1,N2, …, Nk>
for each call site (p,q) passing in some Pi Let u=<M2,M2,…,Mk> set of procedure names and
instances of Pi passed into q
if u is not in ProcParms[q] then worklist worklist {<u,q>}
Procedure ComputeProcParms
©Li-Tal Mashiach, Technion, 2006
49
Today’s topics Additional Interprocedural problems
Kill Analysis Constant propagation Symbolic Analysis Array section Analysis Call graph construction
Interprocedural Optimization Inlining Procedure Cloning Hybrid optimizations
Managing Whole-Program Compilation
©Li-Tal Mashiach, Technion, 2006
50
Inline Substitution
SUBROUTINE MAIN REAL A(100) DO I = 1,N CALL PROCESS(A,I) ENDDOEND
SUBROUTINE PROCESS(X,K) REAL X(*) X(K) = X(K) + K RETURNEND
SUBROUTINE MAIN REAL A(100) DO I = 1,N A(I) = A(I) + I ENDDOEND
©Li-Tal Mashiach, Technion, 2006
51
Inline Substitution Inlining procedure calls advantages:
Eliminates procedure call overhead Allows more optimizations to take place, in the
example the loop can be vectorized However, overuse of inline can cause a
number of problems: Slowdown the compilation Changing function forces global recompilation
©Li-Tal Mashiach, Technion, 2006
52
Inline Substitution
Instead of systematic inline, it is recommended a selective, goal-directed inlining that uses global program analysis to determine when inlining would be profitable
©Li-Tal Mashiach, Technion, 2006
53
PROCEDURE UPDATE(A,N,IS) REAL A(N) DO I = 1,N A(I*IS+1)=A(I*IS+1)+PI ENDDOEND
Procedure Cloning
Often specific values of function parameters result in better optimizations If we know that IS != 0 at specific call sites,
clone a vectorized version of the procedure and use it at those sites
©Li-Tal Mashiach, Technion, 2006
54
DO I = 1,N CALL FOO()ENDDO
PROCEDURE FOO() …END
CALL FOO()
PROCEDURE FOO() DO I = 1,N … ENDDOEND
Hybrid optimizations
Combinations of procedures can have benefit
One example is loop embedding:
©Li-Tal Mashiach, Technion, 2006
55
Today’s topics
Additional Interprocedural problems Kill Analysis Constant propagation Symbolic Analysis Array section Analysis Call graph construction
Interprocedural Optimization Managing Whole-Program Compilation
©Li-Tal Mashiach, Technion, 2006
56
Managing Whole-Program Compilation
In a conventional compilation system, the object code for any single procedure is a function only of the source code for that procedure
In an interprocedural compilation system, the object code for a procedure may depend on the source code for the entire program
Problem: users will be unhappy
if a large program needs to be completely
recompiled after every small change
©Li-Tal Mashiach, Technion, 2006
57
Interprocedural Compilation Process
Local Analysis
Inter-procedural Analysis
Optimization
If intermediate representations are saved, the local analysis phase will not need to be re-invoked for any unchanged procedures
Once for each procedure
Once for program
Once for each procedure
©Li-Tal Mashiach, Technion, 2006
58
Recompilation analysis Basic idea: every time the optimizer passes
over a component, calculate information telling what other procedures must be rescanned
Include a feedback loop that optimizes until no procedures are out of date
©Li-Tal Mashiach, Technion, 2006
59
Recompilation analysis
Module Importer
Composition Editor
Program Compiler
Module Compiler
Interprocedural sets for each
procedure
List of procedures
©Li-Tal Mashiach, Technion, 2006
60
Summary
Solution of flow sensitive problems: Kill analysis Constant propagation
Solutions to related problems such as symbolic analysis and array section analysis
Other optimizations and ways to integrate these into whole-program compilation
©Li-Tal Mashiach, Technion, 2006
61
ANY ANY QUESTIONS?QUESTIONS?