cs412/413
DESCRIPTION
CS412/413. Introduction to Compilers and Translators April 12, 1999 Lecture 28: Loops and check removal. Administration. Prelim 2 in 4 days PA 4 due April 28 Reading: Appel 17, Muchnick 7.1-7.4, 14.1, 17.4.3. Loops. Most execution time in most programs is spent in loops: 90/10 is typical - PowerPoint PPT PresentationTRANSCRIPT
CS412/413
Introduction toCompilers and Translators
April 12, 1999
Lecture 28: Loops and check removal
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
2
Administration• Prelim 2 in 4 days• PA 4 due April 28• Reading: Appel 17, Muchnick
7.1-7.4, 14.1, 17.4.3
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
3
Loops• Most execution time in most
programs is spent in loops: 90/10 is typical
• Most important targets of optimization: loops
• Loop optimizations:– loop-invariant code motion– loop unrolling– loop peeling– strength reduction of expressions
containing induction variables– removal of bounds checks
• When to apply loop optimizations?
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
4
High-level optimization?• Loops may be hard to recognize in
IR or quadruple form -- should we apply loop optimizations to source code or high-level IR?– Many kinds of loops: while, do/while,
continue– loop optimizations benefit from other
IR-level optimizations and vice-versa -- want to be able to interleave
• Problem: identifying loops in call-flow graph
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
5
Definition of a loop• A loop is a set of nodes in the control
flow graph, with one distinguished node called the header.
• Every node is reachablefrom header, headerreachable from everynode: strongly-connectedcomponent
• No entering edges fromoutside except to header
• nodes with outgoingedges: loop exit nodes
header
loop exit
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
6
Nested loops• Control-flow graph may contain many
loops, and loops may contain each other
• Control-flow analysis : identify the loops and nesting structure
inner loop
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
7
Dominators• Notion of dominator will help us
do CFA• Node A dominates node B if the
only way to reach B from start node is through A
• Edge in call flow graph is aback edge if destinationdominates source
• A loop contains at least one back edge
1
2
543
back edge
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
8
Dominator tree• Domination is transitive; if A
dominates B and B dominates C, then A dominates C. A immediately dominates B if domination not implied transitively
• Every call-flow graph has dominator tree1
2
3 4
5 6
78
9 10
1
2
3 4
5 6
78
9 10
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
9
Finding dominators
• Goal: for every node in call-flow graph, find its set of dominators
• Properties of dominators:1. Every node dominates itself2. A node B is dominated by
another node A if A dominates all of the predecessors of B
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
10
Dominator data-flow analysis• Forward analysis
• A node B is dominated by another node A if A dominates all of the predecessors of B
in[n] = n’pred[n] out[n’]• Every node dominates itself:
out[n] = in[n] {n}• Formally: L = sets of nodes ordered by
, flow functions Fn(x) = x {n}, T = {all n} Standard iterative analysis works fine
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
11
Completing control-flow analysis
• Each back edge nh has an associated natural loop with h as its header: all nodes reachable from h that reach n without going through h
• For each back edge, find its natural loop
• Nest loops based on subsetrelationship between natural loops
• Exception: natural loops may sharesame header; merge them intolarger loop.
1
2
3 4
5 6
78
9 10
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
12
Loop-invariant hoisting• Idea: move computations that always give the
same result out of the loop: only compute once!• Hoisting quadruple q: t = a + b. Use reaching
definitions analysis to see if a, b are constants (conservatively)
• Must also ensure q is guaranteed to be executed by loop, q is only defn of t, t not live-in at h
h
q
h
q
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
13
Induction variables• Induction variables are
variables with value ai + b on the ith iteration of a natural loop
• Various optimizations can exploit information about induction variables:– strength reduction– array bounds check elimination– loop unrolling
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
14
Identifying induction variables• Basic induction variables: only
one definition of the form i = i + K• Derived induction variables:
one definition of the form j = i * M + N
j = 3;for (i = 0; i < n; i++) {
j = j +1;k = i*4 + 8;m = k*12 + 1;…
}
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
15
Strength reduction• Every derived induction variable k can
be written as a*i + b, a and b constants, i some basic induction variable
• For all distinct (a,b) pairs:– insert before loop header k’ = b– insert after loop header k’ = k’ + a– Replace definition of any k whose formula is
a*i + b with k = k’• Result: multiplication(s) replaced by single
addition• Additional optimizations facilitated:
copy/constant propagation, dead/useless variable elimination, dead code elimination
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
16
Loop unrolling• Loop unrolling: creates K
copies of loop in sequence
h
Useless unrolling:(K=2)
h
h
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
17
Using induction variables• When loop test expression
depends on induction variable (e.g. i < n), can use one loop test to ensure that entire unrolled loop will succeed (i+K-1 < n): remove all interior loop tests
• Additional loop is needed to “finish up” 0..K-1 iterations
h
h
Useful unrolling
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
18
Array bounds checks• Iota+: On every expression a[i] ,
must ensure i < length a, i 0 (i <u length a)
• Checking array bounds is expensive -- adds conditional jump to every array access expression
• Array indices are often induction variables -- can use induction variable information to avoid the bounds check entirely
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
19
Bounds checks• Can eliminate the bounds check if
we can prove at compile time that it will always succeed
i = 0;while (i < length a) {
a[i] = b[i];i++;
}
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
20
Rules• Given reference a[k] where k is an
induction variable with value a*i + b, must find a conditional test on some induction variable j– test terminates the loop– test dominates the reference to a[k]– test is against some loop invariant
such that provably k <u length a• When to perform optimization? AST? Need
domination analysis, other optimizations not done. Quadruples? Hard to recognize array element and length expressions reliably.
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
21
Null checks• Another costly operation: checking
for null pointers• Java, Iota+ : needed on every
– field access or assignment (except on this)
– method invocation (except on this)– array element access– string operation
• Idea: Once we’ve checked for null, shouldn’t need to check again
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
22
Exampleu = p.x + p.y t1 = p != 0 t1 = p != 0
if t1 goto L1 else L2 if t1 goto L1 else L2L2: abort L2: abortL1: ax = p + 4 L1: ax = p+4tx = M[ax] tx = M[ax]t2 = p != 0 CSE: t2 = t1 t2 = t1if t2 goto L3 else L4 goto L4L3: abort L3: abortL4: ay = p + 8 L4: ay = p + 8ty = M[ay] ty = M[ay]u = tx + ty u = tx + ty
BP: t1 = trueCP: if t1 goto ...
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
23
Boolean propagation• Propagated information is set of
(variable, value) pairs• (v, ?), (v, T), (v, F)• Doesn’t fit into standard
dataflow analysis model– different information leaves on
different out-edges of if statements– need to explicitly represent
information on each edge
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
24
Finishing optimization
t1 = p != 0 t1 = p != 0 CJUMP p != 0, L1if t1 goto L1 else L2 if t1 goto L1 else L2 ABORTL2: abort L2: abort L1: MOVE(u, M[p+4]L1: ax = p+4 L1: ax = p+4 +
M[p+8])tx = M[ax] tx = M[ax]t2 = t1goto L4L3: abortL4: ay = p + 8 ay = p + 8ty = M[ay] ty = M[ay]u = tx + ty u = tx + ty
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
25
Summary• Optimizing loop code is critical to good
performance• Loops can be identified automatically in
control-flow graph using dominator data-flow analysis; allows interleaving of loop optimizations
• Induction variables enable many loop optimizations: loop unrolling, strength reduction, array bounds checks.
• Avoiding array bounds checks, null checks is important for performance but not done often enough; one reason why Java is slow and C is tough to program in.
CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
26
Summary of Optimization• IR optimizations:
– Quadruple representation/control-flow graphs makes optimizations easier to express and check for safety
– Data-flow analysis– Control-flow analyses (loops &
traces)• Low-level optimizations:
– Graph-coloring register allocation– List scheduling of instructions