cs412/413

CS412/413

Introduction toCompilers and Translators

April 12, 1999

Lecture 28: Loops and check removal

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

2

Administration• Prelim 2 in 4 days• PA 4 due April 28• Reading: Appel 17, Muchnick

7.1-7.4, 14.1, 17.4.3


3

Loops• Most execution time in most

programs is spent in loops: 90/10 is typical

• Most important targets of optimization: loops

• Loop optimizations:– loop-invariant code motion– loop unrolling– loop peeling– strength reduction of expressions

containing induction variables– removal of bounds checks

• When to apply loop optimizations?


4

High-level optimization?• Loops may be hard to recognize in

IR or quadruple form -- should we apply loop optimizations to source code or high-level IR?– Many kinds of loops: while, do/while,

continue– loop optimizations benefit from other

IR-level optimizations and vice-versa -- want to be able to interleave

• Problem: identifying loops in call-flow graph


5

Definition of a loop• A loop is a set of nodes in the control

flow graph, with one distinguished node called the header.

• Every node is reachablefrom header, headerreachable from everynode: strongly-connectedcomponent

• No entering edges fromoutside except to header

• nodes with outgoingedges: loop exit nodes

header

loop exit


6

Nested loops• Control-flow graph may contain many

loops, and loops may contain each other

• Control-flow analysis : identify the loops and nesting structure

inner loop


7

Dominators• Notion of dominator will help us

do CFA• Node A dominates node B if the

only way to reach B from start node is through A

• Edge in call flow graph is aback edge if destinationdominates source

• A loop contains at least one back edge

1

2

543

back edge


8

Dominator tree• Domination is transitive; if A

dominates B and B dominates C, then A dominates C. A immediately dominates B if domination not implied transitively

• Every call-flow graph has dominator tree1

2

3 4

5 6

78

9 10

1

2

3 4

5 6

78

9 10


9

Finding dominators

• Goal: for every node in call-flow graph, find its set of dominators

• Properties of dominators:1. Every node dominates itself2. A node B is dominated by

another node A if A dominates all of the predecessors of B


10

Dominator data-flow analysis• Forward analysis

• A node B is dominated by another node A if A dominates all of the predecessors of B

in[n] = n’pred[n] out[n’]• Every node dominates itself:

out[n] = in[n] {n}• Formally: L = sets of nodes ordered by

, flow functions Fn(x) = x {n}, T = {all n} Standard iterative analysis works fine


11

Completing control-flow analysis

• Each back edge nh has an associated natural loop with h as its header: all nodes reachable from h that reach n without going through h

• For each back edge, find its natural loop

• Nest loops based on subsetrelationship between natural loops

• Exception: natural loops may sharesame header; merge them intolarger loop.

1

2

3 4

5 6

78

9 10


12

Loop-invariant hoisting• Idea: move computations that always give the

same result out of the loop: only compute once!• Hoisting quadruple q: t = a + b. Use reaching

definitions analysis to see if a, b are constants (conservatively)

• Must also ensure q is guaranteed to be executed by loop, q is only defn of t, t not live-in at h

h

q

h

q


13

Induction variables• Induction variables are

variables with value ai + b on the ith iteration of a natural loop

• Various optimizations can exploit information about induction variables:– strength reduction– array bounds check elimination– loop unrolling


14

Identifying induction variables• Basic induction variables: only

one definition of the form i = i + K• Derived induction variables:

one definition of the form j = i * M + N

j = 3;for (i = 0; i < n; i++) {

j = j +1;k = i*4 + 8;m = k*12 + 1;…

}


15

Strength reduction• Every derived induction variable k can

be written as a*i + b, a and b constants, i some basic induction variable

• For all distinct (a,b) pairs:– insert before loop header k’ = b– insert after loop header k’ = k’ + a– Replace definition of any k whose formula is

a*i + b with k = k’• Result: multiplication(s) replaced by single

addition• Additional optimizations facilitated:

copy/constant propagation, dead/useless variable elimination, dead code elimination


16

Loop unrolling• Loop unrolling: creates K

copies of loop in sequence

h

Useless unrolling:(K=2)

h

h


17

Using induction variables• When loop test expression

depends on induction variable (e.g. i < n), can use one loop test to ensure that entire unrolled loop will succeed (i+K-1 < n): remove all interior loop tests

• Additional loop is needed to “finish up” 0..K-1 iterations

h

h

Useful unrolling


18

Array bounds checks• Iota+: On every expression a[i] ,

must ensure i < length a, i 0 (i <u length a)

• Checking array bounds is expensive -- adds conditional jump to every array access expression

• Array indices are often induction variables -- can use induction variable information to avoid the bounds check entirely


19

Bounds checks• Can eliminate the bounds check if

we can prove at compile time that it will always succeed

i = 0;while (i < length a) {

a[i] = b[i];i++;

}


20

Rules• Given reference a[k] where k is an

induction variable with value a*i + b, must find a conditional test on some induction variable j– test terminates the loop– test dominates the reference to a[k]– test is against some loop invariant

such that provably k <u length a• When to perform optimization? AST? Need

domination analysis, other optimizations not done. Quadruples? Hard to recognize array element and length expressions reliably.


21

Null checks• Another costly operation: checking

for null pointers• Java, Iota+ : needed on every

– field access or assignment (except on this)

– method invocation (except on this)– array element access– string operation

• Idea: Once we’ve checked for null, shouldn’t need to check again


22

Exampleu = p.x + p.y t1 = p != 0 t1 = p != 0

if t1 goto L1 else L2 if t1 goto L1 else L2L2: abort L2: abortL1: ax = p + 4 L1: ax = p+4tx = M[ax] tx = M[ax]t2 = p != 0 CSE: t2 = t1 t2 = t1if t2 goto L3 else L4 goto L4L3: abort L3: abortL4: ay = p + 8 L4: ay = p + 8ty = M[ay] ty = M[ay]u = tx + ty u = tx + ty

BP: t1 = trueCP: if t1 goto ...


23

Boolean propagation• Propagated information is set of

(variable, value) pairs• (v, ?), (v, T), (v, F)• Doesn’t fit into standard

dataflow analysis model– different information leaves on

different out-edges of if statements– need to explicitly represent

information on each edge


24

Finishing optimization

t1 = p != 0 t1 = p != 0 CJUMP p != 0, L1if t1 goto L1 else L2 if t1 goto L1 else L2 ABORTL2: abort L2: abort L1: MOVE(u, M[p+4]L1: ax = p+4 L1: ax = p+4 +

M[p+8])tx = M[ax] tx = M[ax]t2 = t1goto L4L3: abortL4: ay = p + 8 ay = p + 8ty = M[ay] ty = M[ay]u = tx + ty u = tx + ty


25

Summary• Optimizing loop code is critical to good

performance• Loops can be identified automatically in

control-flow graph using dominator data-flow analysis; allows interleaving of loop optimizations

• Induction variables enable many loop optimizations: loop unrolling, strength reduction, array bounds checks.

• Avoiding array bounds checks, null checks is important for performance but not done often enough; one reason why Java is slow and C is tough to program in.


26

Summary of Optimization• IR optimizations:

– Quadruple representation/control-flow graphs makes optimizations easier to express and check for safety

– Data-flow analysis– Control-flow analyses (loops &

traces)• Low-level optimizations:

– Graph-coloring register allocation– List scheduling of instructions

cs412/413

Documents

distinguished node

control flow graph

flow graphcs

andrew myersdefinition

npredn outnevery node

kinds of loops

othercontrolflow analysis

loopa loop