cs412/413

26
CS412/413 Introduction to Compilers and Translators April 12, 1999 Lecture 28: Loops and check removal

Upload: sebastian-mckenzie

Post on 31-Dec-2015

15 views

Category:

Documents


0 download

DESCRIPTION

CS412/413. Introduction to Compilers and Translators April 12, 1999 Lecture 28: Loops and check removal. Administration. Prelim 2 in 4 days PA 4 due April 28 Reading: Appel 17, Muchnick 7.1-7.4, 14.1, 17.4.3. Loops. Most execution time in most programs is spent in loops: 90/10 is typical - PowerPoint PPT Presentation

TRANSCRIPT

CS412/413

Introduction toCompilers and Translators

April 12, 1999

Lecture 28: Loops and check removal

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

2

Administration• Prelim 2 in 4 days• PA 4 due April 28• Reading: Appel 17, Muchnick

7.1-7.4, 14.1, 17.4.3

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

3

Loops• Most execution time in most

programs is spent in loops: 90/10 is typical

• Most important targets of optimization: loops

• Loop optimizations:– loop-invariant code motion– loop unrolling– loop peeling– strength reduction of expressions

containing induction variables– removal of bounds checks

• When to apply loop optimizations?

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

4

High-level optimization?• Loops may be hard to recognize in

IR or quadruple form -- should we apply loop optimizations to source code or high-level IR?– Many kinds of loops: while, do/while,

continue– loop optimizations benefit from other

IR-level optimizations and vice-versa -- want to be able to interleave

• Problem: identifying loops in call-flow graph

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

5

Definition of a loop• A loop is a set of nodes in the control

flow graph, with one distinguished node called the header.

• Every node is reachablefrom header, headerreachable from everynode: strongly-connectedcomponent

• No entering edges fromoutside except to header

• nodes with outgoingedges: loop exit nodes

header

loop exit

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

6

Nested loops• Control-flow graph may contain many

loops, and loops may contain each other

• Control-flow analysis : identify the loops and nesting structure

inner loop

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

7

Dominators• Notion of dominator will help us

do CFA• Node A dominates node B if the

only way to reach B from start node is through A

• Edge in call flow graph is aback edge if destinationdominates source

• A loop contains at least one back edge

1

2

543

back edge

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

8

Dominator tree• Domination is transitive; if A

dominates B and B dominates C, then A dominates C. A immediately dominates B if domination not implied transitively

• Every call-flow graph has dominator tree1

2

3 4

5 6

78

9 10

1

2

3 4

5 6

78

9 10

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

9

Finding dominators

• Goal: for every node in call-flow graph, find its set of dominators

• Properties of dominators:1. Every node dominates itself2. A node B is dominated by

another node A if A dominates all of the predecessors of B

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

10

Dominator data-flow analysis• Forward analysis

• A node B is dominated by another node A if A dominates all of the predecessors of B

in[n] = n’pred[n] out[n’]• Every node dominates itself:

out[n] = in[n] {n}• Formally: L = sets of nodes ordered by

, flow functions Fn(x) = x {n}, T = {all n} Standard iterative analysis works fine

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

11

Completing control-flow analysis

• Each back edge nh has an associated natural loop with h as its header: all nodes reachable from h that reach n without going through h

• For each back edge, find its natural loop

• Nest loops based on subsetrelationship between natural loops

• Exception: natural loops may sharesame header; merge them intolarger loop.

1

2

3 4

5 6

78

9 10

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

12

Loop-invariant hoisting• Idea: move computations that always give the

same result out of the loop: only compute once!• Hoisting quadruple q: t = a + b. Use reaching

definitions analysis to see if a, b are constants (conservatively)

• Must also ensure q is guaranteed to be executed by loop, q is only defn of t, t not live-in at h

h

q

h

q

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

13

Induction variables• Induction variables are

variables with value ai + b on the ith iteration of a natural loop

• Various optimizations can exploit information about induction variables:– strength reduction– array bounds check elimination– loop unrolling

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

14

Identifying induction variables• Basic induction variables: only

one definition of the form i = i + K• Derived induction variables:

one definition of the form j = i * M + N

j = 3;for (i = 0; i < n; i++) {

j = j +1;k = i*4 + 8;m = k*12 + 1;…

}

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

15

Strength reduction• Every derived induction variable k can

be written as a*i + b, a and b constants, i some basic induction variable

• For all distinct (a,b) pairs:– insert before loop header k’ = b– insert after loop header k’ = k’ + a– Replace definition of any k whose formula is

a*i + b with k = k’• Result: multiplication(s) replaced by single

addition• Additional optimizations facilitated:

copy/constant propagation, dead/useless variable elimination, dead code elimination

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

16

Loop unrolling• Loop unrolling: creates K

copies of loop in sequence

h

Useless unrolling:(K=2)

h

h

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

17

Using induction variables• When loop test expression

depends on induction variable (e.g. i < n), can use one loop test to ensure that entire unrolled loop will succeed (i+K-1 < n): remove all interior loop tests

• Additional loop is needed to “finish up” 0..K-1 iterations

h

h

Useful unrolling

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

18

Array bounds checks• Iota+: On every expression a[i] ,

must ensure i < length a, i 0 (i <u length a)

• Checking array bounds is expensive -- adds conditional jump to every array access expression

• Array indices are often induction variables -- can use induction variable information to avoid the bounds check entirely

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

19

Bounds checks• Can eliminate the bounds check if

we can prove at compile time that it will always succeed

i = 0;while (i < length a) {

a[i] = b[i];i++;

}

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

20

Rules• Given reference a[k] where k is an

induction variable with value a*i + b, must find a conditional test on some induction variable j– test terminates the loop– test dominates the reference to a[k]– test is against some loop invariant

such that provably k <u length a• When to perform optimization? AST? Need

domination analysis, other optimizations not done. Quadruples? Hard to recognize array element and length expressions reliably.

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

21

Null checks• Another costly operation: checking

for null pointers• Java, Iota+ : needed on every

– field access or assignment (except on this)

– method invocation (except on this)– array element access– string operation

• Idea: Once we’ve checked for null, shouldn’t need to check again

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

22

Exampleu = p.x + p.y t1 = p != 0 t1 = p != 0

if t1 goto L1 else L2 if t1 goto L1 else L2L2: abort L2: abortL1: ax = p + 4 L1: ax = p+4tx = M[ax] tx = M[ax]t2 = p != 0 CSE: t2 = t1 t2 = t1if t2 goto L3 else L4 goto L4L3: abort L3: abortL4: ay = p + 8 L4: ay = p + 8ty = M[ay] ty = M[ay]u = tx + ty u = tx + ty

BP: t1 = trueCP: if t1 goto ...

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

23

Boolean propagation• Propagated information is set of

(variable, value) pairs• (v, ?), (v, T), (v, F)• Doesn’t fit into standard

dataflow analysis model– different information leaves on

different out-edges of if statements– need to explicitly represent

information on each edge

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

24

Finishing optimization

t1 = p != 0 t1 = p != 0 CJUMP p != 0, L1if t1 goto L1 else L2 if t1 goto L1 else L2 ABORTL2: abort L2: abort L1: MOVE(u, M[p+4]L1: ax = p+4 L1: ax = p+4 +

M[p+8])tx = M[ax] tx = M[ax]t2 = t1goto L4L3: abortL4: ay = p + 8 ay = p + 8ty = M[ay] ty = M[ay]u = tx + ty u = tx + ty

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

25

Summary• Optimizing loop code is critical to good

performance• Loops can be identified automatically in

control-flow graph using dominator data-flow analysis; allows interleaving of loop optimizations

• Induction variables enable many loop optimizations: loop unrolling, strength reduction, array bounds checks.

• Avoiding array bounds checks, null checks is important for performance but not done often enough; one reason why Java is slow and C is tough to program in.

CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

26

Summary of Optimization• IR optimizations:

– Quadruple representation/control-flow graphs makes optimizations easier to express and check for safety

– Data-flow analysis– Control-flow analyses (loops &

traces)• Low-level optimizations:

– Graph-coloring register allocation– List scheduling of instructions