c chuen-liang chen, ntucs&ie / 321 optimization chuen-liang chen department of computer science...

21
Chuen-Liang Chen, NTUCS&IE / OPTIMIZATION Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University Taipei, TAIWAN

Upload: belen-jourdain

Post on 14-Dec-2015

219 views

Category:

Documents


2 download

TRANSCRIPT

c

Chuen-Liang Chen, NTUCS&IE / 1

OPTIMIZATIONOPTIMIZATION

Chuen-Liang Chen

Department of Computer Science

and Information Engineering

National Taiwan University

Taipei, TAIWAN

c

Chuen-Liang Chen, NTUCS&IE / 2

IntroductionIntroduction

local optimization

within a basic block

may be accompanying with code generation

e.g., peephole optimization

global optimization

over more than one basic blocks

e.g., loop optimization

data flow analysis (a technique)

c

Chuen-Liang Chen, NTUCS&IE / 3

Peephole optimization (1/2)Peephole optimization (1/2)

modify particular pattern in a small window (peephole; 2-3 instructions)may on intermediate or target code

constant folding (evaluate constant expressions in advance) ( +, Lit1, Lit2, Result ) ( :=, Lit1+Lit2, Result ) ( :=, Lit1, Result1 ), ( +, Lit2, Result1, Result2 )

( :=, Lit1, Result1 ),( :=, Lit1+Lit2, Result2 )

strength reduction (replace slow operations with faster equivalents) ( *, Operand, 2, Result ) ( ShiftLeft, Operand, 1, Result ) ( *, Operand, 4, Result ) ( ShiftLeft, Operand, 2, Result )

null sequences (delete useless operations) ( +, Operand, 0, Result ) ( :=, Operand, Result ) ( *, Operand, 1, Result ) ( :=, Operand, Result )

c

Chuen-Liang Chen, NTUCS&IE / 4

Peephole optimization (2/2)Peephole optimization (2/2)combine operations (replace several operations with one equivalent)

Load A, Rj; Load A+1, Rj+1 DoubleLoad A, Rj

BranchZero L1, R1; Branch L2; L1: BranchNotZero L2, R1 Subtract #1, R1; BranchZero L1, R1 SubtractOneBranch L1, R1

algebraic laws (use algebraic laws to simplify or reorder instructions) ( +, Lit, Operand, Result ) ( +, Operand, Lit, Result ) ( -, 0, Operand, Result ) ( Negate, Operand, Result )

special case instructions (use instructions designed for special operand cases) Subtract #1, R1 Decrement R1 Add #1, R1 Increment R1 Load #0, R1; Store A, R1 Clear A

address mode operations (use address modes to simplify code) Load A, R1; Add 0(R1), R2 Add @A, R2 Subtract #2, R1; Clear 0(R1) Clear -(R1)

c

Chuen-Liang Chen, NTUCS&IE / 5

Loop optimization (1/6)Loop optimization (1/6)due to 90 / 10 ruleexample --

for l in 1..100 loopfor J in 1..100 loop

for K in 1..100 loopA(l)(J)(K) := ( I * J ) * K;

end loop;end loop;

end loop;

for l in 1..100 loopfor J in 1..100 loop

T1 := Adr( A(l)(J) );T2 := I * J;for K in 1..100 loop

T1(K) := T2 * K;end loop;

end loop;end loop;

loopinvariant

expressionfactorization

loopinvariant

expressionfactorization

for l in 1..100 loopT3 := Adr( A(I) );for J in 1..100 loop

T1 :=Adr( T3(J) );T2 := I * J;for K in 1..100 loop

T1(K) := T2 * K;end loop;

end loop;end loop;

c

Chuen-Liang Chen, NTUCS&IE / 6

Loop optimization (2/6)Loop optimization (2/6)

for l in 1..100 loopT3 := Adr( A(I) );T4 := I; -- Initial value of l*Jfor J in 1..100 loop

T1 := Adr( T3(J) );T2 := T4; -- T4 holds I*JT5 := T2; -- Initial value of T2*Kfor K in 1..100 loop

T1(K) := T5; -- T5 holds T2*K = I*J*KT5 := T5 + T2;

end loop;T4 := T4 + I;

end loop;end loop;

inductionvariable

elimination

for l in 1..100 loopT3 := Adr( A(I) );for J in 1..100 loop

T1 :=Adr( T3(J) );T2 := I * J;for K in 1..100 loop

T1(K) := T2 * K;end loop;

end loop;end loop;

c

Chuen-Liang Chen, NTUCS&IE / 7

Loop optimization (3/6)Loop optimization (3/6)

copypropagation

for l in 1..100 loopT3 := Adr( A(I) );T4 := I; -- Initial value of l*Jfor J in 1..100 loop

T1 := Adr( T3(J) );T5 := T4; -- Initial value of T2*Kfor K in 1..100 loop

T1(K) := T5; -- T5 holds T2*K = I*J*KT5 := T5 + T4;

end loop;T4 := T4 + I;

end loop;end loop;

for l in 1..100 loopT3 := Adr( A(I) );T4 := I; for J in 1..100 loop

T1 := Adr( T3(J) );T2 := T4;T5 := T2; for K in 1..100 loop

T1(K) := T5;T5 := T5 + T2;

end loop;T4 := T4 + I;

end loop;end loop;

c

Chuen-Liang Chen, NTUCS&IE / 8

Loop optimization (4/6)Loop optimization (4/6)

for l in 1..100 loopT3 := A0 + ( 10000 * l ) - 10000;T4 := I; -- Initial value of l*Jfor J in 1..100 loop

T1 := T3 + ( 100 * J ) - 100;T5 := T4; -- Initial value of T4*Kfor K in 1..100 loop

(T1+K-1) := T5; -- T5 holds T4*K = I*J*KT5 := T5 + T4;

end loop;T4 := T4 + I;

end loop;end loop;

subscriptingcode expansi

on

for l in 1..100 loopT3 := Adr( A(I) );T4 := I;for J in 1..100 loop

T1 := Adr( T3(J) );T5 := T4;for K in 1..100 loop

T1(K) := T5;T5 := T5 + T4;

end loop;T4 := T4 + I;

end loop;end loop;

c

Chuen-Liang Chen, NTUCS&IE / 9

Loop optimization (5/6)Loop optimization (5/6)

inductionvariable

elimination

T6 := A0 ; -- Initial value of Adr(A(I))for l in 1..100 loop

T3 := T6;T4 := I; -- Initial value of l*JT7 := T3; -- Initial value of Adr(A(l)(J))for J in 1..100 loop

T1 := T7;T5 := T4; -- Initial value of T4*KT8 := T1; -- Initial value of Adr(A(l)(J)(K))for K in 1..100 loop

T8 := T5; -- T5 holds T4*K = I*J*KT5 := T5 + T4;T8 := T8 + 1;

end loop;T4 := T4 + I;T7 := T7 + 100;

end loop;T6 := T6 + 10000;

end loop;

for l in 1..100 loopT3 := A0 + ( 10000 * l ) - 10000;T4 := I; for J in 1..100 loop

T1 := T3 + ( 100 * J ) - 100;T5 := T4; for K in 1..100 loop

(T1+K-1) := T5;T5 := T5 + T4;

end loop;T4 := T4 + I;

end loop;end loop;

c

Chuen-Liang Chen, NTUCS&IE / 10

Loop optimization (6/6)Loop optimization (6/6)

T6 := A0 ; -- Initial value of Adr(A(I))for l in 1..100 loop

T4 := I; -- Initial value of l*JT7 := T6; -- Initial value of Adr(A(l)(J))for J in 1..100 loop

T5 := T4; -- Initial value of T4*KT8 := T7; -- Initial value of Adr(A(l)(J)(K))for K in 1..100 loop

T8 := T5; -- T5 holds T4*K = I*J*KT5 := T5 + T4;T8 := T8 + 1;

end loop;T4 := T4 + I;T7 := T7 + 100;

end loop;T6 := T6 + 10000;

end loop;

copypropa-gation

T6 := A0 ;for l in 1..100 loop

T3 := T6;T4 := I;T7 := T3;for J in 1..100 loop

T1 := T7;T5 := T4;T8 := T1;for K in 1..100 loop

T8 := T5;T5 := T5 + T4;T8 := T8 + 1;

end loop;T4 := T4 + I;T7 := T7 + 100;

end loop;T6 := T6 + 10000;

end loop;

c

Chuen-Liang Chen, NTUCS&IE / 11

to fetch information for global structure, not only for a basic block

data flow graph node -- basic block example --

Read ( Limit ) ;for I in 1 .. Limit loop

Read ( J ) ;if I = 1 then

Sum := J ;else

Sum := Sum + J ;end if ;

end loop ;Write ( Sum ) ;

Global data flow analysis (1/2)Global data flow analysis (1/2)

Read(Limit)I := 1

I > Limit

Read(J)I = 1

Sum := J Sum := Sum + J

I := I + 1

Write(Sum)

0

2

6

3 4

5

1

c

Chuen-Liang Chen, NTUCS&IE / 12

Global data flow analysis (2/2)Global data flow analysis (2/2)

classification of data flow analyses any-path v.s. all-path forward-flow v.s. backward-flow dependent on different types of information

data flow equations each basic block has 4 sets, IN, OUT, KILLED, and GEN, whose relat

ionships are specified by data flow equations equations for all basic blocks need to be satisfied simultaneously may not unique solution

solution iterative method structure method

c

Chuen-Liang Chen, NTUCS&IE / 13

Any-path forward-flow analysisAny-path forward-flow analysis

example -- uninitialized variable (used but undefined) IN -- uninitialized just before this basic block OUT -- uninitialized before (including) this basic block KILLED -- defined GEN -- out of scope data flow equations --

– IN(b) = iP(b) OUT(i)

– OUT(b) = GEN(b) ( IN(b) - KILLED(b) )

– IN(first) = universal set

initial condition, i.e., IN(first), is case by case

b

pp

ss

c

Chuen-Liang Chen, NTUCS&IE / 14

Any-path backward-flow analysisAny-path backward-flow analysis

example -- live variable OUT -- will be used just after this basic block IN -- will be used after (including) this basic block KILLED -- defined GEN -- used data flow equations --

– OUT(b) = iS(b) IN(i)

– IN(b) = GEN(b) ( OUT(b) - KILLED(b) )

– OUT(last) =

b

pp

ss

c

Chuen-Liang Chen, NTUCS&IE / 15

All-path forward-flow analysisAll-path forward-flow analysis

example -- available expression (to check redundant computation) IN -- already computed just before this basic block OUT -- already computed before (including) this basic block KILLED -- one of operands is re-defined GEN -- computed subexpression data flow equations --

– IN(b) = iP(b) OUT(i)

– OUT(b) = GEN(b) ( IN(b) - KILLED(b) )

– IN(first) =

b

pp

ss

c

Chuen-Liang Chen, NTUCS&IE / 16

All-path backward-flow analysisAll-path backward-flow analysis

example -- very busy expression (worth storing on register) OUT -- will be used for all cases just after this basic block IN -- will be used for all cases after (including) this basic block KILLED -- defined GEN -- used data flow equations --

– OUT(b) = iS(b) IN(i)

– IN(b) = GEN(b) ( OUT(b) - KILLED(b) )

– OUT(last) =

b

pp

ss

c

Chuen-Liang Chen, NTUCS&IE / 17

Structure method of data flow solution (1/4)Structure method of data flow solution (1/4) for backward analysis -- I O for forward analysis

I = I1

O = ( I2 - K2 ) G2

= ( ((I1-K1)G1) - K2 ) G2

= ( I - (K1K2) ) (G1-K2)G2

K = K1 K2

G = ( G1 - K2 ) G2

I = I1 = I2

O = O1 O2

= ((I1-K1)G1) ((I2-K2)G2)

= ( I - (K1K2) ) (G1G2)

K = K1 K2

G = G1 G2

S1 S2

S1

S2

(any path)

c

Chuen-Liang Chen, NTUCS&IE / 18

Structure method of data flow solution (2/4)Structure method of data flow solution (2/4) I = I1 = I2

O = O1 O2

= ((I1-K1)G1) ((I2-K2)G2)

=

= ( I - (K1K2) ) (G1G2)

K = K1 K2

G = G1 G2

– any path K = K1 – all path K = K1 K2

G = ( G2 - K1 ) G1 G = G1

S1

S2

(all path)

S1 S2

S1

S1 S2 S1

c

Chuen-Liang Chen, NTUCS&IE / 19

Structure method of data flow solution (3/4)Structure method of data flow solution (3/4) example -- uninitialized variable

Gen Killedb0 Limit, Ib1 b2 Jb3 Sumb4 Sumb5 Ib6 I

Read(Limit)I := 1

I > Limit

Read(J)I = 1

Sum := J Sum := Sum + J

I := I + 1

Write(Sum)

0

2

6

3 4

5

1

c

Chuen-Liang Chen, NTUCS&IE / 20

Structure method of data flow solution (4/4)Structure method of data flow solution (4/4)

step structure rule Killed Gen1 b3,b4 conditional Sum 2 b2,b3,b4 sequential J, Sum 3 b2,b3,b4,b5 sequential J, Sum, I 4 b1,b2,b3,b4,b5 iterative 5 b0,b1,b2,b3,b4,b5 sequential Limit, I 6 b0,b1,b2,b3,b4,b5,b6 sequential Limit, I I

c

Chuen-Liang Chen, NTUCS&IE / 21

Applications of data flow analysesApplications of data flow analyses