c chuen-liang chen, ntucs&ie / 321 optimization chuen-liang chen department of computer science...
TRANSCRIPT
c
Chuen-Liang Chen, NTUCS&IE / 1
OPTIMIZATIONOPTIMIZATION
Chuen-Liang Chen
Department of Computer Science
and Information Engineering
National Taiwan University
Taipei, TAIWAN
c
Chuen-Liang Chen, NTUCS&IE / 2
IntroductionIntroduction
local optimization
within a basic block
may be accompanying with code generation
e.g., peephole optimization
global optimization
over more than one basic blocks
e.g., loop optimization
data flow analysis (a technique)
c
Chuen-Liang Chen, NTUCS&IE / 3
Peephole optimization (1/2)Peephole optimization (1/2)
modify particular pattern in a small window (peephole; 2-3 instructions)may on intermediate or target code
constant folding (evaluate constant expressions in advance) ( +, Lit1, Lit2, Result ) ( :=, Lit1+Lit2, Result ) ( :=, Lit1, Result1 ), ( +, Lit2, Result1, Result2 )
( :=, Lit1, Result1 ),( :=, Lit1+Lit2, Result2 )
strength reduction (replace slow operations with faster equivalents) ( *, Operand, 2, Result ) ( ShiftLeft, Operand, 1, Result ) ( *, Operand, 4, Result ) ( ShiftLeft, Operand, 2, Result )
null sequences (delete useless operations) ( +, Operand, 0, Result ) ( :=, Operand, Result ) ( *, Operand, 1, Result ) ( :=, Operand, Result )
c
Chuen-Liang Chen, NTUCS&IE / 4
Peephole optimization (2/2)Peephole optimization (2/2)combine operations (replace several operations with one equivalent)
Load A, Rj; Load A+1, Rj+1 DoubleLoad A, Rj
BranchZero L1, R1; Branch L2; L1: BranchNotZero L2, R1 Subtract #1, R1; BranchZero L1, R1 SubtractOneBranch L1, R1
algebraic laws (use algebraic laws to simplify or reorder instructions) ( +, Lit, Operand, Result ) ( +, Operand, Lit, Result ) ( -, 0, Operand, Result ) ( Negate, Operand, Result )
special case instructions (use instructions designed for special operand cases) Subtract #1, R1 Decrement R1 Add #1, R1 Increment R1 Load #0, R1; Store A, R1 Clear A
address mode operations (use address modes to simplify code) Load A, R1; Add 0(R1), R2 Add @A, R2 Subtract #2, R1; Clear 0(R1) Clear -(R1)
c
Chuen-Liang Chen, NTUCS&IE / 5
Loop optimization (1/6)Loop optimization (1/6)due to 90 / 10 ruleexample --
for l in 1..100 loopfor J in 1..100 loop
for K in 1..100 loopA(l)(J)(K) := ( I * J ) * K;
end loop;end loop;
end loop;
for l in 1..100 loopfor J in 1..100 loop
T1 := Adr( A(l)(J) );T2 := I * J;for K in 1..100 loop
T1(K) := T2 * K;end loop;
end loop;end loop;
loopinvariant
expressionfactorization
loopinvariant
expressionfactorization
for l in 1..100 loopT3 := Adr( A(I) );for J in 1..100 loop
T1 :=Adr( T3(J) );T2 := I * J;for K in 1..100 loop
T1(K) := T2 * K;end loop;
end loop;end loop;
c
Chuen-Liang Chen, NTUCS&IE / 6
Loop optimization (2/6)Loop optimization (2/6)
for l in 1..100 loopT3 := Adr( A(I) );T4 := I; -- Initial value of l*Jfor J in 1..100 loop
T1 := Adr( T3(J) );T2 := T4; -- T4 holds I*JT5 := T2; -- Initial value of T2*Kfor K in 1..100 loop
T1(K) := T5; -- T5 holds T2*K = I*J*KT5 := T5 + T2;
end loop;T4 := T4 + I;
end loop;end loop;
inductionvariable
elimination
for l in 1..100 loopT3 := Adr( A(I) );for J in 1..100 loop
T1 :=Adr( T3(J) );T2 := I * J;for K in 1..100 loop
T1(K) := T2 * K;end loop;
end loop;end loop;
c
Chuen-Liang Chen, NTUCS&IE / 7
Loop optimization (3/6)Loop optimization (3/6)
copypropagation
for l in 1..100 loopT3 := Adr( A(I) );T4 := I; -- Initial value of l*Jfor J in 1..100 loop
T1 := Adr( T3(J) );T5 := T4; -- Initial value of T2*Kfor K in 1..100 loop
T1(K) := T5; -- T5 holds T2*K = I*J*KT5 := T5 + T4;
end loop;T4 := T4 + I;
end loop;end loop;
for l in 1..100 loopT3 := Adr( A(I) );T4 := I; for J in 1..100 loop
T1 := Adr( T3(J) );T2 := T4;T5 := T2; for K in 1..100 loop
T1(K) := T5;T5 := T5 + T2;
end loop;T4 := T4 + I;
end loop;end loop;
c
Chuen-Liang Chen, NTUCS&IE / 8
Loop optimization (4/6)Loop optimization (4/6)
for l in 1..100 loopT3 := A0 + ( 10000 * l ) - 10000;T4 := I; -- Initial value of l*Jfor J in 1..100 loop
T1 := T3 + ( 100 * J ) - 100;T5 := T4; -- Initial value of T4*Kfor K in 1..100 loop
(T1+K-1) := T5; -- T5 holds T4*K = I*J*KT5 := T5 + T4;
end loop;T4 := T4 + I;
end loop;end loop;
subscriptingcode expansi
on
for l in 1..100 loopT3 := Adr( A(I) );T4 := I;for J in 1..100 loop
T1 := Adr( T3(J) );T5 := T4;for K in 1..100 loop
T1(K) := T5;T5 := T5 + T4;
end loop;T4 := T4 + I;
end loop;end loop;
c
Chuen-Liang Chen, NTUCS&IE / 9
Loop optimization (5/6)Loop optimization (5/6)
inductionvariable
elimination
T6 := A0 ; -- Initial value of Adr(A(I))for l in 1..100 loop
T3 := T6;T4 := I; -- Initial value of l*JT7 := T3; -- Initial value of Adr(A(l)(J))for J in 1..100 loop
T1 := T7;T5 := T4; -- Initial value of T4*KT8 := T1; -- Initial value of Adr(A(l)(J)(K))for K in 1..100 loop
T8 := T5; -- T5 holds T4*K = I*J*KT5 := T5 + T4;T8 := T8 + 1;
end loop;T4 := T4 + I;T7 := T7 + 100;
end loop;T6 := T6 + 10000;
end loop;
for l in 1..100 loopT3 := A0 + ( 10000 * l ) - 10000;T4 := I; for J in 1..100 loop
T1 := T3 + ( 100 * J ) - 100;T5 := T4; for K in 1..100 loop
(T1+K-1) := T5;T5 := T5 + T4;
end loop;T4 := T4 + I;
end loop;end loop;
c
Chuen-Liang Chen, NTUCS&IE / 10
Loop optimization (6/6)Loop optimization (6/6)
T6 := A0 ; -- Initial value of Adr(A(I))for l in 1..100 loop
T4 := I; -- Initial value of l*JT7 := T6; -- Initial value of Adr(A(l)(J))for J in 1..100 loop
T5 := T4; -- Initial value of T4*KT8 := T7; -- Initial value of Adr(A(l)(J)(K))for K in 1..100 loop
T8 := T5; -- T5 holds T4*K = I*J*KT5 := T5 + T4;T8 := T8 + 1;
end loop;T4 := T4 + I;T7 := T7 + 100;
end loop;T6 := T6 + 10000;
end loop;
copypropa-gation
T6 := A0 ;for l in 1..100 loop
T3 := T6;T4 := I;T7 := T3;for J in 1..100 loop
T1 := T7;T5 := T4;T8 := T1;for K in 1..100 loop
T8 := T5;T5 := T5 + T4;T8 := T8 + 1;
end loop;T4 := T4 + I;T7 := T7 + 100;
end loop;T6 := T6 + 10000;
end loop;
c
Chuen-Liang Chen, NTUCS&IE / 11
to fetch information for global structure, not only for a basic block
data flow graph node -- basic block example --
Read ( Limit ) ;for I in 1 .. Limit loop
Read ( J ) ;if I = 1 then
Sum := J ;else
Sum := Sum + J ;end if ;
end loop ;Write ( Sum ) ;
Global data flow analysis (1/2)Global data flow analysis (1/2)
Read(Limit)I := 1
I > Limit
Read(J)I = 1
Sum := J Sum := Sum + J
I := I + 1
Write(Sum)
0
2
6
3 4
5
1
c
Chuen-Liang Chen, NTUCS&IE / 12
Global data flow analysis (2/2)Global data flow analysis (2/2)
classification of data flow analyses any-path v.s. all-path forward-flow v.s. backward-flow dependent on different types of information
data flow equations each basic block has 4 sets, IN, OUT, KILLED, and GEN, whose relat
ionships are specified by data flow equations equations for all basic blocks need to be satisfied simultaneously may not unique solution
solution iterative method structure method
c
Chuen-Liang Chen, NTUCS&IE / 13
Any-path forward-flow analysisAny-path forward-flow analysis
example -- uninitialized variable (used but undefined) IN -- uninitialized just before this basic block OUT -- uninitialized before (including) this basic block KILLED -- defined GEN -- out of scope data flow equations --
– IN(b) = iP(b) OUT(i)
– OUT(b) = GEN(b) ( IN(b) - KILLED(b) )
– IN(first) = universal set
initial condition, i.e., IN(first), is case by case
b
pp
ss
c
Chuen-Liang Chen, NTUCS&IE / 14
Any-path backward-flow analysisAny-path backward-flow analysis
example -- live variable OUT -- will be used just after this basic block IN -- will be used after (including) this basic block KILLED -- defined GEN -- used data flow equations --
– OUT(b) = iS(b) IN(i)
– IN(b) = GEN(b) ( OUT(b) - KILLED(b) )
– OUT(last) =
b
pp
ss
c
Chuen-Liang Chen, NTUCS&IE / 15
All-path forward-flow analysisAll-path forward-flow analysis
example -- available expression (to check redundant computation) IN -- already computed just before this basic block OUT -- already computed before (including) this basic block KILLED -- one of operands is re-defined GEN -- computed subexpression data flow equations --
– IN(b) = iP(b) OUT(i)
– OUT(b) = GEN(b) ( IN(b) - KILLED(b) )
– IN(first) =
b
pp
ss
c
Chuen-Liang Chen, NTUCS&IE / 16
All-path backward-flow analysisAll-path backward-flow analysis
example -- very busy expression (worth storing on register) OUT -- will be used for all cases just after this basic block IN -- will be used for all cases after (including) this basic block KILLED -- defined GEN -- used data flow equations --
– OUT(b) = iS(b) IN(i)
– IN(b) = GEN(b) ( OUT(b) - KILLED(b) )
– OUT(last) =
b
pp
ss
c
Chuen-Liang Chen, NTUCS&IE / 17
Structure method of data flow solution (1/4)Structure method of data flow solution (1/4) for backward analysis -- I O for forward analysis
I = I1
O = ( I2 - K2 ) G2
= ( ((I1-K1)G1) - K2 ) G2
= ( I - (K1K2) ) (G1-K2)G2
K = K1 K2
G = ( G1 - K2 ) G2
I = I1 = I2
O = O1 O2
= ((I1-K1)G1) ((I2-K2)G2)
= ( I - (K1K2) ) (G1G2)
K = K1 K2
G = G1 G2
S1 S2
S1
S2
(any path)
c
Chuen-Liang Chen, NTUCS&IE / 18
Structure method of data flow solution (2/4)Structure method of data flow solution (2/4) I = I1 = I2
O = O1 O2
= ((I1-K1)G1) ((I2-K2)G2)
=
= ( I - (K1K2) ) (G1G2)
K = K1 K2
G = G1 G2
– any path K = K1 – all path K = K1 K2
G = ( G2 - K1 ) G1 G = G1
S1
S2
(all path)
S1 S2
S1
S1 S2 S1
c
Chuen-Liang Chen, NTUCS&IE / 19
Structure method of data flow solution (3/4)Structure method of data flow solution (3/4) example -- uninitialized variable
Gen Killedb0 Limit, Ib1 b2 Jb3 Sumb4 Sumb5 Ib6 I
Read(Limit)I := 1
I > Limit
Read(J)I = 1
Sum := J Sum := Sum + J
I := I + 1
Write(Sum)
0
2
6
3 4
5
1
c
Chuen-Liang Chen, NTUCS&IE / 20
Structure method of data flow solution (4/4)Structure method of data flow solution (4/4)
step structure rule Killed Gen1 b3,b4 conditional Sum 2 b2,b3,b4 sequential J, Sum 3 b2,b3,b4,b5 sequential J, Sum, I 4 b1,b2,b3,b4,b5 iterative 5 b0,b1,b2,b3,b4,b5 sequential Limit, I 6 b0,b1,b2,b3,b4,b5,b6 sequential Limit, I I