introduction to optimizations

Post on 05-Jan-2016

48 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Introduction to Optimizations. Guo, Yao. Outline. Optimization Rules Basic Blocks Control Flow Graph (CFG) Loops Local Optimizations Peephole optimization. Levels of Optimizations. Local inside a basic block Global (intraprocedural) Across basic blocks Whole procedure analysis - PowerPoint PPT Presentation

TRANSCRIPT

School of EECS, Peking University

“Advanced Compiler Techniques” (Fall 2011)

Introduction to Introduction to OptimizationsOptimizations

Guo, YaoGuo, Yao

2Fall 2011“Advanced Compiler

Techniques”

OutlineOutline Optimization RulesOptimization Rules Basic BlocksBasic Blocks Control Flow Graph (CFG)Control Flow Graph (CFG)

LoopsLoops Local OptimizationsLocal Optimizations

Peephole optimizationPeephole optimization

3Fall 2011“Advanced Compiler

Techniques”

Levels of OptimizationsLevels of Optimizations LocalLocal

inside a basic blockinside a basic block Global (intraprocedural)Global (intraprocedural)

Across basic blocksAcross basic blocks Whole procedure analysisWhole procedure analysis

InterproceduralInterprocedural Across proceduresAcross procedures Whole program analysisWhole program analysis

4Fall 2011“Advanced Compiler

Techniques”

The Golden Rules of OptimizationThe Golden Rules of Optimization

Premature Optimization is Premature Optimization is EvilEvil

Donald Knuth, Donald Knuth, premature optimization premature optimization is the root of all evilis the root of all evil Optimization can introduce new, subtle bugsOptimization can introduce new, subtle bugs Optimization usually makes code harder to Optimization usually makes code harder to

understand and maintainunderstand and maintain Get your code right first, then, Get your code right first, then, if really if really

neededneeded, optimize it, optimize it Document optimizations carefullyDocument optimizations carefully Keep the non-optimized version handy, or even Keep the non-optimized version handy, or even

as a comment in your codeas a comment in your code

5Fall 2011“Advanced Compiler

Techniques”

The Golden Rules of OptimizationThe Golden Rules of Optimization

The 80/20 RuleThe 80/20 Rule In general, In general, 80% percent of a program80% percent of a program’’s s

execution time is spent executing execution time is spent executing 20% of the code20% of the code

90%/10% for performance-hungry 90%/10% for performance-hungry programsprograms

Spend your time optimizing the important Spend your time optimizing the important 10/20% of your program10/20% of your program

Optimize the common case even at the Optimize the common case even at the cost of making the uncommon case slowercost of making the uncommon case slower

6Fall 2011“Advanced Compiler

Techniques”

The Golden Rules of OptimizationThe Golden Rules of Optimization

Good Algorithms RuleGood Algorithms Rule The best and most important way of The best and most important way of

optimizing a program is using optimizing a program is using good good algorithmsalgorithms E.g. O(n*log) rather than O(nE.g. O(n*log) rather than O(n22))

However, we still need lower level optimization However, we still need lower level optimization to get more of our programsto get more of our programs

In addition, In addition, asymptotic complexityasymptotic complexity is not is not always an appropriate metric of efficiencyalways an appropriate metric of efficiency Hidden constant may be misleadingHidden constant may be misleading E.g.E.g. a linear time algorithm than runs in 100 a linear time algorithm than runs in 100*n+*n+100 100

time is slower than a cubic time algorithm than runs time is slower than a cubic time algorithm than runs in in nn33++10 time 10 time if the problem size is smallif the problem size is small

7Fall 2011“Advanced Compiler

Techniques”

Asymptotic ComplexityAsymptotic ComplexityHidden ConstantsHidden Constants

Hidden Contants

0

500

1000

1500

2000

2500

3000

0 5 10 15

Problem Size

Exe

cuti

on

Tim

e

100*n+100

n*n*n+10

8Fall 2011“Advanced Compiler

Techniques”

General Optimization General Optimization TechniquesTechniques

Strength reductionStrength reduction Use the fastest version of an operationUse the fastest version of an operation E.g. E.g.

x >> 2x >> 2 instead ofinstead of x / 4x / 4

x << 1x << 1 instead ofinstead of x * 2x * 2

Common sub expression eliminationCommon sub expression elimination Eliminate redundant calculationsEliminate redundant calculations E.g.E.g.

double x = d * (lim / max) * sx;double x = d * (lim / max) * sx;

double y = d * (lim / max) * sy;double y = d * (lim / max) * sy;

double depth = d * (lim / max);double depth = d * (lim / max);

double x = depth * sx;double x = depth * sx;

double y = depth * sy;double y = depth * sy;

9Fall 2011“Advanced Compiler

Techniques”

General Optimization General Optimization TechniquesTechniques

Code motionCode motion InvariantInvariant expressions should be executed only expressions should be executed only

onceonce E.g.E.g.

for (int i = 0; i < x.length; i++)for (int i = 0; i < x.length; i++)

x[i] *= Math.PI * Math.cos(y);x[i] *= Math.PI * Math.cos(y);

double picosy = Math.PI * Math.cos(y);double picosy = Math.PI * Math.cos(y);

for (int i = 0; i < x.length; i++)for (int i = 0; i < x.length; i++)

x[i] *= picosy;x[i] *= picosy;

10Fall 2011“Advanced Compiler

Techniques”

General Optimization General Optimization TechniquesTechniques

Loop unrollingLoop unrolling The overhead of the loop control code can be The overhead of the loop control code can be

reduced by executing more than one iteration in reduced by executing more than one iteration in the body of the loop. the body of the loop. E.g.E.g.double picosy = Math.PI * Math.cos(y);double picosy = Math.PI * Math.cos(y);

for (int i = 0; i < x.length; i++)for (int i = 0; i < x.length; i++)

x[i] *= picosy;x[i] *= picosy;

double picosy = Math.PI * Math.cos(y);double picosy = Math.PI * Math.cos(y);

for (int i = 0; i < x.length; i += 2) {for (int i = 0; i < x.length; i += 2) {

x[i] *= picosy;x[i] *= picosy;

x[i+1] *= picosy;x[i+1] *= picosy;

}}

A efficient “+1” in arrayA efficient “+1” in arrayindexing is requiredindexing is required

11Fall 2011“Advanced Compiler

Techniques”

Compiler OptimizationsCompiler Optimizations

Compilers try to generate Compilers try to generate goodgood code code i.e.i.e. Fast Fast

Code improvement is challengingCode improvement is challenging Many problems are NP-hardMany problems are NP-hard

Code improvement may slow down Code improvement may slow down the compilation processthe compilation process In some domains, such as just-in-time In some domains, such as just-in-time

compilation, compilation speed is criticalcompilation, compilation speed is critical

12Fall 2011“Advanced Compiler

Techniques”

Phases of CompilationPhases of Compilation The first The first

three phases three phases are are language-language-dependentdependent

The last two The last two are are machine-machine-dependentdependent

The middle The middle two two dependent on dependent on neither the neither the language nor language nor the machinethe machine

13Fall 2011“Advanced Compiler

Techniques”

PhasesPhases

14Fall 2011“Advanced Compiler

Techniques”

OutlineOutline Optimization RulesOptimization Rules Basic BlocksBasic Blocks Control Flow Graph (CFG)Control Flow Graph (CFG)

LoopsLoops Local OptimizationsLocal Optimizations

Peephole optmizationPeephole optmization

15Fall 2011“Advanced Compiler

Techniques”

Basic BlocksBasic Blocks A A basic blockbasic block is a maximal sequence of is a maximal sequence of

consecutive three-address instructions consecutive three-address instructions with the following properties:with the following properties: The flow of control can only enter the basic The flow of control can only enter the basic

block thru the 1st instr.block thru the 1st instr. Control will leave the block without halting Control will leave the block without halting

or branching, except possibly at the last or branching, except possibly at the last instr.instr.

Basic blocks become the nodes of a Basic blocks become the nodes of a flow flow graphgraph, with edges indicating the order., with edges indicating the order.

16Fall 2011“Advanced Compiler

Techniques”

ExamplesExamples1)1) i = 1i = 12)2) j = 1j = 13)3) t1 = 10 * it1 = 10 * i4)4) t2 = t1 + jt2 = t1 + j5)5) t3 = 8 * t2t3 = 8 * t26)6) t4 = t3 - 88t4 = t3 - 887)7) a[t4] = 0.0a[t4] = 0.08)8) j = j + 1j = j + 19)9) if j <= 10 goto (3)if j <= 10 goto (3)10)10) i = i + 1i = i + 111)11) if i <= 10 goto (2)if i <= 10 goto (2)12)12) i = 1i = 113)13) t5 = i - 1t5 = i - 114)14) t6 = 88 * t5t6 = 88 * t515)15) a[t6] = 1.0a[t6] = 1.016)16) i = i + 1i = i + 117)17) if i <= 10 goto (13)if i <= 10 goto (13)

forfor i from 1 to 10 i from 1 to 10 dodo forfor j from 1 to 10 j from 1 to 10 dodo a[i,j]=0.0a[i,j]=0.0

forfor i from 1 to 10 i from 1 to 10 dodo a[i,i]=0.0a[i,i]=0.0

17Fall 2011“Advanced Compiler

Techniques”

Identifying Basic BlocksIdentifying Basic Blocks Input: sequence of instructions Input: sequence of instructions

instr(i)instr(i) Output: A list of basic blocksOutput: A list of basic blocks Method:Method:

Identify Identify leadersleaders:: the first instruction of a basic block the first instruction of a basic block

Iterate: add subsequent instructions to Iterate: add subsequent instructions to basic block until we reach another basic block until we reach another leaderleader

18Fall 2011“Advanced Compiler

Techniques”

Identifying LeadersIdentifying Leaders Rules for finding leaders in codeRules for finding leaders in code

First instrFirst instr in the code is a leader in the code is a leader Any instr that is the Any instr that is the targettarget of a of a

(conditional or unconditional) jump is a (conditional or unconditional) jump is a leaderleader

Any instr that Any instr that immediately followimmediately follow a a (conditional or unconditional) jump is a (conditional or unconditional) jump is a leaderleader

19Fall 2011“Advanced Compiler

Techniques”

Basic Block Partition Basic Block Partition AlgorithmAlgorithm

leaders = {1}leaders = {1} // start of program// start of programforfor i = 1 to |n| i = 1 to |n| // all instructions// all instructions

ifif instr(i) is a branch instr(i) is a branchleaders = leaders leaders = leaders UU targets of instr(i) targets of instr(i) UU

instr(i+1)instr(i+1)worklist = leadersworklist = leadersWhileWhile worklist worklist notnot emptyempty

x = first instruction in worklistx = first instruction in worklistworklist = worklist worklist = worklist –– {x} {x}block(x) = {x}block(x) = {x}forfor i = x + 1; i <= |n| && i i = x + 1; i <= |n| && i notnot inin leaders; i++ leaders; i++

block(x) = block(x) block(x) = block(x) UU {i} {i}

20Fall 2011“Advanced Compiler

Techniques”

EE

AA

BB

CC

DD

FF

Basic Block ExampleBasic Block Example

Leaders

1.1. i = 1i = 12.2. j = 1j = 13.3. t1 = 10 * it1 = 10 * i4.4. t2 = t1 + jt2 = t1 + j5.5. t3 = 8 * t2t3 = 8 * t26.6. t4 = t3 - 88t4 = t3 - 887.7. a[t4] = 0.0a[t4] = 0.08.8. j = j + 1j = j + 19.9. if j <= 10 goto (3)if j <= 10 goto (3)10.10. i = i + 1i = i + 111.11. if i <= 10 goto (2)if i <= 10 goto (2)12.12. i = 1i = 113.13. t5 = i - 1t5 = i - 114.14. t6 = 88 * t5t6 = 88 * t515.15. a[t6] = 1.0a[t6] = 1.016.16. i = i + 1i = i + 117.17. if i <= 10 goto if i <= 10 goto

(13)(13)

Basic Blocks

21Fall 2011“Advanced Compiler

Techniques”

OutlineOutline Optimization RulesOptimization Rules Basic BlocksBasic Blocks Control Flow Graph (CFG)Control Flow Graph (CFG)

LoopsLoops Local OptimizationsLocal Optimizations

Peephole optmizationPeephole optmization

22Fall 2011“Advanced Compiler

Techniques”

Control-Flow GraphsControl-Flow Graphs Control-flow graphControl-flow graph::

Node: an instruction or sequence of Node: an instruction or sequence of instructions (a instructions (a basic blockbasic block)) Two instructions i, j in same basic blockTwo instructions i, j in same basic block

iffiff execution of i execution of i guaranteesguarantees execution of j execution of j Directed edge: Directed edge: potentialpotential flow of controlflow of control Distinguished start node Distinguished start node Entry & ExitEntry & Exit

First & last instruction in programFirst & last instruction in program

23Fall 2011“Advanced Compiler

Techniques”

Control-Flow EdgesControl-Flow Edges Basic blocks = nodesBasic blocks = nodes Edges:Edges:

Add directed edge between B1 and B2 if:Add directed edge between B1 and B2 if: Branch from last statement of B1 to first Branch from last statement of B1 to first

statement of B2 (B2 is a leader), orstatement of B2 (B2 is a leader), or B2 immediately follows B1 in program order B2 immediately follows B1 in program order

and B1 does not end with unconditional branch and B1 does not end with unconditional branch (goto)(goto)

Definition of Definition of predecessorpredecessor and and successorsuccessor B1 is a predecessor of B2B1 is a predecessor of B2 B2 is a successor of B1B2 is a successor of B1

24Fall 2011“Advanced Compiler

Techniques”

Control-Flow Edge Control-Flow Edge AlgorithmAlgorithm

InputInput: block(i), sequence of basic blocks: block(i), sequence of basic blocks

OutputOutput: CFG where nodes are basic blocks: CFG where nodes are basic blocks

for i = 1 to the number of blocksfor i = 1 to the number of blocks

x = last instruction of block(i)x = last instruction of block(i)

if instr(x) is a branchif instr(x) is a branch

for each target y of instr(x),for each target y of instr(x),

create edge (i create edge (i ->-> y) y)

if instr(x) is if instr(x) is notnot unconditional branch, unconditional branch,

create edge (i -> i+1)create edge (i -> i+1)

25Fall 2011“Advanced Compiler

Techniques”

CFG ExampleCFG Example

26Fall 2011“Advanced Compiler

Techniques”

LoopsLoops Loops comes fromLoops comes from

while, do-while, for, goto……while, do-while, for, goto…… Loop definition: A set of nodes L in a Loop definition: A set of nodes L in a

CFG is a CFG is a looploop if if1.1. There is a node called the There is a node called the loop entryloop entry: :

no other node in L has a predecessor no other node in L has a predecessor outside L.outside L.

2.2. Every node in L has a nonempty path Every node in L has a nonempty path (within L) to the entry of L.(within L) to the entry of L.

27Fall 2011“Advanced Compiler

Techniques”

Loop ExamplesLoop Examples

{B3}{B3} {B6}{B6} {B2, B3, B4}{B2, B3, B4}

28Fall 2011“Advanced Compiler

Techniques”

Identifying LoopsIdentifying Loops MotivationMotivation

majority of runtimemajority of runtime focus optimization on loop bodies!focus optimization on loop bodies!

remove redundant code, replace expensive remove redundant code, replace expensive operations operations )) speed up program speed up program

Finding loops:Finding loops: easy… easy…

for i = 1 to 1000for i = 1 to 1000

for j = 1 to 1000for j = 1 to 1000

for k = 1 to 1000for k = 1 to 1000

do somethingdo something

11 i = 1; j = 1; k = 1;i = 1; j = 1; k = 1;

22 A1: if i > 1000 goto L1;A1: if i > 1000 goto L1;

33 A2: if j > 1000 goto L2;A2: if j > 1000 goto L2;

44 A3: if k > 1000 goto L3;A3: if k > 1000 goto L3;

55 do somethingdo something

66 k = k + 1; goto A3;k = k + 1; goto A3;

77 L3: j = j + 1; goto A2;L3: j = j + 1; goto A2;

88 L2: i = i + 1; goto A1;L2: i = i + 1; goto A1;

99 L1: haltL1: halt

or harderor harder(GOTOs)(GOTOs)

29Fall 2011“Advanced Compiler

Techniques”

OutlineOutline Optimization RulesOptimization Rules Basic BlocksBasic Blocks Control Flow Graph (CFG)Control Flow Graph (CFG)

LoopsLoops Local OptimizationsLocal Optimizations

Peephole optmizationPeephole optmization

30Fall 2011“Advanced Compiler

Techniques”

Local OptimizationLocal Optimization

Optimization of basic blocksOptimization of basic blocks

§8.5§8.5

31Fall 2011“Advanced Compiler

Techniques”

Transformations on basic Transformations on basic blocksblocks

Common subexpression eliminationCommon subexpression elimination: recognize : recognize redundant computations, replace with single redundant computations, replace with single temporarytemporary

Dead-code eliminationDead-code elimination: recognize computations : recognize computations not used subsequently, remove quadruplesnot used subsequently, remove quadruples

Interchange statements, for better schedulingInterchange statements, for better scheduling Renaming of temporaries, for better register Renaming of temporaries, for better register

usageusage

All of the above require All of the above require symbolic executionsymbolic execution of of the basic block, to obtain def/use informationthe basic block, to obtain def/use information

32Fall 2011“Advanced Compiler

Techniques”

Simple symbolic Simple symbolic interpretation:interpretation:

next-use information next-use information If x is computed in statementIf x is computed in statement ii,, and is an and is an

operand of statementoperand of statement j, j > ij, j > i,, its value must its value must be preserved (register or memory) until be preserved (register or memory) until jj..

If x is computed at If x is computed at k, k > ik, k > i,, the value the value computed at i has no further use, and be computed at i has no further use, and be discarded (i.e. register reused)discarded (i.e. register reused)

Next-use information is annotated over Next-use information is annotated over statementsstatements and symbol table.and symbol table.

Computed on one backwards pass over Computed on one backwards pass over statement.statement.

33Fall 2011“Advanced Compiler

Techniques”

Next-Use InformationNext-Use Information DefinitionsDefinitions

1.1. Statement Statement i i assigns a value to assigns a value to x;x;

2.2. Statement Statement j j has has xx as an operand; as an operand;

3.3. Control can flow from i to j along a Control can flow from i to j along a path with no intervening assignments path with no intervening assignments to x;to x;

Statement Statement j usesj uses the value of the value of xx computed at statement computed at statement ii..

i.e.,i.e., x x isis live live at statementat statement i. i.

34Fall 2011“Advanced Compiler

Techniques”

Computing next-useComputing next-use

Use symbol table to annotate Use symbol table to annotate status of variablesstatus of variables

Each operand in a statementEach operand in a statement carries additional information:carries additional information: Operand liveness (boolean)Operand liveness (boolean) Operand next use (later statementOperand next use (later statement))

On exit from block, all temporaries On exit from block, all temporaries are dead (no next-use)are dead (no next-use)

35Fall 2011“Advanced Compiler

Techniques”

AlgorithmAlgorithm INPUT: a basic block BINPUT: a basic block B OUTPUT: at each statement OUTPUT: at each statement i: x=y op zi: x=y op z in B, in B,

create liveness and next-use for x, y, zcreate liveness and next-use for x, y, z METHOD: for each statement in B METHOD: for each statement in B

(backward)(backward) Retrieve liveness & next-use info from a tableRetrieve liveness & next-use info from a table Set Set xx to “ to “not livenot live” and “” and “no next-useno next-use”” Set Set y, zy, z to “ to “livelive” and the next uses of y,z to “” and the next uses of y,z to “ii””

Note: step 2 & 3 cannot be interchanged.Note: step 2 & 3 cannot be interchanged. E.g., x = x + yE.g., x = x + y

36Fall 2011“Advanced Compiler

Techniques”

ExampleExample

1.1. x = 1x = 1

2.2. y = 1y = 1

3.3. x = x + yx = x + y

4.4. z = yz = y

5.5. x = y + zx = y + z

Exit:x: live, 6 y: not livez: not live

37Fall 2011“Advanced Compiler

Techniques”

Computing dependencies in a Computing dependencies in a basic block: the DAGbasic block: the DAG

Use Use directed acyclic graphdirected acyclic graph (DAG) to recognize (DAG) to recognize common subexpressions and remove redundant common subexpressions and remove redundant quadruples.quadruples.

Intermediate code optimization:Intermediate code optimization: basic block => DAG => improved block => basic block => DAG => improved block =>

assemblyassembly

Leaves are labeled with identifiers and constants.Leaves are labeled with identifiers and constants. Internal nodes are labeled with operators and Internal nodes are labeled with operators and

identifiersidentifiers

38Fall 2011“Advanced Compiler

Techniques”

DAG constructionDAG construction Forward pass over basic blockForward pass over basic block For For x = y x = y opop z; z;

Find node labeled y, or create one Find node labeled y, or create one Find node labeled z, or create oneFind node labeled z, or create one Create new node for Create new node for opop, or find an existing one , or find an existing one

with descendants y, z (with descendants y, z (need hash schemeneed hash scheme)) Add x to list of labels for new nodeAdd x to list of labels for new node Remove label x from node on which it appearedRemove label x from node on which it appeared

For For x = y;x = y; Add x to list of labels of node which currently Add x to list of labels of node which currently

holds yholds y

39Fall 2011“Advanced Compiler

Techniques”

DAG ExampleDAG Example Transform a basic block into a Transform a basic block into a

DAG.DAG.

a = b + ca = b + c

b = a – db = a – d

c = b + cc = b + c

d = a - dd = a - d

+

+

-

b0 c0

d0a

b, d

c

40Fall 2011“Advanced Compiler

Techniques”

Local Common Subexpr. Local Common Subexpr. (LCS)(LCS)

Suppose b is not live on exit.Suppose b is not live on exit.

a = b + ca = b + c

b = a – db = a – d

c = b + cc = b + c

d = a - dd = a - d

+

+

-

b0 c0

d0a

b, d

c

a = b + ca = b + c

d = a – dd = a – d

c = d + cc = d + c

41Fall 2011“Advanced Compiler

Techniques”

LCS: another exampleLCS: another example

a = b + ca = b + c

b = b – db = b – d

c = c + dc = c + d

e = b + ce = b + c

++ -

b0 c0 d0

a b

e+

c

42Fall 2011“Advanced Compiler

Techniques”

Common subexpCommon subexp

Programmers donProgrammers don’’t produce t produce common subexpressions, code common subexpressions, code generators do!generators do!

43Fall 2011“Advanced Compiler

Techniques”

Dead Code EliminationDead Code Elimination

Delete any root that has no live Delete any root that has no live variables attachedvariables attached

a = b + ca = b + c

b = b – db = b – d

c = c + dc = c + d

e = b + ce = b + c++ -

b0 c0 d0

a b

e+

c On exit:

a, b live

c, e not live

a = b + ca = b + c

b = b – db = b – d

44Fall 2011“Advanced Compiler

Techniques”

OutlineOutline Optimization RulesOptimization Rules Basic BlocksBasic Blocks Control Flow Graph (CFG)Control Flow Graph (CFG)

LoopsLoops Local OptimizationsLocal Optimizations

Peephole optmizationPeephole optmization

45Fall 2011“Advanced Compiler

Techniques”

Peephole OptimizationPeephole Optimization Dragon§8.7Dragon§8.7

Introduction to peepholeIntroduction to peephole Common techniquesCommon techniques Algebraic identitiesAlgebraic identities An exampleAn example

46Fall 2011“Advanced Compiler

Techniques”

Peephole OptimizationPeephole Optimization Simple compiler do not perform machine-Simple compiler do not perform machine-

independent code improvementindependent code improvement They generates They generates naivenaive code code

It is possible to take the target hole and It is possible to take the target hole and optimize itoptimize it Sub-optimal sequences of instructions that Sub-optimal sequences of instructions that

match an optimization pattern are transformed match an optimization pattern are transformed into optimal sequences of instructionsinto optimal sequences of instructions

This technique is known as This technique is known as peephole optimization

Peephole optimization usually works by sliding a Peephole optimization usually works by sliding a window of several instructions (a window of several instructions (a peepholepeephole))

47Fall 2011“Advanced Compiler

Techniques”

Peephole OptimizationPeephole Optimization

Goals:- improve performance- reduce memory footprint- reduce code size

Method: 1. Exam short sequences of target instructions 2. Replacing the sequence by a more efficient one.

• redundant-instruction elimination • algebraic simplifications• flow-of-control optimizations • use of machine idioms

48Fall 2011“Advanced Compiler

Techniques”

Peephole OptimizationPeephole OptimizationCommon TechniquesCommon Techniques

49Fall 2011“Advanced Compiler

Techniques”

Peephole OptimizationPeephole OptimizationCommon TechniquesCommon Techniques

50Fall 2011“Advanced Compiler

Techniques”

Peephole OptimizationPeephole OptimizationCommon TechniquesCommon Techniques

51Fall 2011“Advanced Compiler

Techniques”

Peephole OptimizationPeephole OptimizationCommon TechniquesCommon Techniques

52Fall 2011“Advanced Compiler

Techniques”

Algebraic identitiesAlgebraic identities Worth recognizing single instructions with Worth recognizing single instructions with

a constant operanda constant operand Eliminate computationsEliminate computations

A * 1 = AA * 1 = A A * 0 = 0A * 0 = 0 A / 1 = AA / 1 = A

Reduce strenthReduce strenth A * 2 = A + AA * 2 = A + A A/2 = A * 0.5A/2 = A * 0.5

Constant foldingConstant folding 2 * 3.14 = 6.282 * 3.14 = 6.28

More delicate with floating-pointMore delicate with floating-point

53Fall 2011“Advanced Compiler

Techniques”

Is this ever helpful?Is this ever helpful? Why would anyone write X * 1?Why would anyone write X * 1? Why bother to correct such obvious junk Why bother to correct such obvious junk

code?code? In fact one might writeIn fact one might write

#define MAX_TASKS 1#define MAX_TASKS 1......a = b * MAX_TASKS;a = b * MAX_TASKS;

Also, seemingly redundant code can be Also, seemingly redundant code can be produced by other optimizations. produced by other optimizations. This is an important effect.This is an important effect.

54Fall 2011“Advanced Compiler

Techniques”

Replace Multiply by ShiftReplace Multiply by Shift A := A * 4;A := A * 4;

Can be replaced by 2-bit left shift Can be replaced by 2-bit left shift (signed/unsigned)(signed/unsigned)

But must worry about overflow if language But must worry about overflow if language doesdoes

A := A / 4;A := A / 4; If If unsignedunsigned, can replace with shift right, can replace with shift right But shift right arithmetic is a well-known But shift right arithmetic is a well-known

problemproblem Language may allow it anyway (traditional C)Language may allow it anyway (traditional C)

55Fall 2011“Advanced Compiler

Techniques”

The Right Shift problemThe Right Shift problem Arithmetic Right shift: Arithmetic Right shift:

shift right and use sign bit to fill most shift right and use sign bit to fill most significant bitssignificant bits

-5 -5 111111...1111111011111111...1111111011 SAR SAR 111111...1111111101111111...1111111101 which is -3, not -2which is -3, not -2 in most languages -5/2 = -2in most languages -5/2 = -2

56Fall 2011“Advanced Compiler

Techniques”

Addition chains for Addition chains for multiplicationmultiplication

If multiply is very slow (or on a machine If multiply is very slow (or on a machine with no multiply instruction like the with no multiply instruction like the original SPARC), decomposing a constant original SPARC), decomposing a constant operand into sum of powers of two can be operand into sum of powers of two can be effective:effective: X * 125 = x * 128 - x*4 + xX * 125 = x * 128 - x*4 + x two shifts, one subtract and one add, which two shifts, one subtract and one add, which

may be faster than one multiplymay be faster than one multiply Note similarity with efficient exponentiation Note similarity with efficient exponentiation

methodmethod

57Fall 2011“Advanced Compiler

Techniques”

Flow-of-control Flow-of-control optimizationsoptimizations

goto L1 . . .L1: goto L2

goto L2 . . .L1: goto L2

if a < b goto L1 . . .L1: goto L2

if a < b goto L2 . . .L1: goto L2

goto L1 . . .L1: if a < b goto L2L3:

if a < b goto L2 goto L3 . . .L3:

58Fall 2011“Advanced Compiler

Techniques”

Peephole Opt: an Peephole Opt: an ExampleExample

debug = 0. . .if(debug) { print debugging information }

debug = 0 . . . if debug = 1 goto L1 goto L2L1: print debugging informationL2:

Source Code:

IntermediateCode:

59Fall 2011“Advanced Compiler

Techniques”

Eliminate Jump after Eliminate Jump after JumpJump

Before:

After:

debug = 0 . . . if debug = 1 goto L1 goto L2L1: print debugging informationL2:

debug = 0 . . . if debug 1 goto L2 print debugging informationL2:

60Fall 2011“Advanced Compiler

Techniques”

Constant PropagationConstant Propagation

Before:

After:

debug = 0 . . . if debug 1 goto L2 print debugging informationL2:

debug = 0 . . . if 0 1 goto L2 print debugging informationL2:

61Fall 2011“Advanced Compiler

Techniques”

Unreachable CodeUnreachable Code(dead code elimination)(dead code elimination)

Before:

After:

debug = 0 . . .

debug = 0 . . . if 0 1 goto L2 print debugging informationL2:

62Fall 2011“Advanced Compiler

Techniques”

Peephole Optimization Peephole Optimization SummarySummary

Peephole optimization is very fastPeephole optimization is very fast Small overhead per instruction since Small overhead per instruction since

they use a small, fixed-size windowthey use a small, fixed-size window

It is often easier to generate naIt is often easier to generate naïïve ve code and run peephole optimization code and run peephole optimization than generating good code!than generating good code!

63Fall 2011“Advanced Compiler

Techniques”

SummarySummary Introduction to optimizationIntroduction to optimization

Basic knowledgeBasic knowledge Basic blocksBasic blocks Control-flow graphsControl-flow graphs

Local OptimizationsLocal Optimizations Peephole optimizationsPeephole optimizations

64Fall 2011“Advanced Compiler

Techniques”

Next TimeNext Time Dataflow analysisDataflow analysis

Dragon§9.2 Dragon§9.2

top related