code optimization 1 course overview part i: overview material 1introduction 2language processors...

26
1 Code Optimization Course Overview PART I: overview material 1 Introduction 2 Language processors (tombstone diagrams, bootstrapping) 3 Architecture of a compiler PART II: inside a compiler 4 Syntax analysis 5 Contextual analysis 6 Runtime organization 7 Code generation PART III: conclusion 8 Interpretation 9 Review Supplementary material: Code optimization

Upload: frank-doyle

Post on 02-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

1Code Optimization

Course Overview

PART I: overview material1 Introduction

2 Language processors (tombstone diagrams, bootstrapping)

3 Architecture of a compiler

PART II: inside a compiler4 Syntax analysis

5 Contextual analysis

6 Runtime organization

7 Code generation

PART III: conclusion8 Interpretation

9 ReviewSupplementary material:Code optimization

Page 2: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

2Code Optimization

What This Topic is About

The code generated by our compiler is not efficient:• It computes some values at runtime that could be known

at compile time• It computes some values more times than necessary

We can do better!• Constant folding• Common sub-expression elimination• Code motion• Dead code elimination

Page 3: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

3Code Optimization

Constant folding

• Consider:

• The compiler could compute 4 * pi / 3 as 4.18879 before the program runs. How many instructions would this save at run time?

• Why shouldn’t the programmer just write

4.18879 * r * r * r?

static double pi = 3.14159;double volume = 4 * pi / 3 * r * r * r;

Page 4: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

4Code Optimization

Constant folding II

• Consider:

• If the address of holidays is x, what is the address of holidays[2].m?

• Could the programmer evaluate this at compile time? Should the programmer do this?

struct { int y, m, d; } holidays[6];holidays[2].m = 12;holidays[2].d = 25;

Page 5: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

5Code Optimization

Constant folding III

• An expression that the compiler should be able to compute the value of is called “manifest”.

• How can the compiler know if the value of an expression is manifest?

Page 6: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

6Code Optimization

Common sub-expression elimination

• Consider:

• Computing x – y takes three instructions; could we save some of them?

t = (x – y) * (x – y + z);

Page 7: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

7Code Optimization

Common sub-expression elimination II

t = (x – y) * (x – y + z);

Naïve code:

load xload ysubload xload ysubload zaddmultstore t

Better code:

load xload ysubdupload zaddmultstore t

Page 8: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

8Code Optimization

Common sub-expression elimination III

• Consider:

• The address of holidays[i] is a common subexpression.

struct { int y, m, d; } holidays[6];holidays[i].m = 12;holidays[i].d = 25;

Page 9: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

9Code Optimization

• But, be careful!

• Is x – y++ still a common sub-expression?

Common sub-expression elimination IV

t = (x – y++) * (x – y++ + z);

Page 10: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

10Code Optimization

Code motion

• Consider:

• Computing the address of name[i][j] is address[name] + (i * 10) + j

• Most of that computation is constant throughout the inner loop

char name[3][10];for (int i = 0; i < 3; i++) {

for (int j = 0; j < 10; j++) {name[i][j] = ‘a’;

address[name] + (i * 10)

Page 11: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

11Code Optimization

Code motion II

• You can think of this as rewriting the original code:

as:

char name[3][10];for (int i = 0; i < 3; i++) {

for (int j = 0; j < 10; j++) {name[i][j] = ‘a’;

char name[3][10];for (int i = 0; i < 3; i++) {

char *x = &(name[i][0]);for (int j = 0; j < 10; j++) {

x[j] = ‘a’;

Page 12: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

12Code Optimization

Code motion III

• However, this might be a bad idea in some cases. Why? Consider very small values of variable k:

char name[3][10];for (int i = 0; i < 3; i++) {

for (int j = 0; j < k; j++) {name[i][j] = ‘a’;

char name[3][10];for (int i = 0; i < 3; i++) {

char *x = &(name[i][0]);for (int j = 0; j < k; j++) {

x[j] = ‘a’;

Page 13: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

13Code Optimization

Dead code elimination

• Consider:

• Computing t takes many instructions, but the value of t is never used.

• We call the value of t “dead” (or the variable t dead) because it can never affect the final value of the computation. Computing dead values and assigning to dead variables is wasteful.

int f(int x, int y, int z){

int t = (x – y) * (x – y + z);return 6;

}

Page 14: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

14Code Optimization

Dead code elimination II

• But consider:

• Now t is only dead for part of its existence. So it requires a careful algorithm to identify which code is dead, and therefore which code can be safely removed.

int f(int x, int y, int z){

int t = x * y;int r = t * z;t = (x – y) * (x – y + z);return r;

}

Page 15: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

15Code Optimization

Optimization implementation

• What do we need to know in order to apply an optimization?

–Constant folding

–Common sub-expression elimination

–Code motion

–Dead code elimination

–Many other kinds of optimizations

• Is the optimization correct or safe?• Is the optimization really an improvement?• What sort of analyses do we need to perform to get the

required information?

Page 16: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

16Code Optimization

Basic blocks

• A basic block is a sequence of instructions that is entered only at the beginning and exited only at the end.

• A flow graph is a collection of basic blocks connected by edges indicating the flow of control.

Page 17: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

17Code Optimization

Finding basic blocks (Example: JVM code)iconst_1

istore 2

iconst_2

istore 3

Label_1:

iload 3

iload 1

if_icmplt Label_4

iconst_0

goto Label_5

Label_4:

iconst_1

Label_5:

ifeq Label_2

iload 2iload 3imuldupistore 2pop

Label_3:iload 3dupiconst_1iaddistore 3popgoto Label_1

Label_2:iload 2ireturn

Mark the first instruction, labelled instructions, and following jumps.

Page 18: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

18Code Optimization

Finding basic blocks II

Label_2:iload 2ireturn

Label_3:iload 3dupiconst_1iaddistore 3popgoto Label_1

iload 2iload 3imuldupistore 2pop

Label_5:

ifeq Label_2

Label_4:

iconst_1

iconst_0

goto Label_5

Label_1:

iload 3

iload 1

if_icmplt Label_4

iconst_1

istore 2

iconst_2

istore 3

Page 19: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

19Code Optimization

Flow graphs

7: iload 2ireturn

6: iload 3dupiconst_1iaddistore 3popgoto 1

5: iload 2iload 3imuldupistore 2pop

4: ifeq 7

3: iconst_1

2: iconst_0

goto 4

1: iload 3

iload 1

if_icmplt 3

0: iconst_1

istore 2

iconst_2

istore 3

Page 20: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

20Code Optimization

Local optimizations (within a basic block)

• Everything you need to know is easy to determine• For example: live variable analysis

–Start at the end of the block and work backwards

–Assume everything is live at the end of the basic block

–Copy live/dead info for the instruction

–If you see an assignment to x, then mark x “dead”

–If you see a reference to y, then mark y “live”

5: iload 2iload 3imuldupistore 2pop

live: 1, 2, 3

live: 1, 3

live: 1, 2, 3

live: 1, 3live: 1, 3live: 1, 2, 3

live: 1, 3

Page 21: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

21Code Optimization

Global optimizations

• Global means “across all basic blocks”• We must know what happens across block boundaries• For example: live variable analysis

– The liveness of a value depends on its later uses perhaps in other blocks

– What values does this block define and use?

5: iload 2iload 3imuldupistore 2pop

Define: 2Use: 2, 3

Page 22: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

22Code Optimization

Global live variable analysis

• We define four sets for each basic block B– def[B] = variables defined in B before they are used in B

– use[B] = variables used in B before they are defined in B

– in[B] = variables live at the beginning of B

– out[B] = variables live at the end of B

• These sets are related by the following equations:– in[B] = use[B] (out[B] – def[B])

– out[B] = S in[S] where S is a successor of B

Page 23: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

23Code Optimization

Solving data flow equations

• We want a fixed-point solution for this system of equations (there are two equations per each basic block).

• Start with conservative initial values for each in[B] and out[B], and apply the formulas to update the values of each in[B] and out[B]. Repeat until no further changes can occur.– The best conservative initial value is {}, because no variables

are live at the end of the program.

Page 24: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

24Code Optimization

Dead code elimination

• Suppose we have now computed all global live variable information

• We can redo the local live variable analysis using correct liveness information at the end of each block: out[B]

• Whenever we see an assignment to a variable that is marked dead, we can safely eliminate it

Page 25: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

25Code Optimization

Dead code examples

iload 1iload 2imulistore 4iload 4iload 3imulistore 5

istore 5iload 1iload 2isubiload 1iload 2isubiload 3iaddimuldupistore 4popiload 5ireturn

live:live:live: 5live: 5live: 5live: 5live: 5live: 5live: 3, 5live: 3, 5live: 2, 3, 5live: 1, 2, 3, 5live: 1, 2, 3, 5live: 1, 2, 3, 5live: 1, 2, 3, 5

live: 1, 2, 3live: 1, 2, 3live: 1, 2, 3live: 1, 2, 3live: 1, 2, 3, 4live: 1, 2, 3live: 1, 2, 3live: 1, 2, 3live: 1, 2, 3 live: 1, 2, 3

Page 26: Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a

26Code Optimization

Code optimization?

• Code optimization should be called “code improvement”• It is not practical to generate absolutely optimal code (too

expensive at compile time ==> NP-hard)• There is a trade-off between compiler speed and

execution speed• Many compilers have options that permit the programmer

to choose between generating either optimized or non-optimized code

• Non-optimized => debugging; optimized => release• Some compilers even allow the programmer to select

which kinds of optimizations to perform