interm codegen

85
Intermediate Code Generation Sarfaraz Masood Asstt Prof, Department of Computer Engineering Jamia Millia University New Delhi

Upload: anshul-sharma

Post on 27-Jan-2015

159 views

Category:

Technology


4 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Interm codegen

Intermediate Code Generation

Sarfaraz MasoodAsstt Prof, Department of

Computer EngineeringJamia Millia University

New Delhi

Page 2: Interm codegen

CS 540 GMU Spring 2007 2

Compiler Architecture

Scanner(lexical

analysis)

Parser(syntax

analysis)

CodeOptimizer

SemanticAnalysis

(IC generator)

CodeGenerator

SymbolTable

Sourcelanguage

tokens Syntacticstructure

Intermediate Code

Targetlanguage

Intermediate Code

Intermediate Code

Page 3: Interm codegen

Joey Paquet, 2000, 2002 3

Introduction to Code Generation

• Front end: – Lexical Analysis– Syntactic Analysis– Intermediate Code Generation

• Back end: – Intermediate Code Optimization– Object Code Generation

• The front end is machine-independent, i.e. it can be reused to build compilers for different architectures

• The back end is machine-dependent, i.e. these steps are related to the nature of the assembly or machine language of the target architecture

Page 4: Interm codegen

04/10/23 4

Introduction to Code Generation

Target-1 Code Generator Target-2 Code Generator

Intermediate-code Optimizer

Language-1 Front End

Source programin Language-1

Language-2 Front End

Source programin Language-2

Non-optimized Intermediate Code

Optimized Intermediate Code

Target-1 machine code Target-2 machine code

Page 5: Interm codegen

Joey Paquet, 2000, 2002 5

Introduction to Code Generation

• After syntactic analysis, we have a number of options to choose from:– generate object code directly from the parse– generate intermediate code, and then generate object

code from it– generate an intermediate abstract representation, and

then generate code directly from it– generate an intermediate abstract representation,

generate intermediate code, and then the object code

• All these options have one thing in common: they are all based on syntactic information gathered in the semantic analysis

Page 6: Interm codegen

Joey Paquet, 2000, 2002 6

Introduction to Code Generation

SyntacticAnalyzer

ObjectCode

SyntacticAnalyzer

IntermediateRepresentation

ObjectCode

Lexical Analyzer

Lexical Analyzer

Lexical Analyzer

SyntacticAnalyzer

IntermediateRepresentation

IntermediateCode

ObjectCode

SyntacticAnalyzer

IntermediateCode

ObjectCode

Lexical Analyzer

Front End Back End

Page 7: Interm codegen

04/10/23 7

Intermediate Representation (IR)

A kind of abstract machine language that can express the target machine operations without committing to too much machine details.

•Why IR ?

Page 8: Interm codegen

04/10/23 8

Without IR

C

Pascal

FORTRAN

C++

SPARC

HP PA

x86

IBM PPC

Page 9: Interm codegen

04/10/23 9

With IR

C

Pascal

FORTRAN

C++

SPARC

HP PA

x86

IBM PPC

IR

Page 10: Interm codegen

04/10/23 10

With IR

C

Pascal

FORTRAN

C++

IR Common Backend

?

Page 11: Interm codegen

04/10/23 11

Advantages of Using an Intermediate Language

1. Retargeting - Build a compiler for a new machine by

attaching a new code generator to an existing front-end.

2. Optimization - reuse intermediate code optimizers in

compilers for different languages and different

machines.

Note: the terms “intermediate code”, “intermediate

language”, and “intermediate representation” are all

used interchangeably.

Page 12: Interm codegen

04/10/23 12

Issues in Designing an IR

Whether to use an existing IR if target machine architecture is similar if the new language is similar

Whether the IR is appropriate for the kind of optimizations to be performed e.g. speculation and predication some transformations may take much

longer than they would on a different IR

Page 13: Interm codegen

04/10/23 13

Issues in Designing an IR

Designing a new IR needs to consider Level (how machine dependent it is) Structure Expressiveness Appropriateness for general and

special optimizations Appropriateness for code generation Whether multiple IRs should be

used

Page 14: Interm codegen

what are the IR in actual compilers?

• gcc is a widely used compiler on many platformsit uses two IRs: AST (Abstract Syntax Tree) and RTL (Register Transfer Language), and some development paths are using Tree-SSA

[SSA: Static Single Assignment: each name is assigned once. We will talk about this later!]

• VM can be seen as a new type of IRJava Bytecode .Net IL

some programming languages have well defined intermediate languages. java – java virtual machine prolog – warren abstract machine In fact, there are byte-code emulators to execute instructions in

these intermediate languages.

Page 15: Interm codegen

Intermediate Code Generation

• Direct Translation– Using SDT scheme– Parse tree to Three-Address Instructions– Can be done while parsing in a single pass– Needs to be able to deal with Syntactic Errors and

Recovery

• Indirect Translation– First validate parsing constructing of AST– Uses SDT scheme to build AST– Traverse the AST and generate Three Address

Instructions

IntermediateCode Generation

O(n)

IR IR

Three-Address

Instructions

∞ regs

Parse tree

AST

Page 16: Interm codegen

Syntax-directed definition to produce AST for assignment statements

productionproduction semantic rulessemantic rules

S S id := id :=EE SS..nptrnptr := := mknodemknode((‘‘assignassign’’, , mkleaf mkleaf (id, (id, id.id.entryentry),), E E..nptrnptr) )

E E EE1 1 ++EE22 EE..nptrnptr := := mknodemknode( ( ‘‘++’’, , EE11..nptrnptr,, E E22..nptrnptr) )

E E EE1 1 EE22 EE..nptrnptr := := mknodemknode( ( ‘‘’’, , EE11..nptrnptr,, E E22..nptrnptr) )

E E EE11 EE..nptrnptr := := mkunodemkunode( ( ‘‘uminusuminus’’, , EE11..nptrnptr) )

E E ( (EE11) ) EE..nptrnptr := := EE11..nptrnptr

E E id id EE..nptrnptr := := mkleaf mkleaf (id, id.(id, id.entryentry) )

1. Syntax Tree vs DAG

Page 17: Interm codegen

assigna +

+

bc d

c duminus

syntax tree for a := (b + cd ) + cd

Syntax Tree vs DAG

Page 18: Interm codegen

• if mknode returns a pointer to an existing node whenever possible, a DAG can be produced

assigna +

+

bc d

c duminus

assigna +

+

bc d

uminus

(a)syntax tree (b)DAG a := (b + cd ) + cd

Syntax Tree vs DAG

Page 19: Interm codegen

04/10/23 19

Form Rules:

1. If E is a variable/constant, the PN of E is E itself

2. If E is an expression of the form E1 op E2, the PN of E is E1’E2’op (E1’ and E2’ are the PN of E1 and E2, respectively.)

3. If E is a parenthesized expression of form (E1), the PN of E is the same as the PN of E1.The PN of expression 9* (5+2) is 952+*

How about (a+b)/(c-d) ? ab+cd-/

A mathematical notation wherein every operator follows all of its operands.

2. Postfix Notation

Page 20: Interm codegen

Intermediate-Code Generation 20

3. Static Single-Assignment Form

• Static single assignment form (SSA) is an intermediate representation that facilitates certain code optimization.

• Two distinct aspects distinguish SSA from three –address code.– All assignments in SSA are to variables with distinct

names; hence the term static single-assignment.

Page 21: Interm codegen

Intermediate-Code Generation 21

3. Static Single-Assignment Form

if (flag) x = -1; else x = 1;y = x * ;a

if (flag) x1 = -1; else x2 = 1;X3 = (x1, x2)

Page 22: Interm codegen

4. Three Address Instructions IR

• Construct mapped to Three-Address Instructions– Register-based IR for expression evaluation– Infinite number of virtual registers– Still independent of target architecture

• Generic Statement Format:Label: x = y op z or if exp goto L– Statements can have symbolic labels– Compiler inserts temporary variables– Type and conversions dealt in other phases of the

code generation

Page 23: Interm codegen

Types of Three-address Statements

• Assignment– Binary: x := y op z– Unary: x := op y– “op” can be any reasonable arithmetic

or logic operator.

• Copy– Simple: x := y– Indexed: x := y[i] or x[i] := y– Address and pointer manipulation:

• x := &y• x := * y• *x := y

Page 24: Interm codegen

Types of Three-address Statements

• Jump– Unconditional: goto L– Conditional: if x relop y goto L1 [else goto L2],

where relop is <,=, >, , or ≠.≧ ≦

• Procedure call– Call procedure P(X1,X2, . . . ,Xn)

PARAM X1

PARAM X2

...

PARAM Xn

CALL P, n

Page 25: Interm codegen

implementations of three-address statements

• common implementations: – Quadruples– Triples– indirect triples

Consider the code:a := b * -c + b * -c

Page 26: Interm codegen

Quadruples

• A quadruple is a record structure with four fields: op, arg1, arg2, and result– The op field contains an internal code for an

operator– Statements with unary operators do not use arg2– Operators like param use neither arg2 nor result– The target label for conditional and unconditional

jumps are in result• The contents of fields arg1, arg2, and result are typically pointers to symbol table entries– If so, temporaries must be entered into the

symbol table as they are created– Obviously, constants need to be handled

differently

Page 27: Interm codegen

Quadruples Example

op arg1 arg2 result

(0) uminus c t1

(1) * b t1 t2

(2) uminus c t3

(3) * b t3 t4

(4) + t2 t4 t5

(5) := t5 a

a := b * -c + b * -c

Page 28: Interm codegen

Triples

• Triples refer to a temporary value by the position of the statement that computes it– Statements can be represented by a record

with only three fields: op, arg1, and arg2– Avoids the need to enter temporary names

into the symbol table

• Contents of arg1 and arg2:– Pointer into symbol table (for programmer

defined names)– Pointer into triple structure (for temporaries)– Of course, still need to handle constants

differently

Page 29: Interm codegen

Triples Example

op arg1 arg2

(0) uminus c

(1) * b (0)

(2) uminus c

(3) * b (2)

(4) + t2 (3)

(5) assign a (4)

Result is implicit in triples

a := b * -c + b * -c

Page 30: Interm codegen

opop arg1arg1 arg2arg2

(0)(0) []=[]= xx ii

(1)(1) :=:= (0)(0) yy

an indexed assignment requires two triples:an indexed assignment requires two triples:x[i] := yx[i] := y

Page 31: Interm codegen

Indirect triples• indirect triples add a list of pointers

to triples, so that triples can be shared and moved easily

op arg1 arg2

(14) uminus c

(15) * b (14)

(16) uminus c

(17) * b (16)

(18) + (15) (17)

(19) assign a (18)

op

(0) (14)

(1) (15)

(2) (16)

(3) (17)

(4) (18)

(5) (19)

a := b * -c + b * -c

Page 32: Interm codegen

syntax-directed translation into three-address code

productionproduction semantic rulessemantic rules

S S id := id :=EE SS..codecode := E.code := E.code ‖gen(‖gen(id.place id.place ‘‘:=:=’’ E.place) E.place)

E E EE1 1 ++EE22 EE.place := newtemp;.place := newtemp;

E.code := E.code := EE11.code .code ‖‖EE22.code .code ‖‖

gen(E.place ‘:=’gen(E.place ‘:=’EE11.place .place ‘‘++’’EE22. place). place)

E E EE1 1 EE22 ............

E E EE11 EE.place := newtemp;.place := newtemp;

E.code := E.code := EE11.code .code ‖gen(E.place ‘:=’‖gen(E.place ‘:=’‘‘uminusuminus’’EE11. . place)place)

E E ( (EE11) ) EE..placeplace := := EE11.place; .place; EE.code := .code := EE11.code.code

F F id id EE..placeplace := id.place; E.code := := id.place; E.code := ‘’‘’

Page 33: Interm codegen

syntax-directed translation into three-

address codeproductioproductio

nn semantic rulessemantic rules

S S while while E do SE do S11

S.begin := newlabel;S.begin := newlabel;

S.after := newlabel;S.after := newlabel;

SS..codecode := :=

gen(S.begin gen(S.begin ‘‘::’’) ) ‖‖

E.code E.code ‖‖

gen(‘if’Egen(‘if’E.place .place ‘‘==’’ ‘‘00’’ ‘‘gotogoto’’ S.after) S.after) ‖‖

SS11.code .code ‖‖

gen(gen(‘‘gotogoto’’ S.begin) S.begin) ‖‖

gen(S.after gen(S.after ‘‘::’’) )

Page 34: Interm codegen

Declarations

• enter symbols in a symbol table• allocate space and record it in the

symbol table• emit appropriate code

Page 35: Interm codegen

Declarations in a procedure

• computing types and relative address of names

P {offset := 0} D D D ; D D id : T {enter ( id.name, T.type, offset);

offset := offset + T.width }T integer {T.type := integer;

T.width := 4 }T real {T.type := real; T.width := 8 }T array [ num ] of T1

{T.type := array (num.val, T1.type);

T.width := num.val T1.width}T T1 {T.type := pointer (T1.type);

T.width := 4 }

Page 36: Interm codegen

Synta x -Directed Translation to Three Address Code

• Attributes for the Non-Terminals, say E and S– Location of the value of an expression: E.place– The Code that Evaluates the Expressions or Statement:

E.code– Markers for beginning and end of sections of the code

S.begin, S.end

• Semantic Actions in Productions of the Grammar– Functions to create temporaries newtemp, and labels newlabel– Use Auxiliary functions to enter symbols and consult types

corresponding to declarations in aside data structure that can be built as the code is being parsed - a symbol table.

– To generate the code we use the emit function gen which creates a list of instructions to be emitted later and can generate symbolic labels corresponding to next instruction of a list.

– Use of append function on lists of instructions.– Synthesized and Inherited Attributes

Page 37: Interm codegen

Assignment Statements: Grammar and Actions

S id = E { p = lookup(id.name);

if (p != NULL)

S.code = gen(p ‘=‘ E.place);

else error;

S.code = nulllist;

}

E E1 + E2 {E.place = newtemp();

E.code = append(E1.code,E2.code,gen(E.place ‘=‘ E1.place ‘+’ E2.place); }

E E1 * E2 { E.place = newtemp();

E.code = append(E1.code,E2.code,gen(E.place ‘=‘ E1.place ‘*’ E2.place); }

Page 38: Interm codegen

Assignment Statements: Grammar and Actions

E - E1 {E.place = newtemp();

E.code = append(E1.code,gen(E.place ‘=‘ ‘-’ E1.place)); }

E (E1) {E.place = E1.place; E.code = E1.code; }

E id {p = lookup(id.name);

if (p != NULL)

E.place = p;

else

error;

E.code = nulllist;

}

Page 39: Interm codegen

Assignment: Examplex = a * b + c * d - e * f;

S

id =

E E*

id

a

id

b

xE

E E*

id

c

id

d

E

E E*

id

e

id

f

E

E+

E

-

Page 40: Interm codegen

Assignment: Examplex = a * b + c * d - e * f;

id

E id { p = lookup(id.name);

if (p != NULL)

E.place = p;

else

error;

E.code = null list;

}

Production:

S

id =

E E

*

id

a

id

b

xE

E E

*

id

c

id

d

E

E E*

id

e f

E

E+

E

-

place = loc(e)

code = null

Page 41: Interm codegen

Assignment: Example

x = a * b + c * d - e * f;

S

id =

E E*

id

a

id

b

xE

E E*

id

c

id

d

E

E E*

id

e

id

f

E

E+

E

-

place = loc(f)

code = null

E id { p = lookup(id.name);

if (p != NULL)

E.place = p;

else

error;

E.code = null list;

}

Production:

place = loc(e)

code = null

Page 42: Interm codegen

Assignment: Example

x = a * b + c * d - e * f;

S

id =

E E*

id

a

id

b

xE

E E*

id

c

id

d

E

E E*

id

e

id

f

E

E+

E

-

place = loc(f)

code = null

E E1 * E2 {E.place = newtemp();

E.code = gen(E.place ‘=‘ E1.place ‘*’ E2.place);}

Production:

place = loc(e)

code = null

place = loc(t1)

code = {t1 = e + f;}

Page 43: Interm codegen

Assignment: E x ample

x = a * b + c * d - e * f;

S

id =

E E*

id

a

id

b

xE

E E*

id

c

id

d

E

E E*

id

e

id

f

E

E+

E

-

Production:E E1 + E2 {E.place = newtemp();

E.code = gen(E.place ‘=‘ E1.place ‘+’ E2.place);}

place = loc(f)

code = nullplace =

loc(e)

code = null

place = loc(t1)

code = {t1 = e + f;}place = loc(d)

code = nullplace =

loc(c)

code = null

place = loc(t2)

code = {t2 = c + d;}

Page 44: Interm codegen

Assignment: Example

x = a * b + c * d - e * f;

S

id =

E E*

id

a

id

b

xE

E E*

id

c

id

d

E

E E*

id

e

id

f

E

E+

E

-

Production:

place = loc(f)

code = nullplace =

loc(e)

code = null

place = loc(t1)

code = {t1 = e * f;}place = loc(d)

code = nullplace =

loc(c)

code = null

place = loc(t2)

code = {t2 = c * d;}

S id = E { p = lookup(id.name);

if (p != NULL)

E.code = append(E.code,

gen(p ‘=‘ E.place));

else

error;

}

place = loc(t3)

code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; }place = loc(b)

code = nullplace =

loc(a)

code = null

place = loc(t4)

code = {t4 = a * b;}

place = loc(t5)

code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; t4 = a * b; t5 = t4 + t3}

code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; t4 = a * b; t5 = t4 + t3; x = t5;}

place = loc(x)

code = null

Page 45: Interm codegen

Assignment: Example

x = a * b + c * d - e * f;

S

id =

E E*

id

a

id

b

xE

EE

*

id

c

id

d

E

E E*

id

e

id

f

E

E + E

-

t1 = e * f;

t2 = c * d;

t3 = t2 - t1;

t4 = a * b;

t5 = t4 + t3;

x = t5;

Page 46: Interm codegen

Reusing Temporary Variables

• Temporary Variables– Short lived– Used for Evaluation of Expressions– Clutter the Symbol Table

• Change the newtemp Function– Keep track of when the value created in a temporary is

used– Use a counter to keep track of the number of active temps– When a temporary is used in an expression decrement

counter– When a temporary is generated by newtemp increment

counter– Initialize counter to zero

Page 47: Interm codegen

Assignment: Example

• Only 2 Registers Needed

x = a * b + c * d - e * f;

S

id =

E E*

id

a

id

b

xE

EE

*

id

c

id

d

E

E E*

id

e

id

f

E

E + E

-

// c = 0

t1 = e * f; // c = 1

t2 = c * d; // c = 2

t1 = t2 - t1; // c = 1

t2 = a * b; // c = 2

t1 = t2 + t1; // c = 1

x = t1; // c = 0

Page 48: Interm codegen

Boolean & Relational Values

How should the compiler represent them?

• Answer depends on the target machine

Two classic approaches• Numerical representation• Positional (implicit) representation

Correct choice depends on both context and ISA

Page 49: Interm codegen

• Issue: Control Flow Introduces Complications– In Both Representations– Need to Know Address to Jump To in Some Cases

• Solution: Two Additional Attributes– nextstat (Inherited) Indicates the next location to be

generated– laststat (Synthesized) Indicates the last location

filled– As code is generated the attributes are filled with

the correct value

SDT Scheme for Boolean Expressions

Page 50: Interm codegen

Boolean Expression: Grammar and Actions

E false {E.place = newtemp()

E.code = {gen(E.place = 0)}

E.laststat = E.nextstat + 1

}

E true{E.place = newtemp()

E.code = {gen(E.place = 1)}

E.laststat = E.nextstat + 1

}

Page 51: Interm codegen

Boolean Expression: Grammar and Actions

E (E1) {E.place = E1.place;

E.code = E1.code; E1.nextstat = E.nextstat E.laststat = E1.laststat }

E not E1 {E.place = newtemp()

E.code = append(E1.code,gen(E.place = not E1.place)) E1.nextstat = E.nextstat E.laststat = E1.laststat + 1

}

Page 52: Interm codegen

Boolean Expression: Grammar and Actions

E E1 or E2

{E.place = newtemp()

E.code = append(E1.code,E2.code,gen(E.place = E1.place or E2.place)

E1.nextstat = E.nexstat

E2.nextstat = E1.laststat

E.laststat = E2.laststat + 1

}

Page 53: Interm codegen

Boolean Expression: Grammar and Actions

E E1 and E2

{E.place = newtemp()

E.code = append(E1.code,E2.code,gen(E.place = E1.place and E2.place)

E1.nextstat = E.nexstat

E2.nextstat = E1.laststat

E.laststat = E2.laststat + 1

}

Page 54: Interm codegen

Boolean Expression: Grammar and Actions

E id1 relop id2

{

E.place = newtemp()

E.code = gen(if id1.place relop id2.place goto E.nextstat+3)

E.code = append(E.code,gen(E.place = 0))

E.code = append(E.code,gen(goto E.nextstat+2))

E.code = append(E.code,gen(E.place = 1))

E.laststat = E.nextstat + 4

}

Page 55: Interm codegen

Boolean Expressions: Example

a < b or c < d and e < f00: if a < b goto 03

01: t1 = 0

02: goto 04

03: t1 = 1

04: if c < d goto 07

05: t2 = 0

06: goto 08

07: t2 = 1

08: if e < f goto 11

09: t3 = 0

10: goto 12

11: t3 = 1

12: t4 = t2 and t3

13: t5 = t1 or t4

id relop id

E

E

E

id relop id

E

id relop id

E

a b

c d e f<<

<

or

and

Page 56: Interm codegen

Control Flow Statements: Code Layout

E.code

S1.code

S if E then S1

S if E then S1else S2

E.code

S1.code

S2.code

goto S.next

to E.trueto E.false

E.true:E.false:

to E.trueto E.false

E.true:

E.false:S.next:

• Attributes:– E.true: the label to which control flows if E is true– E.false: the label to which control flows if E is false– S.next: an inherited attribute with the symbolic label of

the code following S

Page 57: Interm codegen

Control Flow Statements: Code Layout

E.code

S1.code

S while E do S1

goto S.begin

to E.trueto E.falseE.tru

e:

E.false:

S.begin:

• Difficulty: Need to know where to jump to– Introduce a symbolic labels using the newlabel function– Use inherited attributes– Backpatch it later with the actual value (later…)

Page 58: Interm codegen

Control Flow Statements: Grammar and Actions

S if E then S1 { E.true = newlabel E.false = S.next S1.next = S.next S.code = append(E.code,gen(E.true:),S1.code)}

Page 59: Interm codegen

Control Flow Statements: Grammar and Actions

S if E then S1 else S2

{ E.true = newlabel E.false = newlabel S1.next = S.next S2.next = S.next S.code = append(E.code,gen(E.true:),S1.code, gen(goto S.next),gen(E.false :),S2.code)}

Page 60: Interm codegen

Control Flow Statements: Grammar and Actions

S while E do S1

{ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:), E.code, gen(E.true:),

S1.code, gen(goto S.begin)}

Page 61: Interm codegen

Control Flow Translation of Boolean Expressions

• Short-Circuit Evaluation– No Need to Evaluate portions of the expression if

the outcome is already determined– Examples:

• E1 or E2 need not evaluate E2 if E1 is known to be true.

• E1 and E2 need not evaluate E2 if E1 is known to be false.

• Use Control Flow– Jump over code that evaluates boolean terms of the

expression– Use Inherited E.false and E.true attributes and link

evaluation of E

Page 62: Interm codegen

Control Flow Translation of Boolean Expressions

E E1 or E2

{ E1.true = E.true E1.false = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code,gen(E1.false:),E2.code)

}E E1 and E2

{E1.false = E.false E1.true = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code,gen(E1.true:),E2.code)}

Page 63: Interm codegen

Control Flow Translation of Boolean Expressions

E id1 relop id2

{E.code = append(gen(if id1.place relop id2.place goto E.true), gen(goto E.false)) }

E true {E.code = gen(goto E.true) }

E false {E.code = gen(goto E.false) }

E not E1 {E1.true = E.false E1.false = E.trueE.code = E1.code }

E ( E1 ) { E1.true = E.true E1.false = E.falseE.code = E1.code }

Page 64: Interm codegen

Boolean Expression: Short Circuit Evaluation

a < b or c < d and e < f

E

E

E

id relop id

E

id relop id

E

a b

c d e f<<

<

or

andid relop id

Page 65: Interm codegen

Boolean Expression: Short Circuit Evaluation

a < b or c < d and e < f

E

E

E

id relop id

E

id relop id

E

a b

c d e f<<

<

or

and

E.true = LtrueE.false = Lfalse

E1.true = LtrueE1.false = L1

id relop id

E id1 relop id2 ‖ E.code = append(

gen(if id1.place relop id2.place goto E.true),gen(goto E.false))

E E1 or E2 ‖ E1.true = E.true E1.false = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code,

gen(E1.false:),E2.code)

E2.true = LtrueE2.false = Lfalse

if a < b goto Ltrue goto L1L1:

Page 66: Interm codegen

Boolean Expression: Short Circuit Evaluation

a < b or c < d and e < f

id relop id

E id1 relop id2 ‖ E.code = append(

gen(if id1.place relop id2.place goto E.true),gen(goto E.false))

E E1 and E2 ‖ E1.false = E.false E1.true = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code,

gen(E1.true:),E2.code)

if a < b goto Ltrue goto L1

E2.true = LtrueE2.false = Lfalse

L1: if c < d goto L2 goto LfalseL2:

E

E

E

id relop id

E

id relop id

E

a b

c d e f<<

<

or

and

E.true = LtrueE.false = Lfalse

E1.true = LtrueE1.false = L1

E2.true = LtrueE2.false = Lfalse

E1.true = L2E1.false = Lfalse

Page 67: Interm codegen

Boolean Expression: Short Circuit Evaluation

a < b or c < d and e < f

E

E

E

id relop id

E

id relop id

E

a b

c d e f<<

<

or

and

E.true = LtrueE.false = Lfalse

E1.true = LtrueE1.false = L1

id relop id

E id1 relop id2 ‖ E.code = append(

gen(if id1.place relop id2.place goto E.true),gen(goto E.false))

E E1 and E2 ‖ E1.false = E.false E1.true = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code,

gen(E1.true:),E2.code)

E2.true = LtrueE2.false = Lfalse

if a < b goto Ltrue goto L1

E2.true = LtrueE2.false = Lfalse

E1.true = L2E1.false = Lfalse

L1: if c < d goto L2 goto LfalseL2: if e < f goto Ltrue goto Lfalse

Page 68: Interm codegen

Boolean Expression: Short Circuit Evaluation

a < b or c < d and e < f

E

E

E

id relop id

E

id relop id

E

a b

c d e f<<

<

or

andid relop id

if a < b goto Ltrue goto L1L1: if c < d goto L2 goto LfalseL2: if e < f goto Ltrue goto Lfalse

Page 69: Interm codegen

Combining Boolean and Control Flow Statements

while a < b do

if c < d then

x = y + z

else

x = y - z

S while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code,

gen(E.true:),S1.code, gen(goto S.begin)

S

S

E

id relop id

S

E

b

c d<

<

do

ifid relop id

while

a

then Sthen

Page 70: Interm codegen

Combining Boolean and Control Flow Statements

while a < b do

if c < d then

x = y + z

else

x = y - z

S while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code,

gen(E.true:),S1.code, gen(goto S.begin))

S

S

E

id relop id

S

E

b

c d<

<

do

ifid relop id

while

a

then Sthen

S.next = Lnext

S.begin = L1

E.true = L2

E.false = Lnext

S.next = L1

Page 71: Interm codegen

Combining Boolean and Control Flow Statements

while a < b do

if c < d then

x = y + z

else

x = y - z

S while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code,

gen(E.true:),S1.code, gen(goto S.begin))

S

S

E

id relop id

S

E

b

c d<

<

do

ifid relop id

while

a

then Sthen

S.next = Lnext

S.begin = L1

E.true = L2

E.false = Lnext

S.next = L1

L1: if a < b goto L2

goto Lnext

L2:

Page 72: Interm codegen

Combining Boolean and Control Flow Statements

while a < b do

if c < d then

x = y + z

else

x = y - z

S while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code,

gen(E.true:),S1.code, gen(goto S.begin))

S

S

E

id relop id

S

E

b

c d<

<

do

ifid relop id

while

a

then Sthen

S.next = Lnext

S.begin = L1

E.true = L2

E.false = Lnext

S.next = L1

L1: if a < b goto L2

goto Lnext

L2: if c < d goto L3

goto L4

L3: E.true = L3E.false = L4 S1.next = L1

S2.next = L1

Page 73: Interm codegen

Combining Boolean and Control Flow Statements

while a < b do

if c < d then

x = y + z

else

x = y - z

S while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code,

gen(E.true:),S1.code, gen(goto S.begin))

S

S

E

id relop id

S

E

b

c d<

<

do

ifid relop id

while

a

then Sthen

S.next = Lnext

S.begin = L1

E.true = L2

E.false = Lnext

S.next = L1

L1: if a < b goto L2

goto Lnext

L2: if c < d goto L3

goto L4

L3: t1 = x + z

x = t1

goto L1

L4:

E.true = L3E.false = L4 S1.next = L1

S2.next = L1

Page 74: Interm codegen

Combining Boolean and Control Flow Statements

while a < b do

if c < d then

x = y + z

else

x = y - z

S while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code,

gen(E.true:),S1.code, gen(goto S.begin))

S

S

E

id relop id

S

E

b

c d<

<

do

ifid relop id

while

a

then Sthen

S.next = Lnext

S.begin = L1

E.true = L2

E.false = Lnext

S.next = L1

L1: if a < b goto L2

goto Lnext

L2: if c < d goto L3

goto L4

L3: t1 = x + z

x = t1

goto L1

L4: t2 = x - z

x = t2

goto L1

Lnext:

E.true = L3E.false = L4 S1.next = L1

S2.next = L1

Page 75: Interm codegen

Combining Boolean and Control Flow Statements

while a < b do

if c < d then

x = y + z

else

x = y - z

S while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code,

gen(E.true:),S1.code, gen(goto S.begin))

S

S

E

id relop id

S

E

b

c d<

<

do

ifid relop id

while

a

then Sthen

L1: if a < b goto L2

goto Lnext

L2: if c < d goto L3

goto L4

L3: t1 = x + z

x = t1

goto L1

L4: t2 = x - z

x = t2

goto L1

Lnext:

Page 76: Interm codegen

Loop Constructs

Loops• Evaluate condition before loop (if needed)• Evaluate condition after loop • Branch back to the top (if needed)

Why this structure?• Merges test with last block of loop body• Pre-test block to hold loop-invariant code• Post-test for increment instructions and

test

while, for, do, & until all fit this basic model

Pre-test

Loop head

Post-test

Next block

B1 B2

Page 77: Interm codegen

Break & Skip StatementsMany modern programming languages

include a break• Exits from the innermost control-flow

statement– Out of the innermost loop– Out of a case statement

Translates into a jump• Targets statement outside control- flow construct• Creates multiple-exit construct• skip in loop goes to next iteration

Only make sense if loop has > 1 block

Pre-test

Loop head

Post-test

Next block

B1 B2Break in B1

Skip in B2

Page 78: Interm codegen

Break and Skip Statements

• Need to Keep track of enclosing control-flow constructs

• Harder to have clean SDT scheme…– Keep a Stack of control-flow constructs– Using S.next as in the stack as the target for the

break statement– For skip statements need to keep track of the label

of the code of the post-test block to advance to the next iteration. This is harder since the code has not been generated yet.

• Backpatching helps– Use a breaklist and a skiplist to be patched later.

Page 79: Interm codegen

Backpatching

• Single Pass Solution to Code Generation?– No more symbolic labels - symbolic addresses instead– Emit code directly into an array of instructions– Actions associated with Productions– Executed when Bottom-Up Parser “Reduces” a production

• Problem– Need to know the labels for target branches before

actually generating the code for them.

• Solution– Leave Branches undefined and patch them later– Requires: carrying around a list of the places that need to

be patched until the value to be patched with is known.

Page 80: Interm codegen

Boolean Expressions Revisited

• Use Additional -Production– Just a Marker M– Label Value M.addr

• Attributes:– E.truelist: code places that

need to be filled-in corresponding to the evaluation of E as “true”.

– E.falselist: same for “false”

(1) E E1 or M E2

(2) | E1 and M E2

(3) | not E1

(4) | ( E1 ) (5) | id1 relop id2

(6) | true(7) | false(8) M

Page 81: Interm codegen

Boolean Expressions: Code Outline

E1.code

E2.code

E1 and E2

false

?

true

false

?true

E1.code

E2.code

E1 or E2

truefalse

?

false

true

Page 82: Interm codegen

Action

(8) M { M.Addr := nextAddr; }

(1) E E1 or M E2 { backpatch(E1.falselist,M.Addr);

E.truelist := merge(E1.truelist,E2.truelist);

E.falselist := E2.falselist; }(2) E E1 and M E2 { backpatch(E1.truelist,M.Addr);

E.truelist := E2.truelist;

E.falselist := merge(E1.falselist, E2.falselist); }

Page 83: Interm codegen

(3) E not E1

{E.truelist := E1.falselist; E.falselist := E1.truelist;}

(4) E ( E1 ){E.truelist := E1.truelist;

E.falselist := E1.falselist;}

(6) E true {E.truelist := makelist(nextquad); emit(‘goto

_’);}

(7) E false {E.falselist := makelist(nextquad); emit(‘goto

_’);}

More Actions

Page 84: Interm codegen

Backpatching Example

E.truelist =

E.falselist =

E.truelist =

E.falselist =

E.truelist =

E.falselist =

E.truelist = E.falselist =

E.truelist = E.falselist =

M.addr =

M.addr =or

anda < b

c < d

e < f

e

e

E

E

E

E

E

M

M

EE.truelist

E.falselist

M.addrM

Generated CodeExecuting Action

{ E.truelist := makelist(nextquad());

E.falselist := makelist(nextquad());

emit(“if id1.place relop.op id2.place goto _”);

emit(“goto _”); }

100: if a < b goto _

101: goto_

102: if c < d goto _

103: goto_

{ M.quad = nextquad(); }

104: if e < f goto _

105: goto_

{ backpatch(E1.falselist,M.quad);

E.truelist := merge(E1.truelist,E2.truelist);

E.falselist := E2.falselist; }

{ backpatch(E1.truelist,M.quad);

E.truelist := E2.truelist;

E.falselist :=

merge(E1.falselist,E2.falselist; }

102: if c < d goto 104

103: goto_

100: if a < b goto _

101: goto 102

{100}

{101}

{102}{103}

102

104{104}

{105}

{104}

{103, 105}

{103, 105}

{100, 104}

Page 85: Interm codegen

Control Flow Code Structures

.

.

.

E.code

S1.codeE.true:

E.false:

if E then S1

.

.

.

E.code

S1.codeE.true:

E.false:

if E then S1 else S2

S.next:

S2.code

goto S.next

.

.

.

E.code

S1.codeE.true:

E.false:

while E do S1

goto S.begin

S.begin: