intermediate code generation simplifying initial assumptions a 3-address code will be used for...

20
Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions choice of instruction mnemonics and layouts is lecturer’s own symbolic locations and jump targets, like in many assembly languages, will be used these would need to be resolved to actual machine addresses before the code could be run all operands in arithmetic expressions and comparisons are integers we deal only with arithmetic expressions, boolean expressions, assignment statements, while-loop statements ignoring e.g. type declarations, if-then-else statements, most array references, function and procedure calls, record and class definitions, etc etc code is generated as a string, by concatenating substrings code is stored as the value of an attribute of a parse tree node run-time speed of generated code is not a concern http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 1

Upload: kimberly-henderson

Post on 18-Jan-2018

217 views

Category:

Documents


0 download

DESCRIPTION

Add actions to a basic grammar Compiler Construction3 E ::= E + T | E – T | T T ::= T * F | T / F | F F ::= id | num

TRANSCRIPT

Page 1: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Intermediate code generation

Simplifying initial assumptions• a 3-address code will be used for instructions

choice of instruction mnemonics and layouts is lecturer’s own• symbolic locations and jump targets, like in many assembly languages, will be used

these would need to be resolved to actual machine addresses before the code could be run

• all operands in arithmetic expressions and comparisons are integers• we deal only with arithmetic expressions, boolean expressions, assignment

statements, while-loop statements ignoring e.g. type declarations, if-then-else statements, most array references,

function and procedure calls, record and class definitions, etc etc• code is generated as a string, by concatenating substrings• code is stored as the value of an attribute of a parse tree node• run-time speed of generated code is not a concern

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 1

Page 2: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Arithmetic expressions: simple code generation

The general idea is that semantic actions are associated with every production in the grammar dealing with expressions and subexpressions.

These actions systematically set two particular attributes ‘code’ – a string containing arbitrarily many 3-address-code instructions ‘loc’ – a symbol naming the location where the value for the (sub)expression

will be placed when those instructions are executed

New locations can be invented as required.Code construction may involve adding something to the code already generated for sub-sub-expressions.

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 2

E ::= E + T {E.loc=newloc(); E.code=E1.code || T.code || gen(add, E.loc, E1.loc,T.loc)}

Page 3: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Add actions to a basic grammar

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 3

E ::= E + T

| E – T

| T

T ::= T * F

| T / F

| F

F ::= id

| num

Page 4: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Add actions to a basic grammar

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 4

E ::= E + T

| E – T

| T

T ::= T * F

| T / F

| F

F ::= id

| num { F.loc = newloc(); F.code=gen(loadimmediate, F.loc, num.lexval)}

Page 5: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Add actions to a basic grammar

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 5

E ::= E + T

| E – T

| T

T ::= T * F

| T / F

| F

F ::= id

| num

{ F.loc = id.lexval; F.code=“”}{ F.loc = newloc(); F.code=gen(loadimmediate, F.loc, num.lexval)}

Page 6: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Add actions to a basic grammar

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 6

E ::= E + T

| E – T

| T

T ::= T * F

| T / F

| F

F ::= id

| num

{ F.loc = id.lexval; F.code=“”}

{ T.loc = F.loc; T.code=F.code}

{ F.loc = newloc(); F.code=gen(loadimmediate, F.loc, num.lexval)}

Page 7: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Add actions to a basic grammar

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 7

E ::= E + T

| E – T

| T

T ::= T * F

| T / F

| F

F ::= id

| num

{ F.loc = id.lexval; F.code=“”}

{ T.loc = F.loc; T.code=F.code}

{ T.loc = newloc(); T.code=T1.code || F.code || gen(div, T.loc, T1.loc,F.loc)}

{ F.loc = newloc(); F.code=gen(loadimmediate, F.loc, num.lexval)}

Page 8: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Add actions to a basic grammar

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 8

E ::= E + T

| E – T

| T

T ::= T * F

| T / F

| F

F ::= id

| num

{ F.loc = id.lexval; F.code=“”}

{ T.loc = F.loc; T.code=F.code}

{ T.loc = newloc(); T.code=T1.code || F.code || gen(div, T.loc, T1.loc,F.loc)}

{ T.loc = newloc(); T.code=T1.code || F.code || gen(mul, T.loc, T1.loc,F.loc)}

{ F.loc = newloc(); F.code=gen(loadimmediate, F.loc, num.lexval)}

Page 9: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Add actions to a basic grammar

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 9

E ::= E + T

| E – T

| T

T ::= T * F

| T / F

| F

F ::= id

| num

{ F.loc = id.lexval; F.code=“”}

{ T.loc = F.loc; T.code=F.code}

{ T.loc = newloc(); T.code=T1.code || F.code || gen(div, T.loc, T1.loc,F.loc)}

{ T.loc = newloc(); T.code=T1.code || F.code || gen(mul, T.loc, T1.loc,F.loc)}

{ E.loc = T.loc; E.code=T.code}

{ F.loc = newloc(); F.code=gen(loadimmediate, F.loc, num.lexval)}

Page 10: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Add actions to a basic grammar

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 10

E ::= E + T

| E – T

| T

T ::= T * F

| T / F

| F

F ::= id

| num

{ F.loc = id.lexval; F.code=“”}

{ T.loc = F.loc; T.code=F.code}

{ T.loc = newloc(); T.code=T1.code || F.code || gen(div, T.loc, T1.loc,F.loc)}

{ T.loc = newloc(); T.code=T1.code || F.code || gen(mul, T.loc, T1.loc,F.loc)}

{ E.loc = T.loc; E.code=T.code}

{ E.loc = newloc(); E.code=E1.code || T.code || gen(sub, E.loc, E1.loc,T.loc)}

{ F.loc = newloc(); F.code=gen(loadimmediate, F.loc, num.lexval)}

Page 11: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Add actions to a basic grammar

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 11

E ::= E + T

| E – T

| T

T ::= T * F

| T / F

| F

F ::= id

| num

{ F.loc = id.lexval; F.code=“”}

{ T.loc = F.loc; T.code=F.code}

{ T.loc = newloc(); T.code=T1.code || F.code || gen(div, T.loc, T1.loc,F.loc)}

{ T.loc = newloc(); T.code=T1.code || F.code || gen(mul, T.loc, T1.loc,F.loc)}

{ E.loc = T.loc; E.code=T.code}

{ E.loc = newloc(); E.code=E1.code || T.code || gen(sub, E.loc, E1.loc,T.loc)}

{ E.loc = newloc(); E.code=E1.code || T.code || gen(add, E.loc, E1.loc,T.loc)}

{ F.loc = newloc(); F.code=gen(loadimmediate, F.loc, num.lexval)}

Page 12: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Is ‘code’ a synthesised attribute?

That grammar had left-recursive productions, unsuited to top-down parsing, and all attributes were synthesised.

If left-recursion is eliminated from a grammar, we have seen previously that for an existing synthesised attribute it may be necessary to introduce two new attributes, one inherited and one synthesised. There is a pattern for doing this:

Where there are two original productions A ::= A b | c,•a new nonterminal is needed, say A’•three non-left-recursive productions are needed•For any existing synthesised attribute q used in the 2 original productions

the original q attribute is used only in the new production for A ::= c A’ an inherited attribute, say qi, passes progressively refined values for q down the

right hand side of a parse tree, from A to A’ and from A’ to lower A’ a synthesised attribute, say qs, passes the ultimate value for q back up the tree

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 12

Page 13: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Attribute transformations in left-recursion removal

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 13

A ::= A bA ::= c

{ A.q = f ( A1.q, b.q)}{ A.q = g(c) }

A ::= c { A’.qi = g(c) } A’ { A.q = A’.qs }A’ ::= b { A’1.qi = f( A’.qi, b.q)} A’ { A’.qs = A’1.qs}A’ ::= { A’.qs = A’.qi}

These grammars both generate one c followed by any number of b’s

E ::= E + TE ::= E – TE ::= T

{ …, E.code= f1 (E1.code, T.code) }{ …, E.code = f2(E1.code, T.code) }{ …, E.code = g(T) }

E ::= T { ? } E’ { E.code=E’.codes }E’ ::= + T { ? } E’ { E’.codes = E’1.codes }E’ ::= - T { ? } E’ { E’codes = E’1.codes }E’ ::= { E’.codes = E’.codei }

These grammars both generate one T followed by any number of either“+ T” or “- T” sequences

Exercise: What replaces the ?s(in terms of f1, f2 and g)

Page 14: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Assignment statements

In a programming language assignment statement, the value of an expression on the right-hand side is calculated and stored in a location named on the left-hand side.• In general the location may be something simple like an identifier, an array element,

a record slot. Calculation of a storage address is perhaps required.• The expression on the right-hand side may be arbitrarily complicated. Calculation of

a quantity is required.• These two kinds of calculation are for so-called lvalue and rvalue respectively.

Assn ::= id := ExprAssn ::= id [ Expr ] := Expr

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 14

{ Assn.code = Expr.code || gen(load, id.lexval, Expr.loc)}{ Assn.addr = newloc(); Assn.code =

Expr2.code ||Expr1.code ||gen(add, Assn.addr, symtabaddr(id.lexval), Expr1.loc) ||gen(loadindirect, Assn.addr, Expr2.loc)}

gets the identifier’s address from compiler’s symbol table

Page 15: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Boolean Expressions

• Boolean expression syntax usually includes elements for logical constants true and false comparisons of arithmetic quantities – equal, not equal, greater, etc boolean variables, array elements, functions etc logical operators and or and not

• Boolean expressions have two uses to calculate boolean quantities to be stored in variables, arrays etc to determine flow of control – in loops, in if-then-else statements

• Compound boolean expressions can be evaluated in two distinct ways

akin to arithmetic expressions: evaluate subexpressions then combine values in a “short-circuit” mode: later subexpressions may not always get evaluated

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 15

e.g. (k >= 0) and (k <= 99) and (B[k] /= 0)

Page 16: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Jumping code

• Code generated for boolean expressions almost always includes jumps & labels• to affect control flow• to use short-circuit evaluation• exception possible only if both full evaluation and calculation to produce a value

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 16

Syntax WHILE boolexpr DO statement

label one: <code for boolexpr that jumps -- to label two if true -- to label three if false>label two: <code for statement> jump to label onelabel three:

Translation sketch

Syntax bool1 AND bool2

Translation sketch (short-circuit) <code for bool1 that jumps -- to label four if true -- to falselabel if false>label four: <code for bool2 that jumps -- to truelabel if true -- to falselabel if false>

inherited attributes in SDD/SDT

Page 17: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Semantic actions for sample of boolean productionsBool ::=

Bool ::=

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 17

Expr < Expr

Bool AND Bool

{ Bool.code=

{ Bool1.falselabeli = Bool.falselabeli

Bool2.falselabeli = Bool.falselabeli

Bool1.truelabeli = newlabel()

Bool2.truelabeli = Bool.truelabeli

Bool.code =

Expr1.code || Expr2.code ||

gen(jumpifless, Expr1.loc, Expr2.loc, Bool.falselabeli) ||

gen(jumpalways, Bool.truelabeli) }

Bool1.code ||

gen(label, Bool1.truelabeli) ||

Bool2.code }

Labels are not actual instructions, though they are generated as if they were.They are names attached to the next actual instruction in the sequence.

Page 18: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Where Boolean expressions are found

• Grammar productions which have a Bool anywhere on the right-hand-side must provide values for its falselabeli and truelabeli attributes

• In translating While loops, two labels are needed anyway to handle a return-to-start and escape-to-end situation. A third label is needed if the boolean expression is to be given short-circuit treatment – language designer’s choice

• In translating If-Then-Else statements, two labels are needed anyway, to handle the jump to the start of the else-code, and to handle the completion of the then-code. Again a third label is needed, labelling the start of the then-code, if the boolean expression is to be given short-circuit treatment.

• In translating assignments to boolean variables, two labels are needed.

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 18

Page 19: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Translating a While Loop

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 19

Stmt ::= while

Booldo

Stmt

{ Stmt.label_one= newlabel(); Stmt.label_two=newlabel(); Stmt.label_three=newlabel(); Bool.truelabeli=Stmt.label_two; Bool.falselabel=Stmt.label_three;

{ } { }

{ Stmt.code = gen(label, Stmt.label_one) ||Bool.code ||gen(label, Stmt.label_two) ||Stmt1.code ||gen(jumpalways, Stmt.label_one) ||gen(label, Stmt.label_three) }

Page 20: Intermediate code generation Simplifying initial assumptions a 3-address code will be used for instructions  choice of instruction mnemonics and layouts

Translating a While Loop

http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 20

Stmt0 ::= while

Booldo

Stmt1

{ Stmt0.label_one= newlabel(); Stmt0.label_two=newlabel(); Stmt0.label_three=newlabel(); Bool.truelabeli=Stmt0.label_two; Bool.falselabel=Stmt0.label_three; { } { }

{ Stmt0.code = gen(label, Stmt0.label_one) ||Bool.code ||gen(label, Stmt0.label_two) ||Stmt1.code ||gen(jumpalways, Stmt0.label_one) ||gen(label, Stmt0.label_three) }