intermediate code generation simplifying initial assumptions a 3-address code will be used for...
DESCRIPTION
Add actions to a basic grammar Compiler Construction3 E ::= E + T | E – T | T T ::= T * F | T / F | F F ::= id | numTRANSCRIPT
Intermediate code generation
Simplifying initial assumptions• a 3-address code will be used for instructions
choice of instruction mnemonics and layouts is lecturer’s own• symbolic locations and jump targets, like in many assembly languages, will be used
these would need to be resolved to actual machine addresses before the code could be run
• all operands in arithmetic expressions and comparisons are integers• we deal only with arithmetic expressions, boolean expressions, assignment
statements, while-loop statements ignoring e.g. type declarations, if-then-else statements, most array references,
function and procedure calls, record and class definitions, etc etc• code is generated as a string, by concatenating substrings• code is stored as the value of an attribute of a parse tree node• run-time speed of generated code is not a concern
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 1
Arithmetic expressions: simple code generation
The general idea is that semantic actions are associated with every production in the grammar dealing with expressions and subexpressions.
These actions systematically set two particular attributes ‘code’ – a string containing arbitrarily many 3-address-code instructions ‘loc’ – a symbol naming the location where the value for the (sub)expression
will be placed when those instructions are executed
New locations can be invented as required.Code construction may involve adding something to the code already generated for sub-sub-expressions.
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 2
E ::= E + T {E.loc=newloc(); E.code=E1.code || T.code || gen(add, E.loc, E1.loc,T.loc)}
Add actions to a basic grammar
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 3
E ::= E + T
| E – T
| T
T ::= T * F
| T / F
| F
F ::= id
| num
Add actions to a basic grammar
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 4
E ::= E + T
| E – T
| T
T ::= T * F
| T / F
| F
F ::= id
| num { F.loc = newloc(); F.code=gen(loadimmediate, F.loc, num.lexval)}
Add actions to a basic grammar
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 5
E ::= E + T
| E – T
| T
T ::= T * F
| T / F
| F
F ::= id
| num
{ F.loc = id.lexval; F.code=“”}{ F.loc = newloc(); F.code=gen(loadimmediate, F.loc, num.lexval)}
Add actions to a basic grammar
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 6
E ::= E + T
| E – T
| T
T ::= T * F
| T / F
| F
F ::= id
| num
{ F.loc = id.lexval; F.code=“”}
{ T.loc = F.loc; T.code=F.code}
{ F.loc = newloc(); F.code=gen(loadimmediate, F.loc, num.lexval)}
Add actions to a basic grammar
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 7
E ::= E + T
| E – T
| T
T ::= T * F
| T / F
| F
F ::= id
| num
{ F.loc = id.lexval; F.code=“”}
{ T.loc = F.loc; T.code=F.code}
{ T.loc = newloc(); T.code=T1.code || F.code || gen(div, T.loc, T1.loc,F.loc)}
{ F.loc = newloc(); F.code=gen(loadimmediate, F.loc, num.lexval)}
Add actions to a basic grammar
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 8
E ::= E + T
| E – T
| T
T ::= T * F
| T / F
| F
F ::= id
| num
{ F.loc = id.lexval; F.code=“”}
{ T.loc = F.loc; T.code=F.code}
{ T.loc = newloc(); T.code=T1.code || F.code || gen(div, T.loc, T1.loc,F.loc)}
{ T.loc = newloc(); T.code=T1.code || F.code || gen(mul, T.loc, T1.loc,F.loc)}
{ F.loc = newloc(); F.code=gen(loadimmediate, F.loc, num.lexval)}
Add actions to a basic grammar
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 9
E ::= E + T
| E – T
| T
T ::= T * F
| T / F
| F
F ::= id
| num
{ F.loc = id.lexval; F.code=“”}
{ T.loc = F.loc; T.code=F.code}
{ T.loc = newloc(); T.code=T1.code || F.code || gen(div, T.loc, T1.loc,F.loc)}
{ T.loc = newloc(); T.code=T1.code || F.code || gen(mul, T.loc, T1.loc,F.loc)}
{ E.loc = T.loc; E.code=T.code}
{ F.loc = newloc(); F.code=gen(loadimmediate, F.loc, num.lexval)}
Add actions to a basic grammar
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 10
E ::= E + T
| E – T
| T
T ::= T * F
| T / F
| F
F ::= id
| num
{ F.loc = id.lexval; F.code=“”}
{ T.loc = F.loc; T.code=F.code}
{ T.loc = newloc(); T.code=T1.code || F.code || gen(div, T.loc, T1.loc,F.loc)}
{ T.loc = newloc(); T.code=T1.code || F.code || gen(mul, T.loc, T1.loc,F.loc)}
{ E.loc = T.loc; E.code=T.code}
{ E.loc = newloc(); E.code=E1.code || T.code || gen(sub, E.loc, E1.loc,T.loc)}
{ F.loc = newloc(); F.code=gen(loadimmediate, F.loc, num.lexval)}
Add actions to a basic grammar
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 11
E ::= E + T
| E – T
| T
T ::= T * F
| T / F
| F
F ::= id
| num
{ F.loc = id.lexval; F.code=“”}
{ T.loc = F.loc; T.code=F.code}
{ T.loc = newloc(); T.code=T1.code || F.code || gen(div, T.loc, T1.loc,F.loc)}
{ T.loc = newloc(); T.code=T1.code || F.code || gen(mul, T.loc, T1.loc,F.loc)}
{ E.loc = T.loc; E.code=T.code}
{ E.loc = newloc(); E.code=E1.code || T.code || gen(sub, E.loc, E1.loc,T.loc)}
{ E.loc = newloc(); E.code=E1.code || T.code || gen(add, E.loc, E1.loc,T.loc)}
{ F.loc = newloc(); F.code=gen(loadimmediate, F.loc, num.lexval)}
Is ‘code’ a synthesised attribute?
That grammar had left-recursive productions, unsuited to top-down parsing, and all attributes were synthesised.
If left-recursion is eliminated from a grammar, we have seen previously that for an existing synthesised attribute it may be necessary to introduce two new attributes, one inherited and one synthesised. There is a pattern for doing this:
Where there are two original productions A ::= A b | c,•a new nonterminal is needed, say A’•three non-left-recursive productions are needed•For any existing synthesised attribute q used in the 2 original productions
the original q attribute is used only in the new production for A ::= c A’ an inherited attribute, say qi, passes progressively refined values for q down the
right hand side of a parse tree, from A to A’ and from A’ to lower A’ a synthesised attribute, say qs, passes the ultimate value for q back up the tree
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 12
Attribute transformations in left-recursion removal
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 13
A ::= A bA ::= c
{ A.q = f ( A1.q, b.q)}{ A.q = g(c) }
A ::= c { A’.qi = g(c) } A’ { A.q = A’.qs }A’ ::= b { A’1.qi = f( A’.qi, b.q)} A’ { A’.qs = A’1.qs}A’ ::= { A’.qs = A’.qi}
These grammars both generate one c followed by any number of b’s
E ::= E + TE ::= E – TE ::= T
{ …, E.code= f1 (E1.code, T.code) }{ …, E.code = f2(E1.code, T.code) }{ …, E.code = g(T) }
E ::= T { ? } E’ { E.code=E’.codes }E’ ::= + T { ? } E’ { E’.codes = E’1.codes }E’ ::= - T { ? } E’ { E’codes = E’1.codes }E’ ::= { E’.codes = E’.codei }
These grammars both generate one T followed by any number of either“+ T” or “- T” sequences
Exercise: What replaces the ?s(in terms of f1, f2 and g)
Assignment statements
In a programming language assignment statement, the value of an expression on the right-hand side is calculated and stored in a location named on the left-hand side.• In general the location may be something simple like an identifier, an array element,
a record slot. Calculation of a storage address is perhaps required.• The expression on the right-hand side may be arbitrarily complicated. Calculation of
a quantity is required.• These two kinds of calculation are for so-called lvalue and rvalue respectively.
Assn ::= id := ExprAssn ::= id [ Expr ] := Expr
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 14
{ Assn.code = Expr.code || gen(load, id.lexval, Expr.loc)}{ Assn.addr = newloc(); Assn.code =
Expr2.code ||Expr1.code ||gen(add, Assn.addr, symtabaddr(id.lexval), Expr1.loc) ||gen(loadindirect, Assn.addr, Expr2.loc)}
gets the identifier’s address from compiler’s symbol table
Boolean Expressions
• Boolean expression syntax usually includes elements for logical constants true and false comparisons of arithmetic quantities – equal, not equal, greater, etc boolean variables, array elements, functions etc logical operators and or and not
• Boolean expressions have two uses to calculate boolean quantities to be stored in variables, arrays etc to determine flow of control – in loops, in if-then-else statements
• Compound boolean expressions can be evaluated in two distinct ways
akin to arithmetic expressions: evaluate subexpressions then combine values in a “short-circuit” mode: later subexpressions may not always get evaluated
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 15
e.g. (k >= 0) and (k <= 99) and (B[k] /= 0)
Jumping code
• Code generated for boolean expressions almost always includes jumps & labels• to affect control flow• to use short-circuit evaluation• exception possible only if both full evaluation and calculation to produce a value
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 16
Syntax WHILE boolexpr DO statement
label one: <code for boolexpr that jumps -- to label two if true -- to label three if false>label two: <code for statement> jump to label onelabel three:
Translation sketch
Syntax bool1 AND bool2
Translation sketch (short-circuit) <code for bool1 that jumps -- to label four if true -- to falselabel if false>label four: <code for bool2 that jumps -- to truelabel if true -- to falselabel if false>
inherited attributes in SDD/SDT
Semantic actions for sample of boolean productionsBool ::=
Bool ::=
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 17
Expr < Expr
Bool AND Bool
{ Bool.code=
{ Bool1.falselabeli = Bool.falselabeli
Bool2.falselabeli = Bool.falselabeli
Bool1.truelabeli = newlabel()
Bool2.truelabeli = Bool.truelabeli
Bool.code =
Expr1.code || Expr2.code ||
gen(jumpifless, Expr1.loc, Expr2.loc, Bool.falselabeli) ||
gen(jumpalways, Bool.truelabeli) }
Bool1.code ||
gen(label, Bool1.truelabeli) ||
Bool2.code }
Labels are not actual instructions, though they are generated as if they were.They are names attached to the next actual instruction in the sequence.
Where Boolean expressions are found
• Grammar productions which have a Bool anywhere on the right-hand-side must provide values for its falselabeli and truelabeli attributes
• In translating While loops, two labels are needed anyway to handle a return-to-start and escape-to-end situation. A third label is needed if the boolean expression is to be given short-circuit treatment – language designer’s choice
• In translating If-Then-Else statements, two labels are needed anyway, to handle the jump to the start of the else-code, and to handle the completion of the then-code. Again a third label is needed, labelling the start of the then-code, if the boolean expression is to be given short-circuit treatment.
• In translating assignments to boolean variables, two labels are needed.
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 18
Translating a While Loop
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 19
Stmt ::= while
Booldo
Stmt
{ Stmt.label_one= newlabel(); Stmt.label_two=newlabel(); Stmt.label_three=newlabel(); Bool.truelabeli=Stmt.label_two; Bool.falselabel=Stmt.label_three;
{ } { }
{ Stmt.code = gen(label, Stmt.label_one) ||Bool.code ||gen(label, Stmt.label_two) ||Stmt1.code ||gen(jumpalways, Stmt.label_one) ||gen(label, Stmt.label_three) }
Translating a While Loop
http://csiweb.ucd.ie/staff/acater/comp30330.html Compiler Construction 20
Stmt0 ::= while
Booldo
Stmt1
{ Stmt0.label_one= newlabel(); Stmt0.label_two=newlabel(); Stmt0.label_three=newlabel(); Bool.truelabeli=Stmt0.label_two; Bool.falselabel=Stmt0.label_three; { } { }
{ Stmt0.code = gen(label, Stmt0.label_one) ||Bool.code ||gen(label, Stmt0.label_two) ||Stmt1.code ||gen(jumpalways, Stmt0.label_one) ||gen(label, Stmt0.label_three) }