interm codegen
DESCRIPTION
TRANSCRIPT
Intermediate Code Generation
Sarfaraz MasoodAsstt Prof, Department of
Computer EngineeringJamia Millia University
New Delhi
CS 540 GMU Spring 2007 2
Compiler Architecture
Scanner(lexical
analysis)
Parser(syntax
analysis)
CodeOptimizer
SemanticAnalysis
(IC generator)
CodeGenerator
SymbolTable
Sourcelanguage
tokens Syntacticstructure
Intermediate Code
Targetlanguage
Intermediate Code
Intermediate Code
Joey Paquet, 2000, 2002 3
Introduction to Code Generation
• Front end: – Lexical Analysis– Syntactic Analysis– Intermediate Code Generation
• Back end: – Intermediate Code Optimization– Object Code Generation
• The front end is machine-independent, i.e. it can be reused to build compilers for different architectures
• The back end is machine-dependent, i.e. these steps are related to the nature of the assembly or machine language of the target architecture
04/10/23 4
Introduction to Code Generation
Target-1 Code Generator Target-2 Code Generator
Intermediate-code Optimizer
Language-1 Front End
Source programin Language-1
Language-2 Front End
Source programin Language-2
Non-optimized Intermediate Code
Optimized Intermediate Code
Target-1 machine code Target-2 machine code
Joey Paquet, 2000, 2002 5
Introduction to Code Generation
• After syntactic analysis, we have a number of options to choose from:– generate object code directly from the parse– generate intermediate code, and then generate object
code from it– generate an intermediate abstract representation, and
then generate code directly from it– generate an intermediate abstract representation,
generate intermediate code, and then the object code
• All these options have one thing in common: they are all based on syntactic information gathered in the semantic analysis
Joey Paquet, 2000, 2002 6
Introduction to Code Generation
SyntacticAnalyzer
ObjectCode
SyntacticAnalyzer
IntermediateRepresentation
ObjectCode
Lexical Analyzer
Lexical Analyzer
Lexical Analyzer
SyntacticAnalyzer
IntermediateRepresentation
IntermediateCode
ObjectCode
SyntacticAnalyzer
IntermediateCode
ObjectCode
Lexical Analyzer
Front End Back End
04/10/23 7
Intermediate Representation (IR)
A kind of abstract machine language that can express the target machine operations without committing to too much machine details.
•Why IR ?
04/10/23 8
Without IR
C
Pascal
FORTRAN
C++
SPARC
HP PA
x86
IBM PPC
04/10/23 9
With IR
C
Pascal
FORTRAN
C++
SPARC
HP PA
x86
IBM PPC
IR
04/10/23 10
With IR
C
Pascal
FORTRAN
C++
IR Common Backend
?
04/10/23 11
Advantages of Using an Intermediate Language
1. Retargeting - Build a compiler for a new machine by
attaching a new code generator to an existing front-end.
2. Optimization - reuse intermediate code optimizers in
compilers for different languages and different
machines.
Note: the terms “intermediate code”, “intermediate
language”, and “intermediate representation” are all
used interchangeably.
04/10/23 12
Issues in Designing an IR
Whether to use an existing IR if target machine architecture is similar if the new language is similar
Whether the IR is appropriate for the kind of optimizations to be performed e.g. speculation and predication some transformations may take much
longer than they would on a different IR
04/10/23 13
Issues in Designing an IR
Designing a new IR needs to consider Level (how machine dependent it is) Structure Expressiveness Appropriateness for general and
special optimizations Appropriateness for code generation Whether multiple IRs should be
used
what are the IR in actual compilers?
• gcc is a widely used compiler on many platformsit uses two IRs: AST (Abstract Syntax Tree) and RTL (Register Transfer Language), and some development paths are using Tree-SSA
[SSA: Static Single Assignment: each name is assigned once. We will talk about this later!]
• VM can be seen as a new type of IRJava Bytecode .Net IL
some programming languages have well defined intermediate languages. java – java virtual machine prolog – warren abstract machine In fact, there are byte-code emulators to execute instructions in
these intermediate languages.
Intermediate Code Generation
• Direct Translation– Using SDT scheme– Parse tree to Three-Address Instructions– Can be done while parsing in a single pass– Needs to be able to deal with Syntactic Errors and
Recovery
• Indirect Translation– First validate parsing constructing of AST– Uses SDT scheme to build AST– Traverse the AST and generate Three Address
Instructions
IntermediateCode Generation
O(n)
IR IR
Three-Address
Instructions
∞ regs
Parse tree
AST
Syntax-directed definition to produce AST for assignment statements
productionproduction semantic rulessemantic rules
S S id := id :=EE SS..nptrnptr := := mknodemknode((‘‘assignassign’’, , mkleaf mkleaf (id, (id, id.id.entryentry),), E E..nptrnptr) )
E E EE1 1 ++EE22 EE..nptrnptr := := mknodemknode( ( ‘‘++’’, , EE11..nptrnptr,, E E22..nptrnptr) )
E E EE1 1 EE22 EE..nptrnptr := := mknodemknode( ( ‘‘’’, , EE11..nptrnptr,, E E22..nptrnptr) )
E E EE11 EE..nptrnptr := := mkunodemkunode( ( ‘‘uminusuminus’’, , EE11..nptrnptr) )
E E ( (EE11) ) EE..nptrnptr := := EE11..nptrnptr
E E id id EE..nptrnptr := := mkleaf mkleaf (id, id.(id, id.entryentry) )
1. Syntax Tree vs DAG
assigna +
+
bc d
c duminus
syntax tree for a := (b + cd ) + cd
Syntax Tree vs DAG
• if mknode returns a pointer to an existing node whenever possible, a DAG can be produced
assigna +
+
bc d
c duminus
assigna +
+
bc d
uminus
(a)syntax tree (b)DAG a := (b + cd ) + cd
Syntax Tree vs DAG
04/10/23 19
Form Rules:
1. If E is a variable/constant, the PN of E is E itself
2. If E is an expression of the form E1 op E2, the PN of E is E1’E2’op (E1’ and E2’ are the PN of E1 and E2, respectively.)
3. If E is a parenthesized expression of form (E1), the PN of E is the same as the PN of E1.The PN of expression 9* (5+2) is 952+*
How about (a+b)/(c-d) ? ab+cd-/
A mathematical notation wherein every operator follows all of its operands.
2. Postfix Notation
Intermediate-Code Generation 20
3. Static Single-Assignment Form
• Static single assignment form (SSA) is an intermediate representation that facilitates certain code optimization.
• Two distinct aspects distinguish SSA from three –address code.– All assignments in SSA are to variables with distinct
names; hence the term static single-assignment.
Intermediate-Code Generation 21
3. Static Single-Assignment Form
if (flag) x = -1; else x = 1;y = x * ;a
if (flag) x1 = -1; else x2 = 1;X3 = (x1, x2)
4. Three Address Instructions IR
• Construct mapped to Three-Address Instructions– Register-based IR for expression evaluation– Infinite number of virtual registers– Still independent of target architecture
• Generic Statement Format:Label: x = y op z or if exp goto L– Statements can have symbolic labels– Compiler inserts temporary variables– Type and conversions dealt in other phases of the
code generation
Types of Three-address Statements
• Assignment– Binary: x := y op z– Unary: x := op y– “op” can be any reasonable arithmetic
or logic operator.
• Copy– Simple: x := y– Indexed: x := y[i] or x[i] := y– Address and pointer manipulation:
• x := &y• x := * y• *x := y
Types of Three-address Statements
• Jump– Unconditional: goto L– Conditional: if x relop y goto L1 [else goto L2],
where relop is <,=, >, , or ≠.≧ ≦
• Procedure call– Call procedure P(X1,X2, . . . ,Xn)
PARAM X1
PARAM X2
...
PARAM Xn
CALL P, n
implementations of three-address statements
• common implementations: – Quadruples– Triples– indirect triples
Consider the code:a := b * -c + b * -c
Quadruples
• A quadruple is a record structure with four fields: op, arg1, arg2, and result– The op field contains an internal code for an
operator– Statements with unary operators do not use arg2– Operators like param use neither arg2 nor result– The target label for conditional and unconditional
jumps are in result• The contents of fields arg1, arg2, and result are typically pointers to symbol table entries– If so, temporaries must be entered into the
symbol table as they are created– Obviously, constants need to be handled
differently
Quadruples Example
op arg1 arg2 result
(0) uminus c t1
(1) * b t1 t2
(2) uminus c t3
(3) * b t3 t4
(4) + t2 t4 t5
(5) := t5 a
a := b * -c + b * -c
Triples
• Triples refer to a temporary value by the position of the statement that computes it– Statements can be represented by a record
with only three fields: op, arg1, and arg2– Avoids the need to enter temporary names
into the symbol table
• Contents of arg1 and arg2:– Pointer into symbol table (for programmer
defined names)– Pointer into triple structure (for temporaries)– Of course, still need to handle constants
differently
Triples Example
op arg1 arg2
(0) uminus c
(1) * b (0)
(2) uminus c
(3) * b (2)
(4) + t2 (3)
(5) assign a (4)
Result is implicit in triples
a := b * -c + b * -c
opop arg1arg1 arg2arg2
(0)(0) []=[]= xx ii
(1)(1) :=:= (0)(0) yy
an indexed assignment requires two triples:an indexed assignment requires two triples:x[i] := yx[i] := y
Indirect triples• indirect triples add a list of pointers
to triples, so that triples can be shared and moved easily
op arg1 arg2
(14) uminus c
(15) * b (14)
(16) uminus c
(17) * b (16)
(18) + (15) (17)
(19) assign a (18)
op
(0) (14)
(1) (15)
(2) (16)
(3) (17)
(4) (18)
(5) (19)
a := b * -c + b * -c
syntax-directed translation into three-address code
productionproduction semantic rulessemantic rules
S S id := id :=EE SS..codecode := E.code := E.code ‖gen(‖gen(id.place id.place ‘‘:=:=’’ E.place) E.place)
E E EE1 1 ++EE22 EE.place := newtemp;.place := newtemp;
E.code := E.code := EE11.code .code ‖‖EE22.code .code ‖‖
gen(E.place ‘:=’gen(E.place ‘:=’EE11.place .place ‘‘++’’EE22. place). place)
E E EE1 1 EE22 ............
E E EE11 EE.place := newtemp;.place := newtemp;
E.code := E.code := EE11.code .code ‖gen(E.place ‘:=’‖gen(E.place ‘:=’‘‘uminusuminus’’EE11. . place)place)
E E ( (EE11) ) EE..placeplace := := EE11.place; .place; EE.code := .code := EE11.code.code
F F id id EE..placeplace := id.place; E.code := := id.place; E.code := ‘’‘’
syntax-directed translation into three-
address codeproductioproductio
nn semantic rulessemantic rules
S S while while E do SE do S11
S.begin := newlabel;S.begin := newlabel;
S.after := newlabel;S.after := newlabel;
SS..codecode := :=
gen(S.begin gen(S.begin ‘‘::’’) ) ‖‖
E.code E.code ‖‖
gen(‘if’Egen(‘if’E.place .place ‘‘==’’ ‘‘00’’ ‘‘gotogoto’’ S.after) S.after) ‖‖
SS11.code .code ‖‖
gen(gen(‘‘gotogoto’’ S.begin) S.begin) ‖‖
gen(S.after gen(S.after ‘‘::’’) )
Declarations
• enter symbols in a symbol table• allocate space and record it in the
symbol table• emit appropriate code
Declarations in a procedure
• computing types and relative address of names
P {offset := 0} D D D ; D D id : T {enter ( id.name, T.type, offset);
offset := offset + T.width }T integer {T.type := integer;
T.width := 4 }T real {T.type := real; T.width := 8 }T array [ num ] of T1
{T.type := array (num.val, T1.type);
T.width := num.val T1.width}T T1 {T.type := pointer (T1.type);
T.width := 4 }
Synta x -Directed Translation to Three Address Code
• Attributes for the Non-Terminals, say E and S– Location of the value of an expression: E.place– The Code that Evaluates the Expressions or Statement:
E.code– Markers for beginning and end of sections of the code
S.begin, S.end
• Semantic Actions in Productions of the Grammar– Functions to create temporaries newtemp, and labels newlabel– Use Auxiliary functions to enter symbols and consult types
corresponding to declarations in aside data structure that can be built as the code is being parsed - a symbol table.
– To generate the code we use the emit function gen which creates a list of instructions to be emitted later and can generate symbolic labels corresponding to next instruction of a list.
– Use of append function on lists of instructions.– Synthesized and Inherited Attributes
Assignment Statements: Grammar and Actions
S id = E { p = lookup(id.name);
if (p != NULL)
S.code = gen(p ‘=‘ E.place);
else error;
S.code = nulllist;
}
E E1 + E2 {E.place = newtemp();
E.code = append(E1.code,E2.code,gen(E.place ‘=‘ E1.place ‘+’ E2.place); }
E E1 * E2 { E.place = newtemp();
E.code = append(E1.code,E2.code,gen(E.place ‘=‘ E1.place ‘*’ E2.place); }
Assignment Statements: Grammar and Actions
E - E1 {E.place = newtemp();
E.code = append(E1.code,gen(E.place ‘=‘ ‘-’ E1.place)); }
E (E1) {E.place = E1.place; E.code = E1.code; }
E id {p = lookup(id.name);
if (p != NULL)
E.place = p;
else
error;
E.code = nulllist;
}
Assignment: Examplex = a * b + c * d - e * f;
S
id =
E E*
id
a
id
b
xE
E E*
id
c
id
d
E
E E*
id
e
id
f
E
E+
E
-
Assignment: Examplex = a * b + c * d - e * f;
id
E id { p = lookup(id.name);
if (p != NULL)
E.place = p;
else
error;
E.code = null list;
}
Production:
S
id =
E E
*
id
a
id
b
xE
E E
*
id
c
id
d
E
E E*
id
e f
E
E+
E
-
place = loc(e)
code = null
Assignment: Example
x = a * b + c * d - e * f;
S
id =
E E*
id
a
id
b
xE
E E*
id
c
id
d
E
E E*
id
e
id
f
E
E+
E
-
place = loc(f)
code = null
E id { p = lookup(id.name);
if (p != NULL)
E.place = p;
else
error;
E.code = null list;
}
Production:
place = loc(e)
code = null
Assignment: Example
x = a * b + c * d - e * f;
S
id =
E E*
id
a
id
b
xE
E E*
id
c
id
d
E
E E*
id
e
id
f
E
E+
E
-
place = loc(f)
code = null
E E1 * E2 {E.place = newtemp();
E.code = gen(E.place ‘=‘ E1.place ‘*’ E2.place);}
Production:
place = loc(e)
code = null
place = loc(t1)
code = {t1 = e + f;}
Assignment: E x ample
x = a * b + c * d - e * f;
S
id =
E E*
id
a
id
b
xE
E E*
id
c
id
d
E
E E*
id
e
id
f
E
E+
E
-
Production:E E1 + E2 {E.place = newtemp();
E.code = gen(E.place ‘=‘ E1.place ‘+’ E2.place);}
place = loc(f)
code = nullplace =
loc(e)
code = null
place = loc(t1)
code = {t1 = e + f;}place = loc(d)
code = nullplace =
loc(c)
code = null
place = loc(t2)
code = {t2 = c + d;}
Assignment: Example
x = a * b + c * d - e * f;
S
id =
E E*
id
a
id
b
xE
E E*
id
c
id
d
E
E E*
id
e
id
f
E
E+
E
-
Production:
place = loc(f)
code = nullplace =
loc(e)
code = null
place = loc(t1)
code = {t1 = e * f;}place = loc(d)
code = nullplace =
loc(c)
code = null
place = loc(t2)
code = {t2 = c * d;}
S id = E { p = lookup(id.name);
if (p != NULL)
E.code = append(E.code,
gen(p ‘=‘ E.place));
else
error;
}
place = loc(t3)
code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; }place = loc(b)
code = nullplace =
loc(a)
code = null
place = loc(t4)
code = {t4 = a * b;}
place = loc(t5)
code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; t4 = a * b; t5 = t4 + t3}
code = {t1 = e * f; t2 = c * d; t3 = t2 - t1; t4 = a * b; t5 = t4 + t3; x = t5;}
place = loc(x)
code = null
Assignment: Example
x = a * b + c * d - e * f;
S
id =
E E*
id
a
id
b
xE
EE
*
id
c
id
d
E
E E*
id
e
id
f
E
E + E
-
t1 = e * f;
t2 = c * d;
t3 = t2 - t1;
t4 = a * b;
t5 = t4 + t3;
x = t5;
Reusing Temporary Variables
• Temporary Variables– Short lived– Used for Evaluation of Expressions– Clutter the Symbol Table
• Change the newtemp Function– Keep track of when the value created in a temporary is
used– Use a counter to keep track of the number of active temps– When a temporary is used in an expression decrement
counter– When a temporary is generated by newtemp increment
counter– Initialize counter to zero
Assignment: Example
• Only 2 Registers Needed
x = a * b + c * d - e * f;
S
id =
E E*
id
a
id
b
xE
EE
*
id
c
id
d
E
E E*
id
e
id
f
E
E + E
-
// c = 0
t1 = e * f; // c = 1
t2 = c * d; // c = 2
t1 = t2 - t1; // c = 1
t2 = a * b; // c = 2
t1 = t2 + t1; // c = 1
x = t1; // c = 0
Boolean & Relational Values
How should the compiler represent them?
• Answer depends on the target machine
Two classic approaches• Numerical representation• Positional (implicit) representation
Correct choice depends on both context and ISA
• Issue: Control Flow Introduces Complications– In Both Representations– Need to Know Address to Jump To in Some Cases
• Solution: Two Additional Attributes– nextstat (Inherited) Indicates the next location to be
generated– laststat (Synthesized) Indicates the last location
filled– As code is generated the attributes are filled with
the correct value
SDT Scheme for Boolean Expressions
Boolean Expression: Grammar and Actions
E false {E.place = newtemp()
E.code = {gen(E.place = 0)}
E.laststat = E.nextstat + 1
}
E true{E.place = newtemp()
E.code = {gen(E.place = 1)}
E.laststat = E.nextstat + 1
}
Boolean Expression: Grammar and Actions
E (E1) {E.place = E1.place;
E.code = E1.code; E1.nextstat = E.nextstat E.laststat = E1.laststat }
E not E1 {E.place = newtemp()
E.code = append(E1.code,gen(E.place = not E1.place)) E1.nextstat = E.nextstat E.laststat = E1.laststat + 1
}
Boolean Expression: Grammar and Actions
E E1 or E2
{E.place = newtemp()
E.code = append(E1.code,E2.code,gen(E.place = E1.place or E2.place)
E1.nextstat = E.nexstat
E2.nextstat = E1.laststat
E.laststat = E2.laststat + 1
}
Boolean Expression: Grammar and Actions
E E1 and E2
{E.place = newtemp()
E.code = append(E1.code,E2.code,gen(E.place = E1.place and E2.place)
E1.nextstat = E.nexstat
E2.nextstat = E1.laststat
E.laststat = E2.laststat + 1
}
Boolean Expression: Grammar and Actions
E id1 relop id2
{
E.place = newtemp()
E.code = gen(if id1.place relop id2.place goto E.nextstat+3)
E.code = append(E.code,gen(E.place = 0))
E.code = append(E.code,gen(goto E.nextstat+2))
E.code = append(E.code,gen(E.place = 1))
E.laststat = E.nextstat + 4
}
Boolean Expressions: Example
a < b or c < d and e < f00: if a < b goto 03
01: t1 = 0
02: goto 04
03: t1 = 1
04: if c < d goto 07
05: t2 = 0
06: goto 08
07: t2 = 1
08: if e < f goto 11
09: t3 = 0
10: goto 12
11: t3 = 1
12: t4 = t2 and t3
13: t5 = t1 or t4
id relop id
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
and
Control Flow Statements: Code Layout
E.code
S1.code
S if E then S1
S if E then S1else S2
E.code
S1.code
S2.code
goto S.next
to E.trueto E.false
E.true:E.false:
to E.trueto E.false
E.true:
E.false:S.next:
• Attributes:– E.true: the label to which control flows if E is true– E.false: the label to which control flows if E is false– S.next: an inherited attribute with the symbolic label of
the code following S
Control Flow Statements: Code Layout
E.code
S1.code
S while E do S1
goto S.begin
to E.trueto E.falseE.tru
e:
E.false:
S.begin:
• Difficulty: Need to know where to jump to– Introduce a symbolic labels using the newlabel function– Use inherited attributes– Backpatch it later with the actual value (later…)
Control Flow Statements: Grammar and Actions
S if E then S1 { E.true = newlabel E.false = S.next S1.next = S.next S.code = append(E.code,gen(E.true:),S1.code)}
Control Flow Statements: Grammar and Actions
S if E then S1 else S2
{ E.true = newlabel E.false = newlabel S1.next = S.next S2.next = S.next S.code = append(E.code,gen(E.true:),S1.code, gen(goto S.next),gen(E.false :),S2.code)}
Control Flow Statements: Grammar and Actions
S while E do S1
{ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:), E.code, gen(E.true:),
S1.code, gen(goto S.begin)}
Control Flow Translation of Boolean Expressions
• Short-Circuit Evaluation– No Need to Evaluate portions of the expression if
the outcome is already determined– Examples:
• E1 or E2 need not evaluate E2 if E1 is known to be true.
• E1 and E2 need not evaluate E2 if E1 is known to be false.
• Use Control Flow– Jump over code that evaluates boolean terms of the
expression– Use Inherited E.false and E.true attributes and link
evaluation of E
Control Flow Translation of Boolean Expressions
E E1 or E2
{ E1.true = E.true E1.false = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code,gen(E1.false:),E2.code)
}E E1 and E2
{E1.false = E.false E1.true = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code,gen(E1.true:),E2.code)}
Control Flow Translation of Boolean Expressions
E id1 relop id2
{E.code = append(gen(if id1.place relop id2.place goto E.true), gen(goto E.false)) }
E true {E.code = gen(goto E.true) }
E false {E.code = gen(goto E.false) }
E not E1 {E1.true = E.false E1.false = E.trueE.code = E1.code }
E ( E1 ) { E1.true = E.true E1.false = E.falseE.code = E1.code }
Boolean Expression: Short Circuit Evaluation
a < b or c < d and e < f
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
andid relop id
Boolean Expression: Short Circuit Evaluation
a < b or c < d and e < f
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
and
E.true = LtrueE.false = Lfalse
E1.true = LtrueE1.false = L1
id relop id
E id1 relop id2 ‖ E.code = append(
gen(if id1.place relop id2.place goto E.true),gen(goto E.false))
E E1 or E2 ‖ E1.true = E.true E1.false = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code,
gen(E1.false:),E2.code)
E2.true = LtrueE2.false = Lfalse
if a < b goto Ltrue goto L1L1:
Boolean Expression: Short Circuit Evaluation
a < b or c < d and e < f
id relop id
E id1 relop id2 ‖ E.code = append(
gen(if id1.place relop id2.place goto E.true),gen(goto E.false))
E E1 and E2 ‖ E1.false = E.false E1.true = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code,
gen(E1.true:),E2.code)
if a < b goto Ltrue goto L1
E2.true = LtrueE2.false = Lfalse
L1: if c < d goto L2 goto LfalseL2:
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
and
E.true = LtrueE.false = Lfalse
E1.true = LtrueE1.false = L1
E2.true = LtrueE2.false = Lfalse
E1.true = L2E1.false = Lfalse
Boolean Expression: Short Circuit Evaluation
a < b or c < d and e < f
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
and
E.true = LtrueE.false = Lfalse
E1.true = LtrueE1.false = L1
id relop id
E id1 relop id2 ‖ E.code = append(
gen(if id1.place relop id2.place goto E.true),gen(goto E.false))
E E1 and E2 ‖ E1.false = E.false E1.true = newlabel E2.true = E.true E2.false = E.false E.code = append(E1.code,
gen(E1.true:),E2.code)
E2.true = LtrueE2.false = Lfalse
if a < b goto Ltrue goto L1
E2.true = LtrueE2.false = Lfalse
E1.true = L2E1.false = Lfalse
L1: if c < d goto L2 goto LfalseL2: if e < f goto Ltrue goto Lfalse
Boolean Expression: Short Circuit Evaluation
a < b or c < d and e < f
E
E
E
id relop id
E
id relop id
E
a b
c d e f<<
<
or
andid relop id
if a < b goto Ltrue goto L1L1: if c < d goto L2 goto LfalseL2: if e < f goto Ltrue goto Lfalse
Combining Boolean and Control Flow Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code, gen(goto S.begin)
S
S
E
id relop id
S
E
b
c d<
<
do
ifid relop id
while
a
then Sthen
Combining Boolean and Control Flow Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code, gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
ifid relop id
while
a
then Sthen
S.next = Lnext
S.begin = L1
E.true = L2
E.false = Lnext
S.next = L1
Combining Boolean and Control Flow Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code, gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
ifid relop id
while
a
then Sthen
S.next = Lnext
S.begin = L1
E.true = L2
E.false = Lnext
S.next = L1
L1: if a < b goto L2
goto Lnext
L2:
Combining Boolean and Control Flow Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code, gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
ifid relop id
while
a
then Sthen
S.next = Lnext
S.begin = L1
E.true = L2
E.false = Lnext
S.next = L1
L1: if a < b goto L2
goto Lnext
L2: if c < d goto L3
goto L4
L3: E.true = L3E.false = L4 S1.next = L1
S2.next = L1
Combining Boolean and Control Flow Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code, gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
ifid relop id
while
a
then Sthen
S.next = Lnext
S.begin = L1
E.true = L2
E.false = Lnext
S.next = L1
L1: if a < b goto L2
goto Lnext
L2: if c < d goto L3
goto L4
L3: t1 = x + z
x = t1
goto L1
L4:
E.true = L3E.false = L4 S1.next = L1
S2.next = L1
Combining Boolean and Control Flow Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code, gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
ifid relop id
while
a
then Sthen
S.next = Lnext
S.begin = L1
E.true = L2
E.false = Lnext
S.next = L1
L1: if a < b goto L2
goto Lnext
L2: if c < d goto L3
goto L4
L3: t1 = x + z
x = t1
goto L1
L4: t2 = x - z
x = t2
goto L1
Lnext:
E.true = L3E.false = L4 S1.next = L1
S2.next = L1
Combining Boolean and Control Flow Statements
while a < b do
if c < d then
x = y + z
else
x = y - z
S while E do S1 ‖ S.begin = newlabel E.true = newlabel E.false = S.next S1.next = S.begin S.code = append(gen(S.begin:),E.code,
gen(E.true:),S1.code, gen(goto S.begin))
S
S
E
id relop id
S
E
b
c d<
<
do
ifid relop id
while
a
then Sthen
L1: if a < b goto L2
goto Lnext
L2: if c < d goto L3
goto L4
L3: t1 = x + z
x = t1
goto L1
L4: t2 = x - z
x = t2
goto L1
Lnext:
Loop Constructs
Loops• Evaluate condition before loop (if needed)• Evaluate condition after loop • Branch back to the top (if needed)
Why this structure?• Merges test with last block of loop body• Pre-test block to hold loop-invariant code• Post-test for increment instructions and
test
while, for, do, & until all fit this basic model
Pre-test
Loop head
Post-test
Next block
B1 B2
Break & Skip StatementsMany modern programming languages
include a break• Exits from the innermost control-flow
statement– Out of the innermost loop– Out of a case statement
Translates into a jump• Targets statement outside control- flow construct• Creates multiple-exit construct• skip in loop goes to next iteration
Only make sense if loop has > 1 block
Pre-test
Loop head
Post-test
Next block
B1 B2Break in B1
Skip in B2
Break and Skip Statements
• Need to Keep track of enclosing control-flow constructs
• Harder to have clean SDT scheme…– Keep a Stack of control-flow constructs– Using S.next as in the stack as the target for the
break statement– For skip statements need to keep track of the label
of the code of the post-test block to advance to the next iteration. This is harder since the code has not been generated yet.
• Backpatching helps– Use a breaklist and a skiplist to be patched later.
Backpatching
• Single Pass Solution to Code Generation?– No more symbolic labels - symbolic addresses instead– Emit code directly into an array of instructions– Actions associated with Productions– Executed when Bottom-Up Parser “Reduces” a production
• Problem– Need to know the labels for target branches before
actually generating the code for them.
• Solution– Leave Branches undefined and patch them later– Requires: carrying around a list of the places that need to
be patched until the value to be patched with is known.
Boolean Expressions Revisited
• Use Additional -Production– Just a Marker M– Label Value M.addr
• Attributes:– E.truelist: code places that
need to be filled-in corresponding to the evaluation of E as “true”.
– E.falselist: same for “false”
(1) E E1 or M E2
(2) | E1 and M E2
(3) | not E1
(4) | ( E1 ) (5) | id1 relop id2
(6) | true(7) | false(8) M
Boolean Expressions: Code Outline
E1.code
E2.code
E1 and E2
false
?
true
false
?true
E1.code
E2.code
E1 or E2
truefalse
?
false
true
Action
(8) M { M.Addr := nextAddr; }
(1) E E1 or M E2 { backpatch(E1.falselist,M.Addr);
E.truelist := merge(E1.truelist,E2.truelist);
E.falselist := E2.falselist; }(2) E E1 and M E2 { backpatch(E1.truelist,M.Addr);
E.truelist := E2.truelist;
E.falselist := merge(E1.falselist, E2.falselist); }
(3) E not E1
{E.truelist := E1.falselist; E.falselist := E1.truelist;}
(4) E ( E1 ){E.truelist := E1.truelist;
E.falselist := E1.falselist;}
(6) E true {E.truelist := makelist(nextquad); emit(‘goto
_’);}
(7) E false {E.falselist := makelist(nextquad); emit(‘goto
_’);}
More Actions
Backpatching Example
E.truelist =
E.falselist =
E.truelist =
E.falselist =
E.truelist =
E.falselist =
E.truelist = E.falselist =
E.truelist = E.falselist =
M.addr =
M.addr =or
anda < b
c < d
e < f
e
e
E
E
E
E
E
M
M
EE.truelist
E.falselist
M.addrM
Generated CodeExecuting Action
{ E.truelist := makelist(nextquad());
E.falselist := makelist(nextquad());
emit(“if id1.place relop.op id2.place goto _”);
emit(“goto _”); }
100: if a < b goto _
101: goto_
102: if c < d goto _
103: goto_
{ M.quad = nextquad(); }
104: if e < f goto _
105: goto_
{ backpatch(E1.falselist,M.quad);
E.truelist := merge(E1.truelist,E2.truelist);
E.falselist := E2.falselist; }
{ backpatch(E1.truelist,M.quad);
E.truelist := E2.truelist;
E.falselist :=
merge(E1.falselist,E2.falselist; }
102: if c < d goto 104
103: goto_
100: if a < b goto _
101: goto 102
{100}
{101}
{102}{103}
102
104{104}
{105}
{104}
{103, 105}
{103, 105}
{100, 104}
Control Flow Code Structures
.
.
.
E.code
S1.codeE.true:
E.false:
if E then S1
.
.
.
E.code
S1.codeE.true:
E.false:
if E then S1 else S2
S.next:
S2.code
goto S.next
.
.
.
E.code
S1.codeE.true:
E.false:
while E do S1
goto S.begin
S.begin: