professor yihjia tsai tamkang university abstract syntax tree
Post on 21-Dec-2015
224 views
TRANSCRIPT
2
Abstract Syntax Trees
• So far a parser traces the derivation of a sequence of tokens
• The rest of the compiler needs a structural representation of the program
• Abstract syntax trees– Like parse trees but ignore some details– Abbreviated as AST
3
Abstract Syntax Tree. (Cont.)
• Consider the grammar E int | ( E ) | E + E
• And the string 5 + (2 + 3)
• After lexical analysis (a list of tokens) int5 ‘+’ ‘(‘ int2 ‘+’ int3 ‘)’
• During parsing we build a parse tree …
4
Example of Abstract Syntax Tree
• Also captures the nesting structure• But abstracts from the concrete syntax
=> more compact and easier to use
• An important data structure in a compiler
PLUS
PLUS
2 5 3
5
Example of Parse Tree
E
E E
( E )
+
E +
int5
int2
E
int3
• Traces the operation of the parser
• Does capture the nesting structure
• But too much info– Parentheses– Single-successor
nodes
6
Semantic Actions
• This is what we’ll use to construct ASTs
• Each grammar symbol may have attributes– For terminal symbols (lexical tokens) attributes
can be calculated by the lexer
• Each production may have an action– Written as: X Y1 … Yn { action }
– That can refer to or compute symbol attributes
7
Semantic Actions: An Example
• Consider the grammar E int | E + E | ( E )
• For each symbol X define an attribute X.val– For terminals, val is the associated lexeme– For non-terminals, val is the expression’s value (and
is computed from values of subexpressions)
• We annotate the grammar with actions:E int { E.val = int.val } | E1 + E2 { E.val = E1.val + E2.val }
| ( E1 ) { E.val = E1.val }
8
Semantic Actions: An Example (Cont.)
Productions EquationsE E1 + E2 E.val = E1.val + E2.val
E1 int5 E1.val = int5.val = 5
E2 ( E3) E2.val = E3.val
E3 E4 + E5 E3.val = E4.val + E5.val
E4 int2 E4.val = int2.val = 2
E5 int3 E5.val = int3.val = 3
• String: 5 + (2 + 3)
• Tokens: int5 ‘+’ ‘(‘ int2 ‘+’ int3 ‘)’
9
Semantic Actions: Notes
• Semantic actions specify a system of equations– Order of resolution is not specified
• Example: E3.val = E4.val + E5.val
– Must compute E4.val and E5.val before E3.val
– We say that E3.val depends on E4.val and E5.val
• The parser must find the order of evaluation
10
Dependency Graph
E
E1 E2
( E3 )
+
E4+
int5
int2
E5
int3
+
+
2
5
• Each node labeled E has one slot for the val attribute
• Note the dependencies
3
11
Evaluating Attributes
• An attribute must be computed after all its successors in the dependency graph have been computed – In previous example attributes can be
computed bottom-up
• Such an order exists when there are no cycles– Cyclically defined attributes are not legal
12
Semantic Actions: Notes (Cont.)
• Synthesized attributes– Calculated from attributes of descendents in
the parse tree– E.val is a synthesized attribute– Can always be calculated in a bottom-up order
• Grammars with only synthesized attributes are called S-attributed grammars– Most frequent kinds of grammars
13
Semantic Actions :Top-down Approach
• Recursive-descent interpreter
• Consider this grammarS -> E $E -> T E’ E’-> +T E’ E’ -> - T E’ E-> T -> F T’ T’ -> * F T’ T’ -> / F T’ T’ ->F -> id F -> num F -> ( E )
• Needs “type” of non-terminals and tokens
14
Recursive-descent interpreter
int T() { switch (tok.kind) {
case ID: case NUM: case LPAREN
return Tprime( F() );
default:print(“expected ID, NUM, or left-paren”);
skipto(T_follow); return 0; }}
int Tprime(int a) {switch (tok.kind) {
case TIMES: eat(TIMES); return Tprime(a*F());
case DIVIDE: eat(DIVIDE); return Tprime(a/F());
case PLUS: case MINUS: case RPAREN: case EOF:
return a;
default: /* error handling */ …… }}
15
JavaCC version
• GrammarS -> E $E -> T ( + T | - T)* T -> F ( * F | - F)*F -> id | num | ( E )
Note: E –> T E’ E’ -> + T E’ | - T E’ |
16
JavaCC version –
void Start() :{ int i; }{ i=Exp() <EOF> {System.out.println(i); }}int Exp() :{ int a, i; }{ a=Term() ( “+” i=Term() { a=a+i; } | “-” i=Term() { a=a+i; } )* { return a; }}Int Factor() :{ Token t; int i; }{ t = <IDENTIFIER > {return lookup(t.image); } | t=<INTEGER_LITERAL> {return Integer.parseInt(t.image);} | “(“ i=Exp() “)” {return i; }}
17
Semantic Actions – Reduce and Shift
• We can now illustrate how semantic actions are implemented for LR parsing
• Keep attributes on the stack
• On shift a, push attribute for a on stack• On reduce X
– pop attributes for – compute attribute for X– and push it on the stack
18
Performing Semantic Actions. Example
• Recall the example from previous lecture
E T + E1 { E.val = T.val + E1.val }
| T { E.val = T.val } T int * T1 { T.val = int.val * T1.val }
| int { T.val = int.val }
• Consider the parsing of the string 3 * 5 + 8
19
Performing Semantic Actions. Example
| int * int + int shiftint3 | * int + int shift
int3 * | int + int shift
int3 * int5 | + int reduce T int
int3 * T5 | + int reduce T int * T
T15 | + int shift
T15 + | int shift
T15 + int8 | reduce T int
T15 + T8 | reduce E T
T15 + E8 | reduce E T + E
E23 | accept
20
Inherited Attributes
• Another kind of attribute• Calculated from attributes of parent
and/or siblings in the parse tree
• Example: a line calculator
21
A Line Calculator
• Each line contains an expression E int | E + E• Each line is terminated with the = sign L E = | + E =• In second form the value of previous line
is used as starting value• A program is a sequence of lines P | P L
22
Attributes for the Line Calculator
• Each E has a synthesized attribute val – Calculated as before
• Each L has a synthesized attribute val L E = { L.val = E.val } | + E = { L.val = E.val + L.prev }
• We need the value of the previous line • We use an inherited attribute L.prev
23
Attributes for the Line Calculator (Cont.)
• Each P has a synthesized attribute val – The value of its last line P { P.val = 0 } | P1 L { P.val = L.val;
L.prev = P1.val }
– Each L has an inherited attribute prev
– L.prev is inherited from sibling P1.val
• Example …
24
Example of Inherited Attributes
• val synthesized
• prev inherited
• All can be computed in depth-first order
P
L
+ E3=
E4+
int2
E5
int3
+
+
2
0
3
P
25
Semantic Actions: Notes (Cont.)
• Semantic actions can be used to build ASTs
• And many other things as well– Also used for type checking, code generation,
…
• Process is called syntax-directed translation– Substantial generalization over CFGs
26
Constructing An AST
• We first define the AST data type– Supplied by us for the project
• Consider an abstract tree type with two constructors:
mkleaf(n)
mkplus(
T1
) =,
T2
=
PLUS
T1 T2
n
27
Constructing a Parse Tree
• We define a synthesized attribute ast – Values of ast values are ASTs– We assume that int.lexval is the value of the
integer lexeme– Computed using semantic actions
E int E.ast = mkleaf(int.lexval) | E1 + E2 E.ast = mkplus(E1.ast, E2.ast)
| ( E1 ) E.ast = E1.ast
28
Parse Tree Example
• Consider the string int5 ‘+’ ‘(‘ int2 ‘+’ int3 ‘)’
• A bottom-up evaluation of the ast attribute: E.ast = mkplus(mkleaf(5), mkplus(mkleaf(2), mkleaf(3))
PLUS
PLUS
2 5 3
29
Review
• We can specify language syntax using CFG
• A parser will answer whether s L(G)• … and will build a parse tree• … which we convert to an AST• … and pass on to the rest of the compiler
30
Abstract Syntax
E -> E + EE -> E – EE -> E * EE -> E / EE -> id E -> num
Abtract Parse Trees : Expression Grammar
31
AST : Node types
public abstract class Exp { public abstract int eval():}public class PlusExp extends Exp { private Exp e1, e2; public PlusExp(Exp a1, Exp a2) { e1=a1; d2=a2; } public int eval() { return e1.eval()+e2.eval(): }}public class Identifier extends Exp {
private String f0; public Indenfifier(String n0) { f0 = n0; } public int eval() { return lookup(f0); }}public class IntegerLiteral extends Exp {
private String f0; public IntegerLiteral(String n0) { f0 = n0; } public int eval() { return Integer.parseInt(f0); }}
32
JavaCC Example for AST construction
Exp Start() :{ Exp e; }{ e=Exp() { return e; }}Exp Exp() :{ Exp e1, e2; }{ e1=Term() ( “+” e2=Term() { e1=new PlusExp(e1,e2); } | “-” e2=Term() { e1=new MinusExp(e1,e2); } )* { return a; }}Exp Factor() :{ Token t; Exp e; }{ t = <IDENTIFIER > {return new Identifier(t.image); } | t=<INTEGER_LITERAL> {return new IntegerLiteral(t.image);} | “(“ e=Exp() “)” {return e; }}
33
Positions
• Must remember the position in the source file– Lexical analysis, parsing and semantic analysis are
not done simultaneously. – Necessary for error reporting
• AST must keep the pos fields, which indicate the position within the original source file.
• Lexer must pass the information to the parser.• Ast node constructors must be augmented to
init the pos fields.
34
JavaCC : Class Token
• Each Token object has the following fields: – int kind;– int beginLine, beginColumn, endLine,
endColumn;– String image;– Token next;– Token specialToken; – static final Token newToken(int ofKind);
• Unfortunately, ….
35
Visitors
• “syntax separate from interpretation “style of programming– Vs. object-oriented style of programming
• “Visitor pattern”– Visitor implements an interpretation.– Visitor object contains a visit method for each
syntax-tree class.– Syntax-tree classes contain “accept” methods.– Visitor calls “accept”(what is your class?).
Then “accept” calls the “visit” of the visitor.
36
Example :Expression Classes
public abstract class Exp { public abstract int accept(Visitor v):}public class PlusExp extends Exp { private Exp e1, e2; public PlusExp(Exp a1, Exp a2) { e1=a1; d2=a2; } public int accept(Visitor v) { return v.visit(this) ; }}public class Identifier extends Exp {
private String f0; public Indenfifier(String n0) { f0 = n0; } public int accept(Visitor v) { return v.visit(this) ; }}public class IntegerLiteral extends Exp {
private String f0; public IntegerLiteral(String n0) { f0 = n0; } public int accept(Visitor v) { return v.visit(this) ; }}
37
An interpreter visitor
public interface Visitor {
public int visit(PlusExp n);
public int visit(Identifier n);
public int visit(IntegerLiteral n);
}
public class Interpreter implements Visitor {
public int visit(PlusExp n) {
return n.e1.accept(this) + n.e2.accept(this);
}
public int visit(Identifier n) {
return looup(n.f0);
}
public int visit(IntegerLiteral n) {
return Integer.parseInt(n.f0);
}
38
Abstract Syntax for MiniJava (I)
Package syntaxtree;
Program(MainClass m, ClassDecList c1)MainClass(Identifier i1, Identifier i2, Statement s)----------------------------abstract class ClassDeclClassDeclSimple(Identifier i, VarDeclList vl, methodDeclList m1)ClassDeclExtends(Identifier i, Identifier j, VarDecList vl, MethodDeclList ml)-----------------------------VarDecl(Type t, Identifier i)MethodDecl(Type t, Identifier I, FormalList fl, VariableDeclList vl, StatementList sl, Exp e)Formal(Type t, Identifier i)
39
Abstract Syntax for MiniJava (II)
abstract class typeIntArrayType()BooleanType()IntegerType()IndentifierType(String s)---------------------------abstract class StatementBlock(StatementList sl)If(Exp e, Statement s1, Statement s2)While(Exp e, Statement s)Print(Exp e)Assign(Identifier i, Exp e)ArrayAssign(Identifier i, Exp e1, Exp e2)-------------------------------------------
40
Abstract Syntax for MiniJava (III)
abstract class ExpAnd(Exp e1, Exp e2) LessThan(Exp e1, Exp e2)Plus(Exp e1, Exp e2) Minus(Exp e1, Exp e2)Times(Exp e1, Exp e2) Not(Exp e)ArrayLookup(Exp e1, Exp e2) ArrayLength(Exp e)Call(Exp e, Identifier i, ExpList el)IntergerLiteral(int i)True() False()IdentifierExp(String s)This()NewArray(Exp e) NewObject(Identifier i)-------------------------------------------------Identifier(Sting s)--list classes-------------------------ClassDecList() ExpList() FormalList() MethodDeclList()StatementLIst() VarDeclList()
41
Syntax Tree Nodes - Details
package syntaxtree;import visitor.Visitor;import visitor.TypeVisitor;
public class Program { public MainClass m; public ClassDeclList cl;
public Program(MainClass am, ClassDeclList acl) { m=am; cl=acl; }
public void accept(Visitor v) { v.visit(this); }
public Type accept(TypeVisitor v) { return v.visit(this); }}
42
ClassDecl.java
package syntaxtree;
import visitor.Visitor;
import visitor.TypeVisitor;
public abstract class ClassDecl {
public abstract void accept(Visitor v);
public abstract Type accept(TypeVisitor v);
}
43
ClassDeclExtends.java
package syntaxtree;import visitor.Visitor;import visitor.TypeVisitor;
public class ClassDeclExtends extends ClassDecl { public Identifier i; public Identifier j; public VarDeclList vl; public MethodDeclList ml; public ClassDeclExtends(Identifier ai, Identifier aj, VarDeclList avl, MethodDeclList aml) { i=ai; j=aj; vl=avl; ml=aml; } public void accept(Visitor v) { v.visit(this); } public Type accept(TypeVisitor v) { return v.visit(this); }}
44
StatementList.java
package syntaxtree;import java.util.Vector;
public class StatementList { private Vector list;
public StatementList() { list = new Vector(); } public void addElement(Statement n) { list.addElement(n); } public Statement elementAt(int i) { return (Statement)list.elementAt(i); } public int size() { return list.size(); }}
45
Package Visitor/visitor.java
package visitor;import syntaxtree.*;
public interface Visitor { public void visit(Program n); public void visit(MainClass n); public void visit(ClassDeclSimple n); public void visit(ClassDeclExtends n); public void visit(VarDecl n); public void visit(MethodDecl n); public void visit(Formal n); public void visit(IntArrayType n); public void visit(BooleanType n); public void visit(IntegerType n); public void visit(IdentifierType n); public void visit(Block n); public void visit(If n); public void visit(While n); public void visit(Print n); public void visit(Assign n); public void visit(ArrayAssign n); public void visit(And n); public void visit(LessThan n); public void visit(Plus n); public void visit(Minus n); public void visit(Times n); public void visit(ArrayLookup n); public void visit(ArrayLength n); public void visit(Call n); public void visit(IntegerLiteral n); public void visit(True n); public void visit(False n); public void visit(IdentifierExp n); public void visit(This n); public void visit(NewArray n); public void visit(NewObject n); public void visit(Not n); public void visit(Identifier n);}
46
X = y.m(1,4+5)
Statement -> AssignmentStatementAssignmentStatement -> Identfier1 “=“ Expression
Identifier1 -> <IDENTIFIER>
Expression -> Expression1 “.” Identifier2 “(“ ( ExpList)? “)”
Expression1 -> IdentifierExp
IdentifierExp -> <IDENTIFIER>Identifier2 -> <IDENTIFIER>
ExpList -> Expression2 ( “,” Expression3 )*
Expression2 -> <INTEGER_LITERAL>
Expression3 -> PlusExp -> Expression “+” Expression -> <INTEGER_LITERAL> , <INTEGER_LITERAL>
47
AST
Statement s -> Assign (Identifier,Exp)
Identifier(“x”) Call(Exp,Identifier,ExpList)
IdentifierExp(“y”) Identifier(“m”)
IntegerLiteral(1)
Plus(Exp,Exp)
IntegerLiteral(4) (IntegerLiteral(5)
ExpList e1
init
add
add
48
MiniJava : Grammar(I)
Program -> MainClass ClassDecl *
Program(MainClass, ClassDeclList)
Program Goal() :
{ MainClass m; ClassDeclList cl = new ClassDeclList();
ClassDecl c;
}
{ m = MainClass() (c = ClassDecl() {cl.addElement(c);})*
<EOF> {return new Program(m,cl)
}
49
MiniJava : Grammar(II)
MainClass -> class id { public static void main ( String [] id ) { Statement } } MainClass(Identifier, VarDeclList)ClassDecl -> class id { VarDecl * MethodDecl * } -> class id extends id { VarDecl* MethodDecl * } ClassDeclSimple(…), ClassDecExtends(…)VarDecl -> Type id ; VarDecl(Type, Identifier)MethodDecl -> public Type id ( FormalList )
{ VarDecl * Statement* return Exp ; } MethodDecl(Type,Identifier,FormalList,VarDeclList StaementList, Exp)
50
MiniJava : Grammar(III)
FormalList -> Type id FormalRest * ->
FormalRest -> , Type id
Type -> int [] -> boolean -> int -> id
51
MiniJava : Grammar(IV)
Statement -> { Statement * } -> if ( Exp ) Statement else Statement -> while ( Exp ) Statement -> System.out.println ( Exp ) ; -> id = Exp ; -> id [ Exp ] = Exp ;
ExpList -> Exp ExpRest * ->
ExpRest -> , Exp
52
MiniJava : Grammar(V)
Exp -> Exp op Exp -> Exp [ Exp ] -> Exp . length -> Exp . Id ( ExpList ) -> INTEGER_LITERAL -> true -> false -> id -> this -> new int [ Exp ] -> new id ( ) -> ! Exp -> ( Exp )