professor yihjia tsai tamkang university abstract syntax tree

53
Professor Yihjia Tsai Tamkang University Abstract Syntax Tree

Post on 21-Dec-2015

224 views

Category:

Documents


0 download

TRANSCRIPT

Professor Yihjia TsaiTamkang University

Abstract Syntax Tree

2

Abstract Syntax Trees

• So far a parser traces the derivation of a sequence of tokens

• The rest of the compiler needs a structural representation of the program

• Abstract syntax trees– Like parse trees but ignore some details– Abbreviated as AST

3

Abstract Syntax Tree. (Cont.)

• Consider the grammar E int | ( E ) | E + E

• And the string 5 + (2 + 3)

• After lexical analysis (a list of tokens) int5 ‘+’ ‘(‘ int2 ‘+’ int3 ‘)’

• During parsing we build a parse tree …

4

Example of Abstract Syntax Tree

• Also captures the nesting structure• But abstracts from the concrete syntax

=> more compact and easier to use

• An important data structure in a compiler

PLUS

PLUS

2 5 3

5

Example of Parse Tree

E

E E

( E )

+

E +

int5

int2

E

int3

• Traces the operation of the parser

• Does capture the nesting structure

• But too much info– Parentheses– Single-successor

nodes

6

Semantic Actions

• This is what we’ll use to construct ASTs

• Each grammar symbol may have attributes– For terminal symbols (lexical tokens) attributes

can be calculated by the lexer

• Each production may have an action– Written as: X Y1 … Yn { action }

– That can refer to or compute symbol attributes

7

Semantic Actions: An Example

• Consider the grammar E int | E + E | ( E )

• For each symbol X define an attribute X.val– For terminals, val is the associated lexeme– For non-terminals, val is the expression’s value (and

is computed from values of subexpressions)

• We annotate the grammar with actions:E int { E.val = int.val } | E1 + E2 { E.val = E1.val + E2.val }

| ( E1 ) { E.val = E1.val }

8

Semantic Actions: An Example (Cont.)

Productions EquationsE E1 + E2 E.val = E1.val + E2.val

E1 int5 E1.val = int5.val = 5

E2 ( E3) E2.val = E3.val

E3 E4 + E5 E3.val = E4.val + E5.val

E4 int2 E4.val = int2.val = 2

E5 int3 E5.val = int3.val = 3

• String: 5 + (2 + 3)

• Tokens: int5 ‘+’ ‘(‘ int2 ‘+’ int3 ‘)’

9

Semantic Actions: Notes

• Semantic actions specify a system of equations– Order of resolution is not specified

• Example: E3.val = E4.val + E5.val

– Must compute E4.val and E5.val before E3.val

– We say that E3.val depends on E4.val and E5.val

• The parser must find the order of evaluation

10

Dependency Graph

E

E1 E2

( E3 )

+

E4+

int5

int2

E5

int3

+

+

2

5

• Each node labeled E has one slot for the val attribute

• Note the dependencies

3

11

Evaluating Attributes

• An attribute must be computed after all its successors in the dependency graph have been computed – In previous example attributes can be

computed bottom-up

• Such an order exists when there are no cycles– Cyclically defined attributes are not legal

12

Semantic Actions: Notes (Cont.)

• Synthesized attributes– Calculated from attributes of descendents in

the parse tree– E.val is a synthesized attribute– Can always be calculated in a bottom-up order

• Grammars with only synthesized attributes are called S-attributed grammars– Most frequent kinds of grammars

13

Semantic Actions :Top-down Approach

• Recursive-descent interpreter

• Consider this grammarS -> E $E -> T E’ E’-> +T E’ E’ -> - T E’ E-> T -> F T’ T’ -> * F T’ T’ -> / F T’ T’ ->F -> id F -> num F -> ( E )

• Needs “type” of non-terminals and tokens

14

Recursive-descent interpreter

int T() { switch (tok.kind) {

case ID: case NUM: case LPAREN

return Tprime( F() );

default:print(“expected ID, NUM, or left-paren”);

skipto(T_follow); return 0; }}

int Tprime(int a) {switch (tok.kind) {

case TIMES: eat(TIMES); return Tprime(a*F());

case DIVIDE: eat(DIVIDE); return Tprime(a/F());

case PLUS: case MINUS: case RPAREN: case EOF:

return a;

default: /* error handling */ …… }}

15

JavaCC version

• GrammarS -> E $E -> T ( + T | - T)* T -> F ( * F | - F)*F -> id | num | ( E )

Note: E –> T E’ E’ -> + T E’ | - T E’ |

16

JavaCC version –

void Start() :{ int i; }{ i=Exp() <EOF> {System.out.println(i); }}int Exp() :{ int a, i; }{ a=Term() ( “+” i=Term() { a=a+i; } | “-” i=Term() { a=a+i; } )* { return a; }}Int Factor() :{ Token t; int i; }{ t = <IDENTIFIER > {return lookup(t.image); } | t=<INTEGER_LITERAL> {return Integer.parseInt(t.image);} | “(“ i=Exp() “)” {return i; }}

17

Semantic Actions – Reduce and Shift

• We can now illustrate how semantic actions are implemented for LR parsing

• Keep attributes on the stack

• On shift a, push attribute for a on stack• On reduce X

– pop attributes for – compute attribute for X– and push it on the stack

18

Performing Semantic Actions. Example

• Recall the example from previous lecture

E T + E1 { E.val = T.val + E1.val }

| T { E.val = T.val } T int * T1 { T.val = int.val * T1.val }

| int { T.val = int.val }

• Consider the parsing of the string 3 * 5 + 8

19

Performing Semantic Actions. Example

| int * int + int shiftint3 | * int + int shift

int3 * | int + int shift

int3 * int5 | + int reduce T int

int3 * T5 | + int reduce T int * T

T15 | + int shift

T15 + | int shift

T15 + int8 | reduce T int

T15 + T8 | reduce E T

T15 + E8 | reduce E T + E

E23 | accept

20

Inherited Attributes

• Another kind of attribute• Calculated from attributes of parent

and/or siblings in the parse tree

• Example: a line calculator

21

A Line Calculator

• Each line contains an expression E int | E + E• Each line is terminated with the = sign L E = | + E =• In second form the value of previous line

is used as starting value• A program is a sequence of lines P | P L

22

Attributes for the Line Calculator

• Each E has a synthesized attribute val – Calculated as before

• Each L has a synthesized attribute val L E = { L.val = E.val } | + E = { L.val = E.val + L.prev }

• We need the value of the previous line • We use an inherited attribute L.prev

23

Attributes for the Line Calculator (Cont.)

• Each P has a synthesized attribute val – The value of its last line P { P.val = 0 } | P1 L { P.val = L.val;

L.prev = P1.val }

– Each L has an inherited attribute prev

– L.prev is inherited from sibling P1.val

• Example …

24

Example of Inherited Attributes

• val synthesized

• prev inherited

• All can be computed in depth-first order

P

L

+ E3=

E4+

int2

E5

int3

+

+

2

0

3

P

25

Semantic Actions: Notes (Cont.)

• Semantic actions can be used to build ASTs

• And many other things as well– Also used for type checking, code generation,

• Process is called syntax-directed translation– Substantial generalization over CFGs

26

Constructing An AST

• We first define the AST data type– Supplied by us for the project

• Consider an abstract tree type with two constructors:

mkleaf(n)

mkplus(

T1

) =,

T2

=

PLUS

T1 T2

n

27

Constructing a Parse Tree

• We define a synthesized attribute ast – Values of ast values are ASTs– We assume that int.lexval is the value of the

integer lexeme– Computed using semantic actions

E int E.ast = mkleaf(int.lexval) | E1 + E2 E.ast = mkplus(E1.ast, E2.ast)

| ( E1 ) E.ast = E1.ast

28

Parse Tree Example

• Consider the string int5 ‘+’ ‘(‘ int2 ‘+’ int3 ‘)’

• A bottom-up evaluation of the ast attribute: E.ast = mkplus(mkleaf(5), mkplus(mkleaf(2), mkleaf(3))

PLUS

PLUS

2 5 3

29

Review

• We can specify language syntax using CFG

• A parser will answer whether s L(G)• … and will build a parse tree• … which we convert to an AST• … and pass on to the rest of the compiler

30

Abstract Syntax

E -> E + EE -> E – EE -> E * EE -> E / EE -> id E -> num

Abtract Parse Trees : Expression Grammar

31

AST : Node types

public abstract class Exp { public abstract int eval():}public class PlusExp extends Exp { private Exp e1, e2; public PlusExp(Exp a1, Exp a2) { e1=a1; d2=a2; } public int eval() { return e1.eval()+e2.eval(): }}public class Identifier extends Exp {

private String f0; public Indenfifier(String n0) { f0 = n0; } public int eval() { return lookup(f0); }}public class IntegerLiteral extends Exp {

private String f0; public IntegerLiteral(String n0) { f0 = n0; } public int eval() { return Integer.parseInt(f0); }}

32

JavaCC Example for AST construction

Exp Start() :{ Exp e; }{ e=Exp() { return e; }}Exp Exp() :{ Exp e1, e2; }{ e1=Term() ( “+” e2=Term() { e1=new PlusExp(e1,e2); } | “-” e2=Term() { e1=new MinusExp(e1,e2); } )* { return a; }}Exp Factor() :{ Token t; Exp e; }{ t = <IDENTIFIER > {return new Identifier(t.image); } | t=<INTEGER_LITERAL> {return new IntegerLiteral(t.image);} | “(“ e=Exp() “)” {return e; }}

33

Positions

• Must remember the position in the source file– Lexical analysis, parsing and semantic analysis are

not done simultaneously. – Necessary for error reporting

• AST must keep the pos fields, which indicate the position within the original source file.

• Lexer must pass the information to the parser.• Ast node constructors must be augmented to

init the pos fields.

34

JavaCC : Class Token

• Each Token object has the following fields: – int kind;– int beginLine, beginColumn, endLine,

endColumn;– String image;– Token next;– Token specialToken; – static final Token newToken(int ofKind);

• Unfortunately, ….

35

Visitors

• “syntax separate from interpretation “style of programming– Vs. object-oriented style of programming

• “Visitor pattern”– Visitor implements an interpretation.– Visitor object contains a visit method for each

syntax-tree class.– Syntax-tree classes contain “accept” methods.– Visitor calls “accept”(what is your class?).

Then “accept” calls the “visit” of the visitor.

36

Example :Expression Classes

public abstract class Exp { public abstract int accept(Visitor v):}public class PlusExp extends Exp { private Exp e1, e2; public PlusExp(Exp a1, Exp a2) { e1=a1; d2=a2; } public int accept(Visitor v) { return v.visit(this) ; }}public class Identifier extends Exp {

private String f0; public Indenfifier(String n0) { f0 = n0; } public int accept(Visitor v) { return v.visit(this) ; }}public class IntegerLiteral extends Exp {

private String f0; public IntegerLiteral(String n0) { f0 = n0; } public int accept(Visitor v) { return v.visit(this) ; }}

37

An interpreter visitor

public interface Visitor {

public int visit(PlusExp n);

public int visit(Identifier n);

public int visit(IntegerLiteral n);

}

public class Interpreter implements Visitor {

public int visit(PlusExp n) {

return n.e1.accept(this) + n.e2.accept(this);

}

public int visit(Identifier n) {

return looup(n.f0);

}

public int visit(IntegerLiteral n) {

return Integer.parseInt(n.f0);

}

38

Abstract Syntax for MiniJava (I)

Package syntaxtree;

Program(MainClass m, ClassDecList c1)MainClass(Identifier i1, Identifier i2, Statement s)----------------------------abstract class ClassDeclClassDeclSimple(Identifier i, VarDeclList vl, methodDeclList m1)ClassDeclExtends(Identifier i, Identifier j, VarDecList vl, MethodDeclList ml)-----------------------------VarDecl(Type t, Identifier i)MethodDecl(Type t, Identifier I, FormalList fl, VariableDeclList vl, StatementList sl, Exp e)Formal(Type t, Identifier i)

39

Abstract Syntax for MiniJava (II)

abstract class typeIntArrayType()BooleanType()IntegerType()IndentifierType(String s)---------------------------abstract class StatementBlock(StatementList sl)If(Exp e, Statement s1, Statement s2)While(Exp e, Statement s)Print(Exp e)Assign(Identifier i, Exp e)ArrayAssign(Identifier i, Exp e1, Exp e2)-------------------------------------------

40

Abstract Syntax for MiniJava (III)

abstract class ExpAnd(Exp e1, Exp e2) LessThan(Exp e1, Exp e2)Plus(Exp e1, Exp e2) Minus(Exp e1, Exp e2)Times(Exp e1, Exp e2) Not(Exp e)ArrayLookup(Exp e1, Exp e2) ArrayLength(Exp e)Call(Exp e, Identifier i, ExpList el)IntergerLiteral(int i)True() False()IdentifierExp(String s)This()NewArray(Exp e) NewObject(Identifier i)-------------------------------------------------Identifier(Sting s)--list classes-------------------------ClassDecList() ExpList() FormalList() MethodDeclList()StatementLIst() VarDeclList()

41

Syntax Tree Nodes - Details

package syntaxtree;import visitor.Visitor;import visitor.TypeVisitor;

public class Program { public MainClass m; public ClassDeclList cl;

public Program(MainClass am, ClassDeclList acl) { m=am; cl=acl; }

public void accept(Visitor v) { v.visit(this); }

public Type accept(TypeVisitor v) { return v.visit(this); }}

42

ClassDecl.java

package syntaxtree;

import visitor.Visitor;

import visitor.TypeVisitor;

public abstract class ClassDecl {

public abstract void accept(Visitor v);

public abstract Type accept(TypeVisitor v);

}

43

ClassDeclExtends.java

package syntaxtree;import visitor.Visitor;import visitor.TypeVisitor;

public class ClassDeclExtends extends ClassDecl { public Identifier i; public Identifier j; public VarDeclList vl; public MethodDeclList ml; public ClassDeclExtends(Identifier ai, Identifier aj, VarDeclList avl, MethodDeclList aml) { i=ai; j=aj; vl=avl; ml=aml; } public void accept(Visitor v) { v.visit(this); } public Type accept(TypeVisitor v) { return v.visit(this); }}

44

StatementList.java

package syntaxtree;import java.util.Vector;

public class StatementList { private Vector list;

public StatementList() { list = new Vector(); } public void addElement(Statement n) { list.addElement(n); } public Statement elementAt(int i) { return (Statement)list.elementAt(i); } public int size() { return list.size(); }}

45

Package Visitor/visitor.java

package visitor;import syntaxtree.*;

public interface Visitor { public void visit(Program n); public void visit(MainClass n); public void visit(ClassDeclSimple n); public void visit(ClassDeclExtends n); public void visit(VarDecl n); public void visit(MethodDecl n); public void visit(Formal n); public void visit(IntArrayType n); public void visit(BooleanType n); public void visit(IntegerType n); public void visit(IdentifierType n); public void visit(Block n); public void visit(If n); public void visit(While n); public void visit(Print n); public void visit(Assign n); public void visit(ArrayAssign n); public void visit(And n); public void visit(LessThan n); public void visit(Plus n); public void visit(Minus n); public void visit(Times n); public void visit(ArrayLookup n); public void visit(ArrayLength n); public void visit(Call n); public void visit(IntegerLiteral n); public void visit(True n); public void visit(False n); public void visit(IdentifierExp n); public void visit(This n); public void visit(NewArray n); public void visit(NewObject n); public void visit(Not n); public void visit(Identifier n);}

46

X = y.m(1,4+5)

Statement -> AssignmentStatementAssignmentStatement -> Identfier1 “=“ Expression

Identifier1 -> <IDENTIFIER>

Expression -> Expression1 “.” Identifier2 “(“ ( ExpList)? “)”

Expression1 -> IdentifierExp

IdentifierExp -> <IDENTIFIER>Identifier2 -> <IDENTIFIER>

ExpList -> Expression2 ( “,” Expression3 )*

Expression2 -> <INTEGER_LITERAL>

Expression3 -> PlusExp -> Expression “+” Expression -> <INTEGER_LITERAL> , <INTEGER_LITERAL>

47

AST

Statement s -> Assign (Identifier,Exp)

Identifier(“x”) Call(Exp,Identifier,ExpList)

IdentifierExp(“y”) Identifier(“m”)

IntegerLiteral(1)

Plus(Exp,Exp)

IntegerLiteral(4) (IntegerLiteral(5)

ExpList e1

init

add

add

48

MiniJava : Grammar(I)

Program -> MainClass ClassDecl *

Program(MainClass, ClassDeclList)

Program Goal() :

{ MainClass m; ClassDeclList cl = new ClassDeclList();

ClassDecl c;

}

{ m = MainClass() (c = ClassDecl() {cl.addElement(c);})*

<EOF> {return new Program(m,cl)

}

49

MiniJava : Grammar(II)

MainClass -> class id { public static void main ( String [] id )        { Statement } } MainClass(Identifier, VarDeclList)ClassDecl -> class id { VarDecl * MethodDecl * } -> class id extends id { VarDecl* MethodDecl * } ClassDeclSimple(…), ClassDecExtends(…)VarDecl -> Type id ; VarDecl(Type, Identifier)MethodDecl -> public Type id ( FormalList )

       { VarDecl * Statement* return Exp ; } MethodDecl(Type,Identifier,FormalList,VarDeclList StaementList, Exp)

50

MiniJava : Grammar(III)

FormalList -> Type id FormalRest * ->

FormalRest -> , Type id

Type -> int [] ->   boolean ->   int ->   id

51

MiniJava : Grammar(IV)

Statement -> { Statement * }   -> if ( Exp ) Statement else Statement ->  while ( Exp ) Statement   -> System.out.println ( Exp ) ;   -> id = Exp ; ->  id [ Exp ] = Exp ;

ExpList -> Exp ExpRest * ->

 ExpRest -> , Exp

52

MiniJava : Grammar(V)

Exp -> Exp op Exp ->  Exp [ Exp ]   -> Exp . length ->   Exp . Id ( ExpList )   -> INTEGER_LITERAL  -> true -> false  -> id  -> this -> new int [ Exp ] -> new id ( ) ->   ! Exp ->   ( Exp )

53

References

• Andrew W. Appel, Modern Compiler Implementation in Java (2nd Edition), Cambridge University Press, 2002