abstract syntax mooly sagiv schrierber 317 03-640-7606 wed 10:00-12:00...

19
Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://www.math.tau.ac.il/~msagiv/ courses/wcc.html

Upload: eileen-wheeler

Post on 04-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

Abstract Syntax

Mooly Sagiv

Schrierber 31703-640-7606

Wed 10:00-12:00

html://www.math.tau.ac.il/~msagiv/courses/wcc.html

Page 2: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

Outline• The general idea

• Bison

• Motivating example Interpreter for arithmetic expressions

• The need for abstract syntax

• Abstract syntax for Straight-line code

• Abstract syntax for Tiger (Targil)

Page 3: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

Semantic Analysis during Recursive Descent Parsing

• Scanner returns “semantic values” for some tokens

• The function of every non-terminal returns the “corresponding subtree value”

• When A ::= B C D is appliedthe function for A can use the values returned by B, C, and D– The function can also pass parameters, e.g., to

D(), reflecting left contexts

Page 4: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

E ::= num E’

E’ ::=empty-string

E’ := + num E’

int E() {

swith(tok) {

case num : temp=tok.val; eat(num);

return EP(temp);

default: error(…); }}

int EP(int left) {

swith(tok) {

case $: return left;

case + : eat(+);

temp=tok.val; eat(num);

return EP(left + temp);

default: error(…) ;}}

Page 5: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

Semantic Analysis during Bottom-Up Parsing

• Scanner returns “semantic values” for some tokens

• Use parser stack to store the “corresponding subtree values”

• When A ::= B C D is reducedthe function for A can use the values returned by B, C, and D

• No action in the middle of the rule

Page 6: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

E ::= E + num

E::= num

Example

7

E

+

5

num

12

E

Page 7: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

Bison Specification

Declarations

%%

Productions

%%

C -Routines

Page 8: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

Interpreter (in Bison)%{ declarations of yylex()

and yyeror() %}%union {

int num; string id;}%token <id> ID%token <num> NUM%type <num> e f t%start e%%

e : e ‘+’ t {$$ = $1 + $3 ; } | e ‘-’ t { $$ = $1 - $3 ; } | t { $$ = $1; } ; t : t ‘*’ f { $$ = $1 * $3; } | t ‘/’ f { $$= $1 / $3; } | f { $$ = $1; } ; f : NUM { $$ = $1; } | ID { $$ = lookup($1); } | ‘-’ e { $$ = - $2; } | ‘(‘ e ‘)’ { $$ = $2; } ;

Page 9: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

Interpreter (compact spec.)%{ declarations of yylex()

and yyeror()%}%union {

int num; string id;}%token <id> ID%token <num> NUM%type <num> e%start e%left PLUS MINUS%left MUL DIV%right UMINUS%%

e : e PLUS e {$$ = $1 + $3 ; } | e MINUS e { $$ = $1 - $3 ; } | e MUL e { $$ = $1 * $3; } | e DIV e { $$= $1 / $3; } | NUM | ID { $$ = lookup($1); } | MINUS e % prec UMINUS { $$ = - $2; } | ‘(‘ e ‘)’ { $$ = $2; } ;

Page 10: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

stack input action

$ 7+11+17$ shift

num 7

$

+11+17$ reduce e ::= num

e 7

$

+11+17$ shift

+

e 7

$

11+17$ shift

num 11

+

e 7

$

+17$ reduce e::= num

Page 11: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

stack input action

e 11

+

e 7

$

+17$ reduce e :=: e+e

e 18

$

+17$ shift

+

e 18

$

17$ shift

num 17

+

e 18

$

$ reduce e::= num

Page 12: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

stack input action

e 17

+

e 18

$

$ reduce e::= e+e

e 35

$

$ accept

Page 13: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

So why can’t we write

all the compiler code

in Bison?

Page 14: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

%{typdef struct table *Table_ ;typedef Table_ struct {string id, int value, Table _tail}Table_ Table(string id, int value, struct table *tail);Table_ table=NULL;int lookup(Table_ table, string id) { assert(table!=NULL) if (id==table.id) return table.value; else return lookup(table.tail, id)}void update(Table_ *tabptr, string id, int value) { *tabptr = Table(id, value, *tabptr);}%}%union {int num; string id;}%token <num> INT%token <id> ID%token ASSIGN PRINT LPAREN RPAREN%type <num> exp%left SEMICOLUMN COMMA%left PLUS MINUS%left TIMES DIV%start prog%%

prog : stm ;stm: stm SEMICOLUMN stm | ID ASSIGN exp

{update(&table, $1, $3); } | PRINT LPAREN exps RPAREN

{printf(“\n”); } ;exps : exp

{printf(“%d”, $1) ;} | exps COMMA exp

{printf(“%d”, $3) ; exp : INT {$$=$1;} | ID {$$=lookup(table, $1);} | exp PLUS exp { $$ = $1 + $3; } | exp MINUS exp { $$= $1 - $3; } | exp TIMES exp { $$ = $1 * $3;} | exp DIV exp { $$ = $1 / $3; } | stm COMMA exp { $$ =$3;} | ‘(‘ exp ‘)’ { $$ = $2; }

Page 15: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

Historical Perspective• Originally parsers were written w/o tools

• yacc, bison, ... make tools acceptable

• But it is still difficult to write compilers in parser actions (top-down and bottom-up)– Natural grammars are ambiguous– No modularity principle– Many useful programming language features

prevent code generation while parsing• Use before declaration

• gotos

Page 16: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

Abstract Syntax• Intermediate program representation

• Defines a tree - Preserves program hierarchy

• Generated by the parser

• Declared using an (ambiguous) context free grammar (relatively flat) Not meant for parsing

• Keywords and punctuation symbols are not stored (Not relevant once the tree exists)

• Big programs can be also handled (possibly via virtual memory)

Page 17: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

Abstract Syntax for Straight-line ProgramStm ::= Stm Stm (CompoundStm)

Stm ::= id Exp (AssignStm)

Stm ::= ExpList (PrintStm)

Exp ::= id (IdExp)

Exp ::=num (NumExp)

Exp ::=Exp Binop Exp (OpExp)

Exp ::=Stm Exp (EseqExp)

ExpList ::=Exp ExpList (PairExpList)

ExpList ::=Exp (LastExpList)

Binop ::=+ (Plus)Binop ::=- (Minus)

Binop ::=*

Binop ::=/

(Times)

(Div)

Page 18: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

%{#include “absyn.h”%}%union {int num; string id; A_stm stm ; A_exp exp ; A_expList expList ;}%token <num> INT%token <id> ID%token ASSIGN PRINT LPAREN RPAREN%type <num> exp%left SEMICOLUMN COMMA%left PLUS MINUS%left TIMES DIV%start prog%%

prog : stm { $$ = $1 ;} ;stm: stm SEMICOLUMN stm

{ $$ = A_CompoundStm($1, $3) ; } | ID ASSIGN exp

{$$ = A_AssignStm($1, $3); } | PRINT LPAREN exps RPAREN

{$$ = A_PrintStm($3); } ;exps : exp

{$$ = A_ExpList($1, NULL) ;} | exps COMMA exp

{$$ = A_ExpList($1, $3) ; exp : INT {$$=A_NumExp($1);} | ID {$$=A_IdExp( $1);} | exp PLUS exp { $$ = A_OpExp($1, A_Plus, $3); } | exp MINUS exp { $$= A_OpExp($1, A_Minus, $3);} | exp TIMES exp { $$ = A_OpExp($1, A_Time, $3);} | exp DIV exp { $$ = A_OpExp($1, A_Div, $3);} | exp COMMA exp { $$ =A_EseqExp($1, $3);} | ‘(‘ exp ‘)’ { $$ = $2; }

Page 19: Abstract Syntax Mooly Sagiv Schrierber 317 03-640-7606 Wed 10:00-12:00 html://msagiv/courses/wcc.html

Summary

• Flex and Bison simplify the task of writing compiler/interpreter front-ends

• Abstract syntax provides a clear interface with other compiler phases– Supports general programming languages

• But the design of an abstract syntax for a given PL may take some time