compiler construction parsing ii ran shaham and ohad shacham school of computer science tel-aviv...
Post on 22-Dec-2015
223 views
TRANSCRIPT
![Page 1: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/1.jpg)
Compiler Construction
Parsing II
Ran Shaham and Ohad ShachamSchool of Computer Science
Tel-Aviv University
![Page 2: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/2.jpg)
22
Administration
Forum https://forums.cs.tau.ac.il/viewforum.php?f=64
Submit only source files Add +1 to yyline Please read IC Spec carefully
No -- No ++ Class identifier starts with Upper case letter Other identifiers starts with lower case letter
![Page 3: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/3.jpg)
33
PA1 submission
Sources only According to the given hierarchy A brief, clear, and concise description of your code
structure and testing strategy Put it in: ~/IC_COMPILER/PA1/ Don’t develop in this directory and try to compile using
ant
![Page 4: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/4.jpg)
44
Compiler
ICProgram
ic
x86 executable
exeLexicalAnalysi
s
Syntax Analysi
s
Parsing
AST Symbol
Tableetc.
Inter.Rep.(IR)
CodeGeneration
IC compiler
![Page 5: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/5.jpg)
55
Parsing
Input: Sequence of Tokens
Output: Abstract Syntax Tree
Decide whether program satisfies syntactic structure
![Page 6: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/6.jpg)
66
From text to abstract syntax5 + (7 * x)
num+(num*id)
Lexical Analyzer
program text
token stream
Parser
Grammar:E id E numE E + EE E * EE ( E ) num(5)
E
E E+
E * E
( E )
num(7) id(x)
+
Num(5)
Num(7) id(x)
*Abstract syntax tree
parse tree
validsyntaxerror
![Page 7: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/7.jpg)
77
Usage
Syntax analysisChecks the input syntax validity
Semantic analysisChecks the input meaning
![Page 8: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/8.jpg)
88
Expression calculator
expr expr + expr| expr - expr| expr * expr| expr / expr| - expr| ( expr )| number
Goals of expression calculator parser:• Is 2+3+4+5 a valid expression?• What is the meaning (value) of this expression?
![Page 9: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/9.jpg)
99
High-level structure
JFlex javacLexerspec
Lexical analyzer
text
tokens
.java
JavaCup javacParserspec
.java Parser
AST
IC.cupLibrary.cup
IC.lex
IC/Parser/sym.java Parser.java LibraryParser.java
IC/Parser/Lexer.java
(Token.java)
![Page 10: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/10.jpg)
1010
Cup
JavaCup javacParserspec
.java Parser
AST
Constructor of Useful Parsers
Automatic LALR(1) parser generator Input: cup spec file Output: Syntax analyzer in Java
tokens
![Page 11: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/11.jpg)
1111
Expression calculator
terminal Integer NUMBER;terminal PLUS, MINUS, MULT, DIV;terminal LPAREN, RPAREN;
non terminal Integer expr;
expr ::= expr PLUS expr| expr MINUS expr| expr MULT expr| expr DIV expr| MINUS expr| LPAREN expr RPAREN| NUMBER
;
Symbol typeexplained later
![Page 12: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/12.jpg)
1212
Ambiguities
a * b + c
a b c
+
*
a b c
*
+
a + b + c
a b c
+
+
a b c
+
+
![Page 13: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/13.jpg)
1313
terminal Integer NUMBER;terminal PLUS,MINUS,MULT,DIV;terminal LPAREN, RPAREN;terminal UMINUS;non terminal Integer expr;
precedence left PLUS, MINUS;precedence left DIV, MULT;precedence left UMINUS;
expr ::= expr PLUS expr| expr MINUS expr| expr MULT expr| expr DIV expr| MINUS expr %prec UMINUS| LPAREN expr RPAREN| NUMBER
;
Expression calculator
Increasing precedence
Contextual precedence
![Page 14: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/14.jpg)
1414
DisambiguationEach terminal assigned with precedence
By default all terminals have lowest precedence User can assign his own precedence
MINUS expr %prec UMINUS
CUP assigns each production a precedence Precedence of last terminal in production
expr MINUS expr User specified contextual precedence
MINUS expr %prec UMINUS
![Page 15: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/15.jpg)
1515
Disambiguation
On shift/reduce conflict resolve ambiguity by comparing precedence of terminal and production and decides whether to shift or reduce
In case of equal precedences left/right help resolve conflicts left means reduce right means shift
More information on precedence declarations in CUP’s manual
![Page 16: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/16.jpg)
1616
Resolving ambiguity
a + b + c
a b c
+
+
a b c
+
+
precedence left PLUS
![Page 17: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/17.jpg)
1717
Resolving ambiguity
a * b + c
a b c
+
*
a b c
*
+
precedence left PLUSprecedence left MULT
![Page 18: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/18.jpg)
1818
Resolving ambiguity
- a * b
a b
*
-
MINUS expr %prec UMINUS
a
-b
*
![Page 19: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/19.jpg)
1919
Resolving ambiguityterminal Integer NUMBER;terminal PLUS,MINUS,MULT,DIV;terminal LPAREN, RPAREN;terminal UMINUS;
precedence left PLUS, MINUS;precedence left DIV, MULT;precedence left UMINUS;
expr ::= expr PLUS expr| expr MINUS expr| expr MULT expr| expr DIV expr| MINUS expr %prec
UMINUS| LPAREN expr RPAREN| NUMBER
;
Rule has precedence of
UMINUS
UMINUS never returnedby scanner
(used only to define precedence)
![Page 20: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/20.jpg)
2020
More CUP directives precedence nonassoc NEQ
Non-associative operators: < > == != etc. 1<2<3 identified as an error (semantic error?) 6 == 7 == 8 == 9
start non-terminal Specifies start non-terminal other than first non-terminal Can change to test parts of grammar
Getting internal representation Command line options:
-dump_grammar -dump_states -dump_tables -dump
![Page 21: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/21.jpg)
2121
CUP API
Link on the course web page to API Parser extends java_cup.runtime.lr_parser
Various methods to report syntax errors, e.g., override syntax_error(Symbol cur_token)
![Page 22: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/22.jpg)
2222
import java_cup.runtime.*;%%%cup%eofval{ return new Symbol(sym.EOF);%eofval}NUMBER=[0-9]+%%<YYINITIAL>”+” { return new Symbol(sym.PLUS); }<YYINITIAL>”-” { return new Symbol(sym.MINUS); }<YYINITIAL>”*” { return new Symbol(sym.MULT); }<YYINITIAL>”/” { return new Symbol(sym.DIV); }<YYINITIAL>”(” { return new Symbol(sym.LPAREN); }<YYINITIAL>”)” { return new Symbol(sym.RPAREN); }<YYINITIAL>{NUMBER} {
return new Symbol(sym.NUMBER, new Integer(yytext()));}<YYINITIAL>\n { }<YYINITIAL>. { }
Parser gets terminals from the scanner
Scanner integrationGenerated from token
declarations in .cup file
![Page 23: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/23.jpg)
2323
Recap
Package and import specifications and user code components
Symbol (terminal and non-terminal) listsDefine building-blocks of the grammar
Precedence declarationsMay help resolve conflicts
The grammarMay introduce conflicts that have to be resolved
![Page 24: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/24.jpg)
2424
Assigning meaning
So far, only validationAdd Java code implementing semantic actions
expr ::= expr PLUS expr| expr MINUS expr| expr MULT expr| expr DIV expr| MINUS expr %prec UMINUS| LPAREN expr RPAREN| NUMBER
;
![Page 25: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/25.jpg)
2525
Symbol labels used to name variables RESULT names the left-hand side symbol
expr ::= expr:e1 PLUS expr:e2{: RESULT = new Integer(e1.intValue() + e2.intValue()); :}| expr:e1 MINUS expr:e2{: RESULT = new Integer(e1.intValue() - e2.intValue()); :}| expr:e1 MULT expr:e2{: RESULT = new Integer(e1.intValue() * e2.intValue()); :}| expr:e1 DIV expr:e2{: RESULT = new Integer(e1.intValue() / e2.intValue()); :}| MINUS expr:e1{: RESULT = new Integer(0 - e1.intValue(); :} %prec UMINUS| LPAREN expr:e1 RPAREN{: RESULT = e1; :}| NUMBER:n {: RESULT = n; :};
Assigning meaning
![Page 26: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/26.jpg)
2626
Building an AST
More useful representation of syntax treeLess clutterActual level of detail depends on your design
Basis for semantic analysisLater annotated with various information
Type informationComputed values
![Page 27: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/27.jpg)
2727
Parse tree vs. AST
+
expr
1 2 + 3
expr
expr
( ) ( )
expr
expr
1 2
+
3
+
![Page 28: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/28.jpg)
2828
AST construction
AST Nodes constructed during parsingStored in push-down stack
Bottom-up parserGrammar rules annotated with actions for
AST constructionWhen node is constructed all children
available (already constructed)Node (RESULT) pushed on stack
![Page 29: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/29.jpg)
2929
1 + (2) + (3)
expr + (expr) + (3)
+
expr
1 2 + 3
expr
expr + (3)
expr
) ( ) (
expr + (expr)
expr
expr
expr
expr + (2) + (3)
int_const
val = 1
pluse1 e2
int_const
val = 2
int_const
val = 3
pluse1 e2
expr ::= expr:e1 PLUS expr:e2 {: RESULT = new plus(e1,e2); :} | LPAREN expr:e RPAREN {: RESULT = e; :} | INT_CONST:i {: RESULT = new int_const(…, i); :}
AST construction
![Page 30: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/30.jpg)
3030
terminal Integer NUMBER;terminal PLUS,MINUS,MULT,DIV,LPAREN,RPAREN,SEMI;terminal UMINUS;non terminal Integer expr;non terminal expr_list, expr_part; precedence left PLUS, MINUS;precedence left DIV, MULT;precedence left UMINUS;
expr_list ::= expr_list expr_part | expr_part
; expr_part ::= expr:e {: System.out.println("= " + e); :} SEMI
; expr ::= expr PLUS expr
| expr MINUS expr| expr MULT expr| expr DIV expr| MINUS expr %prec UMINUS| LPAREN expr RPAREN| NUMBER
;
Designing an AST
![Page 31: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/31.jpg)
3131
Designing an AST
Rules of thumbInterfaces or abstract classes for non-terminals
with alternativesClass for each non-terminal or group of related
non-terminals with similar functionalityRemember - bottom-up
When constructing a node children nodes already constructed
but parent not constructed yet
![Page 32: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/32.jpg)
3232
Designing an AST
expr_list ::= expr_list expr_part | expr_part
;
expr_part ::= expr SEMI ;
expr ::= expr PLUS expr| expr MINUS expr| expr MULT expr| expr DIV expr| MINUS expr %prec UMINUS| LPAREN expr RPAREN| NUMBER
;
ExprProgram
Expr
PlusExpr
MinusExpr
MultExpr
DivExpr
UnaryMinusExpr
ValueExpr
Alternative 2class for each op:Alternative 1:
op typefield of Expr
![Page 33: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/33.jpg)
3333
expr_list ::= expr_list:el expr_part:ep{: RESULT = el.addExpressionPart(ep); :}| expr_part:ep{: RESULT = new ExprProgram(ep); :}
; expr_part ::= expr:e SEMI
{: RESULT = e; :};
expr ::= expr:e1 PLUS expr:e2{: RESULT = new Expr(e1,e2,”PLUS”); :}| expr:e1 MINUS expr:e2{: RESULT = new Expr(e1,e2,”MINUS”); :}| expr:e1 MULT expr:e2{: RESULT = new Expr(e1,e2,”MULT”); :}| expr:e1 DIV expr:e2{: RESULT = new Expr(e1,e2,”DIV”); :}| MINUS expr:e1{: RESULT = new Expr(e1,”UMINUS”); :} %prec UNMINUS| LPAREN expr:e1 RPAREN{: RESULT = e1; :}| NUMBER:n {: RESULT = new Expr(n); :}
;
terminal Integer NUMBER;non terminal Expr expr, expr_part;non terminal ExprProgram expr_list;
Designing an AST
![Page 34: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/34.jpg)
3434
Designing an ASTpublic abstract class ASTNode {
// common AST nodes functionality}
public class Expr extends ASTNode {private int value;private Expr left;private Expr right;private String operator;
public Expr(Integer val) {value = val.intValue();
}public Expr(Expr operand, String op) {
this.left = operand;this.operator = op;
}public Expr(Expr left, Expr right, String op) {
this.left = left;this.right = right;this.operator = op;
}}
![Page 35: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/35.jpg)
3535
Computing meaning
Evaluate expression by AST traversalTraversal for debug printingLater – annotate ASTMore on AST next recitation
![Page 36: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/36.jpg)
3636
PA2
Write parser for ICWrite parser for libic.sigCheck syntax
Emit either “Parsed [file] successfully!”or “Syntax error in [file]: [details]”
-print-ast optionPrints one AST node per line
![Page 37: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/37.jpg)
3737
PA2 – step 1
Understand IC grammar in the manual Don’t touch the keyboard before understanding spec
Write a debug JavaCup spec for IC grammar A spec with “debug actions” : print-out debug
messages to understand what’s going on
Try “debug grammar” on a number of test cases Keep a copy of “debug grammar” spec aroundOptional: perform error recovery
Use JavaCup error token
![Page 38: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/38.jpg)
3838
PA2 – step 2
Flesh out AST class hierarchyDon’t touch the keyboard before you
understand the hierarchyKeep in mind that this is the basis for later
stagesWeb-site contains an AST adapted with
permission from Tovi AlmozlinoChange CUP actions to construct AST
nodes
![Page 39: Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University](https://reader036.vdocument.in/reader036/viewer/2022062421/56649d7b5503460f94a5f618/html5/thumbnails/39.jpg)
3939
Partial example of mainimport java.io.*;import IC.Lexer.Lexer;import IC.Parser.*;import IC.AST.*;
public class Compiler { public static void main(String[] args) { try { FileReader txtFile = new FileReader(args[0]); Lexer scanner = new Lexer(txtFile); Parser parser = new Parser(scanner); // parser.parse() returns Symbol, we use its value ProgAST root = (ProgAST) parser.parse().value; System.out.println(“Parsed ” + args[0] + “ successfully!”); } catch (SyntaxError e) { System.out.print(“Syntax error in ” + args[0] + “: “ + e); }
if (libraryFileSpecified) {... try { FileReader libicFile = new FileReader(libPath); Lexer scanner = new Lexer(libicFile); LibraryParser parser = new LibraryParser(scanner); ClassAST root = (ClassAST) parser.parse().value; System.out.println(“parsed “ + libPath + “ successfully!”); } catch (SyntaxError e) { System.out.print(“Syntax error in “ + libPath + “ “ + e); } } ...