syntax i - university of iowahomepage.cs.uiowa.edu/~tinelli/classes/111/fall08/notes/ch02-i.pdf ·...
TRANSCRIPT
![Page 1: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/1.jpg)
22c:111 Programming Language Concepts - Fall 2008
22c:111 Programming Language Concepts
Fall 2008
Copyright 2007-08, The McGraw-Hill Company and Cesare Tinelli.These notes were originally developed by Allen Tucker, Robert Noonan and modified by Cesare Tinelli. They arecopyrighted materials and may not be used in other course settings outside of the University of Iowa in their current formor modified form without the express written permission of one of the copyright holders. During this course, students areprohibited from selling notes to or being paid for taking notes by any person or commercial firm without the expresswritten permission of one of the copyright holders.
Syntax I
![Page 2: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/2.jpg)
22c:111 Programming Language Concepts - Fall 2008
Contents
2.1 Grammars2.1.1 Backus-Naur Form2.1.2 Derivations2.1.3 Parse Trees2.1.4 Associativity and Precedence2.1.5 Ambiguous Grammars
2.2 Extended BNF2.3 Syntax of a Small Language: Clite
2.3.1 Lexical Syntax2.3.2 Concrete Syntax
2.4 Compilers and Interpreters2.5 Linking Syntax and Semantics
2.5.1 Abstract Syntax2.5.2 Abstract Syntax Trees2.5.3 Abstract Syntax of Clite
![Page 3: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/3.jpg)
22c:111 Programming Language Concepts - Fall 2008
Thinking about Syntax
The syntax of a programming language is a precisedescription of all its grammatically correctprograms.
Precise syntax was first used with Algol 60, and hasbeen used ever since.
Three levels:– Lexical syntax– Concrete syntax– Abstract syntax
![Page 4: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/4.jpg)
22c:111 Programming Language Concepts - Fall 2008
Levels of Syntax
Lexical syntax = all the basic symbols of thelanguage (names, values, operators, etc.)
Concrete syntax = rules for writing expressions,statements and programs.
Abstract syntax = internal representation of theprogram, favoring content over form. E.g.,– C: if ( expr ) ... discard ( )– Ada: if ( expr ) then discard then
![Page 5: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/5.jpg)
22c:111 Programming Language Concepts - Fall 2008
2.1 Grammars
A metalanguage is a language used to define otherlanguages.
A grammar is a metalanguage used to define thesyntax of a language.
Our interest: using grammars to define the syntax ofa programming language.
![Page 6: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/6.jpg)
22c:111 Programming Language Concepts - Fall 2008
2.1.1 Backus-Naur Form (BNF)
• Stylized version of a context-free grammar (cf.Chomsky hierarchy)
• Sometimes called Backus Normal Form• First used to define syntax of Algol 60• Now used to define syntax of most major languages
![Page 7: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/7.jpg)
22c:111 Programming Language Concepts - Fall 2008
BNF Grammar
Set of productions: Pterminal symbols: Tnonterminal symbols: Nstart symbol:
A production has the form
where and!
S " N
!
A " N
!
" # (N$T) *
!
A"#
![Page 8: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/8.jpg)
22c:111 Programming Language Concepts - Fall 2008
Example: Binary Digits
Consider the grammar:binaryDigit → 0binaryDigit → 1
or equivalently:binaryDigit → 0 | 1
Here, | is a metacharacter that separates alternatives.
![Page 9: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/9.jpg)
22c:111 Programming Language Concepts - Fall 2008
2.1.2 Derivations
Consider the grammar:Integer → Digit | Integer DigitDigit → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
We can derive any unsigned integer, like 352, fromthe start symbol Integer in this grammar.
![Page 10: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/10.jpg)
22c:111 Programming Language Concepts - Fall 2008
Derivation of 352 as an Integer
A 6-step process, starting with:
Integer
![Page 11: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/11.jpg)
22c:111 Programming Language Concepts - Fall 2008
Derivation of 352 (step 1)
Use a grammar rule to enable each step:
Integer ⇒ Integer Digit
![Page 12: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/12.jpg)
22c:111 Programming Language Concepts - Fall 2008
Derivation of 352 (steps 1-2)
Replace a nonterminal by a right-hand side of one ofits rules:
Integer ⇒ Integer Digit ⇒ Integer 2
![Page 13: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/13.jpg)
22c:111 Programming Language Concepts - Fall 2008
Derivation of 352 (steps 1-3)
Each step follows from the one before it.
Integer ⇒ Integer Digit ⇒ Integer 2 ⇒ Integer Digit 2
![Page 14: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/14.jpg)
22c:111 Programming Language Concepts - Fall 2008
Derivation of 352 (steps 1-4)
Integer ⇒ Integer Digit ⇒ Integer 2 ⇒ Integer Digit 2 ⇒ Integer 5 2
![Page 15: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/15.jpg)
22c:111 Programming Language Concepts - Fall 2008
Derivation of 352 (steps 1-5)
Integer ⇒ Integer Digit ⇒ Integer 2 ⇒ Integer Digit 2 ⇒ Integer 5 2 ⇒ Digit 5 2
![Page 16: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/16.jpg)
22c:111 Programming Language Concepts - Fall 2008
Derivation of 352 (steps 1-6)
You know you’re finished when there are onlyterminal symbols remaining.
Integer ⇒ Integer Digit ⇒ Integer 2 ⇒ Integer Digit 2 ⇒ Integer 5 2 ⇒ Digit 5 2 ⇒ 3 5 2
![Page 17: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/17.jpg)
22c:111 Programming Language Concepts - Fall 2008
A Different Derivation of 352
Integer ⇒ Integer Digit ⇒ Integer Digit Digit ⇒ Digit Digit Digit ⇒ 3 Digit Digit ⇒ 3 5 Digit ⇒ 3 5 2
This is called a leftmost derivation, since at each stepthe leftmost nonterminal is replaced.(The first one was a rightmost derivation.)
![Page 18: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/18.jpg)
22c:111 Programming Language Concepts - Fall 2008
Notation for Derivations
Integer ⇒* 352Means that 352 can be derived in a finite number of stepsusing the grammar for Integer.
352 ∈ L(G)Means that 352 is a member of the language defined bygrammar G.
L(G) = { ω ∈ T* | Integer ⇒* ω }Means that the language defined by grammar G is the setof all symbol strings ω that can be derived as an Integer.
![Page 19: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/19.jpg)
22c:111 Programming Language Concepts - Fall 2008
2.1.3 Parse Trees
A parse tree is a graphical representation of aderivation.Each internal node of the tree corresponds to a step in the
derivation.Each child of a node represents a right-hand side of a
production.Each leaf node represents a symbol of the derived string,
reading from left to right.
![Page 20: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/20.jpg)
22c:111 Programming Language Concepts - Fall 2008
E.g., The step Integer ⇒ Integer Digitappears in the parse tree as:
Integer
Integer Digit
![Page 21: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/21.jpg)
22c:111 Programming Language Concepts - Fall 2008
Parse Tree for 352 as an IntegerFigure 2.1
![Page 22: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/22.jpg)
22c:111 Programming Language Concepts - Fall 2008
Arithmetic Expression Grammar
The following grammar defines the language ofarithmetic expressions with 1-digit integers, addition,and subtraction.
Expr → Expr + Term | Expr – Term | TermTerm → 0 | ... | 9 | ( Expr )
![Page 23: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/23.jpg)
22c:111 Programming Language Concepts - Fall 2008
Parse of the String 5-4+3Figure 2.2
![Page 24: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/24.jpg)
22c:111 Programming Language Concepts - Fall 2008
2.1.4 Associativity and PrecedenceA grammar can be used to define associativity and
precedence among the operators in an expression.E.g., + and - are left-associative operators in mathematics;
* and / have higher precedence than + and - .
Consider the more interesting grammar G1:Expr → Expr + Term | Expr – Term | TermTerm → Term * Factor | Term / Factor |
Term % Factor | FactorFactor → Primary ** Factor | PrimaryPrimary → 0 | ... | 9 | ( Expr )
![Page 25: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/25.jpg)
22c:111 Programming Language Concepts - Fall 2008
Parse of 4**2**3+5*6+7 for Grammar G1Figure 2.3
![Page 26: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/26.jpg)
22c:111 Programming Language Concepts - Fall 2008
Precedence Associativity Operators3 right **2 left * / %1 left + -
Note: These relationships are shown by the structureof the parse tree: highest precedence at the bottom,and left-associativity on the left at each level.
Associativity and Precedence for Grammar G1Table 2.1
![Page 27: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/27.jpg)
22c:111 Programming Language Concepts - Fall 2008
2.1.5 Ambiguous GrammarsA grammar is ambiguous if one of its strings has two or
more different parse trees (equivalently, if it has twoor more different left-most derivations).E.g., Grammar G1 above is not ambiguous.
C, C++, and Java have a large number of– operators and– precedence levels
Instead of using a large grammar, we can:– Write a smaller ambiguous grammar, and– Give separate precedence and associativity (e.g., Table 2.1)
![Page 28: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/28.jpg)
22c:111 Programming Language Concepts - Fall 2008
An Ambiguous Expression Grammar G2
Expr → Expr Op Expr | ( Expr ) | IntegerOp → + | - | * | / | % | **
Notes:– G2 is equivalent to G1. I.e., its language is the same.– G2 has fewer productions and nonterminals than G1.– However, G2 is ambiguous.
![Page 29: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/29.jpg)
22c:111 Programming Language Concepts - Fall 2008
Ambiguous Parse of 5-4+3 Using Grammar G2Figure 2.4
![Page 30: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/30.jpg)
22c:111 Programming Language Concepts - Fall 2008
The Dangling Else
Block → { Statements }Statements → Statements Statement | StatementStatement → Assignment | IfStatement | BlockIfStatement → if ( Expression ) Statement |
if ( Expression ) Statement else Statement
![Page 31: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/31.jpg)
22c:111 Programming Language Concepts - Fall 2008
Example
With which ‘if’ does the following ‘else’ associate
if (x < 0)if (y < 0) y = y - 1;else y = 0;
Answer: either one!
![Page 32: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/32.jpg)
22c:111 Programming Language Concepts - Fall 2008
The Dangling Else AmbiguityFigure 2.5
![Page 33: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/33.jpg)
22c:111 Programming Language Concepts - Fall 2008
Solving the dangling else ambiguity1. Algol 60, C, C++: associate each else with
closest if; use {} or begin…end to override.2. Algol 68, Modula, Ada: use explicit delimiter to
end every conditional (e.g., if…fi)3. Java: rewrite the grammar to limit what can
appear in a conditional:IfThenStatement → if ( Expression ) StatementIfThenElseStatement → if ( Expression ) StatementNoShortIf
else Statement
The category StatementNoShortIf includes allexcept IfThenStatement.
![Page 34: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/34.jpg)
22c:111 Programming Language Concepts - Fall 2008
2.2 Extended BNF (EBNF)
BNF:– recursion for iteration– nonterminals for grouping
EBNF: additional metacharacters– { } for a series of zero or more
– ( ) for a list, must pick one
– [ ] for an optional list; pick none or one
![Page 35: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/35.jpg)
22c:111 Programming Language Concepts - Fall 2008
EBNF Examples
Expression is a list of one or more Terms separated byoperators + and -Expression → Term { ( + | - ) Term }IfStatement → if ( Expression ) Statement [ else Statement ]
C-style EBNF lists alternatives vertically and uses opt tosignify optional parts. E.g.,IfStatement:
if ( Expression ) Statement ElsePartoptElsePart:
else Statement
![Page 36: Syntax I - University of Iowahomepage.cs.uiowa.edu/~tinelli/classes/111/Fall08/Notes/ch02-I.pdf · 22c:111 Programming Language Concepts - Fall 2008 Contents ... and Java have a large](https://reader035.vdocument.in/reader035/viewer/2022062907/5a9e47007f8b9a0d7f8c89ac/html5/thumbnails/36.jpg)
22c:111 Programming Language Concepts - Fall 2008
EBNF to BNF
We can always rewrite an EBNF grammar as a BNFgrammar. E.g.,
A → x { y } zcan be rewritten:
A → x A' zA' → ε | y A'
(Rewriting EBNF rules with ( ), [ ] is left as an exercise.)
While EBNF is no more powerful than BNF, its rules areoften simpler and clearer.