note

24
1 Note • As usual, these notes are based on the Sebesta text. • The tree diagrams in these slides are from the lecture slides provided in the instructor resources for the text, and were made by David Garrett.

Upload: jereni

Post on 11-Jan-2016

45 views

Category:

Documents


3 download

DESCRIPTION

Note. As usual, these notes are based on the Sebesta text. The tree diagrams in these slides are from the lecture slides provided in the instructor resources for the text, and were made by David Garrett. Context-free (CF) grammar. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Note

1

Note

• As usual, these notes are based on the Sebesta text.

• The tree diagrams in these slides are from the lecture slides provided in the instructor resources for the text, and were made by David Garrett.

Page 2: Note

2

Context-free (CF) grammar

• A CF grammar is formally presented as a 4-tuple G=(T,NT,P,S), where:– T is a set of terminal symbols (the alphabet)– NT is a set of non-terminal symbols– P is a set of productions (or rules), where

PNT(TNT)*– SNT

Page 3: Note

3

Example 1

L1 = { 0, 00, 1, 11 }

G1 = ( {0,1}, {S}, { S0, S00, S1, S11 }, S )

Page 4: Note

4

Example 2L2 = { the dog chased the dog, the dog chased a

dog, a dog chased the dog, a dog chased a dog, the dog chased the cat, … }

G2 = ( { a, the, dog, cat, chased }, { S, NP, VP, Det, N, V }, { S NP VP, NP Det N, Det a | the, N dog | cat, VP V | VP NP, V chased }, S )

Notes: S = Sentence, NP = Noun Phrase , N = Noun

VP = Verb Phrase, V = Verb, Det = Determiner

Page 5: Note

5

Examples of lexemes and tokens

Lexemes Tokens

foo identifier

i identifier

sum identifier

-3 integer_literal

10 integer_literal

1 integer_literal

; statement_separator

= assignment_operator

Page 6: Note

6

BNF Fundamentals• Sample rules [p. 128]

<assign> → <var> = <expression><if_stmt> → if <logic_expr> then <stmt><if_stmt> → if <logic_expr> then <stmt> else <stmt>

• non-terminals/tokens surrounded by < and >• lexemes are not surrounded by < and >• keywords in language are in bold• → separates LHS from RHS• | expresses alternative expansions for LHS

<if_stmt> → if <logic_expr> then <stmt> | if <logic_expr> then <stmt> else

<stmt>• = is in this example a lexeme

Page 7: Note

7

BNF Rules• A rule has a left-hand side (LHS) and a right-

hand side (RHS), and consists of terminal and nonterminal symbols

• A grammar is often given simply as a set of rules (terminal and non-terminal sets are implicit in rules, as is start symbol)

Page 8: Note

8

Describing Lists

• There are many situations in which a programming language allows a list of items (e.g. parameter list, argument list).

• Such a list can typically be as short as empty or consisting of one item.

• Such lists are typically not bounded.

• How is their structure described?

Page 9: Note

9

Describing lists

• The are described using recursive rules.

• Here is a pair of rules describing a list of identifiers, whose minimum length is one:

<ident_list> ident | ident, <ident_list>

• Notice that ‘,’ is part of the object language

Page 10: Note

10

Example 3

L3 = { 0, 1, 00, 11, 000, 111, 0000, 1111, … }

G3 = ( {0,1}, {S,ZeroList,OneList},

{ S ZeroList | OneList,

ZeroList 0 | 0 ZeroList,

OneList 1 | 1 OneList }, S )

Page 11: Note

11

Derivation of sentences from a grammar

• A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols).

Page 12: Note

12

Example: derivation from G2

• Example: derivation of the dog chased a catS NP VP

Det N VP

the N VP

the dog VP

the dog V NP

the dog chased NP

the dog chased Det N

the dog chased a N

the dog chased a cat

Page 13: Note

13

Example: derivations from G3

• Example: derivation of 0 0 0 0S ZeroList

0 ZeroList 0 0 ZeroList 0 0 0 ZeroList 0 0 0 0

• Example: derivation of 1 1 1S OneList

1 OneList 1 1 OneList 1 1 1

Page 14: Note

14

Observations about derivations

• Every string of symbols in the derivation is a sentential form.

• A sentence is a sentential form that has only terminal symbols.

• A leftmost derivation is one in which the leftmost nonterminal in each sentential form is the one that is expanded.

• A derivation can be leftmost, rightmost, or neither.

Page 15: Note

15

An example grammar

<program> <stmt-list><stmt-list> <stmt>

| <stmt> ; <stmt-list> <stmt> <var> = <expr> <var> a

| b | c | d

<expr> <term> + <term> | <term> - <term>

<term> <var> | const

Page 16: Note

16

A leftmost derivation

<program> => <stmts>

=> <stmt>

=> <var> = <expr>

=> a = <expr>

=> a = <term> + <term>

=> a = <var> + <term>

=> a = b + <term>

=> a = b + const

Page 17: Note

17

Parse tree

• A hierarchical representation of a derivation

<program>

<stmts>

<stmt>

const

a

<var> = <expr>

<var>

b

<term> + <term>

Page 18: Note

18

Parse trees and compilation

• A compiler builds a parse tree for a program (or for different parts of a program).

• If the compiler cannot build a well-formed parse tree from a given input, it reports a compilation error.

• The parse tree serves as the basis for semantic interpretation/translation of the program.

Page 19: Note

19

Extended BNF

• Optional parts are placed in brackets [ ]<proc_call> -> ident [(<expr_list>)]

• Alternative parts of RHSs are placed inside parentheses and separated via vertical bars <term> → <term> (+|-) const

• Repetitions (0 or more) are placed inside braces { }<ident> → letter {letter|digit}

Page 20: Note

20

Comparison of BNF and EBNF• sample grammar fragment expressed in BNF

<expr> <expr> + <term>| <expr> - <term>| <term>

<term> <term> * <factor>| <term> / <factor>| <factor>

• same grammar fragment expressed in EBNF<expr> <term> {(+ | -) <term>}<term> <factor> {(* | /) <factor>}

Page 21: Note

21

Ambiguity in grammars

• A grammar is ambiguous if and only if it generates a sentential form that has two or more distinct parse trees

Page 22: Note

22

An ambiguous grammar for arithmetic expressions

<expr> <expr> <op> <expr> | const<op> / | -

<expr>

<expr> <expr>

<expr> <expr>

<expr>

<expr> <expr>

<expr> <expr>

<op>

<op>

<op>

<op>

const const const const const const- -/ /

<op>

Page 23: Note

23

Disambiguating the grammar

• If we use the parse tree to indicate precedence levels of the operators, we can remove the ambiguity.

• The following rules give / a higher precedence than -

<expr> <expr> - <term> | <term><term> <term> / const| const

<expr>

<expr> <term>

<term> <term>

const const

const/

-

Page 24: Note

24

Links to BNF-style grammars for actual programming languages

Below are some links to grammars for real programming languages. Look at how the grammars are expressed.

– http://www.schemers.org/Documents/Standards/R5RS/– http://www.sics.se/isl/sicstuswww/site/documentation.html

In the ones listed below, find the parts of the grammar that deal with operator precedence.

– http://java.sun.com/docs/books/jls/index.html– http://www.lykkenborg.no/java/grammar/JLS3.html– http://www.enseignement.polytechnique.fr/profs/informatique/Jean-Jacques.L

evy/poly/mainB/node23.html– http://www.lrz-muenchen.de/~bernhard/Pascal-EBNF.html