re-enter chomsky more about grammars. 2 parse trees s a b a aa | a b bb | b consider l = { a m b...
TRANSCRIPT
Re-enter Chomsky
More about grammars
2
Parse trees
S A B
A aA | aB bB | b
Consider L = {am bn | m, n > 0 } (one/more a’s followed by one/more b’s)
Consider the string “aaabb” which is a valid string in L and can be derived from the grammar.
Left-most derivation
S A B
a A B
aa A B
aaa B
aaa bB
aaa bb
Right-most derivation
S A B
A bB
A bb
aA bb
aaA bb
aaa bb
A Parse Tree (for the same string)
S
BA
Bb
b
Aa
a A
a
The leaves of the tree form the input string.
This is a “common” representation where the order of derivation is not explicit—there is no such thing as “left parse tree” or “right parse tree”!
3
Parse trees
(any combination of a’s & b’s that contains at least one b somewhere)
Parse Tree 1
Consider L = { w є {a,b}* | w contains a ‘b’ }
S L b L
L aL | bL | ε
S
Consider the string “abba” which is a valid string in L and can be derived/generated from/by the grammar.
L b L
a L
ε
Parse Tree 2
S
L b L
ε
a La L
b L
εThe grammar can generate the input string in two different ways. In other words, there are two different parse trees for the string. Since it’s unclear as to how exactly the grammar should generate the string, the grammar is said to be ambiguous *. Note that the grammar on the previous slide is not ambiguous.
a b b a
L L
b L
a
εL
a b b a
L L
* This example is based on an observation by Mr. Hui Zhang, a COMPSCI 220 student.
4
Parse trees
S X b Y
has only zero/more a’s first occurrence of b
zero/more a’s and b’s
Consider the string “abba” which is a valid string in L and can be derived/generated from/by the grammar.
S
X b Y
a X
ε
b Y
a
εY
a b b a
X YThere is only one way in which you can “group” the input string this time!
An unambiguous grammar for L = { w є {a,b}* | w contains a ‘b’ }
X a X | εY a Y | b Y | ε
5
Ambiguous grammars
A grammar is said to be ambiguous if it generates some string w є Σ* in more than one way, i.e. if the string has more than one parse tree.
6
What is wrong with ambiguity?
Ambiguous grammars can be undesirable, for instance, in Compiler Design*, where the code generated by the compiler might depend on the particular way in which the input string (a statement in a programming language) is generated. This will be demonstrated in the examples that follow.
* Grammars are used to describe the syntax of statements in a programming language.
7
Grammar for IF-ELSE statement(i) IF (condition) Statement 1;
Statement 2;
(ii) IF (condition) Statement 1;
ELSE Statement 2;
Statement 3;
Statement 1
Statement 2
yes
no
Iscondition
true?Statement 2
Statement 3
no
Iscondition
true?
yes
Statement 1
Statement IF_statement | …
IF_statement if (Cond) Statement
IF_statement if (Cond) Statement else Statement
8
Statement IF_statement | …IF_statement if (Cond) Statement
IF_statement if (Cond) Statement else Statement
IF ( C1 ) S1;
ELSE
IF ( C2 ) S2;
ELSE
S3;
Statement
IF_Statement
if ( Cond ) Statement else Statement
C1IF_Statement
if ( Cond ) Statement else Statement
S1
C2 S2 S3
Grammar for IF-ELSE statement
Consider the statement:
Parse tree for the statement
9
IF ( C1 )
IF ( C2 ) S2;
ELSE
S3;
Statement
IF_Statement
if (Cond ) Statement
C1IF_Statement
if ( Cond) Statement else Statement
C2 S2 S3
IF ( C1 )
IF ( C2 ) S2;
ELSE
S3;
?
Statement
IF_Statement
C1IF_Statement
if ( Cond ) Statement
C2 S2
S3
AMBIGUITY !(the same expression can be generated in a different way)
Statementif ( Cond ) Statementelse
Consider the statement:
Parse tree 1 Parse tree 2
Grammar for IF-ELSE statement
10
Grammar for arithmetic expressions
VERSION I
Consider arithmetic expressions with only one or two variables (that use +, -, * only).
e.g. a, a + b, a – b, a * b
E Var
E Var + Var
E Var – Var
E Var * Var
11
VERSION II
Consider arithmetic expressions with any number of variables (more
realistic!).e.g. a + b – c * d
E Var
E Var + E
E Var – E
E Var * E
Grammar for arithmetic expressions
Try generating the expression: a * b + c
E
b Var
aWrong grouping!(wrong order of precedence)
a * b + c
1
2
+ EVar
* EVar
c
12
VERSION IIITry to generate arithmetic expressions—preserving the order of precedence.
E Var
E E + E
E E – E
E E * E
Grammar for arithmetic expressions
Try generating the same expression again: a * b + c
E
E + E
E * E
Var Var
Var
a b
c
looks OK!
E
E * E
E + E
Var Var
Var
b c
a
AMBIGUITY !(the same expression can be generated in a different way)
Note: Each parse tree conveys a different “meaning”; each of them corresponds to a different code (therefore possibly different results) generated by the compiler.
13
VERSION IV
Try to generate arithmetic expressions—preserving the order of precedence and also avoiding ambiguity.
E E + Term
E E – Term
E Term
Term Term * Var
Term Var
Grammar for arithmetic expressions
Try generating the same expression again: a * b + c
E
E + Term
Term * Var
Var
Var
a
b
c
Term
14
What Context Free Grammars (CFGs) can’t express
{ an bn cn | n, m > 0 }
{ an bm cn dm | n, m > 0 }
{w c w | w є Σ* }
Examples of languages that can’t be generated by CFGs:
15
Four classes of grammarType 3 (Regular grammar)A aA a B
Right side:
(i) a single terminal symbol OR
(ii) a single terminal followed by a single non-terminal
Left side:
a single non-terminal symbol
Type 2 (Context-free grammar)
A α Right side:
no restriction (any string of terminals and non-terminals).
Left side:
a single non-terminal symbol
Type 1 (Context-sensitive grammar)
α β Right, left sides:
no restriction except that length( α ) <= length ( β )
Type 0 (Phrase-structure grammar)
α β Right, left sides:
no restriction at all!
A language is said to be type i (i = 0, 1, 2, 3) if it can be specified by type i grammar and cannot specified by type (i +1) grammar.
16
Converting FSA into equivalent grammar
i
b
a j
a
b L = {strings of a’s and b’s—with at least one ‘a’}
(i) For an a-transition from state i to state j, generate the production rule: A i a A j
Rules:
(i) For the final state f, generate the production rule: A f ε
A i a A j
A j ε
A i b A i
A j a A j
A j b A j
Grammar rules that generate L
Any given FSA can be mechanically converted into grammar rules that generate the exactly the same language recognized by the FSA.