c sc 473 automata, grammars & languages automata, grammars and languages discourse 04...
DESCRIPTION
C SC 473 Automata, Grammars & Languages 3 Grammars are “Generators” “yields” or “derives in one step” Apply one production to one variable in the string nondeterministic TRANSCRIPT
C SC 473 Automata, Grammars & Languages
Automata, Grammars and Languages
Discourse 04
Context-Free Grammarsand
Pushdown Automata
C SC 473 Automata, Grammars & Languages 2
Backus-Naur Form Grammars (CFGs)• Algol 60, Algol 68—first “block-structured” languages
• Ex: • CF Grammar
<program> ::= <block><statement> ::= s | <block><block> ::= begin <list> end<list> ::= <statement> ; <list> | <statement>begin s ; begin s;s;s end ;s end
P BSS BB LL S LL S
→→→→→→
s
b e;
:G { ;}Σ = s, b, e ,
Nonter
minals
=varia
bles
rules=productions
terminals
V
Start variable “S”
R
C SC 473 Automata, Grammars & Languages 3
Grammars are “Generators”
“yields” or “derives in one step” Apply one production to one variable in the string nondeterministic
P B L⇑
⇒ ⇒ b e
S B L⇒ ⇒b e b e bb e e⇑bse S ⇒
⇑bb e e L
;S L⇓
⇒bb e e L
bbsee
; ;B L L L⇒ ⇒b e bb e e L ; L ⇒bs e L;S L⇓
⇒b e ;S S ⇒b e L
; ;S S L⇓
⇒b e L
C SC 473 Automata, Grammars & Languages 4
• One possible derivation. Variable being rewritten at each stage is underscored
• two choices at each derivation step: Which variable (nonterminal) to be rewritten? Which rule with that variable as LHS to be applied?
• All possible terminal strings obtainable in this way make up L(G)
; ; ;; ; ; ; ; ;; ; ; ; ; ; ;; ; ; ; ; ; ; ; ; ; ;; ; ; ; ; ; ; ;
; ; ; ; ( )
P B L S L S S LS S S S B S S L S
L S L S LL S L L
SL G
⇒ ⇒ ⇒ ⇒⇒ ⇒ ⇒⇒ ⇒ ⇒⇒ ⇒ ⇒⇒ ⇒∴ ∈
b e b e b eb e b e b b e ebs b e e bs b e se bs b e sebs bs e se bs bs e se bs bs s e sebs bs s e se bs bs s se se
bs bs s se se
A Particular Derivation
C SC 473 Automata, Grammars & Languages 5
Why CFGs?• Most natural or artificial (e.g. programming) languages are not regular
• We know that the latter language is not regular, so …
• Ex: C programs
( ) { | 1}n nL G n∗ ∗∩ = ≥b seb se
main(){}, main(){{}}, main(){{{}}},
main(){} main(){} | 1n n n∗ ∗∩ = ≥C
K{ }
C SC 473 Automata, Grammars & Languages 6
Derivation (Parse) TreePB
b L eS L;s S ; L
SBsb eL
S L;s S ; L
s Ss; ; ; ;bs bs s se se
yield/frontier/terminal string =
C SC 473 Automata, Grammars & Languages 7
Derivation (Parse) TreePB
S L;s S ; L
B Ss
S L;s S ; L
s Ss
Lb e
b eL
C SC 473 Automata, Grammars & Languages 8
Derivation (Parse) Tree (cont’d) P
Bb L e
S L;s S ; L
SBsb eL
S L;s S ; L
s Ss
12
3
456
7
8
9
10
11 12
13 14
15
C SC 473 Automata, Grammars & Languages 9
Context-Free Grammar• Defn 2.2: A context-free grammar G is a 4-
tuple
• is a finite set, the variables (nonterminals)• is a finite set disjoint from V, the terminals• is a finite set of rules, of the form
• is the start variable• Ex: strings with balanced parentheses. Formally:
• Ex: informally• Variables = upper case• Terminals = lower case
( , , ),G V R S= ΣV
RΣ
S V∈
0 ( , , ),G V R S= Σ S B={(,)} { }V BΣ = ={ , , ( )}R B B BB B Be= → → →
, , ( )A w A V w V ∗→ ∈ ∈ ∪ Σ
0 : | | ( )G B B BB B Be→ → →
techn
ically
, an o
rdered
pair
(A, w)
C SC 473 Automata, Grammars & Languages 10
Yields & Derives Relations• Defn. The relation yields (derives in 1 step)
is defined as follows: if is a rule in R, then
• Defn: derives in k steps:• Defn: derives:
• In other words:
• Defn: A derivation (of n steps) from is any sequence of strings satisfying:
1 2 1
1 2 1
iff or ( )( ) ( )( )
k
k
u v u v u u uu u u u v
∗−
−
⇒ = ∃ ∃ ∃⇒ ⇒ ⇒ ⇒
LL
, ( )u v V uAv uwv∗∀ ∈ ∪ Σ ⇒
( ) ( )V V∗ ∗⇒ ⊆ ∪ Σ × ∪ Σ
A w→
( )k⇒( )∗⇒
0 1 2 ku u u u⇒ ⇒ ⇒L0u
C SC 473 Automata, Grammars & Languages 11
Language Generated• Defn. The language generated by G is the set
of all terminal strings derived from S:
A partial derivation is one that starts with S and ends in a non-terminal string containing variables in V
• Ex:
Partial: Terminal or terminated:
( ) { | }L G w w S w∗ ∗= ∈ Σ ∧ ⇒
|| |
S aAS aA SbA SS ba
→→
S aAS aSbAS⇒ ⇒
S aAS aSbAS aabASaabbaS aabbaa⇒ ⇒ ⇒
⇒ ⇒
C SC 473 Automata, Grammars & Languages 12
Derivations and Parse Trees• Ex: 0 : | | ( )G B B BB B Be→ → →
( ) (( )) (())(())( ) (())()
( ) () ( )()(( ))() (())()
B BB B B B B BBB BB B B B B
B
⇒ ⇒ ⇒ ⇒⇒ ⇒
⇒ ⇒ ⇒ ⇒⇒ ⇒
leftmost :
rightmost :
B BBB
BB
( BB)
( B )
BB
( BB)
( B )
BB
( BB)
e( B )
BB
( BB)
e
( B )
( B )
BB
( BB)
e
( B )e Notice: completed (terminated) parse
tree is the same for both derivations—thoughthe sequence “grows” differently
C SC 473 Automata, Grammars & Languages 13
Derivation Parse Tree• Proposition 1: For every (terminated or partial) derivation
there is an unique parse tree T with frontier constructible from D.
• Proposition 2: For every parse tree T in G and any traversal order that is top-down (visits parents before children), there is an unique derivation for the frontier of T from S, and it is constructible from T.
• Corollary 3: For every parse tree T in G there is an unique leftmost derivation constructible from T.
Pf: Pre-order traverse T, expanding variables as their nodes are visited.
1 2: nD S u u u⇒ ⇒ ⇒Lnu
C SC 473 Automata, Grammars & Languages 14
Ex: Leftmost DerivationET∗ F
( )T
( )FE
E + T
EE + T
Fx
T
∗ FTFx
x
TFx
Fx|
|( )|
E E T TT T F FF E x
→ +→ ∗→
C SC 473 Automata, Grammars & Languages 15
Ex: Leftmost DerivationET∗ F
( )T
( )FE
E + T
EE + T
Fx
T
∗ FTFx
x
TFx
Fx
1
34
5
6
7
8
2
9
10
1112
1314
151617
1718
Preorder traversal
C SC 473 Automata, Grammars & Languages 18
Syntactic Ambiguity
• 2 distinct parse trees for same terminal string
• 2 distinct leftmost derivations for same terminal string
• Leftmost derivation parse tree 1-to-1
• A CFG is unambiguous wL(G) w has an unique parse tree (unique leftmost derivation)
| |E E E E E a→ + ∗
E
+ E
E E
E
a ∗a a
E
∗ E
a
E
E E+a a
a a a+ ∗
E E E a Ea E E a a Ea a a
⇒ + ⇒ + ⇒+ ∗ ⇒ + ∗ ⇒+ ∗
terminal string =
E E E E E Ea E E a a Ea a a
⇒ ∗ ⇒ + ∗ ⇒+ ∗ ⇒ + ∗ ⇒+ ∗
C SC 473 Automata, Grammars & Languages 19
Ex: Ambiguous Grammar--English
<Sent><NP><VP><NP><N>|<Adj><N><VP><V><Obj>|<V><AdvP><AdvP><Adv>|<AdvP><AdvP><Prep><Obj><Obj><Adj><N><N>fruit | flies | ……
C SC 473 Automata, Grammars & Languages 20
“Fruit flies like a banana”<Sent>
<NP> <VP>
<Adj> <N> <V> <Obj>
fruit flies like<Adj> <N>
a banana<Sent>
<NP> <VP>
<N> <V> <AdvP>
fruit flies<Prep> <Obj>
like <Adj> <N>
a banana
<Sent><NP><VP><NP><N>|<Adj><N><VP><V><Obj>|<V><AdvP><AdvP><Adv>|<AdvP><AdvP><Prep><Obj><Obj><Adj><N><N>fruit | flies | ……
C SC 473 Automata, Grammars & Languages 21
Right Linear Grammars & Regular Languages
• Defn: A CFG is right-linear iff each rule is of one the forms AwB or Aw where A, B are variables and w Σ*
Chomsky (1958) called these “Type 3”• Thm: L is a regular language iff L=L(G) for some right-linear grammar G. There are algorithms for converting from finite automata to right-linear grammars, and conversely.
DFA M
NFA NReg. Expr
ERight-linear
Grammar
G
= conversion algorithm
C SC 473 Automata, Grammars & Languages 22
Right-Linear & Regular (cont’d)• Pf: () Assume L=L(M) where is a DFA. Construct with R having rule if in and rule
if is a final state. Claim: Pf: easy induction on n The proof direction follows since
• Pf: () Assume L=L(G) where is right-linear. Construct NFA where
is a new symbol. has the transition
if in R and transition if
( , , , , )M Q s F= Σ( , , , )G Q R s= Σ
p aq→ p qa p e→p
(p, a1 L an) êMn (q, e) i ff p ⇒ G
n a1 L anq.
(s, w) êM∗ (q, e) i ff s ⇒ G
∗ wq ⇒ w W( , , , )G V R S= Σ
( , , , ,{ })N V S f= Σf V∉ A Bw
A wB→ A fw A w→
C SC 473 Automata, Grammars & Languages 23
Right-Linear & Regular (cont’d)• Claim: Pf: easy induction on n The proof direction follows since
A0 ⇒ G w1A1 ⇒ G w1w2A2 ⇒ G L ⇒ G w1 L wnAni ff(A0, w1 L wn) ê M (A1, w2 L wn) ê M L ê M (An, e)
S ⇒ G
∗ wA ⇒ wx i ff
(s, w) ê M∗ (A, x) ê M (f, e) W
C SC 473 Automata, Grammars & Languages 24
Ex: Right-Linear FA• Ex:
• Ex:
: ||
G S aA bA aB dSB dA
→→→
f A BS abd
ad
1q 2qab b
0q:M a ,a b
0 0 1
1 0 2
2 2 2
( ): | || ||
G M q aq bqq aq bqq aq bq
ee
→→→
( ):N G
“useless” rules—canbe eliminated
C SC 473 Automata, Grammars & Languages 25
Pushdown Automaton• Defn 2.12: A pushdown automaton M is a 6-
tuple
• is a finite set, the states• is a finite, the input alphabet• is a finite set, the stack alphabet•
is the transition function • is the start state• is the set of accept (final)
states
( , , ), , ,M Q s F= Σ GQ
GΣ
: ( )Q Qe e e × Σ × G → × GPs Q∈F Q⊆
{ }{ }
e
e
ee
Σ = Σ ∪G = G ∪
C SC 473 Automata, Grammars & Languages 26
PushDown Automaton
input Σ*
Finite Control
p
1 2 3A A A K
1 2 1ia a a −Kseen to come
1i i na a a+ Kcurrent input symbol
stack G*
Top Bottom (no end-marker supplied)
1 2 3 1 2 3( , , )p aaa A A AK Kconfiguration:
(state, rest of input,Stack )
ia ∈ Σ
iA ∈ G
C SC 473 Automata, Grammars & Languages 27
PDA (cont’d)
Finite Control
s
e
w
( , , )s w econfiguration:
Initially: start state
C SC 473 Automata, Grammars & Languages 28
PDA (cont’d)
Finite Control p
Xa
ax
(p, ax, Xa) ê (q, x, Ya)
configurations:
Transition:
( , ) ( , , )q Y p a X∈
Finite Control q
Ya
x
C SC 473 Automata, Grammars & Languages 29
PDA (cont’d)• Can have e-move: consume no input
• Pop-move: erase top stack symbol
• Push-only move: ignore stack
• Any combination is possible
( , ) ( , , )q Y p X e∈
, ,a X Ye e e= = =
(p, ax, Xa) ê (q, ax, Ya)
( , ) ( , , )q p a Xe ∈ (p, ax, Xa) ê (q, x, a)
( , ) ( , , )q Y p a e∈ (p, ax, Xa) ê (q, x, YXa)
( , ) ( , , )q pe e e∈ (p, ax, Xa) ê (q, ax, Xa)
C SC 473 Automata, Grammars & Languages 30
Finite Control f
a
f F∈
Finally:
configuration:( , , )f e a
PDA (cont’d)
•Defn: recognizes iff for some , and some
•Defn:
(s, w, e) ê M∗ (f, e, a)
f F∈M wa ∗∈G
( ) { : accepts }L M w M w=
C SC 473 Automata, Grammars & Languages 31
Example: PDA• Recognizer for
( , , ), , ,M Q s F= Σ G { , , , }Q s p q f={ , }A BG =
{ , , }a b cΣ =
(s, abbcbba, e) ê ( p, abbcbba, $) ê ( p, bbcbba, A$)
ê L ( p, cbba, BBA$) ê (q, bba, BBA$) ê (q, ba, BA$)
ê L (q, a, A$) ê (q, e, $) ê (f, e, e) ñ
s
{ | { , }}RL wcw w a b ∗= ∈
{ }F f= , $e e → ,a Ae →,b Be →
,c e e→,a A e→,b B e→
p
qf,$e e→
accepts
(s, acb, e) ê ( p, acb, $) ê ( p, cb, A$) ê (q, b, A$) ñdoes not accept
(blocked)
C SC 473 Automata, Grammars & Languages 32
Example: PDA w/ nondeterminism
• Last example (palindromes with center-mark) was a deterministic PDA (DPDA)
• NPDA for
(s, aa, e) ê ( p, aa, $) ê ( p, a, A$) ê ( p, e, AA$)
ê (q, e, AA$) ñ
(s, aa, e) ê ( p, aa, $) ê ( p, a, A$) ê (q, a, A$)ê (q, e, $) ê (f, e, e) ñ
s{ | { , }}
{ | { , } is a palindrome}
RL ww w a b
x x a b
∗
∗
=
=
∈
∈
, $e e → ,a Ae →,b Be →
,e e e→,a A e→,b B e→
p
qf,$e e→
does not accept(blocked) Nondeterministic
“guess”
C SC 473 Automata, Grammars & Languages 33
Example: PDA • Recall well-nested parentheses (()) (()())
(p, w, $) ê∗ ( p, e, $) ⇔
∀ prefi xes x of w, |x|( ≥|x|) ∧ |w|( =|w|)
p
0 : | | ( )G B B BB B Be→ → →
// if ( then +1(, Ae →s , $e e →
f ,$e e→
// if ) then -1),A e→
DPDA!
C SC 473 Automata, Grammars & Languages 34
2{ | 0} { | 0}n n m mL a b n a b m= ≥ ∪ ≥
Example: PDA
,b A e→
, $e e →
, $e e →
,a Ae →
,b A e→
,$e e→
,a Ae →
//push a 2nd A, Ae e →
,b A e→
• “guesses” which pattern• “checks” whether guess is correct• accepts iff $ correct guess that checks
s
(s, e, e) ê 0(s, e, e) ∧s∈F ⇒ e i s recogni ze
C SC 473 Automata, Grammars & Languages 35
CFG PDA• Thm 2.20: A language is CF a PDA recognizes it.
There are algorithms for converting a grammar to an equivalent automaton, and conversely.
• Lemma 2.21: There is an algorithm for constructing, from any CFG G, a PDA M such that L(G) = L(M).
Pf: In constructing a PDA, we can permit, without losing generality, “multi- push” moves such as
where For we may break a multi-push into a sequence of single-push moves by introducing new states:
Henceforth we will allow multi-push moves in our PDAs.
1 2, ta A X X X→ L, iA X ∈ G
, taA X→ 1, tXe e −→ 1, Xe e→tn 2n1tn −
• • •
C SC 473 Automata, Grammars & Languages 36
CFG PDA• Idea: use nondeterminism. Given G, construct PDA P to Load S on stack & simulate a leftmost derivation on the stack:
When a variable symbol A comes to stack top, “guess” a grammar rule Aa , pop A and push a
When a terminal character comes to stack top, compare to next input symbol.
If they match, pop the top and advance the input (“check off”)
If they fail to match, jam (not an accepting computation)
If the input holds a word in L(G) and P guesses the correct leftmost derivation (rules to apply), then all the input characters will be checked off against those at the top of the stack and the stack will empty as the last input is checked off. Otherwise at some point the PDA will jam
C SC 473 Automata, Grammars & Languages 37
CFG PDA (cont’d)• Given construct
States: Input alphabet: Σ Stack alphabet: Start state: Accept states: Transition function:
Initialize stack: Simulate rules: Check off terminals: Detect null stack & accept:
start( , , , , , ):P Q q F= Σ Gstart loop{ , , }acceptQ q q q=
startq
( , , , )G V R S= Σ
accept{ }F q=
{$}V ∪ Σ ∪
start loop( , , ) {( , $)}q q S e e =
loop loop( ) ( , , ) {( , )}A R q A qa e a∀ → ∈ =
loop loop( ) ( , , ) {( , )}a q a a q e∀ ∈ Σ =
loop accept( , ,$) {( , )}q q e e=
,Ae a→,a a e→
startq loopq acceptq, $Se e → ,$e e→
( )a∀ ∈ Σ ( )A Ra∀ → ∈
C SC 473 Automata, Grammars & Languages 38
CFG PDA (cont’d)• Ex:
startq loopq acceptq, $Se e → ,$e e→
0 : | |G S S SS S sSde→ → →
,Se e→,S SSe →,S sSde →
,s s e→,d d e→
S ⇒ L∗ xAa ⇔ (q loop, xu, Σ$) ê
∗ (q loop, u, Aa$)
∴ Σ ⇒ L
∗ w ⇔ (q loop, w, Σ$) ê∗ (q loop, e, e$)
ê (q accept, e, e)
C SC 473 Automata, Grammars & Languages 39
CFG PDA (cont’d)• G • P
(qst ar t, ssddsd, e) ê
LS ⇒ (ql oop, ssddsd, S$) ê
LSS ⇒ (ql oop, ssddsd, SS$) ê
LsSdS ⇒ (ql oop, ssddsd, sSdS$) ê
(ql oop, sddsd, SdS$) ê
LssSddS ⇒ (ql oop, sddsd, sSddS$) ê
(ql oop, ddsd, SddS$) ê
LssddS ⇒ (ql oop, ddsd, ddS$) ê
(ql oop, dsd, dS$) ê
(ql oop, sd, S$) ê
LssddsSd ⇒ (ql oop, sd, sSd$) ê
C SC 473 Automata, Grammars & Languages 40
CFG PDA (cont’d)• G • P
(ql oop, d, Sd) ê
ssddsd (ql oop, d, d$) ê
(ql oop, e, e$) ê
( )ssddsd L G∴ ∈ ( )ssddsd L P∴ ∈
(qst ar t, ssddsd, e) ê∗
(q accept, e, e)LS ssddsd∗⇒
CFG leftmost derivation PDA computation
(ql oop, e, e$) ê
(q accept, e, e)
C SC 473 Automata, Grammars & Languages 41
PDA CFGLemma 2.27: There is an algorithm for constructing, from any PDA P, a CFG G such that L(G) = L(P).
Pf: Given a PDA we can convert it into a PDA with the following simplified structure:• it has only one accept state:
• add e-transitions from multiple accept states
• it empties its stack just before entering the accept state:
•Loop on a state that just pops:
• each PDA transition is either a “pure push” or a “pure pop
- introduce new intermediate states
{ }acceptF q=,
acceptf qe e e→
→,
,pop pop
Xq q X
e e→∈ G→
,a Xe →,a X e→
0( , , , , , )P Q q F= Σ G
C SC 473 Automata, Grammars & Languages 42
PDA CFG (cont’d)
becomes becomes
• Idea of proof: construct G with variables for each p and q in the set of states Q. Arrange that if
generates terminal string x, then PDA P started in state p with an empty stack on input string x has a computation that reaches state q with an empty stack. And conversely, if P started in state p with an empty stack has a computation on input string x that reaches state q with an empty stack, then
How does P, when started on an empty stack in state p, operate on an input string x, ending with an empty stack in state q ? First move must be a push Last move must be a pop
,a X Y→ , ,a X Ye e e→ →,a e e→ , ,a X Xe e e→ →
pqApqA
.pq GA x⇒
,a Xe →,b X e→
C SC 473 Automata, Grammars & Languages 43
PDA CFG (cont’d)
Trace computation of P on x starting in state p with empty stack, and ending in state q with empty stack: (1) stack never empties
pq G rsA aA b⇒
p qa b
rs GA y⇒1 4 4 4 4 42 4 4 4 4 43
input
Stack height
r s
pq GA x⇒1 4 4 4 442 4 4 4 4 43
push X← pop X →
Fig. 1
C SC 473 Automata, Grammars & Languages 44
PDA CFG (cont’d)
Trace computation of P on x starting in state p with empty stack, and ending in state q with empty stack: (2) stack empties somewhere
pq G pr rqA A A⇒
p q
r q GA z⇒1 4 4 4 442 4 4 4 4 43
input
Stack height
r
pr GA y⇒1 4 4 4 442 4 4 4 4 43
Fig. 2
C SC 473 Automata, Grammars & Languages 45
PDA CFG (cont’d)
Construction. Given PDA construct with the following rules in R:
If
then
( , )( ) pq p qp q Q Q A A A∀ ∈ ∀ ∈ → r rr
&
pq rsA aA b→
0( , , , , ,{ })acceptP Q q q= Σ G
0,( , , , )
acceptq qG V R A= Σ
( ) ppp Q A e∀ ∈ →( , , , )( )( , )p q r s Q X a b e∀ ∈ ∀ ∈ G ∀ ∈ Σ
,a Xe →p r,b X e→s q
C SC 473 Automata, Grammars & Languages 46
PDA CFG (cont’d)
Claim 2.30: If then Pf: by induction on a derivation in G length k.Base: k=1. The only derivations of length 1 are and we haveStep: Assume (IH) true for derivations of k steps.
WantClaim true for derivations of k+1 steps. Suppose that . The first derivation
step is either ofthe form orCase . Then
with So IH By construction, since
is a rule of G,
pq G p qA A A⇒ r rpq G rsA aA b⇒
pq GA x⇒ (p, x, e) êP∗ (q, e, e)
kpq GA x⇒
1pp GA e⇒
(p, e, e) êP∗ ( p, e, e)
1kpq GA x+⇒
x ayb=.k
rs GA y⇒ (r , y, e) êP∗ (s, e, e) .
pq rsA aA b→
pq G rsA aA b⇒
C SC 473 Automata, Grammars & Languages 47
PDA CFG (cont’d)
Case . Then with
So IH
Putting these together:
pq G p qA A A⇒ r r
∴ (p, x, e) êP (r, yb, X) êP∗(s, b, X) êP(q, e, e)
x yz=& .k k
pr G rq GA y A z≤ ≤⇒ ⇒
(p, y, e) êP∗ (r, e, e) &(r, z, e) êP
∗ (q, e, e)
(p, yz, e) êP∗ (r, z, e) =(r, z, e) êP
∗ (q, e, e) W
( ) ( , ) ( , , )&( , ) ( , , ).X r X p a q s b X e e ∀ ∈ ∈
C SC 473 Automata, Grammars & Languages 48
PDA CFG (cont’d)
Claim 2.31: If then
Pf: by induction on a computation in P of length k:
Base: k=0. The only computations of length 0 are where x = e. By
construction
Step: Assume (IH) true for computations of k steps. WantClaim true for computations of k+1 steps. Suppose that . Two
cases: either the stackdoes not empty in midst of this computation (Fig. 1) or
itBecomes empty during the computation (Fig. 2). Call
theseCase 1 and Case 2.
.pq GA x∗⇒ (p, x, e) êP
∗ (q, e, e)
1 .pp GA e⇒ (p, x, e) êP
0 ( p, e, e)
(p, x, e) êPk (q, e, e)
(p, x, e) êPk +1 (q, e, e) .
C SC 473 Automata, Grammars & Languages 49
PDA CFG (cont’d)
Case 1: See Fig.1. The symbol X pushed in the 1st moveIs the same as that popped in the last move. Let the
1st
and last moves be governed by the push/pop transitions:
By construction, there is a rule in G
Let x = ayb. Since then
we must have By IH
Then Using we conclude
.pq rsA aA b→
.pq GA ayb x∗⇒ =.rs GA y∗⇒
(r , y, X) êPk−1 (s, e, X)
( , ) ( , , )&( , ) ( , , ).r X p a q s b X e e ∈ ∈
(r , y, e) êPk −1 (s, e, e) .
1pq G rsA aA b⇒
C SC 473 Automata, Grammars & Languages 50
PDA CFG (cont’d)
Case 2: See Fig.2. Let r be the intermediate state where the stack becomes empty. Then
By the IH, and
Since by construction there is a rule in G of the form
then
pq pr rqA A A→
pr GA y∗⇒
( , )y z x yz∃ =
(p, y, e) êP≤k (r, e, e) &(r, z, e) êP
≤k (q, e, e)
.rq GA z∗⇒
.pq G pr rq GA A A yz x∗⇒ ⇒ = W
C SC 473 Automata, Grammars & Languages 51
PDA CFG (cont’d)
Ex:
Rules of G:(1) push-pop pairs (1st kind): # #sf qqA A→
(, Ae →
s#, $e →
f#,$ e→
),A e→
is a well-balanced string of parentheses{ {(, )}( ) }∈ *L P = #w# | w
q
s#, $e →
q
q
(, Ae →
f
q
#,$ e→
q
q),A e→
q
:{ , , }
PQ s q f=
{ ,$}{#,(,)}AG =
Σ =
( )qq qqA A→
C SC 473 Automata, Grammars & Languages 52
PDA CFG (cont’d)Note: If
(p´ unreachable) then (abbreviated ).
Such variables are useless; all rules involving them on left
or right sides can be eliminated as useless productions. For
this grammar(2) Rules of the 2nd Kind (with useless rules
removed—only 10/27 survive) in the order s,q,f:
(p, −, −) ñP∗ ( ′p , −, −)
ss ss ss
sq ss sq
sq sq qq
sf ss sf
sf sq qf
sf sf ff
A A AA A AA A AA A AA A AA A A
→→→→→→
{ | }pp Gx A x∗′ ⇒ = ∅ ppA ′ = ∅
, ,fq qs fsA A A= ∅ = ∅ = ∅
qq qq qq
qf qq qf
qf qf ff
ff ff ff
A A AA A AA A AA A A
→→→→
C SC 473 Automata, Grammars & Languages 53
PDA CFG (cont’d)(2) Rules of the 3rd Kind:
Combining all rules with same LHS:
ss
ff
AAA
eee
→→→
||
# # | | |( )| |
||
ss ss ss
sq ss sq sq qq
sf qq ss sf sq qf sf ff
qq qq qq qq
qf qq qf qf ff
ff ff ff
A A AA A A A AA A A A A A A AA A A AA A A A AA A A
e
e
e
→→→→→→
C SC 473 Automata, Grammars & Languages 54
PDA CFG (cont’d)Simplify: easy to see that
Substituting this into rules:
Eliminate useless rules like
ss
ff
AA
ee
==
|# # | | |( )| |
|
sq sq sq qq
sf qq sf sq qf sf
qq qq qq qq
qf qq qf qf
A A A AA A A A A AA A A AA A A A
e
→→→→
X X→
# # |( )| |
sq sq qq
sf qq sq qf
qq qq qq qq
qf qq qf
A A AA A A AA A A AA A A
e
→→→→
C SC 473 Automata, Grammars & Languages 55
PDA CFG (cont’d)Another kind of useless rule:
generate no terminal strings. Eliminate these variables
any and rules mentioning them. Final simplified grammar
is:
Note: chose to use endmarkers # for clarity, but these could have been e, (input symbols can be anything in ) leading to the familiar grammar
# #( )| |
sf qq
qq qq qq qq
A AA A A A e
→→
,sq qfA A
( )| |sf qq
qq qq qq qq
A AA A A A e
→→
eΣ
C SC 473 Automata, Grammars & Languages 56
Closure Properties Regular Ops. The CFLs are closed under , ×, * Pf: Homework
Intersection. The CFLs are not closed under intersection.
Example: Consider the two CFLs
Then We will later see (CF Pumping Lemma) that this last is not a CFL. �
However, if is regular and is CF, then
is CF.
1 2{ | 0}, { | 0}n n n nL a b c n L a b n c∗ ∗= ⋅ ≥ = ≥ ⋅
1 2 { | 0}.n n nL L a b c n∩ = ≥
1L2L 1 2L L∩
W
C SC 473 Automata, Grammars & Languages 57
Closure Properties (cont’d)• Thm: The class of CFLs is closed under intersection with regular languages.
Pf: Assume and
Construction. Construct a “cross-product pda” M as follows:
where the transition function is defined by:
provided and
Machine M simulates the two given machines “in parallel”,
keeping each machine state in one component of the compound state [ , ].
1 1 1 1 1 1 1( ), ( , , , , , )L L P P Q s F= = Σ G
2 2 2 2 2 2 2( ), ( , , , , )R L M M Q s F= = Σ
1 2 1 2 1, 2 1 2( , , , ,[ ], )M Q Q s s F F= × Σ ∪ Σ G ×
1 2 1 2([ , ], ) ([ , ], , )p p Y q q a X∈
1 1 1( , ) ( , , )p Y q a X∈ 2 2 2( , )p q a=
C SC 473 Automata, Grammars & Languages 58
What is Not Context-Free?• PDA have a limited computing ability. They cannot, for example, recognize repeated strings like w#w or strings that “count” in more than 2 places, such as .
• We will show that some languages are not CF using a CF Pumping Lemma, which gives a property that all CFLs must have. Then, to show that a language L is not CF, we somehow argue that it lacks this pumping property.
• Closure properties of CFLs can sometimes be used to simplify non-CFLs and make a pumping argument easier.
n n na b c
C SC 473 Automata, Grammars & Languages 59
CF Pumping Lemma• Thm [Pumping Lemma for CFLs]. Suppose that
L is an infinite CF language. Then
• For comparison, here is the Regular P.L.:
( )( )[( , , , , )(
( 0) )]i i
p w w L w pu v x y z w uvxyz vy
vxy p i uv xy z L
e∃ ∀ ∈ ∧ ≥ ⇒∃ = ∧ ≠
∧ ≤ ∧ ∀ ≥ ∈
( )( )[( , , )(
( 0) )]i
p w w L w px y z w xyz y
xy p i xy z L
e∃ ∀ ∈ ∧ ≥ ⇒∃ = ∧ ≠
∧ ≤ ∧ ∀ ≥ ∈
C SC 473 Automata, Grammars & Languages 60
CF Pumping Lemma (cont’d)• Pf: Let where CFG G is a CFG in Chomsky Normal Form (Text, Theorem 2.9), i.e. a CFG in which all rules are of the (schematic) forms ABC or Aa (a e). If is “sufficiently long”, then any derivation tree T for w must contain a “long” path—more precisely:
• Claim 1: If the derivation tree T for has no path longer than h then
Pf: Induction on h. Base: h = 1. Only possible tree is
and Step: Assume Let T have all paths and be
of form (in CNF)
( )w L G∈
1| | 2 .hw −≤
1.h > h≤
{ } ( )L L Ge− =
w
0| | 2.a =
S
a
C SC 473 Automata, Grammars & Languages 61
CF Pumping Lemma (cont’d)
• Then have all paths of length By IH,
which implies .
Conversely, if a generated string is at least long,
then its parse tree must be at least high.
G has variables. Choose If
and then Claim 1 any parse tree T for w has
a path of length at least Such a path has at least nodes. ∴ some variable appears twice
on the path (note the leaf node is a terminal).
2 2| | 2 ,| | 2h hs t− −≤ ≤
S
1T 2TT =
s t w st=1 2,T T 1.h≤ −
1| | 2hw −≤2h
1h+
V 12 .Vp += w L∈w p≥
2.V +3V +
C SC 473 Automata, Grammars & Languages 62
CF Pumping Lemma (cont’d)• Picture:
12Vw p +≥ =
T =R
R
height nodesvariables repeat
2 32
V VV
≥ + ⇒ ≥ +
⇒ ≥ + ⇒
R
RR
u v x y zw =
aht 2V≤ +
12V p+≤ =
ChooseBottom
variables
1V +
vxy p∴ ≤
1T
2T
C SC 473 Automata, Grammars & Languages 63
CF Pumping Lemma (cont’d)(1) Center portion is not too long:(2) Pumped portion not empty:
cannot both = e. 1T
2T
vxy p≤,v y
=R
B C
2T
v x yx e≠
R
B C
2T
v x yxe≠
or
In CNF, no variablegenerates e
C SC 473 Automata, Grammars & Languages 64
CF Pumping Lemma (cont’d)(3) Pumped strings in L : the following are
all parse trees
R
R
( 0) i ii uv xy z L∴ ∀ ≥ ∈ W
R
R
R
R
R
R
R
u v x y z
u vv x yy z
u vvv x yyy z
R
u x z
and:
C SC 473 Automata, Grammars & Languages 65
CF Pumping: Applications• Ex: is not a CFL.
Pf: Suppose it is CF. Then the Pumping Lemma $p
wL, |w|p $ uvxyz =w & vy e & i0 u vi x yi z L. Pick p as the constant guaranteed and choose n
p/3 and Where is
Cases: Assume first that
{ | 0}n n nL a b c n= ≥
, .n n nw a b c uvxyz vy e= = ≠ ?vxy.v e≠
| | | || | | || | | || | | || | | || | | |
a ab bc cu v x y z
u v x y zu v xyzu v x y z
u v xyzu v xyz
K K K K K K K K K
C SC 473 Automata, Grammars & Languages 66
CF Pumping: Applications• In cases 1-3 has an imbalance. In case 4 it has
a b before a. In case 5 it has a c before b. In case 6 it has an a after a c. In any case, there is a contradiction to the pumped
word being in L. The case where is symmetric. Contradiction.
Cor. The CFLs are not closed under complementation. Pf: is a CFL.
But is not a CFL.
Therefore cannot be CF. Ex: is not CF. Proof similar to
regular case. Ex: is not CF.
y e≠
2 2uv xy z
{ | }p q rL a b c p q q r p r= ≠ ∨ ≠ ∨ ≠{ | 0}i i iL a b c a b c i∗ ∗ ∗∩ = ≥
L{ | prime}iL a i=
2
{ | 0}iL a i= ≥
C SC 473 Automata, Grammars & Languages 67
Pumping: Applications (cont’d) Ex: is not CF.
Pf: Intersection with is not a CFL.
Therefore cannot be CF. Ex: is not CF.
Pf: By pumping on the word Similar to
Text, Example 2.38. Ex: is not CF.
Pf: Pump on the latter language in a way similar to the previous example to show it is not CF.
L
a b c{ { , , } | = = }L w a b c w w w∗= ∈a b c∗ ∗ ∗
{ |i 1,j 1}i j i jL a b c d= ≥ ≥n n n nw a b c d=
{ | {0,1}}L ww w= ∈10 10 110 10 1 {10 10 110 10 1| , 1}n m n mL m n+ + + +∩ = ≥