simplifications of context-free grammars
DESCRIPTION
Simplifications of Context-Free Grammars. A Substitution Rule. Equivalent grammar. Substitute. A Substitution Rule. Substitute. Equivalent grammar. In general:. Substitute. equivalent grammar. Nullable Variables. Nullable Variable:. Removing Nullable Variables. Example Grammar:. - PowerPoint PPT PresentationTRANSCRIPT
1
Simplifications of
Context-Free Grammars
2
A Substitution Rule
bB
aAB
abBcA
aaAA
aBS
Substitute
Equivalentgrammar
aAB
abbcabBcA
aaAA
abaBS
|
|
bB
3
A Substitution Rule
EquivalentgrammarabaAcabbcabBcA
aaAA
aaAabaBS
||
||
aAB
abbcabBcA
aaAA
abaBS
|
|
Substitute aAB
4
In general:
1yB
xBzA
Substitute
zxyxBzA 1|equivalentgrammar
1yB
5
Nullable Variables
:production A
Nullable Variable: A
6
Removing Nullable Variables
Example Grammar:
M
aMbM
aMbS
Nullable variable
7
M
M
aMbM
aMbSSubstitute
abM
aMbM
abS
aMbS
Final Grammar
8
Unit-Productions
BAUnit Production:
(a single variable in both sides)
9
Removing Unit Productions
Observation:
AA
Is removed immediately
10
Example Grammar:
bbB
AB
BA
aA
aAS
11
bbB
AB
BA
aA
aAS
SubstituteBA
bbB
BAB
aA
aBaAS
|
|
12
Remove
bbB
BAB
aA
aBaAS
|
|
bbB
AB
aA
aBaAS
|
BB
13
SubstituteAB
bbB
aA
aAaBaAS
||
bbB
AB
aA
aBaAS
|
14
Remove repeated productions
bbB
aA
aBaAS
|
bbB
aA
aAaBaAS
||
Final grammar
15
Useless Productions
aAA
AS
S
aSbS
aAaaaaAaAAS
Some derivations never terminate...
Useless Production
16
bAB
A
aAA
AS
Another grammar:
Not reachable from S
Useless Production
17
In general:
if wxAyS
then variable is usefulA
otherwise, variable is uselessA
)(GLw
contains only terminals
18
A production is useless if any of its variables is useless
xA
DC
CB
aAA
AS
S
aSbS
Productions
useless
useless
useless
useless
Variables
useless
useless
useless
19
Removing Useless Productions
Example Grammar:
aCbC
aaB
aA
CAaSS
||
20
First: find all variables that can producestrings with only terminals
aCbC
aaB
aA
CAaSS
|| },{ BA
AS
},,{ SBA
Round 1:
Round 2:
21
Keep only the variablesthat produce terminal symbols:
aCbC
aaB
aA
CAaSS
||
},,{ SBA
aaB
aA
AaSS
|
(the rest variables are useless)
Remove useless productions
22
Second:Find all variablesreachable from
aaB
aA
AaSS
|
S A B
Use a Dependency Graph
notreachable
S
23
Keep only the variablesreachable from S
aaB
aA
AaSS
|
aA
AaSS
|
Final Grammar
(the rest variables are useless)
Remove useless productions
24
Removing All
Step 1: Remove Nullable Variables
Step 2: Remove Unit-Productions
Step 3: Remove Useless Variables
25
Normal Formsfor
Context-free Grammars
26
Chomsky Normal Form
Each productions has form:
BCA
variable variable
aAor
terminal
27
Examples:
bA
SAA
aS
ASS
Not ChomskyNormal Form
aaA
SAA
AASS
ASS
Chomsky Normal Form
28
Convertion to Chomsky Normal Form
Example:
AcB
aabA
ABaS
Not ChomskyNormal Form
29
AcB
aabA
ABaS
Introduce variables for terminals:
cT
bT
aT
ATB
TTTA
ABTS
c
b
a
c
baa
a
cba TTT ,,
30
Introduce intermediate variable:
cT
bT
aT
ATB
TTTA
ABTS
c
b
a
c
baa
a
cT
bT
aT
ATB
TTTA
BTV
AVS
c
b
a
c
baa
a
1
1
1V
31
Introduce intermediate variable:
cT
bT
aT
ATB
TTV
VTA
BTV
AVS
c
b
a
c
ba
a
a
2
2
1
1
2V
cT
bT
aT
ATB
TTTA
BTV
AVS
c
b
a
c
baa
a
1
1
32
Final grammar in Chomsky Normal Form:
cT
bT
aT
ATB
TTV
VTA
BTV
AVS
c
b
a
c
ba
a
a
2
2
1
1
AcB
aabA
ABaS
Initial grammar
33
From any context-free grammar(which doesn’t produce )not in Chomsky Normal Form
we can obtain: An equivalent grammar in Chomsky Normal Form
In general:
34
The Procedure
First remove:
Nullable variables
Unit productions
35
Then, for every symbol : a
In productions: replace with a aT
Add production aTa
New variable: aT
36
Replace any production nCCCA 21
with
nnn CCV
VCV
VCA
12
221
11
New intermediate variables: 221 ,,, nVVV
37
Theorem:For any context-free grammar(which doesn’t produce )there is an equivalent grammar in Chomsky Normal Form
38
Observations
• Chomsky normal forms are good for parsing and proving theorems
• It is very easy to find the Chomsky normal form for any context-free grammar
39
Greinbach Normal Form
All productions have form:
kVVVaA 21
symbol variables
0k
40
Examples:
bB
bbBaAA
cABS
||
GreinbachNormal Form
aaS
abSbS
Not GreinbachNormal Form
41
aaS
abSbS
Conversion to Greinbach Normal Form:
bT
aT
aTS
STaTS
b
a
a
bb
GreinbachNormal Form
42
Theorem:For any context-free grammar(which doesn’t produce ) there is an equivalent grammarin Greinbach Normal Form
43
Observations
• Greinbach normal forms are very good for parsing
• It is hard to find the Greinbach normal form of any context-free grammar
44
Compilers
45
Compiler
Program
v = 5;if (v>5) x = 12 + v;while (x !=3) { x = x - 3; v = 10;}......
Add v,v,0cmp v,5jmplt ELSETHEN: add x, 12,vELSE:WHILE:cmp x,3...
Machine Code
46
Lexicalanalyzer parser
Compiler
program machinecode
input output
47
A parser knows the grammarof the programming language
48
ParserPROGRAM STMT_LISTSTMT_LIST STMT; STMT_LIST | STMT;STMT EXPR | IF_STMT | WHILE_STMT | { STMT_LIST }
EXPR EXPR + EXPR | EXPR - EXPR | IDIF_STMT if (EXPR) then STMT | if (EXPR) then STMT else STMTWHILE_STMT while (EXPR) do STMT
49
The parser finds the derivation of a particular input
10 + 2 * 5
Parser
E -> E + E | E * E | INT
E => E + E => E + E * E => 10 + E*E => 10 + 2 * E => 10 + 2 * 5
input
derivation
50
10
E
2 5
E => E + E => E + E * E => 10 + E*E => 10 + 2 * E => 10 + 2 * 5
derivation
derivation tree
E E
E E
+
*
51
10
E
2 5
derivation tree
E E
E E
+
*
mult a, 2, 5add b, 10, a
machine code
52
Parsing
53
grammar
Parserinputstring
derivation
54
Example:
Parser
derivation
S
bSaS
aSbS
SSSinput
?aabb
55
Exhaustive Search
||| bSaaSbSSS
Phase 1:
S
bSaS
aSbS
SSSaabb
All possible derivations of length 1
Find derivation of
56
S
bSaS
aSbS
SSS aabb
57
Phase 2
aSbS
SSS
aabb
SSSS
bSaSSSS
aSbSSSS
SSSSSS
Phase 1
abaSbS
abSabaSbS
aaSbbaSbS
aSSbaSbS
||| bSaaSbSSS
58
Phase 2
SSSS
aSbSSSS
SSSSSS
aaSbbaSbS
aSSbaSbS
Phase 3
aabbaaSbbaSbS
||| bSaaSbSSS
aabb
59
Final result of exhaustive search
Parser
derivation
S
bSaS
aSbS
SSSinput
aabb
aabbaaSbbaSbS
(top-down parsing)
60
Time complexity of exhaustive search
Suppose there are no productions of the form
A
BA
Number of phases for string : w ||2 w
61
Time for phase 1: k
k possible derivations
For grammar with rules k
62
Time for phase 2: 2k
possible derivations2k
63
Time for phase : ||2wk
possible derivations||2wk
||2 w
64
Total time needed for string :w
||22 wkkk
Extremely bad!!!
phase 1 phase 2 phase 2|w|
65
There exist faster algorithmsfor specialized grammars
S-grammar: axA
symbol stringof variables
),( aA appears oncePair
66
S-grammar example:
cS
bSSS
aSS
abccabcSabSSaSS
Each string has a unique derivation
67
In the exhaustive search parsingthere is only one choice in each phase
For S-grammars:
Total time for parsing string :w ||w
Time for a phase: 1
68
For general context-free grammars:
There exists a parsing algorithmthat parses a stringin time
||w3||w
(we will show it in the next class)
69
The CYK Parser
70
The CYK Membership Algorithm
Input:
• Grammar in Chomsky Normal Form G
• String
Output:
find if )(GLw
w
71
The Algorithm
• Grammar :G
bB
ABB
aA
BBA
ABS
• String : w aabbb
Input example:
72
a a b b b
aa ab bb bb
aab abb bbb
aabb abbb
aabbb
aabbb
73
aA
aA
bB
bB
bB
aa ab bb bb
aab abb bbb
aabb abbb
aabbb
bB
ABB
aA
BBA
ABS
74
a A
a A
b B
b B
b B
aa
ab S,B
bb A
bb A
aab abb bbb
aabb abbb
aabbb
bB
ABB
aA
BBA
ABS
75
aA
aA
bB
bB
bB
aa abS,B
bbA
bbA
aabS,B
abbA
bbbS,B
aabbA
abbbS,B
aabbbS,B
bB
ABB
aA
BBA
ABS
76
Therefore: )(GLaabbb
Time Complexity:3||w
The CYK algorithm can be easily converted to a parser(bottom up parser)
Observation: