syntax analysis - wordpress.com · 1. to check syntax (= string recognizer) – and to report...
TRANSCRIPT
Syntax AnalysisSyntax Analysis
1
Syntax Analysis
LexicalAnalyzerLexical
Analyzer
Parserand rest offront-end
Parserand rest offront-end
SourceProgram
Token,tokenval
Get nexttoken
Intermediaterepresentation
Position of a Parserin the Compiler Model
2
LexicalAnalyzer
Parserand rest offront-end
Parserand rest offront-end
Symbol TableSymbol Table
Get nexttoken
Lexical error Syntax errorSemantic error
The Role Of Parser
• A parser implements a C-F grammar• The role of the parser is twofold:1. To check syntax (= string recognizer)
– And to report syntax errors accurately
2. To invoke semantic actions– For static semantics checking, e.g. type checking of
expressions, functions, etc.– For syntax-directed translation of the source code to an
intermediate representation
• A parser implements a C-F grammar• The role of the parser is twofold:1. To check syntax (= string recognizer)
– And to report syntax errors accurately
2. To invoke semantic actions– For static semantics checking, e.g. type checking of
expressions, functions, etc.– For syntax-directed translation of the source code to an
intermediate representation
3
• A parser implements a C-F grammar• The role of the parser is twofold:1. To check syntax (= string recognizer)
– And to report syntax errors accurately
2. To invoke semantic actions– For static semantics checking, e.g. type checking of
expressions, functions, etc.– For syntax-directed translation of the source code to an
intermediate representation
• A parser implements a C-F grammar• The role of the parser is twofold:1. To check syntax (= string recognizer)
– And to report syntax errors accurately
2. To invoke semantic actions– For static semantics checking, e.g. type checking of
expressions, functions, etc.– For syntax-directed translation of the source code to an
intermediate representation
Syntax-Directed Translation
• One of the major roles of the parser is to produce anintermediate representation (IR) of the sourceprogram using syntax-directed translation methods
• Possible IR output:– Abstract syntax trees (ASTs)– Control-flow graphs (CFGs) with triples, three-address
code, or register transfer list notation– WHIRL (SGI Pro64 compiler) has 5 IR levels!
• One of the major roles of the parser is to produce anintermediate representation (IR) of the sourceprogram using syntax-directed translation methods
• Possible IR output:– Abstract syntax trees (ASTs)– Control-flow graphs (CFGs) with triples, three-address
code, or register transfer list notation– WHIRL (SGI Pro64 compiler) has 5 IR levels!
4
• One of the major roles of the parser is to produce anintermediate representation (IR) of the sourceprogram using syntax-directed translation methods
• Possible IR output:– Abstract syntax trees (ASTs)– Control-flow graphs (CFGs) with triples, three-address
code, or register transfer list notation– WHIRL (SGI Pro64 compiler) has 5 IR levels!
• One of the major roles of the parser is to produce anintermediate representation (IR) of the sourceprogram using syntax-directed translation methods
• Possible IR output:– Abstract syntax trees (ASTs)– Control-flow graphs (CFGs) with triples, three-address
code, or register transfer list notation– WHIRL (SGI Pro64 compiler) has 5 IR levels!
Error Handling
• A good compiler should assist in identifying andlocating errors– Lexical errors: important, compiler can easily recover and
continue– Syntax errors: most important for compiler, can almost
always recover– Static semantic errors: important, can sometimes recover– Dynamic semantic errors: hard or impossible to detect at
compile time, runtime checks are required– Logical errors: hard or impossible to detect
• A good compiler should assist in identifying andlocating errors– Lexical errors: important, compiler can easily recover and
continue– Syntax errors: most important for compiler, can almost
always recover– Static semantic errors: important, can sometimes recover– Dynamic semantic errors: hard or impossible to detect at
compile time, runtime checks are required– Logical errors: hard or impossible to detect
5
• A good compiler should assist in identifying andlocating errors– Lexical errors: important, compiler can easily recover and
continue– Syntax errors: most important for compiler, can almost
always recover– Static semantic errors: important, can sometimes recover– Dynamic semantic errors: hard or impossible to detect at
compile time, runtime checks are required– Logical errors: hard or impossible to detect
• A good compiler should assist in identifying andlocating errors– Lexical errors: important, compiler can easily recover and
continue– Syntax errors: most important for compiler, can almost
always recover– Static semantic errors: important, can sometimes recover– Dynamic semantic errors: hard or impossible to detect at
compile time, runtime checks are required– Logical errors: hard or impossible to detect
Viable-Prefix Property
• The viable-prefix property of LL/LR parsersallows early detection of syntax errors– Goal: detection of an error as soon as possible
without further consuming unnecessary input– How: detect an error as soon as the prefix of the
input does not match a prefix of any string in thelanguage
• The viable-prefix property of LL/LR parsersallows early detection of syntax errors– Goal: detection of an error as soon as possible
without further consuming unnecessary input– How: detect an error as soon as the prefix of the
input does not match a prefix of any string in thelanguage
6
• The viable-prefix property of LL/LR parsersallows early detection of syntax errors– Goal: detection of an error as soon as possible
without further consuming unnecessary input– How: detect an error as soon as the prefix of the
input does not match a prefix of any string in thelanguage
• The viable-prefix property of LL/LR parsersallows early detection of syntax errors– Goal: detection of an error as soon as possible
without further consuming unnecessary input– How: detect an error as soon as the prefix of the
input does not match a prefix of any string in thelanguage
…for (;)…
…DO 10 I = 1;0…
Error isdetected here
Error isdetected here
Prefix Prefix
Error Recovery Strategies
• Panic mode– Discard input until a token in a set of designated
synchronizing tokens is found• Phrase-level recovery
– Perform local correction on the input to repair the error• Error productions
– Augment grammar with productions for erroneousconstructs
• Global correction– Choose a minimal sequence of changes to obtain a global
least-cost correction
• Panic mode– Discard input until a token in a set of designated
synchronizing tokens is found• Phrase-level recovery
– Perform local correction on the input to repair the error• Error productions
– Augment grammar with productions for erroneousconstructs
• Global correction– Choose a minimal sequence of changes to obtain a global
least-cost correction
7
• Panic mode– Discard input until a token in a set of designated
synchronizing tokens is found• Phrase-level recovery
– Perform local correction on the input to repair the error• Error productions
– Augment grammar with productions for erroneousconstructs
• Global correction– Choose a minimal sequence of changes to obtain a global
least-cost correction
• Panic mode– Discard input until a token in a set of designated
synchronizing tokens is found• Phrase-level recovery
– Perform local correction on the input to repair the error• Error productions
– Augment grammar with productions for erroneousconstructs
• Global correction– Choose a minimal sequence of changes to obtain a global
least-cost correction
Grammars (Recap)
• Context-free grammar is a 4-tupleG = (N, T, P, S) where– T is a finite set of tokens (terminal symbols)– N is a finite set of nonterminals– P is a finite set of productions of the form
where (NT)* N (NT)* and (NT)*– S N is a designated start symbol
• Context-free grammar is a 4-tupleG = (N, T, P, S) where– T is a finite set of tokens (terminal symbols)– N is a finite set of nonterminals– P is a finite set of productions of the form
where (NT)* N (NT)* and (NT)*– S N is a designated start symbol
8
• Context-free grammar is a 4-tupleG = (N, T, P, S) where– T is a finite set of tokens (terminal symbols)– N is a finite set of nonterminals– P is a finite set of productions of the form
where (NT)* N (NT)* and (NT)*– S N is a designated start symbol
• Context-free grammar is a 4-tupleG = (N, T, P, S) where– T is a finite set of tokens (terminal symbols)– N is a finite set of nonterminals– P is a finite set of productions of the form
where (NT)* N (NT)* and (NT)*– S N is a designated start symbol
Notational Conventions Used
• Terminalsa,b,c,… Tspecific terminals: 0, 1, id, +
• NonterminalsA,B,C,… Nspecific nonterminals: expr, term, stmt
• Grammar symbolsX,Y,Z (NT)
• Strings of terminalsu,v,w,x,y,z T*
• Strings of grammar symbols,, (NT)*
• Terminalsa,b,c,… Tspecific terminals: 0, 1, id, +
• NonterminalsA,B,C,… Nspecific nonterminals: expr, term, stmt
• Grammar symbolsX,Y,Z (NT)
• Strings of terminalsu,v,w,x,y,z T*
• Strings of grammar symbols,, (NT)*
9
• Terminalsa,b,c,… Tspecific terminals: 0, 1, id, +
• NonterminalsA,B,C,… Nspecific nonterminals: expr, term, stmt
• Grammar symbolsX,Y,Z (NT)
• Strings of terminalsu,v,w,x,y,z T*
• Strings of grammar symbols,, (NT)*
• Terminalsa,b,c,… Tspecific terminals: 0, 1, id, +
• NonterminalsA,B,C,… Nspecific nonterminals: expr, term, stmt
• Grammar symbolsX,Y,Z (NT)
• Strings of terminalsu,v,w,x,y,z T*
• Strings of grammar symbols,, (NT)*
Derivations (Recap)
• The one-step derivation is defined by A
where A is a production in the grammar• In addition, we define
– is leftmostlm if does not contain a nonterminal– is rightmostrm if does not contain a nonterminal– Transitive closure* (zero or more steps)– Positive closure+ (one or more steps)
• The language generated by G is defined byL(G) = {w T* | S+ w}
• The one-step derivation is defined by A
where A is a production in the grammar• In addition, we define
– is leftmostlm if does not contain a nonterminal– is rightmostrm if does not contain a nonterminal– Transitive closure* (zero or more steps)– Positive closure+ (one or more steps)
• The language generated by G is defined byL(G) = {w T* | S+ w}
10
• The one-step derivation is defined by A
where A is a production in the grammar• In addition, we define
– is leftmostlm if does not contain a nonterminal– is rightmostrm if does not contain a nonterminal– Transitive closure* (zero or more steps)– Positive closure+ (one or more steps)
• The language generated by G is defined byL(G) = {w T* | S+ w}
• The one-step derivation is defined by A
where A is a production in the grammar• In addition, we define
– is leftmostlm if does not contain a nonterminal– is rightmostrm if does not contain a nonterminal– Transitive closure* (zero or more steps)– Positive closure+ (one or more steps)
• The language generated by G is defined byL(G) = {w T* | S+ w}
Derivation (Example)
Grammar G = ({E}, {+,*,(,),-,id}, P, E) withproductions P = E E + E
E E * EE ( E )E - EE id
Grammar G = ({E}, {+,*,(,),-,id}, P, E) withproductions P = E E + E
E E * EE ( E )E - EE id
11
Grammar G = ({E}, {+,*,(,),-,id}, P, E) withproductions P = E E + E
E E * EE ( E )E - EE id
Grammar G = ({E}, {+,*,(,),-,id}, P, E) withproductions P = E E + E
E E * EE ( E )E - EE id
E - E - idE - E - id
E* EE* E
E+ id * id + idE+ id * id + id
Erm E + Erm E + idrm id + idErm E + Erm E + idrm id + id
Example derivations:Example derivations:
E* id + idE* id + id
Chomsky Hierarchy: LanguageClassification
• A grammar G is said to be– Regular if it is right linear where each production is of the
formA w B or A w
or left linear where each production is of the formA B w or A w
– Context free if each production is of the formA
where A N and (NT)*– Context sensitive if each production is of the form A
where A N, ,, (NT)*, || > 0– Unrestricted
• A grammar G is said to be– Regular if it is right linear where each production is of the
formA w B or A w
or left linear where each production is of the formA B w or A w
– Context free if each production is of the formA
where A N and (NT)*– Context sensitive if each production is of the form A
where A N, ,, (NT)*, || > 0– Unrestricted
12
• A grammar G is said to be– Regular if it is right linear where each production is of the
formA w B or A w
or left linear where each production is of the formA B w or A w
– Context free if each production is of the formA
where A N and (NT)*– Context sensitive if each production is of the form A
where A N, ,, (NT)*, || > 0– Unrestricted
• A grammar G is said to be– Regular if it is right linear where each production is of the
formA w B or A w
or left linear where each production is of the formA B w or A w
– Context free if each production is of the formA
where A N and (NT)*– Context sensitive if each production is of the form A
where A N, ,, (NT)*, || > 0– Unrestricted
Chomsky Hierarchy
L(regular) L(context free) L(context sensitive) L(unrestricted)L(regular) L(context free) L(context sensitive) L(unrestricted)
Where L(T) = { L(G) | G is of type T }That is: the set of all languages
generated by grammars G of type T
Where L(T) = { L(G) | G is of type T }That is: the set of all languages
generated by grammars G of type T
13
Where L(T) = { L(G) | G is of type T }That is: the set of all languages
generated by grammars G of type T
Where L(T) = { L(G) | G is of type T }That is: the set of all languages
generated by grammars G of type T
L1 = { anbn | n 1 } is context freeL1 = { anbn | n 1 } is context free
L2 = { anbncn | n 1 } is context sensitiveL2 = { anbncn | n 1 } is context sensitive
Every finite language is regular! (construct a FSA for strings in L(G))Every finite language is regular! (construct a FSA for strings in L(G))
Examples:Examples:
Parsing
• Universal (any C-F grammar)– Cocke-Younger-Kasimi– Earley
• Top-down (C-F grammar with restrictions)– Recursive descent (predictive parsing)– LL (Left-to-right, Leftmost derivation) methods
• Bottom-up (C-F grammar with restrictions)– Operator precedence parsing– LR (Left-to-right, Rightmost derivation) methods
• SLR, canonical LR, LALR
• Universal (any C-F grammar)– Cocke-Younger-Kasimi– Earley
• Top-down (C-F grammar with restrictions)– Recursive descent (predictive parsing)– LL (Left-to-right, Leftmost derivation) methods
• Bottom-up (C-F grammar with restrictions)– Operator precedence parsing– LR (Left-to-right, Rightmost derivation) methods
• SLR, canonical LR, LALR
14
• Universal (any C-F grammar)– Cocke-Younger-Kasimi– Earley
• Top-down (C-F grammar with restrictions)– Recursive descent (predictive parsing)– LL (Left-to-right, Leftmost derivation) methods
• Bottom-up (C-F grammar with restrictions)– Operator precedence parsing– LR (Left-to-right, Rightmost derivation) methods
• SLR, canonical LR, LALR
• Universal (any C-F grammar)– Cocke-Younger-Kasimi– Earley
• Top-down (C-F grammar with restrictions)– Recursive descent (predictive parsing)– LL (Left-to-right, Leftmost derivation) methods
• Bottom-up (C-F grammar with restrictions)– Operator precedence parsing– LR (Left-to-right, Rightmost derivation) methods
• SLR, canonical LR, LALR
Top-Down Parsing
• LL methods (Left-to-right, Leftmost derivation)and recursive-descent parsing
• LL methods (Left-to-right, Leftmost derivation)and recursive-descent parsing
Grammar:E T + TT ( E )T - ET id
Leftmost derivation:Elm T + Tlm id + Tlm id + id
15
• LL methods (Left-to-right, Leftmost derivation)and recursive-descent parsing
Grammar:E T + TT ( E )T - ET id
Leftmost derivation:Elm T + Tlm id + Tlm id + id
E E
T
+
T
idid
E
TT
+
E
T
+
T
id
• Productions of the formA A
| |
are left recursive• When one of the productions in a grammar is
left recursive then a predictive parser loopsforever on certain inputs
• Productions of the formA A
| |
are left recursive• When one of the productions in a grammar is
left recursive then a predictive parser loopsforever on certain inputs
Left Recursion (Recap)
16
• Productions of the formA A
| |
are left recursive• When one of the productions in a grammar is
left recursive then a predictive parser loopsforever on certain inputs
• Productions of the formA A
| |
are left recursive• When one of the productions in a grammar is
left recursive then a predictive parser loopsforever on certain inputs
General Left Recursion EliminationMethod
Arrange the nonterminals in some order A1, A2, …, Anfor i = 1, …, n do
for j = 1, …, i-1 doreplace each
Ai Aj with
Ai 1 | 2 | … | k where
Aj 1 | 2 | … | kenddoeliminate the immediate left recursion in Ai
enddo
Arrange the nonterminals in some order A1, A2, …, Anfor i = 1, …, n do
for j = 1, …, i-1 doreplace each
Ai Aj with
Ai 1 | 2 | … | k where
Aj 1 | 2 | … | kenddoeliminate the immediate left recursion in Ai
enddo
17
Arrange the nonterminals in some order A1, A2, …, Anfor i = 1, …, n do
for j = 1, …, i-1 doreplace each
Ai Aj with
Ai 1 | 2 | … | k where
Aj 1 | 2 | … | kenddoeliminate the immediate left recursion in Ai
enddo
Arrange the nonterminals in some order A1, A2, …, Anfor i = 1, …, n do
for j = 1, …, i-1 doreplace each
Ai Aj with
Ai 1 | 2 | … | k where
Aj 1 | 2 | … | kenddoeliminate the immediate left recursion in Ai
enddo
Immediate Left-Recursion EliminationMethod
Rewrite every left-recursive productionA A
| | | A
into a right-recursive production:A AR
| ARAR AR
| AR|
Rewrite every left-recursive productionA A
| | | A
into a right-recursive production:A AR
| ARAR AR
| AR|
18
Rewrite every left-recursive productionA A
| | | A
into a right-recursive production:A AR
| ARAR AR
| AR|
Rewrite every left-recursive productionA A
| | | A
into a right-recursive production:A AR
| ARAR AR
| AR|
Example Left Recursion Elim.A B C | aB C A | A bC A B | C C | a
A B C | aB C A | A bC A B | C C | a
Choose arrangement: A, B, C
i = 1: nothing to doi = 2, j = 1: B C A | A b
B C A | B C b | a b(imm) B C A BR | a b BR
BR C b BR | i = 3, j = 1: C A B | C C | a
C B C B | a B | C C | ai = 3, j = 2: C B C B | a B | C C | a
C C A BR C B | a b BR C B | a B | C C | a(imm) C a b BR C B CR | a B CR | a CR
CR A BR C B CR | C CR |
i = 1: nothing to doi = 2, j = 1: B C A | A b
B C A | B C b | a b(imm) B C A BR | a b BR
BR C b BR | i = 3, j = 1: C A B | C C | a
C B C B | a B | C C | ai = 3, j = 2: C B C B | a B | C C | a
C C A BR C B | a b BR C B | a B | C C | a(imm) C a b BR C B CR | a B CR | a CR
CR A BR C B CR | C CR |
19
i = 1: nothing to doi = 2, j = 1: B C A | A b
B C A | B C b | a b(imm) B C A BR | a b BR
BR C b BR | i = 3, j = 1: C A B | C C | a
C B C B | a B | C C | ai = 3, j = 2: C B C B | a B | C C | a
C C A BR C B | a b BR C B | a B | C C | a(imm) C a b BR C B CR | a B CR | a CR
CR A BR C B CR | C CR |
i = 1: nothing to doi = 2, j = 1: B C A | A b
B C A | B C b | a b(imm) B C A BR | a b BR
BR C b BR | i = 3, j = 1: C A B | C C | a
C B C B | a B | C C | ai = 3, j = 2: C B C B | a B | C C | a
C C A BR C B | a b BR C B | a B | C C | a(imm) C a b BR C B CR | a B CR | a CR
CR A BR C B CR | C CR |
Left Factoring
• When a nonterminal has two or more productionswhose right-hand sides start with the same grammarsymbols, the grammar is not LL(1) and cannot beused for predictive parsing
• Replace productionsA 1 | 2 | … | n |
withA AR | AR1 | 2 | … | n
• When a nonterminal has two or more productionswhose right-hand sides start with the same grammarsymbols, the grammar is not LL(1) and cannot beused for predictive parsing
• Replace productionsA 1 | 2 | … | n |
withA AR | AR1 | 2 | … | n
20
• When a nonterminal has two or more productionswhose right-hand sides start with the same grammarsymbols, the grammar is not LL(1) and cannot beused for predictive parsing
• Replace productionsA 1 | 2 | … | n |
withA AR | AR1 | 2 | … | n
• When a nonterminal has two or more productionswhose right-hand sides start with the same grammarsymbols, the grammar is not LL(1) and cannot beused for predictive parsing
• Replace productionsA 1 | 2 | … | n |
withA AR | AR1 | 2 | … | n
Predictive Parsing
• Eliminate left recursion from grammar• Left factor the grammar• Compute FIRST and FOLLOW• Two variants:
– Recursive (recursive calls)– Non-recursive (table-driven)
• Eliminate left recursion from grammar• Left factor the grammar• Compute FIRST and FOLLOW• Two variants:
– Recursive (recursive calls)– Non-recursive (table-driven)
21
• Eliminate left recursion from grammar• Left factor the grammar• Compute FIRST and FOLLOW• Two variants:
– Recursive (recursive calls)– Non-recursive (table-driven)
• Eliminate left recursion from grammar• Left factor the grammar• Compute FIRST and FOLLOW• Two variants:
– Recursive (recursive calls)– Non-recursive (table-driven)
FIRST (Revisited)
• FIRST() = { the set of terminals that begin allstrings derived from }
FIRST(a) = {a} if a TFIRST() = {}FIRST(A) = A FIRST() for A PFIRST(X1X2…Xk) =
if for all j = 1, …, i-1 : FIRST(Xj) thenadd non- in FIRST(Xi) to FIRST(X1X2…Xk)
if for all j = 1, …, k : FIRST(Xj) thenadd to FIRST(X1X2…Xk)
• FIRST() = { the set of terminals that begin allstrings derived from }
FIRST(a) = {a} if a TFIRST() = {}FIRST(A) = A FIRST() for A PFIRST(X1X2…Xk) =
if for all j = 1, …, i-1 : FIRST(Xj) thenadd non- in FIRST(Xi) to FIRST(X1X2…Xk)
if for all j = 1, …, k : FIRST(Xj) thenadd to FIRST(X1X2…Xk)
22
• FIRST() = { the set of terminals that begin allstrings derived from }
FIRST(a) = {a} if a TFIRST() = {}FIRST(A) = A FIRST() for A PFIRST(X1X2…Xk) =
if for all j = 1, …, i-1 : FIRST(Xj) thenadd non- in FIRST(Xi) to FIRST(X1X2…Xk)
if for all j = 1, …, k : FIRST(Xj) thenadd to FIRST(X1X2…Xk)
• FIRST() = { the set of terminals that begin allstrings derived from }
FIRST(a) = {a} if a TFIRST() = {}FIRST(A) = A FIRST() for A PFIRST(X1X2…Xk) =
if for all j = 1, …, i-1 : FIRST(Xj) thenadd non- in FIRST(Xi) to FIRST(X1X2…Xk)
if for all j = 1, …, k : FIRST(Xj) thenadd to FIRST(X1X2…Xk)
FOLLOW• FOLLOW(A) = { the set of terminals that can
immediately follow nonterminal A }
FOLLOW(A) =for all (B A ) P do
add FIRST()\{} to FOLLOW(A)for all (B A ) P and FIRST() do
add FOLLOW(B) to FOLLOW(A)for all (B A) P do
add FOLLOW(B) to FOLLOW(A)if A is the start symbol S then
add $ to FOLLOW(A)
• FOLLOW(A) = { the set of terminals that canimmediately follow nonterminal A }
FOLLOW(A) =for all (B A ) P do
add FIRST()\{} to FOLLOW(A)for all (B A ) P and FIRST() do
add FOLLOW(B) to FOLLOW(A)for all (B A) P do
add FOLLOW(B) to FOLLOW(A)if A is the start symbol S then
add $ to FOLLOW(A)
23
• FOLLOW(A) = { the set of terminals that canimmediately follow nonterminal A }
FOLLOW(A) =for all (B A ) P do
add FIRST()\{} to FOLLOW(A)for all (B A ) P and FIRST() do
add FOLLOW(B) to FOLLOW(A)for all (B A) P do
add FOLLOW(B) to FOLLOW(A)if A is the start symbol S then
add $ to FOLLOW(A)
• FOLLOW(A) = { the set of terminals that canimmediately follow nonterminal A }
FOLLOW(A) =for all (B A ) P do
add FIRST()\{} to FOLLOW(A)for all (B A ) P and FIRST() do
add FOLLOW(B) to FOLLOW(A)for all (B A) P do
add FOLLOW(B) to FOLLOW(A)if A is the start symbol S then
add $ to FOLLOW(A)
First Set (2)
S aSeS BB bBeB CC cCeC d
Red : A Blue :
S aSeS BB bBeB CC cCeC d
G0
S aSeS BB bBeB CC cCeC d
G0
First Set (2)
• First (SaSe) = First(a) ={a}• First (SB) = First(B)• First (B bBe) = First(b)={b}• First (B C) = First(C)• First (C cCe) = First(c) ={c}• First (C d) = First(d)={d}
Red : A Blue :
Step 1:• First (SaSe) = First(a) ={a}• First (SB) = First(B)• First (B bBe) = First(b)={b}• First (B C) = First(C)• First (C cCe) = First(c) ={c}• First (C d) = First(d)={d}
S aSeS BB bBeB CC cCeC d
G0
First Set (2)
• First (SaSe) = {a}• First (SB) = First(B)• First (B bBe) = {b}• First (B C) = First(C)• First (C cCe) = {c}• First (C d) = {d}
Red : A Blue :
Step 1:• First (SaSe) = {a}• First (SB) = First(B)• First (B bBe) = {b}• First (B C) = First(C)• First (C cCe) = {c}• First (C d) = {d}
Step First Set
S B C a b c d
Step 1 {a}∪First(B) {b}∪First(C) {c, d}
S aSeS BB bBeB CC cCeC d
G0
First Set (2)
• First (SaSe) = {a}• First (SB) = First(B) = {b}∪First(C)• First (B bBe) = {b}• First (B C) = First(C)• First (C cCe) = {c}• First (C d) = {d}
Red : A Blue :
Step 2:• First (SaSe) = {a}• First (SB) = First(B) = {b}∪First(C)• First (B bBe) = {b}• First (B C) = First(C)• First (C cCe) = {c}• First (C d) = {d}
Step First Set
S B C a b c d
Step 1 {a}∪First(B) {b}∪First(C) {c, d}
S aSeS BB bBeB CC cCeC d
G0
First Set (2)
• First (SaSe) = {a}• First (SB) = {b}∪First(C)• First (B bBe) = {b}• First (B C) = First(C)• First (C cCe) = {c}• First (C d) = {d}
Red : A Blue :
Step 2:• First (SaSe) = {a}• First (SB) = {b}∪First(C)• First (B bBe) = {b}• First (B C) = First(C)• First (C cCe) = {c}• First (C d) = {d}
Step First Set
S B C a b c d
Step 1 {a}∪First(B) {b}∪First(C) {c, d}
Step 2 {a}∪ {b}∪First(C)
S aSeS BB bBeB CC cCeC d
G0
First Set (2)
• First (SaSe) = {a}• First (SB) = {b}∪First(C)• First (B bBe) = {b}• First (B C) = First(C) = {c, d}• First (C cCe) = {c}• First (C d) = {d}
Red : A Blue :
Step 2:• First (SaSe) = {a}• First (SB) = {b}∪First(C)• First (B bBe) = {b}• First (B C) = First(C) = {c, d}• First (C cCe) = {c}• First (C d) = {d}
Step First Set
S B C a b c d
Step 1 {a}∪First(B) {b}∪First(C) {c, d}
Step 2 {a}∪ {b}∪First(C)
S aSeS BB bBeB CC cCeC d
G0
First Set (2)
• First (SaSe) = {a}• First (SB) = {b}∪First(C)• First (B bBe) = {b}• First (B C) = {c, d}• First (C cCe) = {c}• First (C d) = {d}
Red : A Blue :
Step 2:• First (SaSe) = {a}• First (SB) = {b}∪First(C)• First (B bBe) = {b}• First (B C) = {c, d}• First (C cCe) = {c}• First (C d) = {d}
Step First Set
S B C a b c d
Step 1 {a}∪First(B) {b}∪First(C) {c, d}
Step 2 {a}∪ {b}∪First(C) {b}∪{c, d} = {b,c,d} {c, d}
S aSeS BB bBeB CC cCeC d
G0
First Set (2)
• First (SaSe) = {a}• First (SB) = {b}∪First(C) = {b}∪ {c, d}• First (B bBe) = {b}• First (B C) = {c, d}• First (C cCe) = {c}• First (C d) = {d}
Red : A Blue :
Step 3:• First (SaSe) = {a}• First (SB) = {b}∪First(C) = {b}∪ {c, d}• First (B bBe) = {b}• First (B C) = {c, d}• First (C cCe) = {c}• First (C d) = {d}
Step First Set
S B C a b c d
Step 1 {a}∪First(B) {b}∪First(C) {c, d}
Step 2 {a}∪ {b}∪First(C) {b}∪{c, d} = {b,c,d} {c, d}
S aSeS BB bBeB CC cCeC d
G0
First Set (2)
• First (SaSe) = {a}• First (SB) = {b, c, d}• First (B bBe) = {b}• First (B C) = {c, d}• First (C cCe) = {c}• First (C d) = {d}
Red : A Blue :
Step 3:• First (SaSe) = {a}• First (SB) = {b, c, d}• First (B bBe) = {b}• First (B C) = {c, d}• First (C cCe) = {c}• First (C d) = {d}
Step First Set
S B C a b c d
Step1
{a}∪First(B) {b}∪First(C) {c, d}
Step2
{a}∪ {b}∪First(C) {b}∪{c, d} = {b,c,d} {c, d}
Step3
{a}∪ {b}∪{c, d} = {a,b,c,d} {b}∪{c, d} = {b,c,d} {c, d}
S aSeS BB bBeB CC cCeC d
G0
First Set (2)
• First (SaSe) = {a}• First (SB) = {b, c, d}• First (B bBe) = {b}• First (B C) = {c, d}• First (C cCe) = {c}• First (C d) = {d}
Red : A Blue :
Step 3:
If no more change…The first set of a terminalsymbol is itself
• First (SaSe) = {a}• First (SB) = {b, c, d}• First (B bBe) = {b}• First (B C) = {c, d}• First (C cCe) = {c}• First (C d) = {d}
Step First Set
S B C a b c d
Step 1 {a}∪First(B) {b}∪First(C) {c, d}
Step 2 {a}∪ {b}∪First(C) {b}∪{c, d} = {b,c,d} {c, d}
Step 3 {a}∪ {b}∪{c, d} = {a,b,c,d} {b}∪{c, d} = {b,c,d} {c, d} {a} {b} {c} {d}
If no more change…The first set of a terminalsymbol is itself
Another Example….
First Set (2)
S ABcA aA B bB
G0
Red : A Blue :
S ABcA aA B bB
G0
First Set (2)S ABcA aA B bB
G0
Red : A Blue :
• First (SABc) = First(ABc)• First (Aa) = First(a)• First (A ) = First()∪First()• First (B b) = First(b)• First (B ) = First()∪First()
Step 1:• First (SABc) = First(ABc)• First (Aa) = First(a)• First (A ) = First()∪First()• First (B b) = First(b)• First (B ) = First()∪First()
First Set (2)S ABcA aA B bB
G0
Red : A Blue :
• First (SABc) = First(ABc)• First (Aa) = {a}• First (A ) = {}• First (B b) = {b}• First (B ) = {}
Step 1:• First (SABc) = First(ABc)• First (Aa) = {a}• First (A ) = {}• First (B b) = {b}• First (B ) = {}
Step First Set
S A B a b c
Step 1 First(ABc) {a, } {b, }
First Set (2)S ABcA aA B bB
G0
Red : A Blue :
• First (SABc) = First(ABc) = {a, }= {a, } - {} ∪ First(Bc)= {a} ∪ First(Bc)
• First (Aa) = {a}• First (A ) = {}• First (B b) = {b}• First (B ) = {}
Step 2:• First (SABc) = First(ABc) = {a, }
= {a, } - {} ∪ First(Bc)= {a} ∪ First(Bc)
• First (Aa) = {a}• First (A ) = {}• First (B b) = {b}• First (B ) = {} Step First Set
S A B a b c
Step 1 First(ABc) {a, } {b, }
Step 2 {a} ∪ First(Bc) {a, } {b, }
First Set (2)S ABcA aA B bB
G0
Red : A Blue :
• First (SABc) = {a} ∪ First(Bc)= {a} ∪{b, }= {a} ∪{b, } - {} ∪First(c)= {a} ∪{b,c}
• First (Aa) = {a}• First (A ) = {}• First (B b) = {b}• First (B ) = {}
Step 3:• First (SABc) = {a} ∪ First(Bc)
= {a} ∪{b, }= {a} ∪{b, } - {} ∪First(c)= {a} ∪{b,c}
• First (Aa) = {a}• First (A ) = {}• First (B b) = {b}• First (B ) = {}
Step First Set
S A B a b c
Step 1 First(ABc) {a, } {b, }
Step 2 {a} ∪ First(Bc) {a, } {b, }
Step 3 {a} ∪ {b, c}= {a,b,c} {a, } {b, }
First Set (2)S ABcA aA B bB
G0
Red : A Blue :
• First (SABc) = {a,b,c}• First (Aa) = {a}• First (A ) = {}• First (B b) = {b}• First (B ) = {}
Step 3:• First (SABc) = {a,b,c}• First (Aa) = {a}• First (A ) = {}• First (B b) = {b}• First (B ) = {}
Step First Set
S A B a b c
Step1
First(ABc) {a, } {b, }
Step2
{a} ∪ First(Bc) {a, } {b, }
Step3
{a} ∪ {b, c}= {a,b,c} {a, } {b, } {a} {b} {c}
If no more change…The first set of a terminalsymbol is itself
LL(1) Grammar
• A grammar G is LL(1) if it is not left recursive and foreach collection of productions
A1 | 2 | … | nfor nonterminal A the following holds:
1. FIRST(i) FIRST(j) = for all i j2. if i* then
2.a. j* for all i j2.b. FIRST(j) FOLLOW(A) =
for all i j
• A grammar G is LL(1) if it is not left recursive and foreach collection of productions
A1 | 2 | … | nfor nonterminal A the following holds:
1. FIRST(i) FIRST(j) = for all i j2. if i* then
2.a. j* for all i j2.b. FIRST(j) FOLLOW(A) =
for all i j
41
• A grammar G is LL(1) if it is not left recursive and foreach collection of productions
A1 | 2 | … | nfor nonterminal A the following holds:
1. FIRST(i) FIRST(j) = for all i j2. if i* then
2.a. j* for all i j2.b. FIRST(j) FOLLOW(A) =
for all i j
• A grammar G is LL(1) if it is not left recursive and foreach collection of productions
A1 | 2 | … | nfor nonterminal A the following holds:
1. FIRST(i) FIRST(j) = for all i j2. if i* then
2.a. j* for all i j2.b. FIRST(j) FOLLOW(A) =
for all i j
Non-LL(1) Examples
Grammar Not LL(1) because:S S a | a Left recursiveS a S | a FIRST(a S) FIRST(a)
42
S a S | a FIRST(a S) FIRST(a) S a R | R S | For R: S* and * S a R aR S |
For R:FIRST(S) FOLLOW(R)
Recursive Descent Parsing (Recap)
• Grammar must be LL(1)• Every nonterminal has one (recursive) procedure
responsible for parsing the nonterminal’s syntacticcategory of input tokens
• When a nonterminal has multiple productions, eachproduction is implemented in a branch of a selectionstatement based on input look-ahead information
• Grammar must be LL(1)• Every nonterminal has one (recursive) procedure
responsible for parsing the nonterminal’s syntacticcategory of input tokens
• When a nonterminal has multiple productions, eachproduction is implemented in a branch of a selectionstatement based on input look-ahead information
43
• Grammar must be LL(1)• Every nonterminal has one (recursive) procedure
responsible for parsing the nonterminal’s syntacticcategory of input tokens
• When a nonterminal has multiple productions, eachproduction is implemented in a branch of a selectionstatement based on input look-ahead information
• Grammar must be LL(1)• Every nonterminal has one (recursive) procedure
responsible for parsing the nonterminal’s syntacticcategory of input tokens
• When a nonterminal has multiple productions, eachproduction is implemented in a branch of a selectionstatement based on input look-ahead information
Using FIRST and FOLLOW to Write aRecursive Descent Parser
expr term restrest + term rest
| - term rest|
term id
expr term restrest + term rest
| - term rest|
term id
procedure rest();begin
if lookahead in FIRST(+ term rest) thenmatch(‘+’); term(); rest()
else if lookahead in FIRST(- term rest) thenmatch(‘-’); term(); rest()
else if lookahead in FOLLOW(rest) thenreturn
else error()end;
procedure rest();begin
if lookahead in FIRST(+ term rest) thenmatch(‘+’); term(); rest()
else if lookahead in FIRST(- term rest) thenmatch(‘-’); term(); rest()
else if lookahead in FOLLOW(rest) thenreturn
else error()end;
44
expr term restrest + term rest
| - term rest|
term id
expr term restrest + term rest
| - term rest|
term id
procedure rest();begin
if lookahead in FIRST(+ term rest) thenmatch(‘+’); term(); rest()
else if lookahead in FIRST(- term rest) thenmatch(‘-’); term(); rest()
else if lookahead in FOLLOW(rest) thenreturn
else error()end;
procedure rest();begin
if lookahead in FIRST(+ term rest) thenmatch(‘+’); term(); rest()
else if lookahead in FIRST(- term rest) thenmatch(‘-’); term(); rest()
else if lookahead in FOLLOW(rest) thenreturn
else error()end;
FIRST(+ term rest) = { + }FIRST(- term rest) = { - }FOLLOW(rest) = { $ }
FIRST(+ term rest) = { + }FIRST(- term rest) = { - }FOLLOW(rest) = { $ }
Non-Recursive Predictive Parsing:Table-Driven Parsing
• Given an LL(1) grammar G = (N, T, P, S)construct a table M[A,a] for A N, a T anduse a driver program with a stack
• Given an LL(1) grammar G = (N, T, P, S)construct a table M[A,a] for A N, a T anduse a driver program with a stack
45
• Given an LL(1) grammar G = (N, T, P, S)construct a table M[A,a] for A N, a T anduse a driver program with a stack
Predictive parsingprogram (driver)
Predictive parsingprogram (driver)
Parsing tableM
Parsing tableM
a + b $
XYZ$
stack
input
output
Constructing an LL(1) PredictiveParsing Table
for each production A do
for each a FIRST() doadd A to M[A,a]
enddo
if FIRST() then
for each b FOLLOW(A) doadd A to M[A,b]
enddoendif
enddoMark each undefined entry in M error
for each production A do
for each a FIRST() doadd A to M[A,a]
enddo
if FIRST() then
for each b FOLLOW(A) doadd A to M[A,b]
enddoendif
enddoMark each undefined entry in M error
46
for each production A do
for each a FIRST() doadd A to M[A,a]
enddo
if FIRST() then
for each b FOLLOW(A) doadd A to M[A,b]
enddoendif
enddoMark each undefined entry in M error
for each production A do
for each a FIRST() doadd A to M[A,a]
enddo
if FIRST() then
for each b FOLLOW(A) doadd A to M[A,b]
enddoendif
enddoMark each undefined entry in M error
Example Table
E T ERER + T ER | T F TRTR * F TR | F ( E ) | id
E T ERER + T ER | T F TRTR * F TR | F ( E ) | id
A FIRST() FOLLOW(A)E T ER ( id $ )
ER + T ER + $ )ER $ )
T F TR ( id + $ )TR * F TR * + $ )
TR + $ )F ( E ) ( * + $ )
47
id + * ( ) $E E T ER E T ER
ER ER + T ER ER ER
T T F TR T F TR
TR TR TR * F TR TR TR
F F id F ( E )
F ( E ) ( * + $ )F id id * + $ )
LL(1) Grammars are Unambiguous
Ambiguous grammarS i E t S SR | aSR e S | E b
Ambiguous grammarS i E t S SR | aSR e S | E b
A FIRST() FOLLOW(A)S i E t S SR i e $
S a a e $SR e S e e $
48
Ambiguous grammarS i E t S SR | aSR e S | E b
a b e i t $S S a S i E t S SR
SRSR
SR e S SR
E E b
SR e S e e $SR e $E b b t
Error: duplicate table entry
Predictive Parsing Program (Driver)push($)push(S)a := lookaheadrepeat
X := pop()if X is a terminal or X = $ then
match(X) // moves to next token and a := lookaheadelse if M[X,a] = X Y1Y2…Yk then
push(Yk, Yk-1, …, Y2, Y1) // such that Y1 is on top… invoke actions and/or produce IR output …
else error()endif
until X = $
push($)push(S)a := lookaheadrepeat
X := pop()if X is a terminal or X = $ then
match(X) // moves to next token and a := lookaheadelse if M[X,a] = X Y1Y2…Yk then
push(Yk, Yk-1, …, Y2, Y1) // such that Y1 is on top… invoke actions and/or produce IR output …
else error()endif
until X = $
49
push($)push(S)a := lookaheadrepeat
X := pop()if X is a terminal or X = $ then
match(X) // moves to next token and a := lookaheadelse if M[X,a] = X Y1Y2…Yk then
push(Yk, Yk-1, …, Y2, Y1) // such that Y1 is on top… invoke actions and/or produce IR output …
else error()endif
until X = $
push($)push(S)a := lookaheadrepeat
X := pop()if X is a terminal or X = $ then
match(X) // moves to next token and a := lookaheadelse if M[X,a] = X Y1Y2…Yk then
push(Yk, Yk-1, …, Y2, Y1) // such that Y1 is on top… invoke actions and/or produce IR output …
else error()endif
until X = $
Example Table-Driven ParsingStack$E$ERT$ERTRF$ERTRid$ERTR$ER$ERT+$ERT$ERTRF$ERTRid$ERTR$ERTRF*$ERTRF$ERTRid$ERTR$ER$
Stack$E$ERT$ERTRF$ERTRid$ERTR$ER$ERT+$ERT$ERTRF$ERTRid$ERTR$ERTRF*$ERTRF$ERTRid$ERTR$ER$
Inputid+id*id$id+id*id$id+id*id$id+id*id$
+id*id$+id*id$+id*id$
id*id$id*id$id*id$
*id$*id$
id$id$
$$$
Inputid+id*id$id+id*id$id+id*id$id+id*id$
+id*id$+id*id$+id*id$
id*id$id*id$id*id$
*id$*id$
id$id$
$$$
Production appliedE T ERT F TRF id
TR ER + T ER
T F TRF id
TR * F TR
F id
TR ER
Production appliedE T ERT F TRF id
TR ER + T ER
T F TRF id
TR * F TR
F id
TR ER
50
Stack$E$ERT$ERTRF$ERTRid$ERTR$ER$ERT+$ERT$ERTRF$ERTRid$ERTR$ERTRF*$ERTRF$ERTRid$ERTR$ER$
Stack$E$ERT$ERTRF$ERTRid$ERTR$ER$ERT+$ERT$ERTRF$ERTRid$ERTR$ERTRF*$ERTRF$ERTRid$ERTR$ER$
Inputid+id*id$id+id*id$id+id*id$id+id*id$
+id*id$+id*id$+id*id$
id*id$id*id$id*id$
*id$*id$
id$id$
$$$
Inputid+id*id$id+id*id$id+id*id$id+id*id$
+id*id$+id*id$+id*id$
id*id$id*id$id*id$
*id$*id$
id$id$
$$$
Production appliedE T ERT F TRF id
TR ER + T ER
T F TRF id
TR * F TR
F id
TR ER
Production appliedE T ERT F TRF id
TR ER + T ER
T F TRF id
TR * F TR
F id
TR ER
Panic Mode Recovery
FOLLOW(E) = { ) $ }FOLLOW(ER) = { ) $ }FOLLOW(T) = { + ) $ }FOLLOW(TR) = { + ) $ }FOLLOW(F) = { + * ) $ }
FOLLOW(E) = { ) $ }FOLLOW(ER) = { ) $ }FOLLOW(T) = { + ) $ }FOLLOW(TR) = { + ) $ }FOLLOW(F) = { + * ) $ }
Add synchronizing actions toundefined entries based on FOLLOWAdd synchronizing actions toundefined entries based on FOLLOW
Pro: Can be automatedCons: Error messages are neededPro: Can be automatedCons: Error messages are needed
51
id + * ( ) $E E T ER E T ER synch synchER ER + T ER ER ER
T T F TR synch T F TR synch synchTR TR TR * F TR TR TR
F F id synch synch F ( E ) synch synch
FOLLOW(E) = { ) $ }FOLLOW(ER) = { ) $ }FOLLOW(T) = { + ) $ }FOLLOW(TR) = { + ) $ }FOLLOW(F) = { + * ) $ }
synch: the driver pops current nonterminal A and skips input tillsynch token or skips input until one of FIRST(A) is found
Pro: Can be automatedCons: Error messages are needed
Phrase-Level Recovery
Change input stream by inserting missing tokensFor example: id id is changed into id * idChange input stream by inserting missing tokensFor example: id id is changed into id * id
Can then continue here
Pro: Can be automatedCons: Recovery not always intuitivePro: Can be automatedCons: Recovery not always intuitive
52
id + * ( ) $E E T ER E T ER synch synchER ER + T ER ER ER
T T F TR synch T F TR synch synchTR insert * TR TR * F TR TR TR
F F id synch synch F ( E ) synch synch
insert *: driver inserts missing * and retries the production
Can then continue here
Error Productions
E T ERER + T ER | T F TRTR * F TR | F ( E ) | id
E T ERER + T ER | T F TRTR * F TR | F ( E ) | id
Add “error production”:TR F TR
to ignore missing *, e.g.: id id
Add “error production”:TR F TR
to ignore missing *, e.g.: id id
Pro: Powerful recovery methodCons: Cannot be automatedPro: Powerful recovery methodCons: Cannot be automated
53
id + * ( ) $E E T ER E T ER synch synchER ER + T ER ER ER
T T F TR synch T F TR synch synchTR TR F TR TR TR * F TR TR TR
F F id synch synch F ( E ) synch synch
E T ERER + T ER | T F TRTR * F TR | F ( E ) | id Pro: Powerful recovery method
Cons: Cannot be automatedPro: Powerful recovery methodCons: Cannot be automated
Bottom-Up Parsing
• LR methods (Left-to-right, Rightmostderivation)– SLR, Canonical LR, LALR
• Other special cases:– Shift-reduce parsing– Operator-precedence parsing
• LR methods (Left-to-right, Rightmostderivation)– SLR, Canonical LR, LALR
• Other special cases:– Shift-reduce parsing– Operator-precedence parsing
54
• LR methods (Left-to-right, Rightmostderivation)– SLR, Canonical LR, LALR
• Other special cases:– Shift-reduce parsing– Operator-precedence parsing
• LR methods (Left-to-right, Rightmostderivation)– SLR, Canonical LR, LALR
• Other special cases:– Shift-reduce parsing– Operator-precedence parsing
Operator-Precedence Parsing
• Special case of shift-reduce parsing• We will not further discuss (you can skip
textbook section 4.6)
• Special case of shift-reduce parsing• We will not further discuss (you can skip
textbook section 4.6)
55
• Special case of shift-reduce parsing• We will not further discuss (you can skip
textbook section 4.6)
Shift-Reduce Parsing
Grammar:S a A B eA A b c | bB d
Grammar:S a A B eA A b c | bB d
Shift-reduce correspondsto a rightmost derivation:Srm a A B erm a A d erm a A b c d erm a b b c d e
Shift-reduce correspondsto a rightmost derivation:Srm a A B erm a A d erm a A b c d erm a b b c d e
Reducing a sentence:a b b c d ea A b c d ea A d ea A B eS
Reducing a sentence:a b b c d ea A b c d ea A d ea A B eS
56
Grammar:S a A B eA A b c | bB d
Shift-reduce correspondsto a rightmost derivation:Srm a A B erm a A d erm a A b c d erm a b b c d e
Shift-reduce correspondsto a rightmost derivation:Srm a A B erm a A d erm a A b c d erm a b b c d e
Reducing a sentence:a b b c d ea A b c d ea A d ea A B eS
Reducing a sentence:a b b c d ea A b c d ea A d ea A B eS
S
a b b c d e
A
A
B
a b b c d e
A
A
B
a b b c d e
A
A
a b b c d e
A
These matchproduction’s
right-hand sides
These matchproduction’s
right-hand sides
Handles
Grammar:S a A B eA A b c | bB d
Grammar:S a A B eA A b c | bB d
A handle is a substring of grammar symbols in aright-sentential form that matches a right-hand side
of a production
A handle is a substring of grammar symbols in aright-sentential form that matches a right-hand side
of a productiona b b c d ea A b c d ea A d ea A B eS
a b b c d ea A b c d ea A d ea A B eS
57
Handle
Grammar:S a A B eA A b c | bB d
Grammar:S a A B eA A b c | bB d
NOT a handle, becausefurther reductions will fail
(result is not a sentential form)
a b b c d ea A b c d ea A A e… ?
a b b c d ea A b c d ea A A e… ?
a b b c d ea A b c d ea A d ea A B eS
a b b c d ea A b c d ea A d ea A B eS
Stack Implementation ofShift-Reduce Parsing
Stack$$id$E$E+$E+id$E+E$E+E*$E+E*id$E+E*E$E+E$E
Stack$$id$E$E+$E+id$E+E$E+E*$E+E*id$E+E*E$E+E$E
Inputid+id*id$
+id*id$+id*id$
id*id$*id$*id$
id$$$$$
Inputid+id*id$
+id*id$+id*id$
id*id$*id$*id$
id$$$$$
Actionshiftreduce E idshiftshiftreduce E idshift (or reduce?)shiftreduce E idreduce E E * Ereduce E E + Eaccept
Actionshiftreduce E idshiftshiftreduce E idshift (or reduce?)shiftreduce E idreduce E E * Ereduce E E + Eaccept
Grammar:E E + EE E * EE ( E )E id
Grammar:E E + EE E * EE ( E )E id
How toresolve
conflicts?
58
Stack$$id$E$E+$E+id$E+E$E+E*$E+E*id$E+E*E$E+E$E
Stack$$id$E$E+$E+id$E+E$E+E*$E+E*id$E+E*E$E+E$E
Inputid+id*id$
+id*id$+id*id$
id*id$*id$*id$
id$$$$$
Inputid+id*id$
+id*id$+id*id$
id*id$*id$*id$
id$$$$$
Actionshiftreduce E idshiftshiftreduce E idshift (or reduce?)shiftreduce E idreduce E E * Ereduce E E + Eaccept
Actionshiftreduce E idshiftshiftreduce E idshift (or reduce?)shiftreduce E idreduce E E * Ereduce E E + Eaccept
Grammar:E E + EE E * EE ( E )E id
Grammar:E E + EE E * EE ( E )E id
Find handlesto reduce
How toresolve
conflicts?
Conflicts
• Shift-reduce and reduce-reduce conflicts arecaused by– The limitations of the LR parsing method (even
when the grammar is unambiguous)– Ambiguity of the grammar
• Shift-reduce and reduce-reduce conflicts arecaused by– The limitations of the LR parsing method (even
when the grammar is unambiguous)– Ambiguity of the grammar
59
• Shift-reduce and reduce-reduce conflicts arecaused by– The limitations of the LR parsing method (even
when the grammar is unambiguous)– Ambiguity of the grammar
• Shift-reduce and reduce-reduce conflicts arecaused by– The limitations of the LR parsing method (even
when the grammar is unambiguous)– Ambiguity of the grammar
Shift-Reduce Parsing:Shift-Reduce Conflicts
Stack$…$…if E then S
Stack$…$…if E then S
Input…$
else…$
Input…$
else…$
Action…shift or reduce?
Action…shift or reduce?
Ambiguous grammar:S if E then S
| if E then S else S| other
Ambiguous grammar:S if E then S
| if E then S else S| other
60
Stack$…$…if E then S
Input…$
else…$
Action…shift or reduce?
Ambiguous grammar:S if E then S
| if E then S else S| other
Ambiguous grammar:S if E then S
| if E then S else S| other
Resolve in favorof shift, so else
matches closest if
Resolve in favorof shift, so else
matches closest if
Shift-Reduce Parsing:Reduce-Reduce Conflicts
Stack$$a
Stack$$a
Inputaa$
a$
Inputaa$
a$
Actionshiftreduce A a or B a ?
Actionshiftreduce A a or B a ?
Grammar:C A BA aB a
Grammar:C A BA aB a
61
Stack$$a
Inputaa$
a$
Actionshiftreduce A a or B a ?
Grammar:C A BA aB a
Grammar:C A BA aB a
Resolve in favorof reduce A a,
otherwise we’re stuck!
Resolve in favorof reduce A a,
otherwise we’re stuck!
LR(k) Parsers: Use a DFA forShift/Reduce Decisions
1
2
4
5
0start
a
AC B
aState I1:S C•State I1:S C• State I4:
C A B•State I4:C A B•goto(I0,C)
62
5
3a
a
Grammar:S CC A BA aB a
Grammar:S CC A BA aB a
State I0:S •CC •A BA •a
State I0:S •CC •A BA •a
State I1:S C•
State I2:C A•BB •a
State I2:C A•BB •a
State I3:A a•State I3:A a•
State I4:C A B•State I4:C A B•
State I5:B a•State I5:B a•
goto(I0,C)
goto(I0,a)
goto(I0,A)
goto(I2,a)
goto(I2,B)
Can onlyreduce A a(not B a)
DFA for Shift/Reduce Decisions
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Inputaa$aa$
a$a$
$$$
Inputaa$aa$
a$a$
$$$
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Grammar:S CC A BA aB a
Grammar:S CC A BA aB a
The states of the DFA are used to determineif a handle is on top of the stack
The states of the DFA are used to determineif a handle is on top of the stack
63
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Inputaa$aa$
a$a$
$$$
Inputaa$aa$
a$a$
$$$
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Grammar:S CC A BA aB a
Grammar:S CC A BA aB a
State I0:S •CC •A BA •a
State I0:S •CC •A BA •a
State I3:A a•State I3:A a•
goto(I0,a)goto(I0,a)
DFA for Shift/Reduce Decisions
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Inputaa$aa$
a$a$
$$$
Inputaa$aa$
a$a$
$$$
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Grammar:S CC A BA aB a
Grammar:S CC A BA aB a
The states of the DFA are used to determineif a handle is on top of the stack
The states of the DFA are used to determineif a handle is on top of the stack
64
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Inputaa$aa$
a$a$
$$$
Inputaa$aa$
a$a$
$$$
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Grammar:S CC A BA aB a
Grammar:S CC A BA aB a
State I0:S •CC •A BA •a
State I0:S •CC •A BA •a
State I2:C A•BB •a
State I2:C A•BB •a
goto(I0,A)goto(I0,A)
DFA for Shift/Reduce Decisions
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Inputaa$aa$
a$a$
$$$
Inputaa$aa$
a$a$
$$$
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Grammar:S CC A BA aB a
The states of the DFA are used to determineif a handle is on top of the stack
The states of the DFA are used to determineif a handle is on top of the stack
65
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Inputaa$aa$
a$a$
$$$
Inputaa$aa$
a$a$
$$$
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Grammar:S CC A BA aB a
State I2:C A•BB •a
State I2:C A•BB •a
State I5:B a•State I5:B a•
goto(I2,a)goto(I2,a)
DFA for Shift/Reduce Decisions
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Inputaa$aa$
a$a$
$$$
Inputaa$aa$
a$a$
$$$
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Grammar:S CC A BA aB a
Grammar:S CC A BA aB a
The states of the DFA are used to determineif a handle is on top of the stack
The states of the DFA are used to determineif a handle is on top of the stack
66
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Inputaa$aa$
a$a$
$$$
Inputaa$aa$
a$a$
$$$
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Grammar:S CC A BA aB a
Grammar:S CC A BA aB a
State I2:C A•BB •a
State I2:C A•BB •a
State I4:C A B•State I4:C A B•
goto(I2,B)goto(I2,B)
DFA for Shift/Reduce Decisions
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Inputaa$aa$
a$a$
$$$
Inputaa$aa$
a$a$
$$$
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Grammar:S CC A BA aB a
Grammar:S CC A BA aB a
The states of the DFA are used to determineif a handle is on top of the stack
The states of the DFA are used to determineif a handle is on top of the stack
67
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Inputaa$aa$
a$a$
$$$
Inputaa$aa$
a$a$
$$$
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Grammar:S CC A BA aB a
Grammar:S CC A BA aB a
State I0:S •CC •A BA •a
State I0:S •CC •A BA •a
State I1:S C•State I1:S C•
goto(I0,C)goto(I0,C)
DFA for Shift/Reduce Decisions
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Inputaa$aa$
a$a$
$$$
Inputaa$aa$
a$a$
$$$
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Grammar:S CC A BA aB a
Grammar:S CC A BA aB a
The states of the DFA are used to determineif a handle is on top of the stack
The states of the DFA are used to determineif a handle is on top of the stack
68
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Stack$ 0$ 0$ 0 a 3$ 0 A 2$ 0 A 2 a 5$ 0 A 2 B 4$ 0 C 1
Inputaa$aa$
a$a$
$$$
Inputaa$aa$
a$a$
$$$
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Actionstart in state 0shift (and goto state 3)reduce A a (goto 2)shift (goto 5)reduce B a (goto 4)reduce C AB (goto 1)accept (S C)
Grammar:S CC A BA aB a
Grammar:S CC A BA aB a
State I0:S •CC •A BA •a
State I0:S •CC •A BA •a
State I1:S C•State I1:S C•
goto(I0,C)
Model of an LR ParserModel of an LR Parser
Sm
Xm
Sm-1
Xm-1
a1 ... ai ... an $
LR Parsing Algorithm
stack
input
output
69
Xm-1
.
.S1
X1
S0
Action Tableterminals and $
st four differenta actionstes
Goto Tablenon-terminal
st each item isa a state numbertes
A Configuration of LR ParsingAlgorithm
• A configuration of a LR parsing is:
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ )
Stack Rest of Input
• Sm and ai decides the parser action by consulting the parsing action table.(Initial Stack contains just So )
• A configuration of a LR parsing represents the right sentential form:
X1 ... Xm ai ai+1 ... an $
• A configuration of a LR parsing is:
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ )
Stack Rest of Input
• Sm and ai decides the parser action by consulting the parsing action table.(Initial Stack contains just So )
• A configuration of a LR parsing represents the right sentential form:
X1 ... Xm ai ai+1 ... an $
Actions of A LR-Parser1. shift s -- shifts the next input symbol and the state s onto the stack
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) ( So X1 S1 ... Xm Sm ai s, ai+1 ... an $ )
2. reduce A (or rn where n is a production number)– pop 2|| (=r) items from the stack;– then push A and s where s=goto[sm-r,A]
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ )
– Output is the reducing production reduce A
3. Accept – Parsing successfully completed
4. Error -- Parser detected an error (an empty entry in the action table)
1. shift s -- shifts the next input symbol and the state s onto the stack( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) ( So X1 S1 ... Xm Sm ai s, ai+1 ... an $ )
2. reduce A (or rn where n is a production number)– pop 2|| (=r) items from the stack;– then push A and s where s=goto[sm-r,A]
( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ )
– Output is the reducing production reduce A
3. Accept – Parsing successfully completed
4. Error -- Parser detected an error (an empty entry in the action table)
Reduce ActionReduce Action
• pop 2|| (=r) items from the stack; let usassume that = Y1Y2...Yr
• then push A and s where s=goto[sm-r,A]
( So X1 S1 ... Xm-r Sm-r Y1 Sm-r+1 ...Yr Sm, ai ai+1 ... an $ ) ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ )
• In fact, Y1Y2...Yr is a handle.
X1 ... Xm-r A ai ... an $ X1 ... Xm Y1...Yr ai ai+1 ... an $
• pop 2|| (=r) items from the stack; let usassume that = Y1Y2...Yr
• then push A and s where s=goto[sm-r,A]
( So X1 S1 ... Xm-r Sm-r Y1 Sm-r+1 ...Yr Sm, ai ai+1 ... an $ ) ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ )
• In fact, Y1Y2...Yr is a handle.
X1 ... Xm-r A ai ... an $ X1 ... Xm Y1...Yr ai ai+1 ... an $
(SLR) Parsing Tables for ExpressionGrammar
(SLR) Parsing Tables for ExpressionGrammar
state id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 acc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
Action Table Goto Table
1) E E+T2) E T3) T T*F4) T F5) F (E)6) F id
1) E E+T2) E T3) T T*F4) T F5) F (E)6) F id
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
1) E E+T2) E T3) T T*F4) T F5) F (E)6) F id
Actions of A (S)LR-Parser -- Examplestack input action output0 id*id+id$ shift 50id5 *id+id$ reduce by Fid Fid0F3 *id+id$ reduce by TF TF0T2 *id+id$ shift 70T2*7 id+id$ shift 50T2*7id5 +id$ reduce by Fid Fid0T2*7F10 +id$ reduce by TT*FTT*F0T2 +id$ reduce by ET ET0E1 +id$ shift 60E1+6 id$ shift 50E1+6id5 $ reduce by Fid Fid0E1+6F3 $ reduce by TF TF0E1+6T9$ reduce by EE+T EE+T0E1 $ accept
stack input action output0 id*id+id$ shift 50id5 *id+id$ reduce by Fid Fid0F3 *id+id$ reduce by TF TF0T2 *id+id$ shift 70T2*7 id+id$ shift 50T2*7id5 +id$ reduce by Fid Fid0T2*7F10 +id$ reduce by TT*FTT*F0T2 +id$ reduce by ET ET0E1 +id$ shift 60E1+6 id$ shift 50E1+6id5 $ reduce by Fid Fid0E1+6F3 $ reduce by TF TF0E1+6T9$ reduce by EE+T EE+T0E1 $ accept
SLR Grammars
• SLR (Simple LR): a simple extension of LR(0)shift-reduce parsing
• SLR eliminates some conflicts by populatingthe parsing table with reductions A onsymbols in FOLLOW(A)
75
• SLR (Simple LR): a simple extension of LR(0)shift-reduce parsing
• SLR eliminates some conflicts by populatingthe parsing table with reductions A onsymbols in FOLLOW(A)
S EE id + EE id
State I0:S •EE •id + EE •id
State I2:E id•+ EE id•
goto(I0,id) goto(I3,+)
FOLLOW(E)={$}thus reduce on $
Shift on +
SLR Parsing Table
id + $ E
• Reductions do not fill entire rows• Otherwise the same as LR(0)
76
s2acc
s3 r3s2
r2
id + $01234
E1
4
1. S E2. E id + E3. E id
FOLLOW(E)={$}thus reduce on $
Shift on +
SLR Parsing
• An LR(0) state is a set of LR(0) items• An LR(0) item is a production with a • (dot) in the
right-hand side• Build the LR(0) DFA by
– Closure operation to construct LR(0) items– Goto operation to determine transitions
• Construct the SLR parsing table from the DFA• LR parser program uses the SLR parsing table to
determine shift/reduce operations
77
• An LR(0) state is a set of LR(0) items• An LR(0) item is a production with a • (dot) in the
right-hand side• Build the LR(0) DFA by
– Closure operation to construct LR(0) items– Goto operation to determine transitions
• Construct the SLR parsing table from the DFA• LR parser program uses the SLR parsing table to
determine shift/reduce operations
LR(0) Items of a Grammar
• An LR(0) item of a grammar G is a production of Gwith a • at some position of the right-hand side
• Thus, a productionA X Y Z
has four items:[A • X Y Z][A X • Y Z][A X Y • Z][A X Y Z •]
• Note that production A has one item [A •]
78
• An LR(0) item of a grammar G is a production of Gwith a • at some position of the right-hand side
• Thus, a productionA X Y Z
has four items:[A • X Y Z][A X • Y Z][A X Y • Z][A X Y Z •]
• Note that production A has one item [A •]
Constructing the set of LR(0) Items of aGrammar
1. The grammar is augmented with a new startsymbol S’ and production S’S
2. Initially, set C = closure({[S’•S]})(this is the start state of the DFA)
3. For each set of items I C and each grammarsymbol X (NT) such that goto(I,X) C andgoto(I,X) , add the set of items goto(I,X) to C
4. Repeat 3 until no more sets can be added to C
79
1. The grammar is augmented with a new startsymbol S’ and production S’S
2. Initially, set C = closure({[S’•S]})(this is the start state of the DFA)
3. For each set of items I C and each grammarsymbol X (NT) such that goto(I,X) C andgoto(I,X) , add the set of items goto(I,X) to C
4. Repeat 3 until no more sets can be added to C
The Closure Operation for LR(0) Items
1. Initially, every LR(0) item in I is added toclosure(I)
2. If [A•B] closure(I) then for eachproduction B in the grammar, add theitem [B•] to I if not already in I
3. Repeat 2 until no new items can be added
80
1. Initially, every LR(0) item in I is added toclosure(I)
2. If [A•B] closure(I) then for eachproduction B in the grammar, add theitem [B•] to I if not already in I
3. Repeat 2 until no new items can be added
The Closure Operation (Example)
{ [E’ • E] }
closure({[E’ •E]}) =
{ [E’ • E][E • E + T][E • T] }
{ [E’ • E][E • E + T][E • T][T • T * F][T • F] }
{ [E’ • E][E • E + T][E • T][T • T * F][T • F][F • ( E )][F • id] }
81
Grammar:E E + T | TT T * F | FF ( E )F id
{ [E’ • E][E • E + T][E • T][T • T * F][T • F] }
{ [E’ • E][E • E + T][E • T][T • T * F][T • F][F • ( E )][F • id] }
Add [E•]Add [T•]
Add [F•]
The Goto Operation for LR(0) Items
1. For each item [A•X] I, add the set ofitems closure({[AX•]}) to goto(I,X) if notalready there
2. Repeat step 1 until no more items can beadded to goto(I,X)
3. Intuitively, goto(I,X) is the set of items thatare valid for the viable prefix X when I is theset of items that are valid for
82
1. For each item [A•X] I, add the set ofitems closure({[AX•]}) to goto(I,X) if notalready there
2. Repeat step 1 until no more items can beadded to goto(I,X)
3. Intuitively, goto(I,X) is the set of items thatare valid for the viable prefix X when I is theset of items that are valid for
The Goto Operation (Example 1)
Suppose I = Then goto(I,E)= closure({[E’ E •, E E • + T]})=
{ [E’ E •][E E • + T] }
{ [E’ • E][E • E + T][E • T][T • T * F][T • F][F • ( E )][F • id] }
83
{ [E’ E •][E E • + T] }
Grammar:E E + T | TT T * F | FF ( E )F id
{ [E’ • E][E • E + T][E • T][T • T * F][T • F][F • ( E )][F • id] }
The Goto Operation (Example 2)
Suppose I = { [E’ E •], [E E • + T] }
Then goto(I,+) = closure({[E E + • T]}) = { [E E + • T][T • T * F][T • F][F • ( E )][F • id] }
84
{ [E E + • T][T • T * F][T • F][F • ( E )][F • id] }
Grammar:E E + T | TT T * F | FF ( E )F id
Constructing SLR Parsing Tables1. Augment the grammar with S’S2. Construct the set C={I0,I1,…,In} of LR(0) items3. If [A•a] Ii and goto(Ii,a)=Ij then set
action[i,a]=shift j4. If [A•] Ii then set action[i,a]=reduce A for
all a FOLLOW(A) (apply only if AS’)5. If [S’S•] is in Ii then set action[i,$]=accept6. If goto(Ii,A)=Ij then set goto[i,A]=j7. Repeat 3-6 until no more entries added8. The initial state i is the Ii holding item [S’•S]
85
1. Augment the grammar with S’S2. Construct the set C={I0,I1,…,In} of LR(0) items3. If [A•a] Ii and goto(Ii,a)=Ij then set
action[i,a]=shift j4. If [A•] Ii then set action[i,a]=reduce A for
all a FOLLOW(A) (apply only if AS’)5. If [S’S•] is in Ii then set action[i,$]=accept6. If goto(Ii,A)=Ij then set goto[i,A]=j7. Repeat 3-6 until no more entries added8. The initial state i is the Ii holding item [S’•S]
The Canonical LR(0) Collection --Example
I0: E’ .E I1: E’ E. I6: E E+.T I9: E E+T.E .E+T E E.+T T .T*F T T.*FE .T T .FT .T*F I2: E T. F .(E) I10: T T*F.T .F T T.*F F .idF .(E)F .id I3: T F. I7: T T*.F I11: F (E).
F .(E)I4: F (.E) F .id
E .E+TE .T I8: F (E.)T .T*F E E.+TT .FF .(E)F .id
I5: F id.
I0: E’ .E I1: E’ E. I6: E E+.T I9: E E+T.E .E+T E E.+T T .T*F T T.*FE .T T .FT .T*F I2: E T. F .(E) I10: T T*F.T .F T T.*F F .idF .(E)F .id I3: T F. I7: T T*.F I11: F (E).
F .(E)I4: F (.E) F .id
E .E+TE .T I8: F (E.)T .T*F E E.+TT .FF .(E)F .id
I5: F id.
Transition Diagram (DFA) of GotoFunction
I0 I1
I2
I3
I4
I5
I6
I7
I8to I2to I3to I4
I9to I3to I4to I5
I10to I4to I5
I11to I6
to I7
id
(F
*
E T
T
FF
*+I1
I2
I3
I4
I5
I6
I7
I8to I2to I3to I4
I9to I3to I4to I5
I10to I4to I5
I11to I6E
+T
)
F
F
(
idid
(
(id
Example SLR Grammar and LR(0) Items
Augmentedgrammar:1. C’ C2. C A B3. A a4. B a
State I1:C’ C• State I4:
C A B•goto(I0,C)
I0 = closure({[C’ •C]})I1 = goto(I0,C) = closure({[C’ C•]})…
final
88
Augmentedgrammar:1. C’ C2. C A B3. A a4. B a
State I0:C’ •CC •A BA •a
State I1:C’ C•
State I2:C A•BB •a
State I3:A a•
State I4:C A B•
State I5:B a•
goto(I0,C)
goto(I0,a)
goto(I0,A)
goto(I2,a)
goto(I2,B)
start
final
Example SLR Parsing Table
State I0:C’ •CC •A BA •a
State I1:C’ C•
State I2:C A•BB •a
State I3:A a•
State I4:C A B•
State I5:B a•
89
s3acc
s5r3
r2r4
a $012345
C A B1 2
4
1
2
4
5
3
0start
a
AC B
a
Grammar:1. C’ C2. C A B3. A a4. B a
SLR and Ambiguity
• Every SLR grammar is unambiguous, but not everyunambiguous grammar is SLR
• Consider for example the unambiguous grammarS L = R | RL * R | idR L
90
• Every SLR grammar is unambiguous, but not everyunambiguous grammar is SLR
• Consider for example the unambiguous grammarS L = R | RL * R | idR L
I0:S’ •SS •L=RS •RL •*RL •idR •L
I1:S’ S•
I2:S L•=RR L•
I3:S R•
I4:L *•RR •LL •*RL •id
I5:L id•
I6:S L=•RR •LL •*RL •id
I7:L *R•
I8:R L•
I9:S L=R•
action[2,=]=s6action[2,=]=r5 no
Has no SLRparsing table
LR(1) Grammars
• SLR too simple• LR(1) parsing uses lookahead to avoid
unnecessary conflicts in parsing table• LR(1) item = LR(0) item + lookahead
91
• SLR too simple• LR(1) parsing uses lookahead to avoid
unnecessary conflicts in parsing table• LR(1) item = LR(0) item + lookahead
LR(0) item:[A•]
LR(1) item:[A•, a]
I2:S L•=RR L•
split
SLR Versus LR(1)
• Split the SLR states byadding LR(1) lookahead
• Unambiguous grammar1. S L = R2. | R3. L * R4. | id5. R L
92
I2:S L•=RR L•
action[2,=]=s6
Should not reduce on =, because noright-sentential form begins with R=
R L•S L•=R
• Split the SLR states byadding LR(1) lookahead
• Unambiguous grammar1. S L = R2. | R3. L * R4. | id5. R L
lookahead=$
action[2,$]=r5
LR(1) Items
• An LR(1) item[A•, a]
contains a lookahead terminal a, meaning alreadyon top of the stack, expect to see a
• For items of the form[A•, a]
the lookahead a is used to reduce A only if thenext input is a
• For items of the form[A•, a]
with the lookahead has no effect
93
• An LR(1) item[A•, a]
contains a lookahead terminal a, meaning alreadyon top of the stack, expect to see a
• For items of the form[A•, a]
the lookahead a is used to reduce A only if thenext input is a
• For items of the form[A•, a]
with the lookahead has no effect
The Closure Operation for LR(1) Items
1. Start with closure(I) = I2. If [A•B, a] closure(I) then for each
production B in the grammar and eachterminal b FIRST(a), add the item [B•,b] to I if not already in I
3. Repeat 2 until no new items can be added
94
1. Start with closure(I) = I2. If [A•B, a] closure(I) then for each
production B in the grammar and eachterminal b FIRST(a), add the item [B•,b] to I if not already in I
3. Repeat 2 until no new items can be added
The Goto Operation for LR(1) Items
1. For each item [A•X, a] I, add the setof items closure({[AX•, a]}) to goto(I,X)if not already there
2. Repeat step 1 until no more items can beadded to goto(I,X)
95
1. For each item [A•X, a] I, add the setof items closure({[AX•, a]}) to goto(I,X)if not already there
2. Repeat step 1 until no more items can beadded to goto(I,X)
Constructing the set of LR(1) Items of aGrammar
1. Augment the grammar with a new start symbol S’and production S’S
2. Initially, set C = closure({[S’•S, $]})(this is the start state of the DFA)
3. For each set of items I C and each grammarsymbol X (NT) such that goto(I,X) C andgoto(I,X) , add the set of items goto(I,X) to C
4. Repeat 3 until no more sets can be added to C
96
1. Augment the grammar with a new start symbol S’and production S’S
2. Initially, set C = closure({[S’•S, $]})(this is the start state of the DFA)
3. For each set of items I C and each grammarsymbol X (NT) such that goto(I,X) C andgoto(I,X) , add the set of items goto(I,X) to C
4. Repeat 3 until no more sets can be added to C
Example Grammar and LR(1) Items
• Unambiguous LR(1) grammar:S L = R
| RL * R
| idR L
• Augment with S’ S• LR(1) items (next slide)
97
• Unambiguous LR(1) grammar:S L = R
| RL * R
| idR L
• Augment with S’ S• LR(1) items (next slide)
[S’ •S, $] goto(I0,S)=I1[S •L=R, $] goto(I0,L)=I2[S •R, $] goto(I0,R)=I3[L •*R, =/$] goto(I0,*)=I4[L •id, =/$] goto(I0,id)=I5[R •L, $] goto(I0,L)=I2
[S’ S•, $]
[S L•=R, $] goto(I0,=)=I6[R L•, $]
[S R•, $]
[L *•R, =/$] goto(I4,R)=I7[R •L, =/$] goto(I4,L)=I8[L •*R, =/$] goto(I4,*)=I4[L •id, =/$] goto(I4,id)=I5
[L id•, =/$]
[S L=•R, $] goto(I6,R)=I9[R •L, $] goto(I6,L)=I10[L •*R, $] goto(I6,*)=I11[L •id, $] goto(I6,id)=I12
[L *R•, =/$]
[R L•, =/$]
[S L=R•, $]
[R L•, $]
[L *•R, $] goto(I11,R)=I13[R •L, $] goto(I11,L)=I10[L •*R, $] goto(I11,*)=I11[L •id, $] goto(I11,id)=I12
[L id•, $]
[L *R•, $]
I0:
I1:
I2:
I6:
I7:
I8:
I9:
98
[S’ •S, $] goto(I0,S)=I1[S •L=R, $] goto(I0,L)=I2[S •R, $] goto(I0,R)=I3[L •*R, =/$] goto(I0,*)=I4[L •id, =/$] goto(I0,id)=I5[R •L, $] goto(I0,L)=I2
[S’ S•, $]
[S L•=R, $] goto(I0,=)=I6[R L•, $]
[S R•, $]
[L *•R, =/$] goto(I4,R)=I7[R •L, =/$] goto(I4,L)=I8[L •*R, =/$] goto(I4,*)=I4[L •id, =/$] goto(I4,id)=I5
[L id•, =/$]
[S L=•R, $] goto(I6,R)=I9[R •L, $] goto(I6,L)=I10[L •*R, $] goto(I6,*)=I11[L •id, $] goto(I6,id)=I12
[L *R•, =/$]
[R L•, =/$]
[S L=R•, $]
[R L•, $]
[L *•R, $] goto(I11,R)=I13[R •L, $] goto(I11,L)=I10[L •*R, $] goto(I11,*)=I11[L •id, $] goto(I11,id)=I12
[L id•, $]
[L *R•, $]
I2:
I3:
I4:
I5:
I9:
I10:
I12:
I11:
I13:
Constructing Canonical LR(1) ParsingTables
1. Augment the grammar with S’S2. Construct the set C={I0,I1,…,In} of LR(1) items3. If [A•a, b] Ii and goto(Ii,a)=Ij then set
action[i,a]=shift j4. If [A•, a] Ii then set action[i,a]=reduce A
(apply only if AS’)5. If [S’S•, $] is in Ii then set action[i,$]=accept6. If goto(Ii,A)=Ij then set goto[i,A]=j7. Repeat 3-6 until no more entries added8. The initial state i is the Ii holding item [S’•S,$]
99
1. Augment the grammar with S’S2. Construct the set C={I0,I1,…,In} of LR(1) items3. If [A•a, b] Ii and goto(Ii,a)=Ij then set
action[i,a]=shift j4. If [A•, a] Ii then set action[i,a]=reduce A
(apply only if AS’)5. If [S’S•, $] is in Ii then set action[i,$]=accept6. If goto(Ii,A)=Ij then set goto[i,A]=j7. Repeat 3-6 until no more entries added8. The initial state i is the Ii holding item [S’•S,$]
Example LR(1) Parsing Tables5 s4
acc
s6 r6r3
s5 s4
r4
id * = $012345
S L R1 2 3
8 7Grammar:1. S’ S2. S L = R3. S R4. L * R5. L id6. R L
100
s5 s4r5 r5
s12
s11
r4 r4r6 r6
r2r6
s12
s11
r4
5678910
1112
10 4
10 13
Grammar:1. S’ S2. S L = R3. S R4. L * R5. L id6. R L
LALR(1) Grammars
• LR(1) parsing tables have many states• LALR(1) parsing (Look-Ahead LR) combines LR(1)
states to reduce table size• Less powerful than LR(1)
– Will not introduce shift-reduce conflicts, because shifts donot use lookaheads
– May introduce reduce-reduce conflicts, but seldom do sofor grammars of programming languages
101
• LR(1) parsing tables have many states• LALR(1) parsing (Look-Ahead LR) combines LR(1)
states to reduce table size• Less powerful than LR(1)
– Will not introduce shift-reduce conflicts, because shifts donot use lookaheads
– May introduce reduce-reduce conflicts, but seldom do sofor grammars of programming languages
Constructing LALR(1) Parsing Tables
1. Construct sets of LR(1) items2. Combine LR(1) sets with sets of items that
share the same first part
102
[L *•R, =][R •L, =][L •*R, =][L •id, =]
[L *•R, $][R •L, $][L •*R, $][L •id, $]
I4:
I11:
[L *•R, =/$][R •L, =/$][L •*R, =/$][L •id, =/$]
Shorthandfor two items
in the same set
Example LALR(1) Grammar
• Unambiguous LR(1) grammar:S L = R
| RL * R
| idR L
• Augment with S’ S• LALR(1) items (next slide)
103
• Unambiguous LR(1) grammar:S L = R
| RL * R
| idR L
• Augment with S’ S• LALR(1) items (next slide)
[S’ •S,$] goto(I0,S)=I1[S •L=R,$] goto(I0,L)=I2[S •R,$] goto(I0,R)=I3[L •*R,=/$] goto(I0,*)=I4[L •id,=/$] goto(I0,id)=I5[R •L,$] goto(I0,L)=I2
goto(I0,=)= I6[S’ S•,$]
[S L•=R,$][R L•,$]
[S R•,$]
[L *•R,=/$] goto(I4,R)=I7[R •L,=/$] goto(I4,L)=I9[L •*R,=/$] goto(I4,*)=I4[L •id,=/$] goto(I4,id)=I5
[L id•,=/$]
[S L=•R, $] goto(I6,R)=I8[R •L, $] goto(I6,L)=I9[L •*R, $] goto(I6,*)=I4[L •id, $] goto(I6,id)=I5
[L *R•,=/$]
[S L=R•,$]
[R L•,=/$]
I0:
I1:
I2:
I6:
I7:
I8:
I9:
104
[S’ •S,$] goto(I0,S)=I1[S •L=R,$] goto(I0,L)=I2[S •R,$] goto(I0,R)=I3[L •*R,=/$] goto(I0,*)=I4[L •id,=/$] goto(I0,id)=I5[R •L,$] goto(I0,L)=I2
goto(I0,=)= I6[S’ S•,$]
[S L•=R,$][R L•,$]
[S R•,$]
[L *•R,=/$] goto(I4,R)=I7[R •L,=/$] goto(I4,L)=I9[L •*R,=/$] goto(I4,*)=I4[L •id,=/$] goto(I4,id)=I5
[L id•,=/$]
[S L=•R, $] goto(I6,R)=I8[R •L, $] goto(I6,L)=I9[L •*R, $] goto(I6,*)=I4[L •id, $] goto(I6,id)=I5
[L *R•,=/$]
[S L=R•,$]
[R L•,=/$]I2:
I3:
I4:
I5:
I9:
Shorthandfor two items
[R L•,=][R L•,$]
Example LALR(1) Parsing Table
s5 s4acc
s6 r6
id * = $0123
S L R1 2 3
Grammar:1. S’ S2. S L = R3. S R4. L * R5. L id6. R L
105
s6 r6r3
s5 s4r5 r5
s5 s4r4 r4
r2r6 r6
3456789
9 7
9 8
Grammar:1. S’ S2. S L = R3. S R4. L * R5. L id6. R L
LL, SLR, LR, LALR Summary• LL parse tables computed using FIRST/FOLLOW
– Nonterminals terminals productions– Computed using FIRST/FOLLOW
• LR parsing tables computed using closure/goto– LR states terminals shift/reduce actions– LR states nonterminals goto state transitions
• A grammar is– LL(1) if its LL(1) parse table has no conflicts– SLR if its SLR parse table has no conflicts– LALR(1) if its LALR(1) parse table has no conflicts– LR(1) if its LR(1) parse table has no conflicts
106
• LL parse tables computed using FIRST/FOLLOW– Nonterminals terminals productions– Computed using FIRST/FOLLOW
• LR parsing tables computed using closure/goto– LR states terminals shift/reduce actions– LR states nonterminals goto state transitions
• A grammar is– LL(1) if its LL(1) parse table has no conflicts– SLR if its SLR parse table has no conflicts– LALR(1) if its LALR(1) parse table has no conflicts– LR(1) if its LR(1) parse table has no conflicts
LL, SLR, LR, LALR Grammars
LR(1)
LALR(1)
107
LL(1)
LR(0)
SLR
LALR(1)
Dealing with Ambiguous Grammars
1. S’ E2. E E + E3. E id s2
s3 acc
id + $012
E1
$ 0 E 1 + 3 E 4
id+id+id$
+id$
$ 0
… …
stack input
108
s3 acc
r3 r3s2
s3/r2 r2
234
4
Shift/reduce conflict:action[4,+] = shift 4action[4,+] = reduce E E + E
When reducing on +:yields left associativity(id+id)+id
When shifting on +:yields right associativityid+(id+id)
Using Associativity and Precedence toResolve Conflicts
• Left-associative operators: reduce• Right-associative operators: shift• Operator of higher precedence on stack: reduce• Operator of lower precedence on stack: shift
109
• Left-associative operators: reduce• Right-associative operators: shift• Operator of higher precedence on stack: reduce• Operator of lower precedence on stack: shift
S’ EE E + EE E * EE id $ 0 E 1 * 3 E 5
id*id+id$
+id$
$ 0
… …
stack input
reduce E E * E
Error Detection in LR Parsing
• Canonical LR parser uses full LR(1) parse tablesand will never make a single reduction beforerecognizing the error when a syntax erroroccurs on the input
• SLR and LALR may still reduce when a syntaxerror occurs on the input, but will never shiftthe erroneous input symbol
110
• Canonical LR parser uses full LR(1) parse tablesand will never make a single reduction beforerecognizing the error when a syntax erroroccurs on the input
• SLR and LALR may still reduce when a syntaxerror occurs on the input, but will never shiftthe erroneous input symbol
Error Recovery in LR Parsing• Panic mode
– Pop until state with a goto on a nonterminal A is found,(where A represents a major programming construct),push A
– Discard input symbols until one is found in the FOLLOW setof A
• Phrase-level recovery– Implement error routines for every error entry in table
• Error productions– Pop until state has error production, then shift on stack– Discard input until symbol is encountered that allows
parsing to continue
111
• Panic mode– Pop until state with a goto on a nonterminal A is found,
(where A represents a major programming construct),push A
– Discard input symbols until one is found in the FOLLOW setof A
• Phrase-level recovery– Implement error routines for every error entry in table
• Error productions– Pop until state has error production, then shift on stack– Discard input until symbol is encountered that allows
parsing to continue
ANTLR, Yacc, and Bison
• ANTLR tool– Generates LL(k) parsers
• Yacc (Yet Another Compiler Compiler)– Generates LALR(1) parsers
• Bison– Improved version of Yacc
112
• ANTLR tool– Generates LL(k) parsers
• Yacc (Yet Another Compiler Compiler)– Generates LALR(1) parsers
• Bison– Improved version of Yacc
Creating an LALR(1) Parser withYacc/Bison
Yacc or Bisoncompiler
yaccspecificationyacc.y
y.tab.c
113
y.tab.c
inputstream
Ccompiler
a.out outputstream
a.out
Yacc Specification• A yacc specification consists of three parts:
yacc declarations, and C declarations within %{ %}%%translation rules%%user-defined auxiliary procedures
• The translation rules are productions with actions:production1 { semantic action1 }production2 { semantic action2 }…productionn { semantic actionn }
114
• A yacc specification consists of three parts:yacc declarations, and C declarations within %{ %}%%translation rules%%user-defined auxiliary procedures
• The translation rules are productions with actions:production1 { semantic action1 }production2 { semantic action2 }…productionn { semantic actionn }
Writing a Grammar in Yacc
• Productions in Yacc are of the formNonterminal: tokens/nonterminals { action }
| tokens/nonterminals { action }…;
• Tokens that are single characters can be used directlywithin productions, e.g. ‘+’
• Named tokens must be declared first in thedeclaration part using
%token TokenName
115
• Productions in Yacc are of the formNonterminal: tokens/nonterminals { action }
| tokens/nonterminals { action }…;
• Tokens that are single characters can be used directlywithin productions, e.g. ‘+’
• Named tokens must be declared first in thedeclaration part using
%token TokenName
Synthesized Attributes
• Semantic actions may refer to values of thesynthesized attributes of terminals and nonterminalsin a production:
X : Y1 Y2 Y3 … Yn { action }– $$ refers to the value of the attribute of X– $i refers to the value of the attribute of Yi
• For examplefactor : ‘(’ expr ‘)’ { $$=$2; }
116
• Semantic actions may refer to values of thesynthesized attributes of terminals and nonterminalsin a production:
X : Y1 Y2 Y3 … Yn { action }– $$ refers to the value of the attribute of X– $i refers to the value of the attribute of Yi
• For examplefactor : ‘(’ expr ‘)’ { $$=$2; }
factor.val=x
expr.val=x )($$=$2
Example 1%{ #include <ctype.h> %}%token DIGIT%%line : expr ‘\n’ { printf(“%d\n”, $1); }
;expr : expr ‘+’ term { $$ = $1 + $3; }
| term { $$ = $1; };
term : term ‘*’ factor { $$ = $1 * $3; }| factor { $$ = $1; };
factor : ‘(’ expr ‘)’ { $$ = $2; }| DIGIT { $$ = $1; };
%%int yylex(){ int c = getchar();if (isdigit(c)){ yylval = c-’0’;
return DIGIT;}return c;
}
Also results in definition of#define DIGIT xxx
117
%{ #include <ctype.h> %}%token DIGIT%%line : expr ‘\n’ { printf(“%d\n”, $1); }
;expr : expr ‘+’ term { $$ = $1 + $3; }
| term { $$ = $1; };
term : term ‘*’ factor { $$ = $1 * $3; }| factor { $$ = $1; };
factor : ‘(’ expr ‘)’ { $$ = $2; }| DIGIT { $$ = $1; };
%%int yylex(){ int c = getchar();if (isdigit(c)){ yylval = c-’0’;
return DIGIT;}return c;
}
Attribute of token(stored in yylval)
Attribute ofterm (parent)
Attribute of factor (child)
Example of a very crude lexicalanalyzer invoked by the parser
Dealing With Ambiguous Grammars
• By defining operator precedence levels and left/rightassociativity of the operators, we can specifyambiguous grammars in Yacc, such asE E+E | E-E | E*E | E/E | (E) | -E | num
• To define precedence levels and associativity in Yacc’sdeclaration part:
%left ‘+’ ‘-’%left ‘*’ ‘/’%right UMINUS
118
• By defining operator precedence levels and left/rightassociativity of the operators, we can specifyambiguous grammars in Yacc, such asE E+E | E-E | E*E | E/E | (E) | -E | num
• To define precedence levels and associativity in Yacc’sdeclaration part:
%left ‘+’ ‘-’%left ‘*’ ‘/’%right UMINUS
Example 2%{#include <ctype.h>#include <stdio.h>#define YYSTYPE double%}%token NUMBER%left ‘+’ ‘-’%left ‘*’ ‘/’%right UMINUS%%lines : lines expr ‘\n’ { printf(“%g\n”, $2); }
| lines ‘\n’| /* empty */;
expr : expr ‘+’ expr { $$ = $1 + $3; }| expr ‘-’ expr { $$ = $1 - $3; }| expr ‘*’ expr { $$ = $1 * $3; }| expr ‘/’ expr { $$ = $1 / $3; }| ‘(’ expr ‘)’ { $$ = $2; }| ‘-’ expr %prec UMINUS { $$ = -$2; }| NUMBER;
%%
Double type for attributesand yylval
119
%{#include <ctype.h>#include <stdio.h>#define YYSTYPE double%}%token NUMBER%left ‘+’ ‘-’%left ‘*’ ‘/’%right UMINUS%%lines : lines expr ‘\n’ { printf(“%g\n”, $2); }
| lines ‘\n’| /* empty */;
expr : expr ‘+’ expr { $$ = $1 + $3; }| expr ‘-’ expr { $$ = $1 - $3; }| expr ‘*’ expr { $$ = $1 * $3; }| expr ‘/’ expr { $$ = $1 / $3; }| ‘(’ expr ‘)’ { $$ = $2; }| ‘-’ expr %prec UMINUS { $$ = -$2; }| NUMBER;
%%
Example 2 (cont’d)%%int yylex(){ int c;while ((c = getchar()) == ‘ ‘)
;if ((c == ‘.’) || isdigit(c)){ ungetc(c, stdin);
scanf(“%lf”, &yylval);return NUMBER;
}return c;
}int main(){ if (yyparse() != 0)
fprintf(stderr, “Abnormal exit\n”);return 0;
}int yyerror(char *s){ fprintf(stderr, “Error: %s\n”, s);}
Crude lexical analyzer forfp doubles and arithmeticoperators
120
%%int yylex(){ int c;while ((c = getchar()) == ‘ ‘)
;if ((c == ‘.’) || isdigit(c)){ ungetc(c, stdin);
scanf(“%lf”, &yylval);return NUMBER;
}return c;
}int main(){ if (yyparse() != 0)
fprintf(stderr, “Abnormal exit\n”);return 0;
}int yyerror(char *s){ fprintf(stderr, “Error: %s\n”, s);}
Run the parser
Crude lexical analyzer forfp doubles and arithmeticoperators
Invoked by parserto report parse errors
Combining Lex/Flex with Yacc/Bison
Yacc or Bisoncompiler
yaccspecificationyacc.y
y.tab.cy.tab.h
Lex or Flexcompiler
Lex specificationlex.l
and token definitionsy.tab.h
121
lex.yy.cy.tab.c
inputstream
Ccompiler
a.out outputstream
a.out
Lex or Flexcompiler
Lex specificationlex.l
and token definitionsy.tab.h
lex.yy.c
Lex Specification for Example 2%option noyywrap%{#include “y.tab.h”extern double yylval;%}number [0-9]+\.?|[0-9]*\.[0-9]+%%[ ] { /* skip blanks */ }{number} { sscanf(yytext, “%lf”, &yylval);
return NUMBER;}
\n|. { return yytext[0]; }
Generated by Yacc, contains#define NUMBER xxx
Defined in y.tab.c
122
%option noyywrap%{#include “y.tab.h”extern double yylval;%}number [0-9]+\.?|[0-9]*\.[0-9]+%%[ ] { /* skip blanks */ }{number} { sscanf(yytext, “%lf”, &yylval);
return NUMBER;}
\n|. { return yytext[0]; }
yacc -d example2.ylex example2.lgcc y.tab.c lex.yy.c./a.out
bison -d -y example2.yflex example2.lgcc y.tab.c lex.yy.c./a.out
Error Recovery in Yacc
%{…%}…%%lines : lines expr ‘\n’ { printf(“%g\n”, $2; }
| lines ‘\n’| /* empty */| error ‘\n’ { yyerror(“reenter last line: ”);
yyerrok;}
;…
123
%{…%}…%%lines : lines expr ‘\n’ { printf(“%g\n”, $2; }
| lines ‘\n’| /* empty */| error ‘\n’ { yyerror(“reenter last line: ”);
yyerrok;}
;…
Reset parser to normal modeError production:set error mode and
skip input until newline