l16lr
TRANSCRIPT
-
7/31/2019 L16LR
1/8
1
CS780(Prasad) L16LR 1
LR Parsing
Lecture Notes by
Profs Aiken and Necula (UCB)
CS780(Prasad) L16LR 2
Out line
Review of SLR parsing
Limit s of SLR par sing
LR parsing
LALR parsing
I mplement at ion of semant ic act ions
Using parser generat ors
CS780(Prasad) L16LR 3
Review of SLR(1) Parsing
LR par ser maintains a stack sym1, st at e1 . . . symn, st at en
staten is the f inal stat e of t he DFA on sym1 symn
Got ot ab le: t he tr ansition f unct ion of t he DFA Got o[i ,A] = j i f statei A statej
Act ion t able: f or each st ate and t erminal:
Shif t jReduce X AcceptError
Act ion[i, a] =
CS780(Prasad) L16LR 4
LR Parsing Algori t hm
Let I = w$ be init ial input
Let j = 0
Let DFA st ate 1 have it em S .S
Let stack = dummy, 1
r epeat
case action[top_st ate(st ack), I [j ]] of
shift k: push I [j ++], k
reduce X A:
pop |A| pairs,
push X, Goto[X,t op_st ate(st ack)]
accept : halt normally
err or: halt and report err or
CS780(Prasad) L16LR 5
Review. I t ems
An i tem [X .] says t hat t he parser is looking f or an X
it has an on t op of t he st ack
Expect s t o f ind a st r ing der ived f r om next in the
input Notes:
[X .a] means that a should f ollow. Then we canshif t it and still have a viable pref ix.
[X .] means t hat we could reduce X
But t his is not always a good idea !
CS780(Prasad) L16LR 6
SLR(1) Act ion Table
For each stat e si and t erminal a I f si has it em X .a and Goto[i,a] = j t hen
Action[i,a] = shif t j
I f si has it em S S. t hen Act ion[i,$ ] = accept
I f si has it em X . and a Follow(X) and X St hen Act ion[i,a] = reduce X
Ot herwise, Act ion[i,a] = err or
-
7/31/2019 L16LR
2/8
2
CS780(Prasad) L16LR 7
Limit s of SLR Parsing
SLR(1) is t he simplest LR parsing method SLR(1) is almost power f ul enough, but
some common programming languageconst ructs are not SLR(1).
Consider t he grammar
S L = E | E
L * E | idE L
CS780(Prasad) L16LR 8
Limit s of SLR Parsing (cont . )
Consider t wo st at es of t he DFA f orrecognizing viable pref ixes
S . S S L . = E
S . L = E L E L .
S . E
L . * E
L . id
E . L
SLR(1) parser on input = shif t (item L . = E ) reduce by E L(since = Follow(E))
CS780(Prasad) L16LR 9
What s The Pr oblem?
The grammar is not SLR(1), but why?
Focus on the reduce move in t he second st ate We are in t he context of S E L
No = can f ollow E in this cont ext
Even t hough = Follow(E) (in S L = E *E = E)
The r educe move should not happen if an = f ollowsin this context .
CS780(Prasad) L16LR 10
What s The Problem? (Cont . )
Problem: t he SLR t able has t oo many reduceact ions. Using Follow is t oo coarse.
I n any given cont ext , only some element s ofFollow can actually f ollow a non-t erminal.
For example:Follow(E) = {=, $}, but
I n cont ext S E only $ can f ollow E
I n cont ext S L = E * E1 = E only = can f ollow E1
CS780(Prasad) L16LR 11
One Way t o Fix The Problem: LR(1) I t ems
I dea: ref ine Follow based on context .
The context is described t hrough it ems.
An LR(1) it em is a pair
[X ., a]where X is a product ion and a is thelookahead t oken or $
LR(k) is similar but wit h k t okens of lookahead I n pract ice, k = 1
CS780(Prasad) L16LR 12
LR(1) I t ems. I nt uit ion
[X ., a] describes a st at e of t he parser : We are t rying to f ind an X, and
We have already on top of t he stack, and
We expect t o see a pref ix derived f rom a
Back to reduce act ions: have an [X ., a] Perf orm the reduce only if next t oken is a !
Wil l have f ewer reduce acti ons
Not f or all b Follow(X)
-
7/31/2019 L16LR
3/8
3
CS780(Prasad) L16LR 13
Const ruct ing Sets of LR(1) I t ems (1)
Similar t o const ruct ion f or LR(0).
The states of t he NFA are t he LR(1) itemsof G.
The star t s tate is [S . S, $ ]
CS780(Prasad) L16LR 14
Const r uct ing Sets of LR(1) I t ems (2)
1. For each LR(1) it em [Y .X, a]Add an X-tr ansit ion
[Y .X, a] X [Y X., a]
2. For each LR(1) it em [Y .X, a]
For each product ion X
For each terminal b First(a)
Add an t ransition[Y .X, a] [X ., b]
CS780(Prasad) L16LR 15
NFA f or Viable Pref ixes in Detail (1)
S . S $
S . E $
S . L = E $
S S . $
S
CS780(Prasad) L16LR 16
NFA f or Viable Pr ef ixes in Det ail (2)
S . S $
S S . $
S . L = E $
S . E $
L . * E =
L . id =S
S L . = E $
L
CS780(Prasad) L16LR 17
NFA f or Viable Pref ixes in Detail (3)
S . S $
S S . $
S . L = E $
S . E $
L . id =
L . * E =
S L . = E $
E . L $
S
L
E
CS780(Prasad) L16LR 18
NFA f or Viable Pr ef ixes in Det ail (4)
S . S $
S S . $
S . L = E $
S . E $
E . L $
L . id =
L . * E =
S L . = E $
L E L . $
L . id $
L . * E $
S
L
E
-
7/31/2019 L16LR
4/8
4
CS780(Prasad) L16LR 19
An Example Revisit ed
Consider t he state f rom last slide
LR(1) parser on input = only shi ft ( it em L . = E )
S L . = E $
E L . $
CS780(Prasad) L16LR 20
Const ruct ing LR(1) Parsing Tables
1. Add a dummy S S product ion
2. Const ruct t he NFA of LR(1) it ems as bef ore
3. Convert t he NFA int o a DFA
4. Goto is def ined exact ly as befor e:
Got o[i, A] = j if statei A statej
(t he t ransit ion f unct ion of t he DFA)
CS780(Prasad) L16LR 21
Const ruct ing LR(1) Parsing Tables (Cont . )
5. For each st at e si of t he DFA and ter minal a I f si has it em [X .a, c] and Goto[i , a] = j then
act ion[i,a] = shif t j
I f si has it em [X ., a] and X S then
action[i,a] = reduce X
I f si has it em [S S., $] then
act ion[i,$] = accept
Ot herwise,
action[i,a] = err or
LR(1) grammar act ion[i ,a] uniquely def inedCS780(Prasad) L16LR 22
LALR Parsing
Two bot t om-up parsing methods: SLR and LR
Which one we use? Neither SLR is not power f ul enough.
LR parsing tables are t oo big (1000s of states vs.100s of states f or SLR).
I n pract ice, use LALR(1) St ands f or Look-Ahead LR
A compromise between SLR(1) and LR(1)
CS780(Prasad) L16LR 23
LALR Parsing (Cont . )
Rough intuit ion: A LALR(1) parser f or G has The number of states of an SLR parser.
Some of t he lookahead discr iminat ion of LR(1).
I dea: const ruct t he DFA f or t he LR(1). Then merge the DFA st at es whose it ems
dif f er only in t he lookahead t okens We say that such states have t he same core.
CS780(Prasad) L16LR 24
The Core of a Set of LR I t em
Def ini t ion: The core of a set of LR it ems ist he set of f ir st components.
Example: t he cor e of
{ [X ., b], [Y ., d]}is
{X ., Y .}
The core of an LR it em is an LR(0) it em.
-
7/31/2019 L16LR
5/8
5
CS780(Prasad) L16LR 25
A LALR(1) DFA
Repeat unt il all st at es have dist inct core. Choose two dist inct states wit h same core.
Merge t he stat es by creat ing a new one with t heunion of all t he it ems.
Point edges f r om predecessors t o new stat e.
New state point s t o all t he previous successors.
A
ED
CB
F
A
BE
D
C
F
CS780(Prasad) L16LR 26
The LALR Parser Can Have Conf lict s
Consider f or example t he LR(1) st at es{[X ., a], [ Y ., b]}
{[X ., b], [Y ., a]}
And the merged LALR(1) stat e
{[X ., a/ b], [Y ., a/ b]}
Has a new reduce-reduce conf lict .
I n pr act ice such cases are r ar e.
CS780(Prasad) L16LR 27
LALR vs. LR Parsing
LALR languages ar e not nat ural. They are an ef f iciency hack on LR languages
Any r easonable programming language has anLALR(1) gr ammar .
LALR(1) has become a st andar d f orprogramming languages and f or parsergener ator s.
CS780(Prasad) L16LR 28
A Hierar chy of Grammar Classes
CS780(Prasad) L16LR 29
Semant ic Act ions
We can now illust rate how semant ic act ionsare implemented f or LR par sing.
Keep at t ributes on t he stack.
On shif t a, push at t ribute f or a on stack.
On reduce X pop att r ibut es f or
comput e att r ibut e f or X
and push it on t he stack
CS780(Prasad) L16LR 30
Perf orming Semant ic Act ions. Example
Recall t he example f rom ear lier lecture
E T + E1 { E.val = T.val + E1.val }
| T { E.val = T.val }
T int * T1 { T.val = int .val * T1.val }
| int { T.val = int .val }
Consider t he parsing of t he st r ing 3 * 5 + 8
-
7/31/2019 L16LR
6/8
6
CS780(Prasad) L16LR 31
Perf orming Semant ic Act ions. Example
| int * int + int shif t
int 3 | * int + int shif tint 3 * | int + int shif t
int 3 * int 5 | + int reduce T intint 3 * T5 | + int reduce T int * T
T15 | + int shif tT15 + | int shif t
T15 + int 8 | reduce T intT15 + T8 | reduce E T
T15 + E8 | reduce E T + EE23 | accept
CS780(Prasad) L16LR 32
Notes
The previous discussion shows howsynt hesized at t r ibutes are comput ed by LRparsers.
I t is also possible to comput e inher it edat t r ibut es in an LR parser.
CS780(Prasad) L16LR 33
Using Parser Generat ors
Most common parser generat ors are LALR(1).
A parser generator const ruct s a LALR(1) t able.
And report s an err or when a t able ent ry is multiplydef ined: A shif t and a reduce. Called shift / reduce conf lict
Mult iple reduces. Called reduce/ reduce conf lict
An ambiguous grammar will generate conf lict s.
What do we do in that case?
CS780(Prasad) L16LR 34
Shif t / Reduce Conf lict s
Typically due to ambiguit ies in the grammar .
Classic example: t he dangling elseS if E t hen S | if E t hen S else S | OTHER
Will have DFA st at e containing[S if E t hen S., else]
[S if E t hen S. else S, x]
if else f ollows, t hen we can shif t or reduce
Def ault (bison, CUP, etc.) is to shif t Default behavior is as needed in t his case.
CS780(Prasad) L16LR 35
More Shif t / Reduce Conf lict s
Consider t he ambiguous grammarE E + E | E * E | int
We will have t he st at es containing[E E * . E, +] [E E * E., +]
[E . E + E, +] E [E E . + E, +]
Again we have a shif t / r educe on input + We need to reduce (* binds more t ightly t hat =)
Recall solut ion: declare t he precedence of * and =
CS780(Prasad) L16LR 36
More Shif t / Reduce Conf lict s
I n bison, declare precedence and associat ivit y:%left +
%left *
Pr ecedence of a rule = t hat of it s last t erminal
See bison manual f or ways to overr ide t his def ault .
Resolve shift / reduce conf lict wit h a shif t if : no precedence declared f or eit her r ule or t erminal
input t erminal has higher precedence than t he rule
t he precedences are t he same and r ight associat ive
-
7/31/2019 L16LR
7/8
7
CS780(Prasad) L16LR 37
Using Precedence to Solve S/ R Conf lict s
Back t o our example:[E E * . E, +] [E E * E., +]
[E . E + E, +] E [E E . + E, +]
Wil l choose reduce because precedence ofrule E E * E is higher t han of t erminal +
CS780(Prasad) L16LR 38
Using Pr ecedence to Solve S/ R Conf lict s
Same grammar as bef oreE E + E | E * E | int
We will also have t he stat es[E E + . E, +] [E E + E., +]
[E . E + E, +] E [E E . + E, +]
Now we also have an S/ R conf lict on input + We choose reduce because E E + E and + have
t he same precedence and + is left -associative.
CS780(Prasad) L16LR 39
Using Precedence to Solve S/ R Conf lict s
Back t o our dangling else example[S if E t hen S., else]
[S if E t hen S. else S, x]
Can eliminat e conf lict by declar ing else withhigher precedence thant hen.
But t his st ar t s t o look like hacking the t ables.
Best t o avoid overuse of precedencedeclar ati ons, or youll end wit h unexpectedpar se t r ees.
CS780(Prasad) L16LR 40
Reduce/ Reduce Conf lict s
Usually due t o gross ambiguit y in t he grammar
Example: a sequence of ident if iersS | id | id S
There are t wo parse tr ees f or t he str ing idS i d
S id S id
How does t his conf use t he parser?
CS780(Prasad) L16LR 41
More on Reduce/ Reduce Conf lict s
Consider t he states [S id ., $][S . S, $] [S id . S, $]
[S ., $] id [S ., $]
[S . id, $] [S . id, $]
[S . id S, $] [S . id S, $ ] Reduce/ reduce conf lict on input $
S S id
S S id S i d
Bet t er rewrit e the grammar: S | id S
CS780(Prasad) L16LR 42
St range Reduce/ Reduce Conf lict s
Consider t he grammarS PR , NL N | N , NL
P T | NL : T R T | N : T
N id T id
P - parameters specif icat ion R - result specif icat ion
N - a par ameter or result name
T - a type name
NL - a list of names
-
7/31/2019 L16LR
8/8
8
CS780(Prasad) L16LR 43
St range Reduce/ Reduce Conf lict s
I n Pan id is a N when f ollowed by , or :
T when f ollowed by id
I n R an id is a N when f ollowed by :
T when f ollowed by ,
This is an LR(1) grammar.
But it is not LALR(1). Why? For obscure r easons
CS780(Prasad) L16LR 44
A Few LR(1) St ates
P . T id
P . NL : T id
NL . N :
NL . N , NL :
N . id :
N . id ,
T . id id
1
R . T ,
R . N : T ,
T . id ,
N . id :
2
T id . id
N id . :
N id . ,
id
3
T id . ,
N id . :id
4
T id . id/,
N id . :/,LALR merge
LALR r educe/ r educeconf lict on ,
CS780(Prasad) L16LR 45
What Happened?
Two dist inct stat es were confused becauset hey have t he same core.
Fix: add dummy product ions t o dist inguish t het wo conf used states.
E.g., add
R id bogus bogus is a t erminal not used by the lexer .
This pr oduct ion will never be used dur ing parsing.
But it dist inguishes R from P.
CS780(Prasad) L16LR 46
A Few LR(1) Stat es Af t er Fix
P . T id
P . NL : T id
NL . N :
NL . N , NL :
N . id :
N . id ,
T . id id
R . T ,
R . N : T ,
R . id bogus ,
T . id ,
N . id :
T id . id
N id . :
N id . ,
T id . ,
N id . :
R id . bogus ,
id
id
1
2
3
4
Diff erent cores no LALR merging
CS780(Prasad) L16LR 47
Not es on Parsing
Parsing A solid f oundat ion: cont ext -f ree grammars
A simple parser: LL(1)
A more powerf ul parser: LR(1)
An ef f iciency hack: LALR(1) LALR(1) parser generat ors
Next t ime we move on t o semant ic analysis.