l16lr

7/31/2019 L16LR

1/8

1

CS780(Prasad) L16LR 1

LR Parsing

Lecture Notes by

Profs Aiken and Necula (UCB)


Out line

Review of SLR parsing

Limit s of SLR par sing

LR parsing

LALR parsing

I mplement at ion of semant ic act ions

Using parser generat ors


Review of SLR(1) Parsing

LR par ser maintains a stack sym1, st at e1 . . . symn, st at en

staten is the f inal stat e of t he DFA on sym1 symn

Got ot ab le: t he tr ansition f unct ion of t he DFA Got o[i ,A] = j i f statei A statej

Act ion t able: f or each st ate and t erminal:

Shif t jReduce X AcceptError

Act ion[i, a] =


LR Parsing Algori t hm

Let I = w$ be init ial input

Let j = 0

Let DFA st ate 1 have it em S .S

Let stack = dummy, 1

r epeat

case action[top_st ate(st ack), I [j ]] of

shift k: push I [j ++], k

reduce X A:

pop |A| pairs,

push X, Goto[X,t op_st ate(st ack)]

accept : halt normally

err or: halt and report err or


Review. I t ems

An i tem [X .] says t hat t he parser is looking f or an X

it has an on t op of t he st ack

Expect s t o f ind a st r ing der ived f r om next in the

input Notes:

[X .a] means that a should f ollow. Then we canshif t it and still have a viable pref ix.

[X .] means t hat we could reduce X

But t his is not always a good idea !


SLR(1) Act ion Table

For each stat e si and t erminal a I f si has it em X .a and Goto[i,a] = j t hen

Action[i,a] = shif t j

I f si has it em S S. t hen Act ion[i,$ ] = accept

I f si has it em X . and a Follow(X) and X St hen Act ion[i,a] = reduce X

Ot herwise, Act ion[i,a] = err or

7/31/2019 L16LR

2/8

2


Limit s of SLR Parsing

SLR(1) is t he simplest LR parsing method SLR(1) is almost power f ul enough, but

some common programming languageconst ructs are not SLR(1).

Consider t he grammar

S L = E | E

L * E | idE L


Limit s of SLR Parsing (cont . )

Consider t wo st at es of t he DFA f orrecognizing viable pref ixes

S . S S L . = E

S . L = E L E L .

S . E

L . * E

L . id

E . L

SLR(1) parser on input = shif t (item L . = E ) reduce by E L(since = Follow(E))


What s The Pr oblem?

The grammar is not SLR(1), but why?

Focus on the reduce move in t he second st ate We are in t he context of S E L

No = can f ollow E in this cont ext

Even t hough = Follow(E) (in S L = E *E = E)

The r educe move should not happen if an = f ollowsin this context .


What s The Problem? (Cont . )

Problem: t he SLR t able has t oo many reduceact ions. Using Follow is t oo coarse.

I n any given cont ext , only some element s ofFollow can actually f ollow a non-t erminal.

For example:Follow(E) = {=, $}, but

I n cont ext S E only $ can f ollow E

I n cont ext S L = E * E1 = E only = can f ollow E1


One Way t o Fix The Problem: LR(1) I t ems

I dea: ref ine Follow based on context .

The context is described t hrough it ems.

An LR(1) it em is a pair

[X ., a]where X is a product ion and a is thelookahead t oken or $

LR(k) is similar but wit h k t okens of lookahead I n pract ice, k = 1


LR(1) I t ems. I nt uit ion

[X ., a] describes a st at e of t he parser : We are t rying to f ind an X, and

We have already on top of t he stack, and

We expect t o see a pref ix derived f rom a

Back to reduce act ions: have an [X ., a] Perf orm the reduce only if next t oken is a !

Wil l have f ewer reduce acti ons

Not f or all b Follow(X)

7/31/2019 L16LR

3/8

3


Const ruct ing Sets of LR(1) I t ems (1)

Similar t o const ruct ion f or LR(0).

The states of t he NFA are t he LR(1) itemsof G.

The star t s tate is [S . S, $ ]


Const r uct ing Sets of LR(1) I t ems (2)

1. For each LR(1) it em [Y .X, a]Add an X-tr ansit ion

[Y .X, a] X [Y X., a]

2. For each LR(1) it em [Y .X, a]

For each product ion X

For each terminal b First(a)

Add an t ransition[Y .X, a] [X ., b]


NFA f or Viable Pref ixes in Detail (1)

S . S $

S . E $

S . L = E $

S S . $

S


NFA f or Viable Pr ef ixes in Det ail (2)

S . S $

S S . $

S . L = E $

S . E $

L . * E =

L . id =S

S L . = E $

L


NFA f or Viable Pref ixes in Detail (3)

S . S $

S S . $

S . L = E $

S . E $

L . id =

L . * E =

S L . = E $

E . L $

S

L

E


NFA f or Viable Pr ef ixes in Det ail (4)

S . S $

S S . $

S . L = E $

S . E $

E . L $

L . id =

L . * E =

S L . = E $

L E L . $

L . id $

L . * E $

S

L

E

7/31/2019 L16LR

4/8

4


An Example Revisit ed

Consider t he state f rom last slide

LR(1) parser on input = only shi ft ( it em L . = E )

S L . = E $

E L . $


Const ruct ing LR(1) Parsing Tables

1. Add a dummy S S product ion

2. Const ruct t he NFA of LR(1) it ems as bef ore

3. Convert t he NFA int o a DFA

4. Goto is def ined exact ly as befor e:

Got o[i, A] = j if statei A statej

(t he t ransit ion f unct ion of t he DFA)


Const ruct ing LR(1) Parsing Tables (Cont . )

5. For each st at e si of t he DFA and ter minal a I f si has it em [X .a, c] and Goto[i , a] = j then

act ion[i,a] = shif t j

I f si has it em [X ., a] and X S then

action[i,a] = reduce X

I f si has it em [S S., $] then

act ion[i,$] = accept

Ot herwise,

action[i,a] = err or

LR(1) grammar act ion[i ,a] uniquely def inedCS780(Prasad) L16LR 22

LALR Parsing

Two bot t om-up parsing methods: SLR and LR

Which one we use? Neither SLR is not power f ul enough.

LR parsing tables are t oo big (1000s of states vs.100s of states f or SLR).

I n pract ice, use LALR(1) St ands f or Look-Ahead LR

A compromise between SLR(1) and LR(1)


LALR Parsing (Cont . )

Rough intuit ion: A LALR(1) parser f or G has The number of states of an SLR parser.

Some of t he lookahead discr iminat ion of LR(1).

I dea: const ruct t he DFA f or t he LR(1). Then merge the DFA st at es whose it ems

dif f er only in t he lookahead t okens We say that such states have t he same core.


The Core of a Set of LR I t em

Def ini t ion: The core of a set of LR it ems ist he set of f ir st components.

Example: t he cor e of

{ [X ., b], [Y ., d]}is

{X ., Y .}

The core of an LR it em is an LR(0) it em.

7/31/2019 L16LR

5/8

5


A LALR(1) DFA

Repeat unt il all st at es have dist inct core. Choose two dist inct states wit h same core.

Merge t he stat es by creat ing a new one with t heunion of all t he it ems.

Point edges f r om predecessors t o new stat e.

New state point s t o all t he previous successors.

A

ED

CB

F

A

BE

D

C

F


The LALR Parser Can Have Conf lict s

Consider f or example t he LR(1) st at es{[X ., a], [ Y ., b]}

{[X ., b], [Y ., a]}

And the merged LALR(1) stat e

{[X ., a/ b], [Y ., a/ b]}

Has a new reduce-reduce conf lict .

I n pr act ice such cases are r ar e.


LALR vs. LR Parsing

LALR languages ar e not nat ural. They are an ef f iciency hack on LR languages

Any r easonable programming language has anLALR(1) gr ammar .

LALR(1) has become a st andar d f orprogramming languages and f or parsergener ator s.


A Hierar chy of Grammar Classes


Semant ic Act ions

We can now illust rate how semant ic act ionsare implemented f or LR par sing.

Keep at t ributes on t he stack.

On shif t a, push at t ribute f or a on stack.

On reduce X pop att r ibut es f or

comput e att r ibut e f or X

and push it on t he stack


Perf orming Semant ic Act ions. Example

Recall t he example f rom ear lier lecture

E T + E1 { E.val = T.val + E1.val }

| T { E.val = T.val }

T int * T1 { T.val = int .val * T1.val }

| int { T.val = int .val }

Consider t he parsing of t he st r ing 3 * 5 + 8

7/31/2019 L16LR

6/8

6


Perf orming Semant ic Act ions. Example

| int * int + int shif t

int 3 | * int + int shif tint 3 * | int + int shif t

int 3 * int 5 | + int reduce T intint 3 * T5 | + int reduce T int * T

T15 | + int shif tT15 + | int shif t

T15 + int 8 | reduce T intT15 + T8 | reduce E T

T15 + E8 | reduce E T + EE23 | accept


Notes

The previous discussion shows howsynt hesized at t r ibutes are comput ed by LRparsers.

I t is also possible to comput e inher it edat t r ibut es in an LR parser.


Using Parser Generat ors

Most common parser generat ors are LALR(1).

A parser generator const ruct s a LALR(1) t able.

And report s an err or when a t able ent ry is multiplydef ined: A shif t and a reduce. Called shift / reduce conf lict

Mult iple reduces. Called reduce/ reduce conf lict

An ambiguous grammar will generate conf lict s.

What do we do in that case?


Shif t / Reduce Conf lict s

Typically due to ambiguit ies in the grammar .

Classic example: t he dangling elseS if E t hen S | if E t hen S else S | OTHER

Will have DFA st at e containing[S if E t hen S., else]

[S if E t hen S. else S, x]

if else f ollows, t hen we can shif t or reduce

Def ault (bison, CUP, etc.) is to shif t Default behavior is as needed in t his case.


More Shif t / Reduce Conf lict s

Consider t he ambiguous grammarE E + E | E * E | int

We will have t he st at es containing[E E * . E, +] [E E * E., +]

[E . E + E, +] E [E E . + E, +]

Again we have a shif t / r educe on input + We need to reduce (* binds more t ightly t hat =)

Recall solut ion: declare t he precedence of * and =


More Shif t / Reduce Conf lict s

I n bison, declare precedence and associat ivit y:%left +

%left *

Pr ecedence of a rule = t hat of it s last t erminal

See bison manual f or ways to overr ide t his def ault .

Resolve shift / reduce conf lict wit h a shif t if : no precedence declared f or eit her r ule or t erminal

input t erminal has higher precedence than t he rule

t he precedences are t he same and r ight associat ive

7/31/2019 L16LR

7/8

7


Using Precedence to Solve S/ R Conf lict s

Back t o our example:[E E * . E, +] [E E * E., +]

[E . E + E, +] E [E E . + E, +]

Wil l choose reduce because precedence ofrule E E * E is higher t han of t erminal +


Using Pr ecedence to Solve S/ R Conf lict s

Same grammar as bef oreE E + E | E * E | int

We will also have t he stat es[E E + . E, +] [E E + E., +]

[E . E + E, +] E [E E . + E, +]

Now we also have an S/ R conf lict on input + We choose reduce because E E + E and + have

t he same precedence and + is left -associative.


Using Precedence to Solve S/ R Conf lict s

Back t o our dangling else example[S if E t hen S., else]

[S if E t hen S. else S, x]

Can eliminat e conf lict by declar ing else withhigher precedence thant hen.

But t his st ar t s t o look like hacking the t ables.

Best t o avoid overuse of precedencedeclar ati ons, or youll end wit h unexpectedpar se t r ees.


Reduce/ Reduce Conf lict s

Usually due t o gross ambiguit y in t he grammar

Example: a sequence of ident if iersS | id | id S

There are t wo parse tr ees f or t he str ing idS i d

S id S id

How does t his conf use t he parser?


More on Reduce/ Reduce Conf lict s

Consider t he states [S id ., $][S . S, $] [S id . S, $]

[S ., $] id [S ., $]

[S . id, $] [S . id, $]

[S . id S, $] [S . id S, $ ] Reduce/ reduce conf lict on input $

S S id

S S id S i d

Bet t er rewrit e the grammar: S | id S


St range Reduce/ Reduce Conf lict s

Consider t he grammarS PR , NL N | N , NL

P T | NL : T R T | N : T

N id T id

P - parameters specif icat ion R - result specif icat ion

N - a par ameter or result name

T - a type name

NL - a list of names

7/31/2019 L16LR

8/8

8


St range Reduce/ Reduce Conf lict s

I n Pan id is a N when f ollowed by , or :

T when f ollowed by id

I n R an id is a N when f ollowed by :

T when f ollowed by ,

This is an LR(1) grammar.

But it is not LALR(1). Why? For obscure r easons


A Few LR(1) St ates

P . T id

P . NL : T id

NL . N :

NL . N , NL :

N . id :

N . id ,

T . id id

1

R . T ,

R . N : T ,

T . id ,

N . id :

2

T id . id

N id . :

N id . ,

id

3

T id . ,

N id . :id

4

T id . id/,

N id . :/,LALR merge

LALR r educe/ r educeconf lict on ,


What Happened?

Two dist inct stat es were confused becauset hey have t he same core.

Fix: add dummy product ions t o dist inguish t het wo conf used states.

E.g., add

R id bogus bogus is a t erminal not used by the lexer .

This pr oduct ion will never be used dur ing parsing.

But it dist inguishes R from P.


A Few LR(1) Stat es Af t er Fix

P . T id

P . NL : T id

NL . N :

NL . N , NL :

N . id :

N . id ,

T . id id

R . T ,

R . N : T ,

R . id bogus ,

T . id ,

N . id :

T id . id

N id . :

N id . ,

T id . ,

N id . :

R id . bogus ,

id

id

1

2

3

4

Diff erent cores no LALR merging


Not es on Parsing

Parsing A solid f oundat ion: cont ext -f ree grammars

A simple parser: LL(1)

A more powerf ul parser: LR(1)

An ef f iciency hack: LALR(1) LALR(1) parser generat ors

Next t ime we move on t o semant ic analysis.

l16lr

Documents