csci 3130: automata theory and formal languages

24
CSCI 3130: Automata theory and formal languages Andrej Bogdanov http://www.cse.cuhk.edu.hk/ ~andrejb/csc3130 The Chinese University of Hong Kong LR(1) grammars Fall 2010

Upload: lam

Post on 06-Jan-2016

55 views

Category:

Documents


2 download

DESCRIPTION

Fall 2010. The Chinese University of Hong Kong. CSCI 3130: Automata theory and formal languages. LR( 1 ) grammars. Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130. LR(0) parsing review. A  a A b A  ab. 3. 4. 2. 1. a. parser generator. A. CFG G. 5. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CSCI 3130: Automata theory and formal languages

CSCI 3130: Automata theory and formal languages

Andrej Bogdanov

http://www.cse.cuhk.edu.hk/~andrejb/csc3130

The Chinese University of Hong Kong

LR(1) grammars

Fall 2010

Page 2: CSCI 3130: Automata theory and formal languages

LR(0) parsing review

A aAbA ab

parser generator

A a•AbA a•bA •aAbA •ab

A aA•b

A aAb•

A ab•

A

b

baA •aAbA •ab

a

1

2 3

5

4

CFG G“PDA” for parsing Gerror

if G is not LR(0)

Motivation: Fast parsing for programming languages

Page 3: CSCI 3130: Automata theory and formal languages

Parsing computer programs

if (n == 0) { return x; }

Statement

( Expression ) Block

else { return x + 1; }

if ParExpression Statement

...

...Block

...

else Statement

Most programming language CFGs are not LR(0)!

Page 4: CSCI 3130: Automata theory and formal languages

LR(0) parsing review

A aAb | ab stack action state

S

A a•AbA a•bA •aAbA •ab

A aA•b A aAb•

A ab•

A

b

b

aA •aAbA •ab

a

1

2 3

5

4

a

a b

b••

• •

1

S1 2

S12 2

R122 5 A12 3 S

123 4 R

A

••

Page 5: CSCI 3130: Automata theory and formal languages

Meaning of LR(0) items

A

A •Xundiscovered

part

NFA transitions to:

X •

X

focus

shift focus to subtree rooted at X(if X is nonterminal)

A X•move past subtreerooted at X

Page 6: CSCI 3130: Automata theory and formal languages

Outline of LR(0) parsing algorithm

• LR(0) parser has two kinds of actions:

• What if:

no complete item

is valid

there is one valid item,and it is complete

shift (S) reduce (R)

some valid items

complete, some not

more than one valid

complete item

S / R conflict R / R conflict

Page 7: CSCI 3130: Automata theory and formal languages

context-free grammarsCYK algorithm (slow)

Hierarchy of context-free grammars

LR(1) grammars

LR(0) grammarsLR(0) parsing algorithm

allow some conflicts

conflicts can be resolved by lookahead

Page 8: CSCI 3130: Automata theory and formal languages

A CFG that is not LR(0)

S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)

input: a

S •A, S •Bc A •aA, A •a B •a, B •ab,

valid LR(0) items:

update

Page 9: CSCI 3130: Automata theory and formal languages

A CFG that is not LR(0)

S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)

input:

R(4), R(5), S(6)

A

S

A B

A

aA

a a

A

a a

S S

ca• • •

valid LR(0) items:A a•A, A a• B a•, B a•b,A •aA, A •a

a

S/R, R/R conflicts!

possible parse trees

peek inside!

Page 10: CSCI 3130: Automata theory and formal languages

Lookahead

S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)

input: a apeek inside!

valid LR(0) items:A a•A, A a• B a•, B a•b,A •aA, A •a

A

A

a a

S

parse tree must look like this

action: shift

Page 11: CSCI 3130: Automata theory and formal languages

Lookahead

S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)

input: a a a

valid LR(0) items:A a•A, A a• A •aA, A •a

parse tree must look like this

A

A

aA

a

S

•action: shift

peek inside!

Page 12: CSCI 3130: Automata theory and formal languages

Lookahead

S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)

input: a a a

valid LR(0) items:A a•A, A a• A •aA, A •a

parse tree must look like this

action: reduce

A

A

aA

a a

S

Page 13: CSCI 3130: Automata theory and formal languages

LR(0) items vs. LR(1) items

A

A

a b

a b

Aa b•

A aAb | ab

A a•Ab

A

A

a b

a b

Aa b•

[A a•Ab, b]

LR(0) LR(1)

Page 14: CSCI 3130: Automata theory and formal languages

LR(1) items

[A •, x] [A •, ]

x•

A

A

Page 15: CSCI 3130: Automata theory and formal languages

Generating an LR(1) parser

S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)

NFA

states areLR(1) items

DFA + stack

may haveS/R, R/R conflicts

A CFG is LR(1) if conflicts can always be resolved with one symbol lookahead

Page 16: CSCI 3130: Automata theory and formal languages

NFA for LR(0) parsing

S •q0

A X•XA •X

C •A •C

For every LR(0) item S •

For every LR(0) item A •X

For every pair of LR(0) items A •C, C •

a, b: terminalsA, B, C: variables: mixed stringsX: terminal or variable

notation

Page 17: CSCI 3130: Automata theory and formal languages

NFA for LR(1) parsing

For every item S •

For every LR(1) item [A •X, x]

For every LR(1) item [A •C, x] and production C

a, b: terminalsA, B, C: variables: mixed stringsX: terminal or variable

notationq0

[S •,]

X [A X•, x][A •X, x]

[C •, y][A •C, x]

and every y in FIRST(x)

Page 18: CSCI 3130: Automata theory and formal languages

Explaining the transitions

[A X•, x]X

[A •X, x]

[C •, y]

[A •C, x]

A

C x

A

X x •

A

X x

y ∈ FIRST(x)

y

C

• •

Page 19: CSCI 3130: Automata theory and formal languages

FIRST sets

FIRST() are all leftmost terminals in derivations ⇒

S A(1) | cB(2) A aA(3) | a(4) B a(5) | ab(6)

C•

A

For every y in FIRST(x)

[C •, y][A •C, x]

x

{a}{a}{a, c}{c}{a}∅

aAScABA

FIRST()

Page 20: CSCI 3130: Automata theory and formal languages

Example: Constructing the NFA

S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)

q0

[S •A,]

[S •Bc,]

[S A•,]

A

. . .

[B •a,c]

[B •ab,c]

[S B•c,]B

[A •aA,]

[A •a,]

Page 21: CSCI 3130: Automata theory and formal languages

Example: Constructing the NFA

S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)

[S •A,]

q0

[S •Bc,]

[S A•,]A

[A •aA,]

[B •a,c]

[S B•c,]

[B •ab,c]

B

[A •a,]

[A a•A,] [A aA•,]

[A a•,]

[S Bc•,]

[B a•,c]

[B a•b,c] [B ab•,c]

a

a

c

a

a b

A

Page 22: CSCI 3130: Automata theory and formal languages

Example: Convert NFA to DFA

S A | Bc A aA | a B a | ab

[S •A,]

[S A•,]

[A •aA,]

[S B•c,]

[A •a,]

[A a•A,]

[A aA•,]

[S Bc•,]

A[S •Bc,][A •aA,][A •a,][B •a,c][B •ab,c]

[A a•A,]

[A •a,][B a•b,c]

[A •aA,]

[A a•,][B a•,c]

a

[A a•,]

a

A B

c [B ab•,c]

ba

A

shift variableshift terminal

reduce

LEGEND

12

3

4

5 6 7 8

Page 23: CSCI 3130: Automata theory and formal languages

Example: Resolving conflicts by lookahead

S A(1) | Bc(2) A aA(3) | a(4) B a(5) | ab(6)

[A a•A,]

[A •a,][B a•b,c]

[A •aA,]

[A a•,][B a•,c]

shift variableshift terminal

reduce

LEGEND

2 next action

a

b

c

shift

shift

reduce A

reduce B

[A •aA,][A •a,]

[A a•A,]

[A a•,]

3 next action

a

b

c

shift

error

error

reduce A

Page 24: CSCI 3130: Automata theory and formal languages

Example: Reconstruct the parse tree

stack action state

S 1

S1 2

R12 8

S1 6

16 7 R

[S •A,]

[S A•,]

[A •aA,][S B•c,][A •a,]

[A a•A,]

[A aA•,]

[S Bc•,]

[S •Bc,][A •aA,][A •a,][B •a,c][B •ab,c]

[A a•A,]

[A •a,][B a•b,c]

[A •aA,]

[A a•,][B a•,c]

a

[A a•,]

a

A

B

c

[B ab•,c]

b

a

1 2

3

4

5

6

7

8A

A

a b c

B

S

• • • •