compiler design and construction bottom-up parsingsking/courses/compilers/slides/bottom_up... ·...

220
Compiler Design and Construction Bottom-Up Parsing Slides modified from Louden Book, Y Chung (NTHU), and Fischer, Leblanc

Upload: doandat

Post on 28-May-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Compiler Design and Construction

Bottom-Up Parsing

Slides modified from Louden Book, Y Chung (NTHU), and Fischer, Leblanc

2 2

Outline 6.0 Introduction

6.1 Shift-Reduce Parsers

6.2 LR Parsers

6.3 LR(1) Parsing

6.4 SLR(1)Parsing

6.5 LALR(1)

Fall 2012 Bottom Up Parsing

Parsing

A top-down parser “discovers” the parse tree by

starting at the root (start symbol) and expanding

(predict) downward in a depth-first manner

They predict the derivation before the matching is done

A bottom-up parser starts at the leaves (terminals)

and determines which production generates them.

Then it determines the rules to generate their parents and so-on, until reaching root (S)

Fall 2012 Bottom Up Parsing 3

Bottom-up Parsing Example

Fall 2012 Bottom Up Parsing 4

Scan the input looking for any substrings that appear on the RHS of a rule!

We call that RHS a handle

We can do this left-to-right or right-to-left

Let's use left-to-right

Replace that RHS with the LHS

Repeat until left with Start symbol or error

Effectively we are going to figure out which rules (in a right-most derivation) will generate our input (but in reverse order)

Can think of this as handle pruning

Top-down Parsing Example

Consider the following input and CFG

Input: begin SimpleStmt; SimpleStmt; end $

How would we generate this string in a rightmost

fashion?

<program> begin <stmts> end $ <stmts> SimpleStmt ; <stmts> <stmts> begin <stmts> end ; <stmts>

<stmts> l

Top-down Parsing Example

Consider the following input and CFG

Input: begin SimpleStmt; SimpleStmt; end $

<program> => begin <stmts> end $

=> begin SimpleStmt; <stmts> end $

=> begin SimpleStmt; SimpleStmt; <stmts> end $

=> begin SimpleStmt; SimpleStmt; end $

<program> begin <stmts> end $ <stmts> SimpleStmt ; <stmts> <stmts> begin <stmts> end ; <stmts>

<stmts> l

Bottom-up Parsing Example Input: begin SimpleStmt; SimpleStmt; <stmts> end $

Replace lambda with <stmts>

<stmts>

l <program> begin <stmts> end $ <stmts> SimpleStmt ; <stmts> <stmts> begin <stmts> end ; <stmts>

<stmts> l

Bottom-up Parsing Example

Input: begin SimpleStmt; SimpleStmt; <stmts> end $

Replace SimpleStmt; <stmts> with <stmts>

Input: begin SimpleStmt; <stmts> end $

<stmts>

SimpleStmts ; <stmts>

l <program> begin <stmts> end $ <stmts> SimpleStmt ; <stmts> <stmts> begin <stmts> end ; <stmts>

<stmts> l

Bottom-up Parsing Example

Input: begin SimpleStmt; <stmts> end $

Replace SimpleStmt; <stmts> with <stmts>

Input: begin <stmts> end $

<stmts>

SimpleStmt ; <stmts>

SimpleStmts ; <stmts>

l <program> begin <stmts> end $ <stmts> SimpleStmt ; <stmts> <stmts> begin <stmts> end ; <stmts>

<stmts> l

Bottom-up Parsing Example

Input: begin <stmts> end $

Replace with start symbol

<program> $

<program>

begin <stmts> end $

SimpleStmt ; <stmts>

SimpleStmts ; <stmts>

l <program> begin <stmts> end $ <stmts> SimpleStmt ; <stmts> <stmts> begin <stmts> end ; <stmts>

<stmts> l

Bottom Up Parsing

Fall 2012 Bottom Up Parsing 11

Consider this grammar:

S --> a T U e

T --> T b c | b

U --> d

and the rightmost derivation of the

sentence:

a b b c d e:

S ==> a T U e

==> a T d e

==> a T b c d e

==> a b b c d e

Bottom Up Parsing

Fall 2012 Bottom Up Parsing 12

A bottom-up parser is an LR parser so it reads the input

from left-to-right and performs a rightmost derivation in

reverse order.

There are four steps in the rightmost derivation of a b b

c d e so a bottom-up parser performs the steps in

reverse order: S ==> a T U e

==> a T d e

==> a T b c d e

==> a b b c d e

Bottom Up Parsing

Fall 2012 Bottom Up Parsing 13

The parser examines the sentence ( a b b c d e ) for substrings that match the right-sides of productions in the grammar.

There are three cases:

the first (b) in the sentence;

the second (b) in the sentence;

or the (d).

The parser chooses the first b and reduces it to the left-side of the T --> b production to produce the sentential form: a T b c d e .

S --> a T U e T --> T b c | b U --> d

Bottom Up Parsing

Fall 2012 Bottom Up Parsing 14

The parser examines the sentential form ( a T b c d e )

for substrings that match the right-sides of productions in

the grammar.

There are three cases:

( T b c ), (b), and (d).

The parser chooses ( T b c ) and reduces it to the left-

side of the production: T --> T b c to produce the

sentential form: a T d e.

S --> a T U e T --> T b c | b U --> d

Bottom Up Parsing

Fall 2012 Bottom Up Parsing 15

The parser examines the sentential form ( a T d e ) for

substrings that match the right-sides of productions in the

grammar and finds only one case:

(d).

The parser reduces it to the left-side of the production:

U --> d to produce the sentential form: a T U e.

S --> a T U e T --> T b c | b U --> d

Bottom Up Parsing

Fall 2012 Bottom Up Parsing 16

The parser examines the sentential form ( a T U e ) for substrings that match the right-sides of productions in the grammar and finds that the only case is the whole string: ( a T U e ).

The parser reduces it to the left-side of the production: S --> a T U e to produce a sentential form containing only the start

symbol, S.

Note that each step applies a production in reverse, replacing the right-side with the left-side, so we use the word reduce instead of produce.

Handles

Fall 2012 Bottom Up Parsing 17

The substring of the sentential form that the parser

chooses to reduce in each step of the parse is called the

handle for that step.

In the previous example the handles are:

1. the first (b) in ( a b b c d e ).

2. the ( T b c ) substring in ( a T b c d e ).

3. the (d) in ( a T d e ).

4. the whole string, ( a T U e ).

Handles

Fall 2012 Bottom Up Parsing 18

In step 1 and in step 2 of the example the parser has three possible handles to choose from:

if the parser chooses the wrong handle it won't be able to complete the reverse-ordered rightmost derivation.

The main task of a bottom-up parser is to choose the correct handle at each step of the parse.

There could be many choices on any step;

e.g., the empty string can be inserted into the string of n symbols in any of n + 1 different locations so just a single e -production in a grammar will give us many possible handles to choose from.

Shift Reduce Parsing

Fall 2012 Bottom Up Parsing 19

Most bottom-up parsers are implemented as shift-reduce

parsers.

Such a parser uses a stack to hold grammar symbols (it is

convenient to think of a horizontal stack with its bottom on

the left and its top on the right) and has four possible actions:

Shift: Move the next input symbol on to the top (right) of the stack.

Reduce: Reduce a handle on the right-most part of the stack by

popping it off the stack and pushing the left-side of the appropriate

production on to the right-end of the stack.

Accept: Announce successful completion of parsing.

Error: Signal discovery of a syntax error.

Shift Reduce Parsing

Fall 2012 Bottom Up Parsing 20

We use $ to mark the left-end (bottom) of the stack and also the end of the input string.

Initially the stack is empty.

Parsing ends successfully when the input is empty and the stack contains only the start symbol.

As an example we use the following grammar:

E --> E + E

E --> E *E

E --> (E )

E --> id

Example (louden)

Grammar:

E E + n | n

Input: 2 + 3, or n + n

Parse: ($ is EOF in input, also bottom of stack)

Fall 2012 Bottom Up Parsing 21

Parsing stack Input Action

1 $ n + n $ shift

2 $ n + n $ reduce E n

3 $ E + n $ shift

4 $ E + n $ shift

5 $ E + n $ reduce E E + n

6 $ E $ accept

Notes:

Left recursion is not a problem in bottom-up

parsing. Indeed, as we shall see, lookahead is

not as serious an issue.

Keeping track of what is on the stack, however,

is an issue (note the difference in the grammar

rule reductions at lines 2 and 5 of the previous

example). See later discussion on stack state.

Right recursion is actually a bit of a problem,

because it makes the stack grow large (see next example).

Fall 2012 Bottom Up Parsing 22

Example

Grammar:

E n + E | n

Input: 2 + 3, or n + n

Parse:

Fall 2012 Bottom Up Parsing 23

Parsing stack Input Action

1 $ n + n $ shift

2 $ n + n $ shift

3 $ n + n $ shift

4 $ n + n $ reduce E n

5 $ n + E $ reduce E n + E

6 $ E $ accept

Shift Reduce Parsing

Fall 2012 Bottom Up Parsing 24

The following figure shows the

actions of a shift-reduce parser to

parse the input string id1 * (id2 +

id3) according to the grammar.

STACK INPUT ACTION

$ id1 * ( id2 + id3 ) $ shift

$ id1 * ( id2 + id3 ) $ E --> id

$ E * ( id2 + id3 ) $ shift

$ E * ( id2 + id3 ) $ shift

$ E * ( id2 + id3 ) $ shift

$ E * ( id2 + id3 ) $ E --> id

$ E * ( E + id3 ) $ shift

$ E * ( E + id3 ) $ shift

$ E * ( E + id3 ) $ E --> id

$ E * ( E + E ) $ E --> E + E

$ E * ( E ) $ shift

$ E * ( E ) $ E --> ( E

$ E * E $ E --> E * E

$ E $ accept

Shift Reduce Parsing

Fall 2012 Bottom Up Parsing 25

Shift-reduce parsers can be constructed for a large class

of grammars - the LR grammars - but the construction is

usually so complicated that they are only constructed by

parser-construction programs (YACC)

However, the next section will show that there is a small

but important class of grammars where shift-reduce

parsers can be easily constructed by hand.

Introduction(2)

In Chapter 6

Bottom-up parsers

A bottom-up parser, or a shift-reduce parser,

begins at the leaves and works up to the top of the tree.

The reduction steps trace a rightmost derivation on reverse.

Fall 2012 26

More Example at Next Page to explain it.

S aABe

A Abc | b

B d

Grammar

The input string : abbcde.

parse

Bottom Up Parsing

27

Introduction(3)

a d b b c INPUT:

Bottom-Up Parsing

Program

e OUTPUT: $

Production

S aABe

A Abc

A b

B d

Bottom-Up Parser Example

Shift a

Fall 2012 Bottom Up Parsing

28

Introduction(4)

a d b b c INPUT:

Bottom-Up Parsing

Program

e OUTPUT:

A

b

$

Production

S aABe

A Abc

A b

B d

Bottom-Up Parser Example Shift b

Reduce from b to A

Fall 2012 Bottom Up Parsing

29

Introduction(5)

a d b A c INPUT:

Bottom-Up Parsing

Program

e OUTPUT:

A

b

$

Production

S aABe

A Abc

A b

B d

Bottom-Up Parser Example

Shift A

Fall 2012 Bottom Up Parsing

30

Introduction(6)

a d b A c INPUT:

Bottom-Up Parsing

Program

e OUTPUT:

A

b

$

Production

S aABe

A Abc

A b

B d

Bottom-Up Parser Example

Shift b

Fall 2012 Bottom Up Parsing

31

Introduction(7)

a d b A c INPUT:

Bottom-Up Parsing

Program

e OUTPUT:

A

b

$

Production

S aABe

A Abc

A b

B d

c

A

b

Bottom-Up Parser Example Shift c

Reduce from Abc to A

Fall 2012 Bottom Up Parsing

32

Introduction(8)

a d A INPUT:

Bottom-Up Parsing

Program

e OUTPUT:

A c

A

b

$

Production

S aABe

A Abc

A b

B d

b

Bottom-Up Parser Example

Shift A

Fall 2012 Bottom Up Parsing

33

Introduction(9)

a d A INPUT:

Bottom-Up Parsing

Program

e OUTPUT:

A c

A

b

$

Production

S aABe

A Abc

A b

B d

b

B

d

Bottom-Up Parser Example Shift d

Reduce from d to B

Fall 2012 Bottom Up Parsing

34

Introduction(10)

a B A INPUT:

Bottom-Up Parsing

Program

e OUTPUT:

A c

A

b

$

Production

S aABe

A Abc

A b

B d

b

B

d

Bottom-Up Parser Example

Shift B

Fall 2012 Bottom Up Parsing

35

Introduction(11)

a B A INPUT:

Bottom-Up Parsing

Program

e OUTPUT:

A c

A

b

$

Production

S aABe

A Abc

A b

B d

b

B

d

a

S

e

Bottom-Up Parser Example Shift e

Reduce from aABe to S

Fall 2012 Bottom Up Parsing

36

Introduction(12)

S INPUT:

Bottom-Up Parsing

Program

OUTPUT:

A c

A

b

$

Production

S aABe

A Abc

A b

B d

b

B

d

a

S

e

This parser is known as an LR Parser because

it scans the input from Left to right, and it constructs

a Rightmost derivation in reverse order.

Bottom-Up Parser Example Shift S

Hit the target $

Fall 2012 Bottom Up Parsing

Introduction(13)

Conclusion

The scanning of productions for matching with handles in the

input string

Backtracking makes the method used in the previous example

very inefficient.

Can we do better? Discuss it later!!!

Previous Architecture Renew Architecture

38 38

Outline 6.0 Introduction

6.1 Shift-Reduce Parsers

6.2 LR Parsers

6.3 LR(1) Parsing

6.4 SLR(1)Parsing

6.5 LALR(1)

Fall 2012 Bottom Up Parsing

Parse Trees

Phrase – sequence of tokens descended from a

nonterminal

Simple phrase – phrase that contains no smaller

phrase at the leaves

Handle – the leftmost simple phrase

40

Shift-Reduce Parsers(1) Shift-Reduce (bottom-up) parser is known as an LR Parser

It scans the input from Left to right

Rightmost derivation in reverse order

Kinds of LR

LR(k)

most powerful deterministic bottom-up parsing using k lookaheads

SLR(k)

LALR(k)

mechanism to perform

bottom-up parsing finite state machine

to manipulate “handle”

Components Parse stack Shift-reduce driver Action table

Goto table Fall 2012 Bottom Up Parsing

41

Shift-Reduce Parsers(2)

Parse stack

Initially empty, contains symbols already parsed

Elements in the stack are terminal or non-terminal symbols

The parse stack catenated with the remaining input always

represents a right sentential form

Fall 2012 Bottom Up Parsing

42

Shift-Reduce Parsers(3)

Shift-Reduce driver

Shift -- when top of stack doesn't contain a handle of the

sentential form

push input token (with contextual information) onto stack

Reduce -- when top of stack contains a handle

pop the handle

push reduced non-terminal (with contextual information)

Success when no input left and goal symbol on the stack

Fall 2012 Bottom Up Parsing

43

Shift-Reduce Parsers(4)

Two questions

– Have we reached the end of handles and how long is the

handle?

– Which non-terminal does the handle reduce to?

We use tables to answer the questions

ACTION table

GOTO table

Fall 2012 Bottom Up Parsing

44

Shift-Reduce Parsers(5)

LR parsers are driven by two tables:

Action table, which specifies that actions to take

Shift, reduce, accept (terminate with success) or error

Goto table, which specifies state transition

Defines successor states after a token or LHS is matched and shifted.

Parse stack – contains parse states (not symbols)

Encode the shifted symbol and the handles that are being matched, a possible sub-tree of the parse tree

Fall 2012

45

Shift-Reduce Parsers(6) grammar G0

1. <program> begin <stmts> end $

2. <stmts> SimpleStmt ; <stmts>

3. <stmts> begin <stmts> end ; <stmts>

4. <stmts> l

Action Table

Goto Table

blank -- ERROR

Shift Reduce Parser S – top parse stack state

T – Current input token

push(S0) // start state

Loop forever

case Action(S,T)

error => ReportSyntaxError()

accept => CleanUpAndFinish()

shift => Push(GoTo(S,T))

Scanner(T) // yylex()

reduce => Assume X -> Y1...Ym

Pop(m) // S' is new stack top

Push(GoTo(S',X))

47

Shift-Reduce Parsers(7)

void shift_reduce_driver(void) { /* Push the Start State, S0, * onto an empty parse stack. */ push(S0); while (TRUE) { /* forever */ /* Let S be the top parse stack state; * let T be the current input token.*/ switch (action[S][T]) { case ERROR: announce_syntax_error(); break; case ACCEPT: /* The input has been correctly

* parsed. */ clean_up_and_finish(); return;

case SHIFT: push(go_to[S][T]); scanner(&T); /* Get next token. */ break; case REDUCEi: /* Assume i-th production is * X Y1 Ym. * Remove states corresponding to * the RHS of the production. */ pop(m); /* S' is the new stack top. */ push(go_to[S'][X]); break; } } }

Fall 2012 Bottom Up Parsing

grammar G0

1. <program>begin<stmts>end$

2. <stmts> SimpleStmt;<stmts>

3. <stmts> begin<stmts>end;<stmts>

4. <stmts> l

tracing steps

Step Parse Stack Remaining Input Action (1) 0 begin SimpleStmt ; SimpleStmt ; end $ Shift 1

Shift-Reduce

Parsers(8)

Symbol State 0 1 2 3 4 5 6 7 8 9 10 11

begin S S S S S end R4 S R4 R4 S R4 R2 R3

; S S SimpleStmt S S S S

$ A

action

table

grammar G0

1. <program>begin<stmts>end$

2. <stmts> SimpleStmt;<stmts>

3. <stmts> begin<stmts>end;<stmts>

4. <stmts> l

tracing steps

Step Parse Stack Remaining Input Action (2) 0,1 SimpleStmt ; SimpleStmt ; end $ Shift 5

Shift-Reduce

Parsers(9)

Symbol State 0 1 2 3 4 5 6 7 8 9 10 11

begin S S S S S end R4 S R4 R4 S R4 R2 R3

; S S SimpleStmt S S S S

$ A

action

table

grammar G0

1. <program>begin<stmts>end$

2. <stmts> SimpleStmt;<stmts>

3. <stmts> begin<stmts>end;<stmts>

4. <stmts> l

tracing steps

Step Parse Stack Remaining Input Action (3) 0,1,5 ; SimpleStmt ; end $ Shift 6

Shift-Reduce

Parsers(10)

Symbol State 0 1 2 3 4 5 6 7 8 9 10 11

begin S S S S S end R4 S R4 R4 S R4 R2 R3

; S S SimpleStmt S S S S

$ A

action

table

grammar G0

1. <program>begin<stmts>end$

2. <stmts> SimpleStmt;<stmts>

3. <stmts> begin<stmts>end;<stmts>

4. <stmts> l

tracing steps

Step Parse Stack Remaining Input Action (4) 0,1,5,6 SimpleStmt ; end $ Shift 5

Symbol State 0 1 2 3 4 5 6 7 8 9 10 11

begin S S S S S end R4 S R4 R4 S R4 R2 R3

; S S SimpleStmt S S S S

$ A

Shift-Reduce

Parsers(11)

action

table

grammar G0

1. <program>begin<stmts>end$

2. <stmts> SimpleStmt;<stmts>

3. <stmts> begin<stmts>end;<stmts>

4. <stmts> l

tracing steps

Step Parse Stack Remaining Input Action (5) 0,1,5,6,5 ; end $ Shift 6

Symbol State 0 1 2 3 4 5 6 7 8 9 10 11

begin S S S S S end R4 S R4 R4 S R4 R2 R3

; S S SimpleStmt S S S S

$ A

Shift-Reduce

Parsers(12)

action

table

grammar G0

1. <program>begin<stmts>end$

2. <stmts> SimpleStmt;<stmts>

3. <stmts> begin<stmts>end;<stmts>

4. <stmts> l

tracing steps

Step Parse Stack Remaining Input Action (6) 0,1,5,6,5,6,l end $ /* goto(6,<stmts>) = 10 */ Reduce 4

Symbol State 0 1 2 3 4 5 6 7 8 9 10 11

begin S S S S S end R4 S R4 R4 S R4 R2 R3

; S S SimpleStmt S S S S

$ A

Shift-Reduce

Parsers(13)

goto

table

action

table

grammar G0

1. <program>begin<stmts>end$

2. <stmts> SimpleStmt;<stmts>

3. <stmts> begin<stmts>end;<stmts>

4. <stmts> l

tracing steps

Step Parse Stack Remaining Input Action (7) 0,1,5,6,5,6,10 end $ /* goto(6,<stmts>) = 10 */ Reduce 2

Symbol State 0 1 2 3 4 5 6 7 8 9 10 11

begin S S S S S end R4 S R4 R4 S R4 R2 R3

; S S SimpleStmt S S S S

$ A

Shift-Reduce

Parsers(14)

goto

table

action

table

grammar G0

1. <program>begin<stmts>end$

2. <stmts> SimpleStmt;<stmts>

3. <stmts> begin<stmts>end;<stmts>

4. <stmts> l

tracing steps

Step Parse Stack Remaining Input Action (8) 0,1,5,6,10 end $ /* goto(1,<stmts>) = 2 */ Reduce 2

Symbol State 0 1 2 3 4 5 6 7 8 9 10 11

begin S S S S S end R4 S R4 R4 S R4 R2 R3

; S S SimpleStmt S S S S

$ A

Shift-Reduce

Parsers(15)

goto

table

action

table

grammar G0

1. <program>begin<stmts>end$

2. <stmts> SimpleStmt;<stmts>

3. <stmts> begin<stmts>end;<stmts>

4. <stmts> l

tracing steps

Step Parse Stack Remaining Input Action (9) 0,1,2 end $ Shift 3

Symbol State 0 1 2 3 4 5 6 7 8 9 10 11

begin S S S S S end R4 S R4 R4 S R4 R2 R3

; S S SimpleStmt S S S S

$ A

Shift-Reduce

Parsers(16)

action

table

grammar G0

1. <program>begin<stmts>end$

2. <stmts> SimpleStmt;<stmts>

3. <stmts> begin<stmts>end;<stmts>

4. <stmts> l

tracing steps

Step Parse Stack Remaining Input Action (10) 0,1,2,3 $ Accept

Symbol State 0 1 2 3 4 5 6 7 8 9 10 11

begin S S S S S end R4 S R4 R4 S R4 R2 R3

; S S SimpleStmt S S S S

$ A

Shift-Reduce

Parsers(17)

action

table

tracing steps

Step Parse Stack Remaining Input Action (1) 0 begin SimpleStmt ; SimpleStmt ; end $ Shift 1 (2) 0,1 SimpleStmt ; SimpleStmt ; end $ Shift 5 (3) 0,1,5 ; SimpleStmt ; end $ Shift 6 (4) 0,1,5,6 SimpleStmt ; end $ Shift 5 (5) 0,1,5,6,5 ; end $ Shift 6 (6) 0,1,5,6,5,6 end $ /* goto(6,<stmts>) = 10 */ Reduce 4 (7) 0,1,5,6,5,6,10 end $ /* goto(6,<stmts>) = 10 */ Reduce 2 (8) 0,1,5,6,10 end $ /* goto(1,<stmts>) = 2 */ Reduce 2 (9) 0,1,2 end $ Shift 3 (10) 0,1,2,3 $ Accept

Shift-Reduce Parsers(18)

<program>

begin(1) <stmts> end(9) $(10)

SimpleStmt(2) ;(3) <stmts>

SimpleStmt(4) ;(5) <stmts>

l(6)

R4(6)

R2(7)

R2(8)

grammar G0 1. <program> begin <stmts> end $ 2. <stmts> SimpleStmt ; <stmts> 3. <stmts> begin <stmts> end ; <stmts> 4. <stmts> l

59 59

Outline 6.0 Introduction

6.1 Shift-Reduce Parsers

6.2 LR Parsers

6.3 LR(1) Parsing

6.4 SLR(1)Parsing

6.5 LALR(1)

6.6 Calling Semantic Routines in Shift-Reduce Parsers

6.7 Using a Parser Generator (TA course)

6.8 Optimizing Parse Tables

6.9 Practical LR(1) Parsers

6.10 Properties of LR Parsing

6.11 LL(1) or LAlR(1) , That is the question

6.12 Other Shift-Reduce Technique

Fall 2012 Bottom Up Parsing

60

LR Parsers LR(n) n=0~k

Read from Left, Right-most derivation, n look-ahead

LR parsers are deterministic

No backup or retry parsing actions

LR(0):

Without prediction read from Left, Right-most derivation, 0 look-ahead

LR(1):

1-token look-ahead

General

LR(k) parsers

Decide the next action by examining the tokens already shifted and at most k look-ahead tokens

The most powerful of deterministic

Difficult to implement

Fall 2012 Bottom Up Parsing

61

A production has the form

AX1X2…Xj

By adding a dot, we get a configuration (or an item)

A•X1X2…Xj

AX1X2…Xi • Xi+1 … Xj

AX1X2…Xj •

The • indicates how much of a RHS has been shifted onto the stack. an item (configuration) tells you where you are in a parse!

These are LR(0) configurations since no lookahead info is used.

An item with the • at the end of the RHS

Such as, AX1X2…Xj •, indicates that RHS should be reduced to LHS, it thus has recognized that production.

An item with the • at the beginning of RHS

Such as, A•X1X2…Xj, predicts that production, that is the RHS will be shifted onto the stack

LR(0) Table Construction(1)

Fall 2012 Bottom Up Parsing

LR(0) Table Construction(2) An LR(0) state is a set of configurations

The actual state of LR(0) parsers is denoted by one of the items (configurations).

The closure0 operation:

if there is a configuration B • A in the set where A is a non terminal, then add all configurations of the form A • to the set.

The initial configuration

s0 = closure0({S • $})

A configuration set is all possible configurations at a given point during a parse.

Configuration_set closure (configuration_set s) { configuration_set s’ = s ; do {

if( B • A s’ for A Vn ) { /* Predict productions with A as LHS */ Add all configurations of the form A • γ to s’ } } while (more new configurations can be added) ; return 0; }

EX: for grammar G1 :

1. S'S$

2.SID|l closure0( { S S $ } ) =

{ S' S$,

S ID,

S l }

special case: l

LR(0) Table Construction(3) • Q1: Why the grammar use S'S$ ?

• Ans: To check for the end of the parse.

EX: If S’ does not exist~

SID$

S l$

When we button up to reduce the original symbol S, there are two paths to achieve it.

Multipath is a problem that if we

have in complex grammars like C.

A lot of paths we need to check the ending symbol $.

EX: for grammar G1 :

1. S'S$

2.SID|l

closure0( { S S $ } ) =

{ S' S$,

S ID,

S l }

Given a configuration set s, we can compute its successor, s’ , under a symbol X

Denoted go_to0(s,X)=s’

Configuration_set goto (configuration_set s , symbol x) { Sb = Ø ;

for (each configuration c s) if(c = A β•x γ to sb) Add A βx • γ to sb ; /* * That is, we advance the • past the symbol X, * if possible. Configurations not having a * dot preceding an X are not included in sb . */ /* Add new predictions to sb via closure0. */ return closure0(sb) ; }

LR(0) Table Construction(4)

void_build_CFSM(void)

{

S = SET_OF(S0);

while (S is nonempty) {

Remove a configuration set s from S;

/* Consider both terminals and non-terminals */

for ( X in Symbols) {

if(go_to0(s,X) does not label a CFSM state) {

Create a new CFSM state & label with go_to0(s , X)

Add go_to0(s,X) to S;

}

Create a transition under X from the state s

labels to the state go_to0(s , X)

}

}

}

The grammar is finite, also the # of configurations and configuration sets.

Characteristic finite state machine (CFSM)

Build by identifying configuration sets and successor operations with CFSM states and transitions

It is a finite automaton

LR(0) Table Construction(5)

EX: for grammar G1 :

1. S'S$

2.SID|l

state 0

S' S$,

S ID,

S l

state 1

S ID

ID

state 2

S' S $

S

state 3

S' S $

$

state 4

error

Int ** build_go_to_table(finite_automation CFSM) {

const int N = num_states (CFSM);

int **tab;

Dynamically allocate a table of dimension

N × num_symbols (CFSM) to represent

the go_to table and assign it to tab;

Number the states of CFSM from 0 to N-1,

with the Start State labeled 0;

for( S = 0 ; S<=N-1 ; S++) {

/* Consider both terminals and non-terminals. */

for ( X in Symbols) {

if ( State S has a transition under X to some state T)

tab [S][X] = T ;

else

tab [S][X] = EMPTY;

}

}

return tab;

}

LR(0) Table Construction(6) CFSM is the goto table of LR(0) parsers. state 0

S' S$,

S ID,

S l

state 1

S ID

ID

state 2

S' S $

S

state 3

S' S $

$

State Symbol

ID $ S

0 1 4 2

1 4 4 4

2 4 3 4

3 4 4 4

4

goto table

Because LR(0) uses no look-ahead, we must extract the

action function directly from the configuration sets of

CFSM

Let Q={Shift, Reduce1, Reduce2 , …, Reducen}

There are n productions in the CFG

Let S0 be the set of CFSM states

The power set P, is a projection that maps each CFSM set

to appropriate subset of Q

P:S02Q 2Q is the power set of Q.

P(s)={Reducei | B • s and production i is B }

(if A • a s for a Vt Then {Shift} Else )

LR(0) Table Construction(7)

G is LR(0) if and only if s S0 |P(s)|=1

If G is LR(0), the action table is trivially extracted from P

P(s)={Shift} action[s]=Shift

P(s)={Reducei}, where production j is the augmenting

production, action[s]=Accept

P(s)={Reducei}, ij, action[s]=Reducei

P(s)= action[s]=Error

LR(0) Table Construction(8)

state 0

S' S$,

S ID,

S l

state 1

S ID

ID

state 2

S' S $

S

state 3

S' S $

$

LR(0) Table Construction(9)

EX: for grammar G1 :

1. S'S$

2.SID|l

state 0 1 2 3

action S R2 S Accept

Reducei | B • s and production i is B (if A • a s for a Vt Then {Shift} Else )

state 0

S' S$,

S ID,

S l

state 1

S ID

ID

state 2

S' S $

S

state 3

S' S $

$

LR(0) Table Construction(10)

EX: for grammar G1 :

1. S'S$

2.SID|l

state 0 1 2 3

action S R2 S Accept

Reducei | B • s and production i is B (if A • a s for a Vt Then {Shift} Else )

state 0

S' S$,

S ID,

S l

state 1

S ID

ID

state 2

S' S $

S

state 3

S' S $

$

LR(0) Table Construction(11)

EX: for grammar G1 :

1. S'S$

2.SID|l

state 0 1 2 3

action S R2 S Accept

Reducei | B • s and production i is B (if A • a s for a Vt Then {Shift} Else )

state 0

S' S$,

S ID,

S l

state 1

S ID

ID

state 2

S' S $

S

state 3

S' S $

$

LR(0) Table Construction(12)

EX: for grammar G1 :

1. S'S$

2.SID|l

state 0 1 2 3

action S R2 S Accept

Reducei | B • s and production i is B (if A • a s for a Vt Then {Shift} Else )

Any state s S0 for which |P(s)|>1 is said to be inadequate

Two kinds of parser conflicts create inadequacies in configuration sets

Shift-reduce conflicts

Reduce-reduce conflicts

Should be able to resolve inadequacy by using alookahead

If is easy to introduce inadequacies in CFSM states

Hence, few real grammars are LR(0). For example,

Consider l-productions

The only possible configuration involving a l-production is of the form A l•

However, if A can generate any terminal string other than l, then a shift

action must also be possible (First(A))

LR(0) parser will have problems in handling operator precedence properly

LR(0) Table Construction(13)

Before tracing , we will need to know the mind of CFSM

LR(0) Tracing Example(0)

for grammar G2 :

1. SE$

2.EE+T

3.ET

4.T id

5.T (E)

closure0( { T ( E ) }

= { T ( E ) ,

E E + T ,

E T ,

T id ,

T ( E ) }

T

( E )

E + T

T

( E )

T

id

T

( E )

T

id

When shift ( , some possible answers of tree:

state 0 S E$ E E+T E T T id T (E)

LR(0) Tracing Example(1)

closure0( { S E$ } ) = { S E$, E E+T, E T, T id, T (E) }

E

T

(

id

LR(0) Tracing Example(2)

closure0({ S E $, E E +T } ) =itself

E

T

(

id

state 1 S E $ E E +T

$

+

state 0 S E$ E E+T E T T id T (E)

LR(0) Tracing Example(3)

closure0({ S E $ } ) =itself

E

T

(

id

$

+

state 2 S E $

state 1 S E $ E E +T

state 0 S E$ E E+T E T T id T (E)

LR(0) Tracing Example(4)

closure0({E E+ T}) = {E E+ T, T id, T (E) }

E

T

(

id

$

+ state 3 E E + T T id T (E)

id

T

(

state 2 S E $

state 1 S E $ E E +T

state 0 S E$ E E+T E T T id T (E)

LR(0) Tracing Example(5)

closure0({E E+ T }) =itself

E

T

(

id

$

+

id

T

(

state 4 E E +T

state 3 E E + T T id T (E)

state 2 S E $

state 1 S E $ E E +T

state 0 S E$ E E+T E T T id T (E)

LR(0) Tracing Example(6)

closure0({T id }) =itself

E

T

(

id

$

+

id

T

(

state 5 T id

state 4 E E +T

state 3 E E + T T id T (E)

state 2 S E $

state 1 S E $ E E +T

state 0 S E$ E E+T E T T id T (E)

LR(0) Tracing Example(7)

closure0({T ( E) }) = { T ( E) , E E+T, E T, T id, T (E) }

E

T

(

id

$

+

id

T

(

state 4 E E +T

state 6 T ( E) E E+T E T T id T (E)

(

id

T

E

state 5 T id

state 3 E E + T T id T (E)

state 2 S E $

state 1 S E $ E E +T

state 0 S E$ E E+T E T T id T (E)

LR(0) Tracing Example(8)

closure0({T (E ) ,E E +T } ) =itself

E

T

(

id

$

+

id

T

(

(

id

T

E

state 7 T (E) E E +T

+ )

state 4 E E +T

state 6 T ( E) E E+T E T T id T (E)

state 5 T id

state 3 E E + T T id T (E)

state 2 S E $

state 1 S E $ E E +T

state 0 S E$ E E+T E T T id T (E)

LR(0) Tracing Example(9)

closure0({T (E ) } ) =itself

E

T

(

id

$

+

id

T

(

(

id

T

E

+ )

state 8 T (E)

state 7 T (E) E E +T

state 4 E E +T

state 6 T ( E) E E+T E T T id T (E)

state 5 T id

state 3 E E + T T id T (E)

state 2 S E $

state 1 S E $ E E +T

state 0 S E$ E E+T E T T id T (E)

LR(0) Tracing Example(10)

closure0({E T } ) =itself

E

T

(

id

$

+

id

T

(

(

id

T

E

+ )

state 8 T (E)

state 9

E T

state 7 T (E) E E +T

state 4 E E +T

state 6 T ( E) E E+T E T T id T (E)

state 5 T id

state 3 E E + T T id T (E)

state 2 S E $

state 1 S E $ E E +T

state 0 S E$ E E+T E T T id T (E)

state 0 S E$ E E+T E T T id T (E)

LR(0) Tracing Example(11)

E

T

(

id

state 1 S E $ E E +T

$

+

state 2 S E $

state 3 E E + T T id T (E)

id

T

(

state 4 E E +T

state 5 T id

state 6 T ( E) E E+T E T T id T (E)

(

id

T

E

state 7 T (E) E E +T

+ )

state 8 T (E)

state 9 E T

Symbol State

0 1 2 3 4 5 6 7 8 9 10

anything S S A S R2 R4 S S R5 R3

state 10 Error

any

error

action

table

Reducei | B • s and production i is B (if A • a s for a Vt Then {Shift} Else )

LR(0) Tracing

Example(12)

goto table

State Symbol

S E T + id ( ) $

0 1 9 5 6

1 3 2

2

3 4 5 6

4

5

6 7 9 5 6

7 3 8

8

9

10

Stat

e

Symbol

S E T + id ( ) $

0 1 9 5 6

1 3 2

2

3 4 5 6

4

5

6 7 9 5 6

7 3 8

8

9

10

Symbol State

0 1 2 3 4 5 6 7 8 9 10

anything S S A S R2 R4 S S R5 R3

Program Example (1)

Initial :(id)$

step1:0 (id)$ shift (

1

Tree:

(

Stat

e

Symbol

S E T + id ( ) $

0 1 9 5 6

1 3 2

2

3 4 5 6

4

5

6 7 9 5 6

7 3 8

8

9

10

Symbol State

0 1 2 3 4 5 6 7 8 9 10

anything S S A S R2 R4 S S R5 R3

Program Example (2)

step2:06 id)$ shift id

2

Tree:

(

Initial :(id)$

id

Stat

e

Symbol

S E T + id ( ) $

0 1 9 5 6

1 3 2

2

3 4 5 6

4

5

6 7 9 5 6

7 3 8

8

9

10

Symbol State

0 1 2 3 4 5 6 7 8 9 10

anything S S A S R2 R4 S S R5 R3

Program Example (3)

step3:065 )$ reduce 4

3

Tree:

Initial :(id)$

(

id

T

Stat

e

Symbol

S E T + id ( ) $

0 1 9 5 6

1 3 2

2

3 4 5 6

4

5

6 7 9 5 6

7 3 8

8

9

10

Symbol State

0 1 2 3 4 5 6 7 8 9 10

anything S S A S R2 R4 S S R5 R3

Program Example (4)

step4:069 )$ reduce 3

4

Tree:

Initial :(id)$

(

id

T

E

Stat

e

Symbol

S E T + id ( ) $

0 1 9 5 6

1 3 2

2

3 4 5 6

4

5

6 7 9 5 6

7 3 8

8

9

10

Symbol State

0 1 2 3 4 5 6 7 8 9 10

anything S S A S R2 R4 S S R5 R3

Program Example (5)

step5:067 )$ shift )

5

Tree: (

id

T

Initial :(id)$

E )

Stat

e

Symbol

S E T + id ( ) $

0 1 9 5 6

1 3 2

2

3 4 5 6

4

5

6 7 9 5 6

7 3 8

8

9

10

Symbol State

0 1 2 3 4 5 6 7 8 9 10

anything S S A S R2 R4 S S R5 R3

Program Example (6)

step6:0678 $ reduce 5

6

Tree:

Initial :(id)$

(

id

T

E )

T

Stat

e

Symbol

S E T + id ( ) $

0 1 9 5 6

1 3 2

2

3 4 5 6

4

5

6 7 9 5 6

7 3 8

8

9

10

Symbol State

0 1 2 3 4 5 6 7 8 9 10

anything S S A S R2 R4 S S R5 R3

Program Example (7)

step7:09 $ reduce 3

7

Tree:

Initial :(id)$

(

id

T

E )

T

E

Stat

e

Symbol

S E T + id ( ) $

0 1 9 5 6

1 3 2

2

3 4 5 6

4

5

6 7 9 5 6

7 3 8

8

9

10

Symbol State

0 1 2 3 4 5 6 7 8 9 10

anything S S A S R2 R4 S S R5 R3

Program Example (8)

step8:01 $ shift $

8

Tree:

Initial :(id)$

(

id

T

E )

T

E $

Stat

e

Symbol

S E T + id ( ) $

0 1 9 5 6

1 3 2

2

3 4 5 6

4

5

6 7 9 5 6

7 3 8

8

9

10

Symbol State

0 1 2 3 4 5 6 7 8 9 10

anything S S A S R2 R4 S S R5 R3

Program Example (9)

step9:012 Accept

9 Accept

Tree:

Initial :(id)$

(

id

T

E )

T

E $

S

96 96

Outline 6.0 Introduction

6.1 Shift-Reduce Parsers

6.2 LR Parsers

6.3 LR(1) Parsing

6.4 SLR(1)Parsing

6.5 LALR(1)

Fall 2012 Bottom Up Parsing

97

LR(1) Parsing (1)

An LR(1) configuration, or item is of the form

AX1X2…Xi • Xi+1 … Xj, l where l Vt{l}

The look ahead component l represents a possible look-ahead

after the entire right-hand side has been matched

The l appears as look-ahead only for the augmenting production

because there is no look-ahead after the end-marker

We use the following notation to represent the set of LR(1)

configurations that shared the same dotted production

AX1X2…Xi • Xi+1 … Xj, {l1…lm}

={AX1X2…Xi • Xi+1 … Xj, l1}

{AX1X2…Xi • Xi+1 … Xj, l2}

{AX1X2…Xi • Xi+1 … Xj, lm}

Fall 2012 Bottom Up Parsing

98

LR(1) Parsing (2)

LR(1) There are many more distinct LR(1) configurations than LR(0) configurations.

In fact, the major difficulty with LR(1) parsers is not their power but rather finding ways to represent them in storage-efficient ways.

Parsing begins with the configuration : closure1({S • $, {l}})

Configuration_set closure1 (configuration_set s) { configuration_set s’ = s ; do { if( B • A , l s’ for A Vn ) { /* * Predict productions with A as the left-hand side. * Possible lookaheads are First(l ) */ Add all configurations of the form A • γ, u where u First(l ) to s’ } } while (more new configurations can be added) ; return s’; }

for grammar G2 : 1. SE$

2.EE+T

3.ET

4.T id

5.T (E)

closure1(S • E$, l}) = { S E$,{l} E E+T,{$+} E T,{$+} T id,{$+} T (E),{$+} }

Fall 2012 Bottom Up Parsing

99

LR(1) Parsing (3)

Tracing Example for grammar G2 :

1. SE$

2.EE+T

3.ET

4.T id

5.T (E)

closure1(S • E$, l})

S E$,{l}

E E+T,{$} E T,{$}

T id,{$} T (E),{$}

E E+T,{+} E T,{+}

T id,{+} T (E),{+}

closure1(S • E$, l})=

{ S E$,{l} E E+T,{$+} E T,{$+} T id,{$+} T (E),{$+} }

Fall 2012 Bottom Up Parsing

100

LR(1) Parsing (4)

Given an LR(1) configuration set s

We compute its successor, s', under a symbol X

go_to1(s,X) Configuration_set goto1 (configuration_set s , symbol x) { Sb = Ø ;

for (each configuration c s) if( c is of the form A βx • γ, l)

//In goto0 if( each configuration c s) Add A βx • γ, l to sb ; /* * That is, we advance the • past the symbol X, * if possible. Configurations not having a * dot preceding an X are not included in sb . */ /* Add new predictions to sb via closure1. */ return closure1(sb) ; } Fall 2012 Bottom Up Parsing

101

LR(1) Parsing (5)

LR(1) We can build a finite automata that is analogue of the LR(0) CFSM

LR(1) FSM, LR(1) machine

The relationship between CFSM and LR(1) macine By merging LR(1) machine’s configuration sets, we can obtain CFSM

void_build_LR1(void)

{

Create the Start State of FSM; Label it with s0

Put s0 into an initially empty set , S.

while (S is nonempty) {

Remove a configuration set s from S;

/* Consider both terminals and non-terminals */

for ( X in Symbols) {

if(go_to1(s,X) does not label a FSM state) {

Create a new FSM state and label it with go_to1(s , X) into S;

Put go_to1(s , X) into S;

}

Create a transition under X from the state s

labels to the state go_to1 (s , X) labels;

} } }

Tracing Example:

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

Fall 2012 Bottom Up Parsing

102

state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

E

P

id

T (

Fall 2012 Bottom Up Parsing

103

state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

E

P

id

T

state 1 S E $ ,{l} E E +T,{$+}

(

Fall 2012 Bottom Up Parsing

104

state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

E

P

id

T

state 1 S E $ ,{l} E E +T,{$+}

$ state 2 //Accept S E $ ,{l}

(

Fall 2012 Bottom Up Parsing

105

state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

E

P

id

T

state 1 S E $ ,{l} E E +T,{$+}

+

$ state 2 //Accept S E $ ,{l}

state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

P

T id ( (

Fall 2012 Bottom Up Parsing

106

state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

E

P

id

T

state 1 S E $ ,{l} E E +T,{$+}

+

$ state 2 //Accept S E $ ,{l}

state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

P

T id (

state 4 T P ,{$+*}

(

Fall 2012 Bottom Up Parsing

107

state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

E

P

id

T

state 1 S E $ ,{l} E E +T,{$+}

+

$ state 2 //Accept S E $ ,{l}

state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

P

T id

(

state 4 T P ,{$+*}

state 5 P id ,{$+*}

(

Fall 2012 Bottom Up Parsing

108

state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

E

P

id

T

state 1 S E $ ,{l} E E +T,{$+}

+

$ state 2 //Accept S E $ ,{l}

state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

P

T id

(

state 4 T P ,{$+*}

state 5 P id ,{$+*}

state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

(

E T

P id (

Be careful of

look-ahead !!

Fall 2012 Bottom Up Parsing

109

state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

E

P

id

T

state 1 S E $ ,{l} E E +T,{$+}

+

$ state 2 //Accept S E $ ,{l}

state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

P

T id

(

state 4 T P ,{$+*}

state 5 P id ,{$+*}

state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

(

state 7 E T ,{$+} T T *P,{$+*}

E T

P id (

*

Fall 2012 Bottom Up Parsing

110

state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

E

P

id

T

state 1 S E $ ,{l} E E +T,{$+}

+

$ state 2 //Accept S E $ ,{l}

state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

P

T id

(

state 4 T P ,{$+*}

state 5 P id ,{$+*}

state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

(

state 7 E T ,{$+} T T *P,{$+*}

E T

P id (

*

state 8 T T* P,{$+*} P id ,{$+*} P (E) ,{$+*}

id

P

(

Fall 2012 Bottom Up Parsing

111

state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

E

P

id

T

state 1 S E $ ,{l} E E +T,{$+}

+

$ state 2 //Accept S E $ ,{l}

state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

P

T id

(

state 4 T P ,{$+*}

state 5 P id ,{$+*}

state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

(

state 7 E T ,{$+} T T *P,{$+*}

E T

P id (

*

state 8 T T* P,{$+*} P id ,{$+*} P (E) ,{$+*}

id

P

(

state 9 T T* P ,{$+*}

Fall 2012 Bottom Up Parsing

112

state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

E

P

id

T

state 1 S E $ ,{l} E E +T,{$+}

+

$ state 2 //Accept S E $ ,{l}

state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

P

T id

(

state 4 T P ,{$+*}

state 5 P id ,{$+*}

state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

(

state 7 E T ,{$+} T T *P,{$+*}

E T

P

*

state 8 T T* P,{$+*} P id ,{$+*} P (E) ,{$+*}

id

P

(

state 9 T T* P ,{$+*}

state 10 P id ,{)+*}

(

id

Fall 2012 Bottom Up Parsing

113

state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

E

P

id

T

state 1 S E $ ,{l} E E +T,{$+}

+

$ state 2 //Accept S E $ ,{l}

state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

P

T id

(

state 4 T P ,{$+*}

state 5 P id ,{$+*}

state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

(

state 7 E T ,{$+} T T *P,{$+*}

E T

P

*

state 8 T T* P,{$+*} P id ,{$+*} P (E) ,{$+*}

id

P

(

state 9 T T* P ,{$+*}

state 10 P id ,{)+*}

(

id

state 11 E E+ T ,{$+} T T *P,{$+*}

* State 8

Fall 2012 Bottom Up Parsing

114

state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

E

P

id

T

state 1 S E $ ,{l} E E +T,{$+}

+

$ state 2 //Accept S E $ ,{l}

state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

P

T id

(

state 4 T P ,{$+*}

state 5 P id ,{$+*}

state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

(

state 7 E T ,{$+} T T *P,{$+*}

E

T

P

*

state 8 T T* P,{$+*} P id ,{$+*} P (E) ,{$+*}

id

P

(

state 9 T T* P ,{$+*}

state 10 P id ,{)+*}

(

id

state 11 E E+ T ,{$+} T T *P,{$+*}

* State 8

state 12 P (E ) ,{$+*} E E +T,{)+}

+ )

Fall 2012 Bottom Up Parsing

115

state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

E

P

id

T

state 1 S E $ ,{l} E E +T,{$+}

+

$ state 2 //Accept S E $ ,{l}

state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

P

T id

(

state 4 T P ,{$+*}

state 5 P id ,{$+*}

state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

(

state 7 E T ,{$+} T T *P,{$+*}

E

T

P

*

state 8 T T* P,{$+*} P id ,{$+*} P (E) ,{$+*}

id

P

(

state 9 T T* P ,{$+*}

state 10 P id ,{)+*}

(

id

state 11 E E+ T ,{$+} T T *P,{$+*}

* State 8

state 12 P (E ) ,{$+*} E E +T,{)+}

+

)

state 13 P (E ) ,{$+*}

Fall 2012 Bottom Up Parsing

116

state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

E

P

id

T

state 1 S E $ ,{l} E E +T,{$+}

+

$ state 2 //Accept S E $ ,{l}

state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

P

T id

(

state 4 T P ,{$+*}

state 5 P id ,{$+*}

state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

(

state 7 E T ,{$+} T T *P,{$+*}

E

T *

state 8 T T* P,{$+*} P id ,{$+*} P (E) ,{$+*}

id

P

(

state 9 T T* P ,{$+*}

state 10 P id ,{)+*}

id

state 11 E E+ T ,{$+} T T *P,{$+*}

* State 8

state 12 P (E ) ,{$+*} E E +T,{)+}

+

)

state 13 P (E ) ,{$+*}

(

state 14 T P ,{)+*}

P

Fall 2012 Bottom Up Parsing

117

state 0 S E$ ,{l} E E+T,{$+} E T ,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

E

P

id

T

state 1 S E $ ,{l} E E +T,{$+}

+

$ state 2 //Accept S E $ ,{l}

state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

P

T id

(

state 4 T P ,{$+*}

state 5 P id ,{$+*}

state 6 P ( E) ,{$+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

(

state 7 E T ,{$+} T T *P,{$+*}

E

T *

state 8 T T* P,{$+*} P id ,{$+*} P (E) ,{$+*}

id

P

(

state 9 T T* P ,{$+*}

state 10 P id ,{)+*}

id

state 11 E E+ T ,{$+} T T *P,{$+*}

* State 8

state 12 P (E ) ,{$+*} E E +T,{)+}

+

)

state 13 P (E ) ,{$+*}

(

state 14 T P ,{)+*}

P

state 18 P ( E) ,{)+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

id P

(

T E Fall 2012 Bottom Up Parsing

118

state 18 P ( E) ,{)+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

T

(

id P

State 14 State 10

E state 16 P (E ) ,{)+*} E E +T,{)+}

( +

LR(1) Parsing (16)

Fall 2012 Bottom Up Parsing

119

state 18 P ( E) ,{)+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

T

(

id P

State 14 State 10

E state 16 P (E ) ,{)+*} E E +T,{)+}

( +

state 15 P (E ) ,{)+*}

LR(1) Parsing (17)

Fall 2012 Bottom Up Parsing

120

state 18 P ( E) ,{)+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

T

(

id P

State 14 State 10

E state 16 P (E ) ,{)+*} E E +T,{)+}

(

+

state 15 P (E ) ,{)+*}

state 17 E E +T,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

P

id

(

T

LR(1) Parsing (18)

Renew state 12

->+ to state 17

Fall 2012 Bottom Up Parsing

121

state 18 P ( E) ,{)+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

T

(

id P

State 14 State 10

E state 16 P (E ) ,{)+*} E E +T,{)+}

(

+

state 15 P (E ) ,{)+*}

state 17 E E +T,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

P

id

(

T

state 19 E T ,{)+} T T *P ,{)+*}

*

Renew state 6

->T to state 19

LR(1) Parsing (19)

Fall 2012 Bottom Up Parsing

122

state 18 P ( E) ,{)+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

T

(

id P

State 14 State 10

E state 16 P (E ) ,{)+*} E E +T,{)+}

(

+

state 15 P (E ) ,{)+*}

state 17 E E +T,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

P

id

(

T

state 19 E T ,{)+} T T *P ,{)+*}

state 20 E E +T,{)+} T T *P ,{)+*}

*

*

LR(1) Parsing (20)

Fall 2012 Bottom Up Parsing

123

state 18 P ( E) ,{)+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

T

(

id P

State 14 State 10

E state 16 P (E ) ,{)+*} E E +T,{)+}

(

+

state 15 P (E ) ,{)+*}

state 17 E E +T,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

P

id

(

T

state 19 E T ,{)+} T T *P ,{)+*}

state 20 E E +T,{)+} T T *P ,{)+*}

*

state 21 T T * P,{)+*} P id ,{)+*} P (E) ,{)+*}

*

(

id P

LR(1) Parsing (21)

Fall 2012 Bottom Up Parsing

124

state 18 P ( E) ,{)+*} E E+T,{)+} E T ,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

T

(

id P

State 14 State 10

E state 16 P (E ) ,{)+*} E E +T,{)+}

(

+

state 15 P (E ) ,{)+*}

state 17 E E +T,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

P

id

(

T

state 19 E T ,{)+} T T *P ,{)+*}

state 20 E E +T,{)+} T T *P ,{)+*}

*

state 21 T T * P,{)+*} P id ,{)+*} P (E) ,{)+*}

*

(

id P

state 22 T T * P ,{)+*}

LR(1) Parsing (22)

Fall 2012 Bottom Up Parsing

125

LR(1) Parsing (23)

LR(1)

The go_to table used to

drive an LR(1) is extracted

directly from the LR(1)

machine

The algorithm

to generate “go_to”

table is same that we

discuss in LR(0)

Fall 2012 Bottom Up Parsing

126

LR(1) Parsing (24)

LR(1)

Action table is extracted directly from the configur-ation sets of the LR(1) machine

A projection function, P

P : S1Vt2Q

S1 be the set of LR(1) machine states

P(s,a)= {Reducei | B •,a s and production i is B } (if A • a,b s Then {Shift} Else )

Fall 2012 Bottom Up Parsing

127

LR(1) Parsing (25)

LR(1)

G is LR(1) if and only if

s S1 a Vt |P(s,a)|1

If G is LR(1), the action

table is trivially extracted

from P

P(s,$)={Shift}

action[s][$]=Accept

P(s,a)={Shift}, a$

action[s][a]=Shift

P(s,a)={Reducei},

action[s][a]=Reducei

P(s,a)=

action[s][a]=Error

Fall 2012 Bottom Up Parsing

128

LR(1) Parsing (26)

Example:

state 7 Reduce when look-ahead $+

Shift when look-ahead *

P(s,a)= {Reducei | B •,a s and production i is B } (if A • a,b s Then {Shift} Else )

Fall 2012 Bottom Up Parsing

129

Look- State ahead 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

+ S3 R5 R6 R3 R4 R6 R2 S17 R7 R5 R7 S17 R3 R2 R4

* R5 R6 S8 R4 R6 S8 R7 R5 R7 S21 S21 R4

id S5 S5 S10 S5 S10 S10 S10

( S6 S6 S18 S6 S18 S18 S18

) R6 S13 R5 R7 S15 R3 R2 R4

$ A R5 R6 R3 R4 R2 R7

S

E S1 S12 S16

T S7 S11 S19 S20 S19

P S4 S4 S14 S9 S14 S14 S22

Complete Table

Merge Action table & Go-To table

Fall 2012 Bottom Up Parsing

130

Combare G3 action in LR(0) and LR(1)

Symbol State

0 1 2 3 4 5 6 7 8 9 10 11 12

anything S S A S R5 R6 S S

R3

S R4 R7 S

R2

S

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

Look- State ahead 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

+ S3 R5 R6 R3 R4 R6 R2 S17 R7 R5 R7 S17 R3 R2 R4

* R5 R6 S8 R4 R6 S8 R7 R5 R7 S21 S21 R4

id S5 S5 S10 S5 S10 S10 S10

( S6 S6 S18 S6 S18 S18 S18

) R6 S13 R5 R7 S15 R3 R2 R4

$ A R5 R6 R3 R4 R2 R7

S

E S1 S12 S16

T S7 S11 S19 S20 S19

P S4 S4 S14 S9 S14 S14 S22

LR(0)

LR(1)

ambiguous

state 7 E T T T *P

state 7 E T ,{$+} T T *P,{$+*}

LR(0) LR(1)

Fall 2012 Bottom Up Parsing

131

Initial :(id+id)$

step1:0 (id+id)$ shift (

Tree:

(

Fall 2012 Bottom Up Parsing

132

Initial :(id+id)$

step2:0 6 id+id)$ shift id

Tree:

( id

Fall 2012 Bottom Up Parsing

133

Initial :(id+id)$

step3:0 6 10 +id)$ Reduce 6

Tree:

(

id

P

Fall 2012 Bottom Up Parsing

134

Initial :(id+id)$

step4:0 6 14 +id)$ Reduce 5

Tree:

(

id

P

T

Fall 2012 Bottom Up Parsing

135

Initial :(id+id)$

step5:0 6 19 +id)$ Reduce 3

Tree:

(

id

P

T

E

Fall 2012 Bottom Up Parsing

136

Initial :(id+id)$

step6:0 6 12 +id)$ shift +

Tree:

(

id

P

T

E +

Fall 2012 Bottom Up Parsing

137

Initial :(id+id)$

step7:0 6 12 17 id)$ shift id

Tree:

(

id

P

T

E + id

Fall 2012 Bottom Up Parsing

138

Initial :(id+id)$

step8:0 6 12 17 10 )$ Reduce 6

Tree:

(

id

P

T

E +

id

P

Fall 2012 Bottom Up Parsing

139

Initial :(id+id)$

step9:0 6 12 17 14 )$ Reduce 5

Tree:

(

id

P

T

E +

id

P

T

Fall 2012 Bottom Up Parsing

140

Initial :(id+id)$

step10:0 6 12 17 20 )$ Reduce 2

Tree:

(

id

P

T

+

id

P

E T

E

Fall 2012 Bottom Up Parsing

141

Initial :(id+id)$

step11:0 6 12 )$ Shift 13

Tree:

(

id

P

T

+

id

P

T E

E )

Fall 2012 Bottom Up Parsing

142

Initial :(id+id)$

step12:0 6 12 13 $ Reduce 7

Tree:

(

id

P

T

+

id

P

T E

E )

P

Fall 2012 Bottom Up Parsing

143

Initial :(id+id)$

step13:0 4 $ Reduce 7

Tree:

(

id

P

T

+

id

P

T E

E )

P

T

Fall 2012 Bottom Up Parsing

144

Initial :(id+id)$

step14:0 7 $ Reduce 3

Tree:

(

id

P

T

+

id

P

T E

E )

P

T

E

Fall 2012 Bottom Up Parsing

145

Initial :(id+id)$

step15:0 1 $ Accept

Tree:

(

id

P

T

+

id

P

T E

E )

P

T

E

Fall 2012 Bottom Up Parsing

146 146

Outline 6.0 Introduction

6.1 Shift-Reduce Parsers

6.2 LR Parsers

6.3 LR(1) Parsing

6.4 SLR(1)Parsing

6.5 LALR(1)

6.6 Calling Semantic Routines in Shift-Reduce Parsers

147

SLR(1) Parsing (1)

LR(1) parsers

are the most powerful case of shift-reduce parsers, using a single look-ahead

LR(1) grammars exist for virtually all programming languages

LR(1)’s problem is that the LR(1) machine contains so many states that the go_to and action tables become prohibitively large

In reaction to the space inefficiency of LR(1) tables computer scientists have devised parsing techniques that are almost as

powerful as LR(1) but that require far smaller tables

One is to start with the CFSM, and then add look-ahead after the CFSM is build

– SLR(1)

The other approach to reducing LR(1)’s space inefficiencies is to merger inessential LR(1) states

– LALR(1)

148

SLR(1) Parsing (2)

SLR(1) stands for Simple LR(1)

One-symbol look-ahead

Look-aheads are not built directly into configurations but rather are added after the LR(0) configuration sets are built

An SLR(1) parser will perform a reduce action for configuration B • if the look-ahead symbol is in the set Follow(B)

The SLR(1) projection function, from CFSM states,

P : S0Vt2Q

P(s,a)={Reducei | B •,a Follow(B) and production i is B } (if A • a s for a Vt Then {Shift} Else )

149

SLR(1) Parsing (3)

G is SLR(1) if and only if

s S0 a Vt |P(s,a)|1

If G is SLR(1), the action table is trivially extracted from P

P(s,$)={Shift} action[s][$]=Accept

P(s,a)={Shift}, a$ action[s][a]=Shift

P(s,a)={Reducei}, action[s][a]=Reducei

P(s,a)= action[s][a]=Error

Clearly SLR(1) is a proper superset of LR(0)

150

SLR(1) Parsing (4)

Consider G3

It is LR(1) but not LR(0)

What’re follow-sets in G3?

Consider G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

Follow(S) = {l},

Follow(E) = {+)$},

Follow(T) = {+*)$},

Follow(P) = {+*)$}

151

Follow(S) = {l},

Follow(E) = {+)$},

Follow(T) = {+*)$},

Follow(P) = {+*)$}

state 0 S E$ E E+T E T T T*P T P P id P (E)

E

P

id

T (

152

Follow(S) = {l},

Follow(E) = {+)$},

Follow(T) = {+*)$},

Follow(P) = {+*)$}

state 0 S E$ E E+T E T T T*P T P P id P (E)

E

P

id

T

state 1 S E $ E E +T

+

$

(

153

Follow(S) = {l},

Follow(E) = {+)$},

Follow(T) = {+*)$},

Follow(P) = {+*)$}

state 0 S E$ E E+T E T T T*P T P P id P (E)

E

P

id

T

state 1 S E $ E E +T

+

$ state 2 //Accept S E $

(

154

Follow(S) = {l},

Follow(E) = {+)$},

Follow(T) = {+*)$},

Follow(P) = {+*)$}

state 0 S E$ E E+T E T T T*P T P P id P (E)

E

P

id

T

state 1 S E $ E E +T

+

$ state 2 //Accept S E $

(

state 3 E E+ T T T*P T P P id P (E)

P

T id

(

155

Follow(S) = {l},

Follow(E) = {+)$},

Follow(T) = {+*)$},

Follow(P) = {+*)$}

state 0 S E$ E E+T E T T T*P T P P id P (E)

E

P

id

T

state 1 S E $ E E +T

+

$ state 2 //Accept S E $

(

state 3 E E+ T T T*P T P P id P (E)

P

T id

(

P state 4 T P

156

Follow(S) = {l},

Follow(E) = {+)$},

Follow(T) = {+*)$},

Follow(P) = {+*)$}

state 0 S E$ E E+T E T T T*P T P P id P (E)

E

P

id

T

state 1 S E $ E E +T

+

$ state 2 //Accept S E $

(

state 3 E E+ T T T*P T P P id P (E)

P

T id

(

P state 4 T P

state 5 P id

157

Follow(S) = {l},

Follow(E) = {+)$},

Follow(T) = {+*)$},

Follow(P) = {+*)$}

state 0 S E$ E E+T E T T T*P T P P id P (E)

E

P

id

T

state 1 S E $ E E +T

+

$ state 2 //Accept S E $

(

state 3 E E+ T T T*P T P P id P (E)

P

T id

(

P state 4 T P

state 5 P id

state 6 P ( E) E E+T E T T T*P T P P id P (E)

E

T

P (

id

State 4

158

Follow(S) = {l},

Follow(E) = {+)$},

Follow(T) = {+*)$},

Follow(P) = {+*)$}

state 0 S E$ E E+T E T T T*P T P P id P (E)

E

P

id

T

state 1 S E $ E E +T

+

$ state 2 //Accept S E $

(

state 3 E E+ T T T*P T P P id P (E)

P

T id

(

P state 4 T P

state 5 P id

state 6 P ( E) E E+T E T T T*P T P P id P (E)

E

T

P (

id

State 4

state 7 E T T T *P

*

159

Follow(S) = {l},

Follow(E) = {+)$},

Follow(T) = {+*)$},

Follow(P) = {+*)$}

state 0 S E$ E E+T E T T T*P T P P id P (E)

E

P

id

T

state 1 S E $ E E +T

+

$ state 2 //Accept S E $

(

state 3 E E+ T T T*P T P P id P (E)

P

T id

(

P state 4 T P

state 5 P id

state 6 P ( E) E E+T E T T T*P T P P id P (E)

E

T

P (

id

State 4

state 7 E T T T *P

*

state 8 T T* P P id P (E)

id

P

(

160

Follow(S) = {l},

Follow(E) = {+)$},

Follow(T) = {+*)$},

Follow(P) = {+*)$}

state 0 S E$ E E+T E T T T*P T P P id P (E)

E

P

id

T

state 1 S E $ E E +T

+

$ state 2 //Accept S E $

(

state 3 E E+ T T T*P T P P id P (E)

P

T id

(

P state 4 T P

state 5 P id

state 6 P ( E) E E+T E T T T*P T P P id P (E)

E

T

P (

id

State 4

state 7 E T T T *P

*

state 8 T T* P P id P (E)

id

P

(

state 9 T T* P

161

Follow(S) = {l},

Follow(E) = {+)$},

Follow(T) = {+*)$},

Follow(P) = {+*)$}

state 0 S E$ E E+T E T T T*P T P P id P (E)

E

P

id

T

state 1 S E $ E E +T

+

$ state 2 //Accept S E $

(

state 3 E E+ T T T*P T P P id P (E)

P

T id

(

P state 4 T P

state 5 P id

state 6 P ( E) E E+T E T T T*P T P P id P (E)

E

T

P (

id

State 4

state 7 E T T T *P

*

state 8 T T* P P id P (E)

id

P

(

state 9 T T* P

state 11 E E+ T T T *P

* State 8

162

Follow(S) = {l},

Follow(E) = {+)$},

Follow(T) = {+*)$},

Follow(P) = {+*)$}

state 0 S E$ E E+T E T T T*P T P P id P (E)

E

P

id

T

state 1 S E $ E E +T

+

$ state 2 //Accept S E $

(

state 3 E E+ T T T*P T P P id P (E)

P

T id

(

P state 4 T P

state 5 P id

state 6 P ( E) E E+T E T T T*P T P P id P (E)

E

T

P (

id

State 4

state 7 E T T T *P

*

state 8 T T* P P id P (E)

id

P

(

state 9 T T* P

state 11 E E+ T T T *P

* State 8

state 12 P (E ) E E +T

)

State 3

+

163

Follow(S) = {l},

Follow(E) = {+)$},

Follow(T) = {+*)$},

Follow(P) = {+*)$}

state 0 S E$ E E+T E T T T*P T P P id P (E)

E

P

id

T

state 1 S E $ E E +T

+

$ state 2 //Accept S E $

(

state 3 E E+ T T T*P T P P id P (E)

P

T id

(

P state 4 T P

state 5 P id

state 6 P ( E) E E+T E T T T*P T P P id P (E)

E

T

P (

id

State 4

state 7 E T T T *P

*

state 8 T T* P P id P (E)

id

P

(

state 9 T T* P

state 11 E E+ T T T *P

* State 8

state 12 P (E ) E E +T

)

State 3

+

state 10 P (E)

164

SLR(1) Parsing (5)

SLR(1) action table

165

SLR(1) Parsing (6)

Limitations of the SLR(1) Technique

The use of Follow sets to estimate the look-aheads that predict

reduce actions is less precise than using the exact look-aheads

incorporated into LR(1) configurations

Example in next page

166

Compare

LR(1)&

SLR(1)

LR(1)

SLR(1)

Consider Input: id )

Step1:0 id) shift 5

Step2:05 ) Error

Step1:0 id) shift 5

Step2:05 ) Reduce 6

Step3:04 ) Reduce 5

Step4:07 ) Reduce 3

Step5:01 ) Error

Consider G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

LR(1)

SLR(1)

The performance of

detecting errors

167 167

Outline 6.0 Introduction

6.1 Shift-Reduce Parsers

6.2 LR Parsers

6.3 LR(1) Parsing

6.4 SLR(1)Parsing

6.5 LALR(1)

6.6 Calling Semantic Routines in Shift-Reduce Parsers

168

LALR(1) (1)

LALR(1) parsers

can be built by first constructing an LR(1) parser and then

merging states

An LALR(1) parser is an LR(1) parser in which all states that differ only in the

look-ahead components of the configurations are merged

LALR is an acronym for Look Ahead LR

state 3 E E+ T,{$+} T T*P ,{$+*} T P ,{$+*} P id ,{$+*} P (E) ,{$+*}

state 17 E E +T,{)+} T T*P ,{)+*} T P ,{)+*} P id ,{)+*} P (E) ,{)+*}

The core of the above two configurations is the same. Example: LR(1)- state3,state17

Core s’

E E+ T

T T*P

T P

P id

P (E)

Cognate(s)={c|cs, core(s)=s}

state 3 E E+ T,{)$+} T T*P ,{)$+*} T P ,{)$+*} P id ,{)$+*} P (E) ,{)$+*}

169

LR(1) G3 diagram

LALR(1) G3 diagram

170 LALR(1) G3 diagram

SLR(1) G3 diagram (CFSM)

Compare SLR(1) & LALR(1)

It’s same behavior whether

action or goto using SLR(1) or

LALR(1) in G3

Follow(S) = {l},

Follow(E) = {+)$},

Follow(T) = {+*)$},

Follow(P) = {+*)$}

Example:

Compare state 7and state10

in SLR(1) andLALR(1).

Are they all same?

When’s different???

171

LALR(1) (4)

The CFSM state is transformed into its LALR(1) Cognate

P : S0Vt2Q

P(s,a)={Reducei | B •,a Cognate(s) and production i is B }

(if A • a s Then {Shift} Else )

G is LALR(1) if and only if

s S0 a Vt |P(s,a)|1

If G is LALR(1), the action table is trivially extracted from P

P(s,$)={Shift} action[s][$]=Accept

P(s,a)={Shift}, a$ action[s][a]=Shift

P(s,a)={Reducei}, action[s][a]=Reducei

P(s,a)= action[s][a]=Error

172

state 1 <stmt> ID

<var> ID

<var> ID [<expr>]

LALR(1) (5) For Grammar 5:

Assume statements are separated by ;’s,

the grammar is not SLR(1) because

; Follow(<stmt>) and

; Follow(<var>), since <expr><var>

grammar G5 : ….. <prog> <stmt>;{<stmt>;} <stmt>ID

<stmt><var>:=<expr>

<var> ID

<var> ID[<expr>]

<expr><var>

Reduce-reduce conflict

state 0 …… <prog> <stmt>;{<stmt>;} <stmt> ID

<stmt> <var>:=<expr>

<var> ID

<var> ID[<expr>]

<expr> <var>

id

173

LALR(1) (6)

However, in LALR(1),

if we use <var> ID the next symbol must be :=

so action[ 1, := ] = reduce(<var> ID)

action[ 1, ; ] = reduce(<stmt> ID)

action[ 1,[ ] = shift

There is no conflict.

state 1 <stmt> ID ,{$ ;} <var> ID ,{$ ; :=} <var> ID [<expr>] ,{$ ; := [ }

state 0 …… <prog> <stmt>;{<stmt>;} ,{$ ;} <stmt> ID ,{$ ;}

<stmt> <var>:=<expr> ,{$ ; :=}

<var> ID,{$ ; :=}

<var> ID[<expr>] ,{$ ; := [ }

<expr> <var>

id

174

A common technique

to put an LALR(1) grammar into SLR(1) form is to introduce a new non-terminal whose global (I.e. SLR) look-aheads more nearly correspond to LALR’s exact look-aheads

Follow(<lhs>) = {:=}

LALR(1) (7)

grammar G5 : …… <prog> <stmt>;{<stmt>;} <stmt> ID

<stmt> <var>:=<expr>

<var> ID

<var> ID[<expr>]

<expr> <var>

grammar G5 : …… <prog> <stmt>;{<stmt>;} <stmt> ID

<stmt> <lhs>:=<expr>

<lhs> ID

<lhs> ID[<expr>]

<var> ID

<var> ID[<expr>]

<expr> <var>

175

Both SLR(1) and LALR(1) are both built CFSM

Does the case ever occur in which action table can’t work?

At times, it is the CFSM itself that is at fault.

A different expression non-terminal is used to allow error or warning diagnostics

grammar G6 : S (Exp1)

S [Exp1]

S (Exp2]

S [Exp2)

<Exp1>ID

<Exp2>ID

LALR(1) (8)

In state4 , after reduce,

we do not know what

state should be the

next state

In LR(1) , state4 will split into

two states and have a solution.

176

Building LALR(1) Parsers (1)

In the definition of LALR(1)

An LR(1) machine is first built, and then its states are merged to form an

automaton identical in structure to the CFSM

May be quite inefficient

An alternative is to build the CFSM first.

Then LALR(1) look-aheads are “propagated” from configuration to configuration

Propagate links: Case 1: one configuration is created from another in a

previous state via a shift operation

Case 2: one configuration is created as the result of a closure

or prediction operation on another configuration

A •X , L1 A X• , L2

L2={ x|xFirst( t) and t L1 } B •A , L1

A • , L2

177

Building LALR(1) Parsers(2) Step 1:

After the CFSM is built, we can create all the necessary propagate links to transmit look-aheads from one configuration to another (case1)

Step 2: spontaneous look-aheads are determined (case2)

By including in L2, for configuration A,L2, all spontaneous look-aheads induced by configurations of the form B A,L1

These are simply the non-l values of First()

Step 3: Then, propagate look-aheads via the propagate links

While (stack is not empty)

{

pop top items , assign its components to (s,c,L)

if ( configuration c in state s has any propagate links)

{

Try, in turn, to add L to the look-ahead set of each

configuration so linked.

for (each configuration c’ in state s’ to which L is added)

Push(s’,c’,L) onto the stack

} }

178

Building LALR(1) Parsers(3) state 1 S Opts$ Opts Opt Opt Opt ID

grammar G6 : S Opts $

Opts Opt Opt

Opt ID

Opt state 2 Opts Opt Opt Opt ID

state 3 Opt ID

ID ID

Build CFSM

state 1 S Opts$ , {} Opts Opt Opt ,{$} Opt ID,{ID}

Opt state 2 Opts Opt Opt Opt ID

state 3 Opt ID

ID ID

Build initial Lookahead

Stack:

(s1,c2,$)

(s1,c3,ID)

179

Building LALR(1) Parsers(3)

Opt state 2 Opts Opt Opt,{$} Opt ID

state 3 Opt ID

ID ID Step1:

Pop(s1,c2,$)

Add $ to c1 in s2

Push(s2,c1,$)

state 1 S Opts$ , {} Opts Opt Opt ,{$} Opt ID,{ID}

Opt state 2 Opts Opt Opt.{$} Opt ID,{$}

state 3 Opt ID

ID ID

Stack:

(s2,c1,$)

(s1,c3,ID)

Stack:

(s1,c2,$)

(s1,c3,ID)

Step2:

Pop(s2,c1,$)

Add $ to c2 in s2

Push(s2,c2,$)

state 1 S Opts$ , {} Opts Opt Opt ,{$} Opt ID,{ID}

grammar G6 : S Opts $

Opts Opt Opt

Opt ID

180

Building LALR(1) Parsers(4) state 1 S Opts$ , {} Opts Opt Opt ,{$} Opt ID,{ID}

Opt state 2 Opts Opt Opt.{$} Opt ID,{$}

state 3 Opt ID ,{$}

ID ID

Stack:

(s2,c2,$)

(s1,c3,ID)

Step3:

Pop(s2,c2,$)

Add $ to c1 in s3

Push(s3,c1,$)

state 1 S Opts$ , {} Opts Opt Opt ,{$} Opt ID,{ID}

Opt

ID ID

Stack:

(s3,c1,$)

(s1,c3,ID)

Step4:

Pop(s3,c1,$)

Nothing to added

(no links)

state 2 Opts Opt Opt.{$} Opt ID,{$}

state 3 Opt ID ,{$}

grammar G6 : S Opts $

Opts Opt Opt

Opt ID

181

Building LALR(1) Parsers(4) state 1 S Opts$ , {} Opts Opt Opt ,{$} Opt ID,{ID}

Opt state 2 Opts Opt Opt.{$} Opt ID,{$}

state 3 Opt ID ,{$ ID}

ID ID

Stack:

(s1,c3,ID)

Step5:

Pop(s1,c3,ID)

Add ID to c1 in s3

Push(s3,c1,ID)

state 1 S Opts$ , {} Opts Opt Opt ,{$} Opt ID,{ID}

Opt

ID ID

Stack:

(s3,c1,ID)

Step6:

Pop(s3,c1,ID)

Nothing to added

(no links)

state 2 Opts Opt Opt.{$} Opt ID,{$}

state 3 Opt ID ,{$ ID}

grammar G6 : S Opts $

Opts Opt Opt

Opt ID

182

Building LALR(1) Parsers(5) state 1 S Opts$ , {} Opts Opt Opt ,{$} Opt ID,{ID}

Opt state 2 Opts Opt Opt.{$} Opt ID,{$}

state 3 Opt ID ,{$ ID}

ID ID

Stack:

Step7:

Terminate algorithm

Stack:

high Index low Index

grammar G6 : S Opts $

Opts Opt Opt

Opt ID

183

Building LALR(1) Parsers (6) A number of LALR(1) parser

generators use look-ahead propagation to compute the parser action table

LALR-Gen uses the propagation algorithm

YACC examines each state repeatedly

184 184

Outline 6.0 Introduction

6.1 Shift-Reduce Parsers

6.2 LR Parsers

6.3 LR(1) Parsing

6.4 SLR(1)Parsing

6.5 LALR(1)

6.6 Calling Semantic Routines in Shift-Reduce Parsers

185

Calling Semantic Routines in Shift-

Reduce Parsers (1) Shift-reduce parsers

can normally handle larger classes of grammars than LL(1) parsers, which is a major reason for their popularity

are not predictive

so we cannot always be sure what production is being recognized until its entire right-hand side has been matched

The semantic routines can be invoked only after a production is recognized and reduced

Action symbols only at the extreme right end of a right-hand side

186

Calling Semantic Routines in Shift-

Reduce Parsers (2)

Two common tricks are known that allow more flexible placement of semantic routine calls

For example,

<stmt>if <expr> then <stmts> else <stmts> end if

We need to call semantic routines

after the conditional expression else and end if are matched

Solution: create new non-terminals that generate l

<stmt>if <expr> <test cond>

then <stmts> <process then part>

else <stmts> end if

<test cond>l

<process then part>l

187

Calling Semantic Routines in Shift-

Reduce Parsers (3) If the right-hand sides differ in the semantic routines

that are to be called, the parser will be unable to correctly determine which routines to invoke

Ambiguity will manifest. For example, <stmt>if <expr> <test cond1>

then <stmts> <process then part>

else <stmts> end if;

<stmt>if <expr> <test cond2>

then <stmts> <process then part>

else <stmts> end if;

<test cond1>l

<test cond2>l

<process then part>l

188

Calling Semantic Routines in Shift-

Reduce Parsers (4) An alternative to the use of l–generating non-terminals

is to break a production into a number of pieces,

with the breaks placed where semantic routines are required

<stmt><if head><then part><else part>

<if head>if <expr>

<then part>then <stmts>

<else part>then <stmts> end if;

This approach can make productions harder to read but has the advantage

that no l–generating are needed

189 189

Outline 6.0 Introduction

6.1 Shift-Reduce Parsers

6.2 LR Parsers

6.3 LR(1) Parsing

6.4 SLR(1)Parsing

6.5 LALR(1)

6.6 Calling Semantic Routines in Shift-Reduce Parsers

6.7 Using a Parser Generator (TA course)

6.8 Optimizing Parse Tables

6.9 Practical LR(1) Parsers

6.10 Properties of LR Parsing

6.11 LL(1) or LALR(1) , That is the question

6.12 Other Shift-Reduce Technique

190

Optimizing

Parse tables (1)

Action table

Step1: Merge Action table and Go-to table

Lookahead State

0 1 2 3 4 5 6 7 8 9 10 11 12

+ S R5 R6 R3 R4 R7 R2 S

* R5 R6 S R4 R7 S

id S S S S

( S S S S

) R5 R6 R3 R4 R7 R2 S

$ A R5 R6 R3 R4 R7 R2

191

Optimizing

Parse tables (1)

Goto table

Optimizing Parse Table

Step1:Merge Action table

and Go-to table

Lookahead State

0 1 2 3 4 5 6 7 8 9 10 11 12

+ 3 3

* 8 8

id 5 5 5 5

( 6 6 6 6

) 10

$

S

E 1 12

T 7 11 7

P 4 4 4 9

192

Optimizing Parse tables (3) Action table

Goto table

Complete table

+

Lookahead State

0 1 2 3 4 5 6 7 8 9 10 11 12

+ S R5 R6 R3 R4 R7 R2 S

* R5 R6 S R4 R7 S

id S S S S

( S S S S

) R5 R6 R3 R4 R7 R2 S

$ A R5 R6 R3 R4 R7 R2

Lookahead State

0 1 2 3 4 5 6 7 8 9 10 11 12

+ 3 3

* 8 8

id 5 5 5 5

( 6 6 6 6

) 10

$

S

E 1 12

T 7 11 7

P 4 4 4 9

Lookahead State

0 1 2 3 4 5 6 7 8 9 10 11 12

+ S3 R5 R6 R3 R4 R7 R2 S3

* R5 R6 S8 R4 R7 S8

id S5 S5 S5 S5

( S6 S6 S6 S6

) R5 R6 R3 R4 R7 R2 S10

$ A R5 R6 R3 R4 R7 R2

S

E S1 S12

T S7 S11 S7

P S4 S4 S4 S9

193

Optimizing Parse Tables (2)

Single Reduce State

The state always simply reduce

Because of always reducing , can we simplify using another display?

Lookahead State

0 1 2 3 4 5 6 7 8 9 10 11 12

+ S3 R5 R6 R3 R4 R7 R2 S3

* R5 R6 S8 R4 R7 S8

id S5 S5 S5 S5

( S6 S6 S6 S6

) R5 R6 R3 R4 R7 R2 S10

$ A R5 R6 R3 R4 R7 R2

S

E S1 S12

T S7 S11 S7

P S4 S4 S4 S9

194

Optimizing Parse Tables (2)

Step2:

Eliminate all single reduce states.

Replaced with a special marker--- L-prefix

Example

Shift to state4 would be replaced by the entry L5

Make only one possible reduction in a state, we need not ever

go to that state

Cancel this column

Replace S4

to L5

L5 L5 L5

195

Optimizing Parse Tables (3)

Lookahead State

0 1 2 3 6 7 8 11 12

+ S3 R3 R2 S3

* S8 S8

id L6 L6 L6 L6

( S6 S6 S6 S6

) R3 R2 L7

$ A R3 R2

S

E S1 S12

T S7 S11 S7

P L5 L5 L5 L4

196

Shift-Reduce Parsers

void shift_reduce_driver(void) { /* Push the Start State, S0, * onto an empty parse stack. */ push(S0); while (TRUE) { /* forever */ /* Let S be the top parse stack state; * let T be the current input token.*/ switch (action[S][T]) { case ERROR: announce_syntax_error(); break; case ACCEPT: /* The input has been correctly

* parsed. */ clean_up_and_finish(); return;

case SHIFT: push(go_to[S][T]); scanner(&T); /* Get next token. */ break; case REDUCEi: /* Assume i-th production is * X Y1 Ym. * Remove states corresponding to * the RHS of the production. */ pop(m); /* S' is the new stack top. */ push(go_to[S'][X]); break; case Li: /* Assume i-th production is * X Y1 Ym. * Remove states corresponding to * the RHS of the production. */ pop(m-1); /* S' is the new stack top. */ push(go_to[S'][X]); break; } } }

Example(1)

197

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

Lookahead State

0 1 2 3 6 7 8 11 12

+ S3 R3 R2 S3

* S8 S8

id L6 L6 L6 L6

( S6 S6 S6 S6

) R3 R2 L7

$ A R3 R2

S

E S1 S12

T S7 S11 S7

P L5 L5 L5 L4

Input:(id+id)$

Example(2)

198

Initial :(id+id)$

step1:0 (id+id)$ shift (

Tree:

(

Lookahead State

0 1 2 3 6 7 8 11 12

+ S3 R3 R2 S3

* S8 S8

id L6 L6 L6 L6

( S6 S6 S6 S6

) R3 R2 L7

$ A R3 R2

S

E S1 S12

T S7 S11 S7

P L5 L5 L5 L4

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

Example(3)

199

Initial :(id+id)$

step2:0 6 id+id)$ L6

Tree:

( id

Lookahead State

0 1 2 3 6 7 8 11 12

+ S3 R3 R2 S3

* S8 S8

id L6 L6 L6 L6

( S6 S6 S6 S6

) R3 R2 L7

$ A R3 R2

S

E S1 S12

T S7 S11 S7

P L5 L5 L5 L4

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

Example(4)

200

Initial :(id+id)$

step3:0 6 id+id)$ L5

Tree:

(

id

P

Lookahead State

0 1 2 3 6 7 8 11 12

+ S3 R3 R2 S3

* S8 S8

id L6 L6 L6 L6

( S6 S6 S6 S6

) R3 R2 L7

$ A R3 R2

S

E S1 S12

T S7 S11 S7

P L5 L5 L5 L4

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

Example(5)

201

Initial :(id+id)$

step4:0 6 id+id)$ shift id

Tree:

(

id

P

T

Lookahead State

0 1 2 3 6 7 8 11 12

+ S3 R3 R2 S3

* S8 S8

id L6 L6 L6 L6

( S6 S6 S6 S6

) R3 R2 L7

$ A R3 R2

S

E S1 S12

T S7 S11 S7

P L5 L5 L5 L4

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

Example(6)

202

Initial :(id+id)$

step5:0 6 7 +id)$ Reduce 3

Tree:

(

id

P

T

Lookahead State

0 1 2 3 6 7 8 11 12

+ S3 R3 R2 S3

* S8 S8

id L6 L6 L6 L6

( S6 S6 S6 S6

) R3 R2 L7

$ A R3 R2

S

E S1 S12

T S7 S11 S7

P L5 L5 L5 L4

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

+

Example(7)

203

Initial :(id+id)$

step6:0 6 12 +id)$ shift +

Tree:

(

id

P

T

E +

Lookahead State

0 1 2 3 6 7 8 11 12

+ S3 R3 R2 S3

* S8 S8

id L6 L6 L6 L6

( S6 S6 S6 S6

) R3 R2 L7

$ A R3 R2

S

E S1 S12

T S7 S11 S7

P L5 L5 L5 L4

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

Example(8)

204

Initial :(id+id)$

step7:0 6 12 3 id)$ L6

Tree:

(

id

P

T

E + id

Lookahead State

0 1 2 3 6 7 8 11 12

+ S3 R3 R2 S3

* S8 S8

id L6 L6 L6 L6

( S6 S6 S6 S6

) R3 R2 L7

$ A R3 R2

S

E S1 S12

T S7 S11 S7

P L5 L5 L5 L4

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

Example(9)

205

Initial :(id+id)$

step8:0 6 12 3 id)$ L5

Tree:

(

id

P

T

E +

id

P

Lookahead State

0 1 2 3 6 7 8 11 12

+ S3 R3 R2 S3

* S8 S8

id L6 L6 L6 L6

( S6 S6 S6 S6

) R3 R2 L7

$ A R3 R2

S

E S1 S12

T S7 S11 S7

P L5 L5 L5 L4

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

Example(10)

206

Initial :(id+id)$

step9:0 6 12 3 id)$ Shift id

Tree:

(

id

P

T

E +

id

P

T

Lookahead State

0 1 2 3 6 7 8 11 12

+ S3 R3 R2 S3

* S8 S8

id L6 L6 L6 L6

( S6 S6 S6 S6

) R3 R2 L7

$ A R3 R2

S

E S1 S12

T S7 S11 S7

P L5 L5 L5 L4

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

Example(11)

207

Initial :(id+id)$

step10:0 6 12 3 11 )$ Reduce 2

Tree:

(

id

P

T

+

id

P

E T

Lookahead State

0 1 2 3 6 7 8 11 12

+ S3 R3 R2 S3

* S8 S8

id L6 L6 L6 L6

( S6 S6 S6 S6

) R3 R2 L7

$ A R3 R2

S

E S1 S12

T S7 S11 S7

P L5 L5 L5 L4

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

)

Example(12)

208

Initial :(id+id)$

step11:0 6 12 )$ L7

Tree:

(

id

P

T

+

id

P

T E

E )

Lookahead State

0 1 2 3 6 7 8 11 12

+ S3 R3 R2 S3

* S8 S8

id L6 L6 L6 L6

( S6 S6 S6 S6

) R3 R2 L7

$ A R3 R2

S

E S1 S12

T S7 S11 S7

P L5 L5 L5 L4

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

Example(13)

209

Initial :(id+id)$

step12:0 )$ L5

Tree:

(

id

P

T

+

id

P

T E

E )

P Lookahead State

0 1 2 3 6 7 8 11 12

+ S3 R3 R2 S3

* S8 S8

id L6 L6 L6 L6

( S6 S6 S6 S6

) R3 R2 L7

$ A R3 R2

S

E S1 S12

T S7 S11 S7

P L5 L5 L5 L4

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

Example(14)

210

Initial :(id+id)$

step13:0 )$ Shift )

Tree:

(

id

P

T

+

id

P

T E

E )

P

T

Lookahead State

0 1 2 3 6 7 8 11 12

+ S3 R3 R2 S3

* S8 S8

id L6 L6 L6 L6

( S6 S6 S6 S6

) R3 R2 L7

$ A R3 R2

S

E S1 S12

T S7 S11 S7

P L5 L5 L5 L4

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

Example(15)

211

Initial :(id+id)$

step14:0 7 $ Reduce 3

Tree:

(

id

P

T

+

id

P

T E

E )

P

T

E Lookahead State

0 1 2 3 6 7 8 11 12

+ S3 R3 R2 S3

* S8 S8

id L6 L6 L6 L6

( S6 S6 S6 S6

) R3 R2 L7

$ A R3 R2

S

E S1 S12

T S7 S11 S7

P L5 L5 L5 L4

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

Example(16)

212

Initial :(id+id)$

step15:0 1 $ Accept

Tree:

(

id

P

T

+

id

P

T E

E )

P

T

E Lookahead State

0 1 2 3 6 7 8 11 12

+ S3 R3 R2 S3

* S8 S8

id L6 L6 L6 L6

( S6 S6 S6 S6

) R3 R2 L7

$ A R3 R2

S

E S1 S12

T S7 S11 S7

P L5 L5 L5 L4

for grammar G3 :

1. S E $ 2. E E + T 3. E T 4. T T * P 5. T P 6. P id 7. P ( E )

LR(1) Parsers

Fall 2012 Bottom Up Parsing 213

Very powerful and most languages can be recognized by

them

But, the LR(1) machine contains so many states the GoTo

and Action tables are prohibitivley large.

Alternatives to LR(1) Parsers

Fall 2012 Bottom Up Parsing 214

LR(0) Parsers

Very compact tables

But with no lookahead, not very powerful

SLR(1) – Simple LR(1) parsers

Add lookahead to LR(0) talbes

Almost as powerful as LR(1) but much smaller

LALR(1) – look-ahead LR(1) parsers

Start with LR(1) states and merge states differing only in the

look-ahead

Smaller and slightly weaker than LR(1)

215

LL(1) or LALR(1) , That is the question(1)

--Modified by http://www.csie.ntu.edu.tw/~compiler/

LR(1) grammar

LALR(1) grammar

SLR(1) grammar

LR(0) grammar

LR(0) SLR(1) LALR(1) LR(1)

state number n n n N

action table † n 1 n |VT| n |VT| N |VT|

goto table † n |V| n |V| n |V| N |V|

† before compression

power --

LALR(1) is the most commonly used bottom-up parsing method

216

LL(1) or LALR(1) , That is the question(2)

--Modified by http://www.csie.ntu.edu.tw/~compiler/

LL(1) LALR(1)

simplicity simpler

generality all LL(1) grammars are LALR(1)

a grammar in LALR(1) form is more readable

placement of

action symbols

anywhere in rhs extreme right end

of rhs, essentially

error repair simpler, because parse stack

has predicted information

parse stack just has

matched information

table sizes |VN| |VT| |states| |V|

|states| may exponential

parsing speed comparable

semantic stack easier manipulation

Two most popular parsing methods

Shift-reduce parsers differ in their use of

Follow information:

Fall 2012 Bottom Up Parsing 217

LR(0) parsers never consult the lookahead at all.

SLR(1) parsers use the Follow sets as previously

constructed.

LR(1) parsers use context to split the Follow sets

into subsets for different parsing paths (huge,

inefficient parsers).

LALR(1) parsers: like LR(1) but coarser subsets are

used (achieves most of the benefit, but much smaller

and faster).

LL(1) vs LALR(1)

LL(1) and LALR(1) are dominant types

Although variants are used (recursive descent and SLR(1))

LL(1) is simpler

LALR(1) is more general

Most languages can be represented by an LL(1) or LALR(1) grammar, but it is easier to write the LALR(1) grammar

LL(1) can be easier to specify actions

Error repair is easier to do in LL(1)

LL(1) tables will be ~½ size of LALR(1)

A Comparison of Predictive Parsers with

Shift-Reduce Parsers

Fall 2012 Bottom Up Parsing 219

Both parsers read the input from left-to-right and

maintain a stack of grammar symbols but their parsing

operations are decidedly different as shown in the

following table: Predictive Parser Shift-Reduce Parser

Top-down (LL) Parser Bottom-up (LR) Parser

Stack predicts what is to come Stack shows what has been seen so far

The stack initially contains the start-symbol of the

grammar.

The stack is initially empty.

The stack is empty when the accept state is reached. The stack contains the start symbol of the grammar when the accept

state is reached.

Input tokens are popped off the stack. Input tokens are pushed on the stack.

Left sides of productions are popped off the stack. Right sides of productions are popped off the stack.

Right sides of productions are pushed on the stack. Left sides of productions are pushed on the stack.

Properties of LR(1) Parsers

A correct rightmost parse is guaranteed

Since LR-style parsers accept only viable prefixes,

syntax errors are detected as soon as the parser

attempts to shift a token that isn't part of a viable

prefix

Prompt error reporting

They are linear in operation

All LR(1) grammars are unambiguous

Will yacc generate a parser for an

ambiguous grammar?