lecture #10, feb. 14, 2007

24
Cse321, Programming Languages and Compilers 1 07/03/22 Lecture #10, Feb. 14, 2007 Modified sets of item construction Rules for building LR parse tables The Action rules The GOTO rules Conflicts and ambiguity Shift-reduce and reduce-reduce conflicts Parser generators and ambiguity Ambiguous expression grammar Ambiguous if-then-else grammar Ml-yacc

Upload: cooper-burke

Post on 03-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Lecture #10, Feb. 14, 2007. Modified sets of item construction Rules for building LR parse tables The Action rules The GOTO rules Conflicts and ambiguity Shift-reduce and reduce-reduce conflicts Parser generators and ambiguity Ambiguous expression grammar Ambiguous if-then-else grammar - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

104/20/23

Lecture #10, Feb. 14, 2007•Modified sets of item construction•Rules for building LR parse tables•The Action rules•The GOTO rules•Conflicts and ambiguity•Shift-reduce and reduce-reduce conflicts•Parser generators and ambiguity•Ambiguous expression grammar•Ambiguous if-then-else grammar•Ml-yacc

Page 2: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

204/20/23

Assignments• Homework

– Assignment 7 will be accepted til the end of the week» good review for exam!

– Assignment 8 (paper and pencil) is posted & due Mon. Feb 19» good review for exam!

– Assignment 9 (programming) is posted & due Wed. Feb 21» in case your interested

• Project 1 is Due today. – Email me the code– Name files with your last name as discussed in the Project 1

description.

• Midterm Exam will be Monday, Feb 19, 2007– Exam will be closed book– Exam will take 60 minutes– We will have a short lecture after the exam.

• Project 2 will be assigned next Monday Feb 19, 2006

Page 3: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

304/20/23

To facilitate Table building

• To facilitate Table building we modify the sets of items construction slightly

• Each item now has three components.– A production

– A location for the dot

– A terminal symbol that indicates a valid terminal that could follow the production. This is similar to, but not quite like the Non-terminals that are in the Follow set.

• Examples:• [ Start → . Exp, EOF]• [ F → T . * F, +]

Start → E

E → E + T

| T

T → T * F

| F

F → ( E )

| id

Page 4: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

404/20/23

Modified Closure

• Let I be a set of it modified items• Then Closure(I) = • For each i I, where i = [A → . B β, x]• For each p Productions where p = B → • For each t Terminals where t First(βx)• Add [B → . , t] to Closure(I) if its not

already there

Page 5: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

504/20/23

Modified GOTO

• GOTO(I,X) =• For each item in I of the form [A → . X β,

a]» i.e. the dot comes just before the X

• Let J be the set of items [A → X . β, a]» i.e. move the dot after the X

• Return the Closure(J)

Page 6: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

604/20/23

Modified Sets of items Construction

• Start with a grammar with a Start symbol with only 1 production. Start → E– If the grammar isn’t of that form create a new grammar that is of

that form with a new start symbol that accepts the same set of strings.

• C := Closure( { [ Start → . E, EOF] })• For each set of items I C• For Each X NonTerminal union Terminal• Compute new := GOTO(I,X)– If new is not empty, and new is not already in C,

add it to C

• Until no new sets of items can be added to C

Page 7: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

704/20/23

Building Tables for LR parsers

• Once the sets of items have been constructed, then the tables can be constructed by using– The set of items

– The GOTO construction

– The grammar

• Each set of items corresponds to a state.

• States and Terminals index the ACTION table

• States and Non-Terminals index the GOTO table

Page 8: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

804/20/23

Construction of ACTION table• Let C be the sets of items constructed for a

grammar. There is one state “i” for each set ci C

1. If [A → . a β, b] ci , and GOTO(ci , a) = cj Then set ACTION[ i, a ] to shift jnote “a” is a terminal symbol

2. If [A → . , a] ci

Then set ACTION[ i, a ] to reduce( A → )

3. If [Start → S . , EOF ] ci

Then set ACTION[ i, EOF ] to accept

Any conflict in these rules means the grammar is ambiguous.

Page 9: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

904/20/23

Construction of GOTO table

If GOTO(ci ,A) = cj

Then set GOTO(I,A) to jNote that “A” is a Non-Terminal symbol

• All other entries are error entries

• The Start state of the parser is the state derived from Closure( { [ Start → . E, EOF] })

Page 10: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

1004/20/23

Parser generators• Programs that analyze grammars to

produce efficient winning strategies• ml-yacc uses a LALR(1) table-driven parser

– Look-Ahead 1 symbol

– Left to right processing of input

– Right-most derivation

• ml-yacc reads a grammar, produces a table

• ml-yacc attaches semantic actions to reduce moves

Page 11: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

1104/20/23

ml-yacc makes a virtue out of Ambiguity• Why not:

expr -> expr + expr

| expr * expr

| Number

• Ambiguity ! ! !

expr

Number +

17

Number

3

expr

* expr

2

Page 12: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

1204/20/23

Factoring - A hard solution• Fix the grammar

E : E + T

| T

;

T : T * F

| F

;

F : ( E )

| Number

;

• Problems– Grammar is harder to understand

– Grammar is bigger

E

E + T

T T * F

F

Number

Number Number

17

3 2

Page 13: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

1304/20/23

Using Ambiguity• Ambiguity means the parser can’t decide

between– Shifting a terminal, or reducing a handle to a Non-Terminal– Reducing a handle to one or more Non-terminals

» T → rhs and S → rhs are both in the grammar and rhs is the handle.

• This choice means we can’t construct a unique parse tree for any string.

• But what if we could direct the parser to always prefer one choice over the other.

• Then– The parse tree would always be unique– The grammar might even be smaller

Page 14: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

1404/20/23

Ambiguous Expression Grammar

• Contrast the two grammars.

• Convince yourself they both accept the same set of strings.

• Which one is ambiguous?

• Which one is simpler?

• Which one is smaller?

Start → E

E → E + E

| E * E

| ( E )

| id

Start → E

E → E + T

| T

T → T * F

| F

F → ( E )

| id

Page 15: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

1504/20/23

An LR parser for the ambiguous EXP grammar

• The sets of item construction has 11 states

• It has 4 shift-reduce ambiguities

Start → E

E → E + E

| E * E

| ( E )

| id

Page 16: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

1604/20/23

4 shift reduce ambiguitiesState 8

{ [E → E . + E , _ ]

, [E → E . * E, _ ]

, [E → E * E ., _ ]

}

Action(8,+) shift 5

Action(8,+) reduce by 2

State 8

{ [E → E . + E , _ ]

, [E → E . * E, _ ]

, [E → E * E ., _ ]

}

Action(8,*) shift 4

Action(8,*) reduce by 2

State 9

{ [E → E . PLUS E , _ ]

, [E → E PLUS E . , _ ]

, [E → E . TIMES E, _ ]

}

Action(9,*) shift 4

Action(9,*) reduce by 1

State 9

{ [E → E . PLUS E , _ ]

, [E → E PLUS E . , _ ]

, [E → E . TIMES E, _ ]

}

Action(9,+) shift 5

Action(9,+) reduce by 1

Page 17: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

1704/20/23

State of the stack 1

Stack input

. . . Exp * Exp + 3 EOF

• Choices

Action(8,+) shift 5

Action(8,+) reduce by 2

• Reducing by 2 means (E * E) has higher precedence than (E + E)

• Generally this is what we want.

{ [E → E . + E , _ ], [E → E . * E, _ ] , [E → E * E ., _ ] }

Page 18: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

1804/20/23

State of the stack 2

Stack input

. . . Exp * Exp * 3 EOF

• Choices

Action(8,*) shift 4

Action(8,*) reduce by 2

• Reducing by 2 means that (E * E) is left associative

• Shifting * means (E * E) is right associative

• The other 2 shift reduce errors are similar but talk about the precedence and associativity of (E + E)

{ [E → E . + E , _ ], [E → E . * E, _ ] , [E → E * E ., _ ] }

Page 19: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

1904/20/23

ml-yacc, a Better Solution• ml-yacc allows ambiguous grammars to be

disambiguated via declarations of precedence and associativity

• For example:–%left ‘+’

–%left ‘*’

• Declares that * has higher precedence than + and that both are left associative

• If ambiguity remains the following rules are used– always shift on a shift/reduce conflict

– do the first reduction listed in the grammar on a reduce/reduce conflict

Page 20: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

2004/20/23

Partial Ml-yacc file

%left TIMES

%left PLUS

%%

Start: E EOF ( E )

E : E PLUS E ( Add(E1,E2) )

| E TIMES E ( Mult(E1,E2) )

| LP E RP ( E )

| id ( Id id )

Much more about ml-yacc next time!

Page 21: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

2104/20/23

If-then-else example

Goal → Stmt

Stmt → IF Exp THEN Stmt

| IF Exp THEN Stmt ELSE Stmt

| ID := Exp

State 9

{[Stmt → IF EXP THEN Stmt . , _]

,[Stmt → IF EXP THEN Stmt . ELSE Stmt, _ ]

}

Action(9,ELSE) = Shift 10

Action(9,ELSE) = Reduce by 1

Page 22: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

2204/20/23

State of machine Stack input. . . IF Exp THEN Stmt ELSE x := 3 … EOF

State 9{[Stmt → IF EXP THEN Stmt . , _],[Stmt → IF EXP THEN Stmt . ELSE Stmt, _ ]}

Action(9,ELSE) = Shift 10Else associated with closest IF on stack

Action(9,ELSE) = Reduce by 1 Else associated with further away IF

Page 23: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

2304/20/23

Ml-yacc file%nonassoc THEN

%nonassoc ELSE

%%

Goal: Stmt EOF ( Stmt )

Stmt: IF EXP THEN Stmt ( IfThen(E,Stmt) )

| IF EXP THEN Stmt ELSE Stmt

( IfThenElse(E,Stmt1,Stmt2) )

| ID ASSIGNOP EXP ( Assign(ID,E) )

Page 24: Lecture #10,  Feb. 14, 2007

Cse321, Programming Languages and Compilers

2404/20/23

Some sample ambiguous grammars

• These examples can be found in the directory

• http://www.cs.pdx.edu/~sheard/course/Cs321/LexYacc/AmbiguousExamples/