parsing with context free grammars csc 9010 natural language processing

49
Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from: Jim Martin (after Dan Jurafsky) from U. Colorado Rada Mihalcea, University of North Texas http://www.cs.unt.edu/~rada/CSCE5290/ Robert Berwick, MIT Bonnie Dorr, University of Maryland

Upload: ailani

Post on 24-Feb-2016

55 views

Category:

Documents


3 download

DESCRIPTION

Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from: Jim Martin (after Dan Jurafsky) from U. Colorado Rada Mihalcea, University of North Texas http://www.cs.unt.edu/~rada/CSCE5290/ - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Parsing with Context Free GrammarsCSC 9010 Natural Language Processing

Paula Matuszek and Mary-Angela Papalaskari

This slide set was adapted frombullJim Martin (after Dan Jurafsky) from U ColoradobullRada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290bull Robert Berwick MITbullBonnie Dorr University of Maryland

Slide 1

Parsing

Mapping from strings to structured representation

bull Parsing with CFGs refers to the task of assigning correct trees to input strings

bull Correct here means a tree that covers all and only the elements of the input and has an S at the top

bull It doesnrsquot actually mean that the system can select the correct tree from among the possible trees

bull As with everything of interest parsing involves a search which involves the making of choices

bull Wersquoll start with some basic methods before moving on to more complex ones

Slide 1

Programming languages

max = min = grade

Read and process the rest of the grades

while (grade gt= 0)

count++

sum += grade

if (grade gt max)

max = grade

else

if (grade lt min)

min = grade

Systemoutprint (Enter the next grade (-1 to quit) )

grade = KeyboardreadInt ()

bull Easy to parsebull Designed that way

Slide 1

Natural Languages

max = min = grade Read and process the rest of the grades while (grade gt= 0)

count++ sum += grade if (grade gt max) max = grade else if (grade lt min) min =

grade Systemoutprint (Enter the next grade (-1 to quit) ) grade =

KeyboardreadInt ()

bull No ( ) [ ] to indicate scope and precedence

bull Lots of overloading (arity varies)

bull Grammar isnrsquot known in advance

bullContext-free grammar is not the best formalism

Slide 1

Some assumptions

bull You have all the words already in some buffer

bull The input isnrsquot pos tagged

bull We wonrsquot worry about morphological analysis

bull All the words are known

Slide 1

Top-Down Parsing

bull Since wersquore trying to find trees rooted with an S (Sentences) start with the rules that give us an S

bull Then work your way down from there to the words

Slide 1

Top Down Space

Slide 1

Bottom-Up Parsing

bull Of course we also want trees that cover the input words So start with trees that link up with the words in the right way

bull Then work your way up from there

Slide 1

Bottom-Up Space

Slide 1

Top-Down VS Bottom-Up

bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words

bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root

bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)

Slide 1

Top-Down Depth-First Left-to-Right Search

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

flight flight

Slide 1

Example (contrsquod)

flightflight

Slide 1

Bottom-Up Filtering

Slide 1

Possible Problem Left-Recursion

What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with

Did the flighthellip

Slide 1

Solution Rule Ordering

S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP

The key for the NP is that you want the recursive option after any base case

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 2: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Parsing

Mapping from strings to structured representation

bull Parsing with CFGs refers to the task of assigning correct trees to input strings

bull Correct here means a tree that covers all and only the elements of the input and has an S at the top

bull It doesnrsquot actually mean that the system can select the correct tree from among the possible trees

bull As with everything of interest parsing involves a search which involves the making of choices

bull Wersquoll start with some basic methods before moving on to more complex ones

Slide 1

Programming languages

max = min = grade

Read and process the rest of the grades

while (grade gt= 0)

count++

sum += grade

if (grade gt max)

max = grade

else

if (grade lt min)

min = grade

Systemoutprint (Enter the next grade (-1 to quit) )

grade = KeyboardreadInt ()

bull Easy to parsebull Designed that way

Slide 1

Natural Languages

max = min = grade Read and process the rest of the grades while (grade gt= 0)

count++ sum += grade if (grade gt max) max = grade else if (grade lt min) min =

grade Systemoutprint (Enter the next grade (-1 to quit) ) grade =

KeyboardreadInt ()

bull No ( ) [ ] to indicate scope and precedence

bull Lots of overloading (arity varies)

bull Grammar isnrsquot known in advance

bullContext-free grammar is not the best formalism

Slide 1

Some assumptions

bull You have all the words already in some buffer

bull The input isnrsquot pos tagged

bull We wonrsquot worry about morphological analysis

bull All the words are known

Slide 1

Top-Down Parsing

bull Since wersquore trying to find trees rooted with an S (Sentences) start with the rules that give us an S

bull Then work your way down from there to the words

Slide 1

Top Down Space

Slide 1

Bottom-Up Parsing

bull Of course we also want trees that cover the input words So start with trees that link up with the words in the right way

bull Then work your way up from there

Slide 1

Bottom-Up Space

Slide 1

Top-Down VS Bottom-Up

bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words

bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root

bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)

Slide 1

Top-Down Depth-First Left-to-Right Search

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

flight flight

Slide 1

Example (contrsquod)

flightflight

Slide 1

Bottom-Up Filtering

Slide 1

Possible Problem Left-Recursion

What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with

Did the flighthellip

Slide 1

Solution Rule Ordering

S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP

The key for the NP is that you want the recursive option after any base case

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 3: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Programming languages

max = min = grade

Read and process the rest of the grades

while (grade gt= 0)

count++

sum += grade

if (grade gt max)

max = grade

else

if (grade lt min)

min = grade

Systemoutprint (Enter the next grade (-1 to quit) )

grade = KeyboardreadInt ()

bull Easy to parsebull Designed that way

Slide 1

Natural Languages

max = min = grade Read and process the rest of the grades while (grade gt= 0)

count++ sum += grade if (grade gt max) max = grade else if (grade lt min) min =

grade Systemoutprint (Enter the next grade (-1 to quit) ) grade =

KeyboardreadInt ()

bull No ( ) [ ] to indicate scope and precedence

bull Lots of overloading (arity varies)

bull Grammar isnrsquot known in advance

bullContext-free grammar is not the best formalism

Slide 1

Some assumptions

bull You have all the words already in some buffer

bull The input isnrsquot pos tagged

bull We wonrsquot worry about morphological analysis

bull All the words are known

Slide 1

Top-Down Parsing

bull Since wersquore trying to find trees rooted with an S (Sentences) start with the rules that give us an S

bull Then work your way down from there to the words

Slide 1

Top Down Space

Slide 1

Bottom-Up Parsing

bull Of course we also want trees that cover the input words So start with trees that link up with the words in the right way

bull Then work your way up from there

Slide 1

Bottom-Up Space

Slide 1

Top-Down VS Bottom-Up

bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words

bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root

bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)

Slide 1

Top-Down Depth-First Left-to-Right Search

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

flight flight

Slide 1

Example (contrsquod)

flightflight

Slide 1

Bottom-Up Filtering

Slide 1

Possible Problem Left-Recursion

What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with

Did the flighthellip

Slide 1

Solution Rule Ordering

S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP

The key for the NP is that you want the recursive option after any base case

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 4: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Natural Languages

max = min = grade Read and process the rest of the grades while (grade gt= 0)

count++ sum += grade if (grade gt max) max = grade else if (grade lt min) min =

grade Systemoutprint (Enter the next grade (-1 to quit) ) grade =

KeyboardreadInt ()

bull No ( ) [ ] to indicate scope and precedence

bull Lots of overloading (arity varies)

bull Grammar isnrsquot known in advance

bullContext-free grammar is not the best formalism

Slide 1

Some assumptions

bull You have all the words already in some buffer

bull The input isnrsquot pos tagged

bull We wonrsquot worry about morphological analysis

bull All the words are known

Slide 1

Top-Down Parsing

bull Since wersquore trying to find trees rooted with an S (Sentences) start with the rules that give us an S

bull Then work your way down from there to the words

Slide 1

Top Down Space

Slide 1

Bottom-Up Parsing

bull Of course we also want trees that cover the input words So start with trees that link up with the words in the right way

bull Then work your way up from there

Slide 1

Bottom-Up Space

Slide 1

Top-Down VS Bottom-Up

bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words

bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root

bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)

Slide 1

Top-Down Depth-First Left-to-Right Search

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

flight flight

Slide 1

Example (contrsquod)

flightflight

Slide 1

Bottom-Up Filtering

Slide 1

Possible Problem Left-Recursion

What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with

Did the flighthellip

Slide 1

Solution Rule Ordering

S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP

The key for the NP is that you want the recursive option after any base case

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 5: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Some assumptions

bull You have all the words already in some buffer

bull The input isnrsquot pos tagged

bull We wonrsquot worry about morphological analysis

bull All the words are known

Slide 1

Top-Down Parsing

bull Since wersquore trying to find trees rooted with an S (Sentences) start with the rules that give us an S

bull Then work your way down from there to the words

Slide 1

Top Down Space

Slide 1

Bottom-Up Parsing

bull Of course we also want trees that cover the input words So start with trees that link up with the words in the right way

bull Then work your way up from there

Slide 1

Bottom-Up Space

Slide 1

Top-Down VS Bottom-Up

bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words

bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root

bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)

Slide 1

Top-Down Depth-First Left-to-Right Search

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

flight flight

Slide 1

Example (contrsquod)

flightflight

Slide 1

Bottom-Up Filtering

Slide 1

Possible Problem Left-Recursion

What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with

Did the flighthellip

Slide 1

Solution Rule Ordering

S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP

The key for the NP is that you want the recursive option after any base case

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 6: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Top-Down Parsing

bull Since wersquore trying to find trees rooted with an S (Sentences) start with the rules that give us an S

bull Then work your way down from there to the words

Slide 1

Top Down Space

Slide 1

Bottom-Up Parsing

bull Of course we also want trees that cover the input words So start with trees that link up with the words in the right way

bull Then work your way up from there

Slide 1

Bottom-Up Space

Slide 1

Top-Down VS Bottom-Up

bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words

bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root

bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)

Slide 1

Top-Down Depth-First Left-to-Right Search

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

flight flight

Slide 1

Example (contrsquod)

flightflight

Slide 1

Bottom-Up Filtering

Slide 1

Possible Problem Left-Recursion

What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with

Did the flighthellip

Slide 1

Solution Rule Ordering

S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP

The key for the NP is that you want the recursive option after any base case

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 7: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Top Down Space

Slide 1

Bottom-Up Parsing

bull Of course we also want trees that cover the input words So start with trees that link up with the words in the right way

bull Then work your way up from there

Slide 1

Bottom-Up Space

Slide 1

Top-Down VS Bottom-Up

bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words

bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root

bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)

Slide 1

Top-Down Depth-First Left-to-Right Search

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

flight flight

Slide 1

Example (contrsquod)

flightflight

Slide 1

Bottom-Up Filtering

Slide 1

Possible Problem Left-Recursion

What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with

Did the flighthellip

Slide 1

Solution Rule Ordering

S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP

The key for the NP is that you want the recursive option after any base case

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 8: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Bottom-Up Parsing

bull Of course we also want trees that cover the input words So start with trees that link up with the words in the right way

bull Then work your way up from there

Slide 1

Bottom-Up Space

Slide 1

Top-Down VS Bottom-Up

bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words

bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root

bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)

Slide 1

Top-Down Depth-First Left-to-Right Search

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

flight flight

Slide 1

Example (contrsquod)

flightflight

Slide 1

Bottom-Up Filtering

Slide 1

Possible Problem Left-Recursion

What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with

Did the flighthellip

Slide 1

Solution Rule Ordering

S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP

The key for the NP is that you want the recursive option after any base case

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 9: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Bottom-Up Space

Slide 1

Top-Down VS Bottom-Up

bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words

bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root

bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)

Slide 1

Top-Down Depth-First Left-to-Right Search

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

flight flight

Slide 1

Example (contrsquod)

flightflight

Slide 1

Bottom-Up Filtering

Slide 1

Possible Problem Left-Recursion

What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with

Did the flighthellip

Slide 1

Solution Rule Ordering

S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP

The key for the NP is that you want the recursive option after any base case

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 10: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Top-Down VS Bottom-Up

bull Top-downndash Only searches for trees that can be answersndash But suggests trees that are not consistent with the wordsndash Guarantees that tree starts with S as rootndash Does not guarantee that tree will match input words

bull Bottom-upndash Only forms trees consistent with the wordsndash Suggest trees that make no sense globallyndash Guarantees that tree matches input wordsndash Does not guarantee that parse tree will lead to S as a root

bull Combine the advantages of the two by doing a search constrained from both sides (top and bottom)

Slide 1

Top-Down Depth-First Left-to-Right Search

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

flight flight

Slide 1

Example (contrsquod)

flightflight

Slide 1

Bottom-Up Filtering

Slide 1

Possible Problem Left-Recursion

What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with

Did the flighthellip

Slide 1

Solution Rule Ordering

S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP

The key for the NP is that you want the recursive option after any base case

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 11: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Top-Down Depth-First Left-to-Right Search

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

flight flight

Slide 1

Example (contrsquod)

flightflight

Slide 1

Bottom-Up Filtering

Slide 1

Possible Problem Left-Recursion

What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with

Did the flighthellip

Slide 1

Solution Rule Ordering

S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP

The key for the NP is that you want the recursive option after any base case

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 12: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

flight flight

Slide 1

Example (contrsquod)

flightflight

Slide 1

Bottom-Up Filtering

Slide 1

Possible Problem Left-Recursion

What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with

Did the flighthellip

Slide 1

Solution Rule Ordering

S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP

The key for the NP is that you want the recursive option after any base case

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 13: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Example (contrsquod)

flight flight

Slide 1

Example (contrsquod)

flightflight

Slide 1

Bottom-Up Filtering

Slide 1

Possible Problem Left-Recursion

What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with

Did the flighthellip

Slide 1

Solution Rule Ordering

S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP

The key for the NP is that you want the recursive option after any base case

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 14: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Example (contrsquod)

flightflight

Slide 1

Bottom-Up Filtering

Slide 1

Possible Problem Left-Recursion

What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with

Did the flighthellip

Slide 1

Solution Rule Ordering

S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP

The key for the NP is that you want the recursive option after any base case

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 15: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Bottom-Up Filtering

Slide 1

Possible Problem Left-Recursion

What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with

Did the flighthellip

Slide 1

Solution Rule Ordering

S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP

The key for the NP is that you want the recursive option after any base case

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 16: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Possible Problem Left-Recursion

What happens in the following situationS -gt NP VPS -gt Aux NP VPNP -gt NP PPNP -gt Det NominalhellipWith the sentence starting with

Did the flighthellip

Slide 1

Solution Rule Ordering

S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP

The key for the NP is that you want the recursive option after any base case

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 17: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Solution Rule Ordering

S -gt Aux NP VPS -gt NP VPNP -gt Det NominalNP -gt NP PP

The key for the NP is that you want the recursive option after any base case

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 18: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Avoiding Repeated Work

Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

Consider an attempt to top-down parse the following as an NP

A flight from Indianapolis to Houston on TWA

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 19: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

flight

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 20: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

flight

flight

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 21: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 22: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 23: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in (approximately) polynomial

time

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 24: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Earley Parsing

Fills a table in a single sweep over the input wordsTable is length N+1 N is number of wordsTable entries represent

Completed constituents and their locationsIn-progress constituentsPredicted constituents

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 25: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

States

The table-entries are called states and are represented with dotted-rulesS -gt VP A VP is predictedNP -gt Det Nominal An NP is in progressVP -gt V NP A VP has been found

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 26: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

StatesLocations

It would be nice to know where these things are in the input sohellipS -gt VP [00] A VP is predicted at the

start of the sentenceNP -gt Det Nominal [12] An NP is in progress the

Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending

at 3

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 27: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Graphically

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 28: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by states in current chartndash New incomplete states are created by advancing existing states

as new constituents are discoveredndash New complete states are created in the same way

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 29: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Earley

bull More specificallyhellipndash Predict all the states you can upfrontndash Read a word

ndash Extend states based on matchesndash Add new predictionsndash Go to 2

ndash Look at N+1 to see if you have a winner

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 30: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Earley and Left Recursion

bull So Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the search

ndash Never place a state into the chart thatrsquos already therendash Copy states before advancing them

S -gt NP VPNP -gt NP PP

bull The first rule predictsS -gt NP VP [00] that addsNP -gt NP PP [00]stops there since adding any subsequent prediction would be fruitless

bull When a state gets advanced make a copy and leave the original alonendash Say we have NP -gt NP PP [00]ndash We find an NP from 0 to 2 so we create NP -gt NP PP [02]ndash But we leave the original state as is

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 31: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Predictor

Given a stateWith a non-terminal to right of dotThat is not a part-of-speech categoryCreate a new state for each expansion of the non-terminalPlace these new states into same chart entry as generated state

beginning and ending where generating state ends

So predictor looking at

S -gt VP [00]

results in

VP -gt Verb [00]VP -gt Verb NP [00]

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 32: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Scanner

Given a stateWith a non-terminal to right of dotThat is a part-of-speech categoryIf the next word in the input matches this part-of-speechndash Create a new state with dot moved over the non-terminalndash insert in next chart entry

So scanner looking at

VP -gt Verb NP [00]

If the next word ldquobookrdquo can be a verb add new state

VP -gt Verb NP [01]

Add this state to chart entry following current one

Note Earley algorithm uses top-down input to disambiguate POS Only POS predicted by some state can get added to chart

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 33: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Completer

Applied to a state when its dot has reached right end of roleParser has discovered a category over some span of inputFind and advance all previous states that were looking for this

category

bull copy state bull move dotbull insert in current chart entry

GivenNP -gt Det Nominal [13]VP -gt Verb NP [01]

AddVP -gt Verb NP [03]

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 34: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Earley how do we know we are done

Find an S state in the final column that spans from 0 to n+1 and is complete

S ndashgt α [0n+1]

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 35: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Earley

So sweep through the table from 0 to n+1hellip

New predicted states are created by starting top-down from S

New incomplete states are created by advancing existing states as new constituents are discovered

New complete states are created in the same way

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 36: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Earley

More specificallyhellipPredict all the states you can upfront

Read a wordExtend states based on matchesAdd new predictionsGo to 2

Look at N+1 to see if you have a winner

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 37: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Example

Book that flightWe should findhellip an S from 0 to 3 that is a completed statehellip

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 38: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 39: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Example (contrsquod)

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 40: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Example (contrsquod)

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 41: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

A simple example

Chart[0]γ rarr S [00] (dummy start state)S rarr NP VP [00 ] (predictor)NP rarr N [00 ] (predictor)

Chart[1]N rarr I [01 ] (scan)NP rarr N [01 ] (completer)S rarr NP VP [01 ] (completer)VP rarr V NP [11 ] (predictor)

Chart[2]V rarr saw [12 ] (scan) VP rarr V NP [12 ] (complete) NP rarr N [22 ] (predict)

Chart[3]NP rarr N [23 ] (scan)NP rarr N [23 ] (completer)VP rarr V NP [13 ] (completer)S rarr NP VP [03 ] (completer)

Grammar S rarr NP VP NP rarr N VP rarr V NP

Lexicon Nrarr I | saw | Mary Vrarr saw

Input I saw Mary

Sentence accepted

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 42: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

What is it

What kind of parser did we just describe (trick question)Earley parserhellip yesNot a parser ndash a recognizer

The presence of an S state with the right attributes in the right place indicates a successful recognition

But no parse treehellip no parserThatrsquos how we solve (not) an exponential problem in polynomial time

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 43: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Converting Earley from Recognizer to ParserWith the addition of a few pointers we have a parserAugment the ldquoCompleterrdquo to point to where we came from

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 44: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Augmenting the chart with structural information

S8S9

S10

S11

S13S12

S8

S9

S8

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 45: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Retrieving Parse Trees from Chart

All the possible parses for an input are in the tableWe just need to read off all the backpointers from every complete S

in the last column of the tableFind all the S -gt X [0N+1]Follow the structural traces from the CompleterOf course this wonrsquot be polynomial time since there could be an

exponential number of treesSo we can at least represent ambiguity efficiently

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 46: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Earley and Left Recursion

Earley solves the left-recursion problem without having to alter the grammar or artificially limiting the searchNever place a state into the chart thatrsquos already thereCopy states before advancing them

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 47: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Earley and Left Recursion 1

S -gt NP VPNP -gt NP PP

Predictor given first ruleS -gt NP VP [00]

PredictsNP -gt NP PP [00]stops there since predicting same again would be redundant

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 48: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Earley and Left Recursion 2

When a state gets advanced make a copy and leave the original alonehellip

Say we have NP -gt NP PP [00]We find an NP from 0 to 2 so we create

NP -gt NP PP [02]But we leave the original state as is

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches
Page 49: Parsing with  Context Free Grammars CSC 9010 Natural Language Processing

Slide 1

Dynamic Programming Approaches

EarleyTop-down no filtering no restriction on grammar form

CYKBottom-up no filtering grammars restricted to Chomsky-Normal Form

(CNF)Details are not important

Bottom-up vs top-downWith or without filtersWith restrictions on grammar form or not

  • Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from Jim Martin (after Dan Jurafsky) from U Colorado Rada Mihalcea University of North Texas httpwwwcsuntedu~radaCSCE5290 Robert Berwick MIT Bonnie Dorr University of Maryland
  • Parsing
  • Programming languages
  • Natural Languages
  • Some assumptions
  • Top-Down Parsing
  • Top Down Space
  • Bottom-Up Parsing
  • Bottom-Up Space
  • Top-Down VS Bottom-Up
  • Top-Down Depth-First Left-to-Right Search
  • Example (contrsquod)
  • Slide 13
  • Slide 14
  • Bottom-Up Filtering
  • Possible Problem Left-Recursion
  • Solution Rule Ordering
  • Avoiding Repeated Work
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Dynamic Programming
  • Earley Parsing
  • States
  • StatesLocations
  • Graphically
  • Earley
  • Slide 29
  • Earley and Left Recursion
  • Predictor
  • Scanner
  • Completer
  • Earley how do we know we are done
  • Slide 35
  • Slide 36
  • Example
  • Slide 38
  • Slide 39
  • Slide 40
  • A simple example
  • What is it
  • Converting Earley from Recognizer to Parser
  • Augmenting the chart with structural information
  • Retrieving Parse Trees from Chart
  • Slide 46
  • Earley and Left Recursion 1
  • Earley and Left Recursion 2
  • Dynamic Programming Approaches