school of something faculty of other school of computing faculty of engineering parsing: computing...
TRANSCRIPT
![Page 1: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/1.jpg)
School of somethingFACULTY OF OTHER
School of ComputingFACULTY OF ENGINEERING
Parsing: computing the grammatical structure of English sentences
COMP3310 Natural Language Processing
Eric Atwell, Language Research Group
(with thanks to Katja Markert, Marti Hearst, and other contributors)
![Page 2: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/2.jpg)
Reminder: Outline for Grammar/Parsing
Context-Free Grammars and Constituency
Some common CFG phenomena for English
• Sentence-level constructions
• NP, PP, VP
• Coordination
• Subcategorization
Top-down and Bottom-up Parsing
Chart Parsing
![Page 3: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/3.jpg)
CFG example
S -> NP VP
NP -> Det NOMINAL
NOMINAL -> Noun
VP -> Verb
Det -> a
Noun -> flight
Verb -> left
Alternatively…
S -> NP VP
NP -> A N
VP -> V
A -> a
N -> flight
V -> left
![Page 4: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/4.jpg)
Derivations
A derivation is a sequence of rules applied to a string that accounts for that string
• Covers all the elements in the string
• Covers only the elements in the string
![Page 5: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/5.jpg)
Bracketed Notation
[S [NP [PRO I]] [VP [V prefer] [NP [Det a] [Nom [N morning]
[N flight] ] ] ] ]S
NP VP
NP
VerbPro
Nom
Det NounNoun
I prefer morninga flight
![Page 6: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/6.jpg)
CFGs: a summary
CFGs appear to be just about what we need to account for a lot of basic syntactic structure in English.
But there are problems
• That can be dealt with adequately, although not elegantly, by staying within the CFG framework.
There are simpler, more elegant, solutions that take us out of the CFG framework (beyond its formal power).
Syntactic theories: TG, HPSG, LFG, GPSG, etc.
![Page 7: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/7.jpg)
Other syntactic stuff
Grammatical relations or functions
• Subject
• I booked a flight to New York
• The flight was booked by my agent
• Object
• I booked a flight to New York
• Complement
• I said that I wanted to leave
![Page 8: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/8.jpg)
Dependency parsing
Word to word links instead of constituency
Based on the European rather than American traditions
But dates back to ancient Greek and Arab scholars
Eg see Quranic Arabic Corpus
The original notions of Subject, Object and the progenitor of subcategorization (called ‘valence’) came out of Dependency theory.
Dependency parsing is quite popular as a computational model
since relationships between words are quite useful
![Page 9: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/9.jpg)
Dependency parsing
Bills on ports and immigration were submitted by Senator Brownback
NP
S
NP
NNP NNP
PP
IN
VP
VP
VBN
VBD
NNCCNNS
NPIN
NP PP
NNS
submitted
Bills were Brownback
Senator
nsubjpass auxpass agent
nnprep_onports
immigration conj_and
Parse tree:
Nesting of multi-word constituents
Typed dep parse:
Grammatical relations between individual words
![Page 10: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/10.jpg)
Why are dependency parses useful?
Example: multi-document summarization
Need to identify sentences from different documents that each say roughly the same thing:
phrase structure trees of paraphrasing sentences which differ in word order can be significantly different
but dependency representations will be very similar
![Page 11: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/11.jpg)
Parsing
Parsing: assigning correct trees to input strings
Correct tree: a tree that covers all and only the elements of the input and has an S at the top
For now: enumerate all possible trees
• A further task: disambiguation: means choosing the correct tree from among all the possible trees.
![Page 12: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/12.jpg)
Treebanks
Parsed corpora in the form of trees
The Penn Treebank
• The Brown corpus
• The WSJ corpus
Tgrep
http://www.ldc.upenn.edu/ldc/online/treebank/
Tregex
http://www-nlp.stanford.edu/nlp/javadoc/javanlp/
![Page 13: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/13.jpg)
Parsing involves search
As with everything of interest, parsing involves a search which involves the making of choices
We’ll start with some basic (meaning bad) methods before moving on to the one or two that you need to know
![Page 14: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/14.jpg)
For Now
Assume…
• You have all the words already in some buffer
• The input isn’t pos tagged
• We won’t worry about morphological analysis
• All the words are known
![Page 15: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/15.jpg)
Top-Down Parsing
Since we’re trying to find trees rooted with an S (Sentences) start with the rules that give us an S.
Then work your way down from there to the words.
![Page 16: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/16.jpg)
Top Down Space
S
NP VP
S
VP
S
NP VPAux
NP VP
S
NP VP
S
NP VPAux
S
VPNPAux
S
NP
VP
S
VP
S
![Page 17: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/17.jpg)
Bottom-Up Parsing
Of course, we also want trees that cover the input words. So start with trees that link up with the words in the right way.
Then work your way up from there.
![Page 18: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/18.jpg)
Bottom-Up Space
thatflightBook
flightthatBook flightthatBook
flightthatBook flightthatBook
flightthatBook flightthatBook
![Page 19: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/19.jpg)
Control
Of course, in both cases we left out how to keep track of the search space and how to make choices
• Which node to try to expand next
• Which grammar rule to use to expand a node
![Page 20: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/20.jpg)
Top-Down, Depth-First, Left-to-Right Search
![Page 21: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/21.jpg)
Example
![Page 22: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/22.jpg)
Example
[flight] [flight]
![Page 23: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/23.jpg)
Example
flight flight
![Page 24: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/24.jpg)
Top-Down and Bottom-Up
Top-down
• Only searches for trees that can be answers (i.e. S’s)
• But also suggests trees that are not consistent with the words
Bottom-up
• Only forms trees consistent with the words
• Suggest trees that make no sense globally
![Page 25: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/25.jpg)
So Combine Them
There are a million ways to combine top-down expectations with bottom-up data to get more efficient searches
Most use one kind as the control and the other as a filter
• As in top-down parsing with bottom-up filtering
![Page 26: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/26.jpg)
Adding Bottom-Up Filtering
![Page 27: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/27.jpg)
3 problems with TDDFLtR Parser
Left-Recursion
Ambiguity
Inefficient reparsing of subtrees
![Page 28: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/28.jpg)
Left-Recursion
What happens in the following situation
• S -> NP VP
• S -> Aux NP VP
• NP -> NP PP
• NP -> Det Nominal
• …
• With the sentence starting with
• Did the flight…
![Page 29: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/29.jpg)
Ambiguity
“One morning I shot an elephant in my pajamas. How he got into my pajamas I don’t know.”
Groucho Marx
![Page 30: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/30.jpg)
Lots of ambiguity
VP -> VP PP
NP -> NP PP
Show me the meal on flight 286 from SF to Denver
14 parses!
![Page 31: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/31.jpg)
Lots of ambiguity
Church and Patil (1982)
• Number of parses for such sentences grows at rate of number of parenthesizations of arithmetic expressions
• Which grow with Catalan numbers
C(n)1
n 12n
n
PPs Parses
1 2
2 5
3 14
4 132
5 469
6 1430
![Page 32: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/32.jpg)
Chart Parser:Avoiding Repeated Work
Parsing is hard, and slow. It’s wasteful to redo stuff over and over and over.
A CHART PARSER maintains a CHART – a table of partial parses found so far, to “re-use” if required.
Consider an attempt to top-down parse the following as an NP:
A flight from Indianapolis to Houston on TWA
![Page 33: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/33.jpg)
flight
flight
![Page 34: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/34.jpg)
flight
flight
![Page 35: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/35.jpg)
flight
flight
![Page 36: School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310](https://reader035.vdocument.in/reader035/viewer/2022062511/5515f544550346cf6f8b55e9/html5/thumbnails/36.jpg)
Grammars and Parsing
Context-Free Grammars and Constituency
Some common CFG phenomena for English
Basic parsers: Top-down and Bottom-up Parsing
Chart Parser – keep a CHART of partial parses