![Page 1: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/1.jpg)
CS114 Lecture 10
Parsing and Par4al Parsing March 3, 2014
Professor Meteer Thanks for Jurafsky & Mar4n & Prof. Pustejovksy for slides
![Page 2: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/2.jpg)
3/3/14 Speech and Language Processing -‐ Jurafsky and Mar4n
2
Parsing
• Parsing with CFGs refers to the task of assigning proper trees to input strings
• Proper here means a tree that covers all and only the elements of the input and has an S at the top
– It doesn’t actually mean that the system can select the correct tree from among all the possible trees
• Par4al parsing returns a set of cons4tuents or small trees
– Does not require a single S at the top
![Page 3: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/3.jpg)
3/3/14 Speech and Language Processing -‐ Jurafsky and Mar4n
3
Top-‐Down and BoRom-‐Up
• Top-‐down
– Only finds trees that form sentences
– Finds trees that don't match the words
• BoRom-‐up
– Only finds trees consistent with the words
– Builds subtrees that can't combine to make sentences
![Page 4: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/4.jpg)
3/3/14 Speech and Language Processing -‐ Jurafsky and Mar4n
4
Top-‐Down Search
• Sentences: trees rooted with S
• Start with S, expand nodes working downwards
• Try to reach the right set of words
![Page 5: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/5.jpg)
3/3/14 Speech and Language Processing -‐ Jurafsky and Mar4n
5
Top Down Space
![Page 6: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/6.jpg)
3/3/14 Speech and Language Processing -‐ Jurafsky and Mar4n
6
BoRom-‐Up Parsing
• Input must be matched exactly – very hard
• Start by combining words into small trees
• Work your way up, combining small trees into larger trees
• Try to get a single tree rooted at S
![Page 7: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/7.jpg)
BoRom-‐Up Search
3/3/14 Speech and Language Processing -‐ Jurafsky and Mar4n
7
Book that flight Verb Det Nom
Noun
NP
VP
S Grammar: S ◊ NP VP S◊ Aux NP VP S ◊ VP NP ◊ Det Nom Nom ◊ Noun Nom ◊ Noun Nom Nom ◊ Nom PP NP ◊ Proper-‐Noun VP ◊ Verb VP ◊ Verb NP PP ◊ Prep NP
![Page 8: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/8.jpg)
3/3/14 Speech and Language Processing -‐ Jurafsky and Mar4n
8
Control • How to explore the search space?
– Which node to try to expand next
– Which grammar rule to use to expand a node
– Wrong choice leads to unsolvable problem
• One approach is called backtracking.
– Make a choice, if it works out then fine
– If not then back up and make a different choice
![Page 9: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/9.jpg)
3/3/14 Speech and Language Processing -‐ Jurafsky and Mar4n
9
Problems
• Backtracking is not good enough
– Ambiguity
– Shared subproblems
![Page 10: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/10.jpg)
3/3/14 Speech and Language Processing -‐ Jurafsky and Mar4n
10
Ambiguity
![Page 11: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/11.jpg)
October 2006 csa3180: Parsing Algorithms 1
11
The elephant is in the garden
I shot an elephant in my garden
NP
VP
NP
PP
NP
S
![Page 12: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/12.jpg)
October 2006 csa3180: Parsing Algorithms 1
12
I fired from the garden
I shot an elephant in my garden
NP
VP
PP
NP
S
![Page 13: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/13.jpg)
3/3/14 Speech and Language Processing -‐ Jurafsky and Mar4n
13
Shared Sub-‐Problems
• No maRer what kind of search (top-‐down or boRom-‐up or mixed) that we choose.
– Redoing work is wasted effort
– Naïve backtracking means duplicated work.
– Dynamic Programming...
![Page 14: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/14.jpg)
Parsing so far
Goals -‐ Define a language
-‐ A grammar defines a language L over token set (terminals)
-‐ A parser determines whether a par4cular string of tokens is in the language
-‐ Determine the cons4tuent structure of a string of tokens (assuming it’s in the language)
-‐ Determine the meaning (we’re not there yet)
![Page 15: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/15.jpg)
Parsing so far
Informa4on
– Categories (nonterminals) of the tokens
– Which adjacent nonterminals form tokens (rules)
– What addi4onal condi4ons must be met in order for a cons4tuent to be formed (e.g. features, words)
![Page 16: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/16.jpg)
Parsing Types
• Next:
– Chunking
– Par4al parsing
• We’ll come back to Parsing as Search
– CFGS
• Top down, boRom up
• Earley’s algorithm
– Probabalis4c CFGs
– Unifica4on Grammars
![Page 17: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/17.jpg)
Mo4va4on: Parsing is hard • Example (wsj_0001):
– Pierre Vinken, 61 years old, will join the board as a nonexecu4ve director Nov. 29.
– Mr. Vinken is chairman of Elsevier N.V., the Dutch publishing group.
– Pierre/NNP Vinken/NNP ,/, 61/CD years/NNS old/JJ ,/, will/MD join/VB the/DT board/NN as/IN a/DT nonexecu4ve/JJ director/NN Nov./NNP 29/CD ./.
– Mr./NNP Vinken/NNP is/VBZ chairman/NN of/IN Elsevier/NNP N.V./NNP ,/, the/DT Dutch/NNP publishing/VBG group/NN ./.
![Page 18: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/18.jpg)
October 2006 csa3180: Parsing Algorithms 1
18
Parsing Problem • Given grammar G and sentence A discover all
valid parse trees for G that exactly cover A
S
VP
NP V
Det Nom
N book
that
flight
![Page 19: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/19.jpg)
Slide modified from Steven Bird's
Shallow (Chunk) Parsing
Goal: divide a sentence into a sequence of chunks.
• Chunks are non-‐overlapping regions of a text
[I] saw [a tall man] in [the park].
• Chunks are non-‐recursive
– Cannot contain other chunks
• Chunks are non-‐exhaus4ve
– Not all words are included in chunks
![Page 20: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/20.jpg)
Slide modified from Steven Bird's
Chunk Parsing Examples
• Noun-‐phrase chunking:
[I] saw [a tall man] in [the park].
• Verb-‐phrase chunking:
The man who [was in the park] [saw me].
• Ques4on answering:
– What [Spanish explorer] discovered [the Mississippi River]?
![Page 21: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/21.jpg)
Slide modified from Steven Bird's
Shallow Parsing: Mo4va4on
• Loca4ng informa4on – e.g. index a document collec4on on its noun phrases
• Ignoring informa4on – Generalize in order to study higher-‐level paRerns
• e.g. phrases involving “gave” in Penn treebank:
– gave NP; gave up NP in NP; gave NP up; gave NP help; gave NP to NP
– Some4mes a full parse has too much structure
![Page 22: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/22.jpg)
Slide modified from Steven Bird's
Representa4on
• BIO (or IOB)
Trees
![Page 23: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/23.jpg)
Comparison with Full Syntac4c Parsing
• Parsing oren an intermediate stage
– later stages draw on the parse for their own purposes
• Full parsing sufficient, oren not necessary
– Oren more informa4on than we need
• Shallow parsing is an easier problem
• Less structure, no recursion
Slide modified from Steven Bird's
![Page 24: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/24.jpg)
Slide modified from Steven Bird's
Chunks and Cons4tuency
Cons5tuents: [[a tall man] [ in [the park]]].
Chunks: [a tall man] in [the park].
• A cons4tuent is part of some higher unit in the hierarchical syntac4c parse
• Chunks are not cons'tuents
– Cons4tuents are recursive
– Chunks do not cross major cons4tuent boundaries (why?)
![Page 25: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/25.jpg)
Slide modified from Steven Bird's
Chunk Parsing in NLTK
• Chunk parsers usually ignore lexical content
– Only need to look at part-‐of-‐speech tags
• Possible steps in chunk parsing
– Chunking, unchunking
– Chinking
– Merging, spliung
![Page 26: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/26.jpg)
Slide modified from Steven Bird's
Chunking
• Define a regular expression that matches the sequences of tags in a chunk A simple noun phrase chunk regexp:
(Note that <NN.*> matches any tag star4ng with NN)
<DT>? <JJ>* <NN.*>
• Chunk all matching subsequences:
the/DT liRle/JJ cat/NN sat/VBD on/IN the/DT mat/NN
[the/DT liRle/JJ cat/NN] sat/VBD on/IN [the/DT mat/NN]
• If matching subsequences overlap, first 1 gets priority
![Page 27: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/27.jpg)
Unchunking • Remove any chunk with a given paRern
– e.g., unChunkRule(‘<NN|DT>+’, ‘Unchunk NNDT’)
– Combine with Chunk Rule <NN|DT|JJ>+
• Chunk all matching subsequences:
– Input:
the/DT liRle/JJ cat/NN sat/VBD on/IN the/DT mat/NN
– Apply chunk rule
[the/DT liRle/JJ cat/NN] sat/VBD on/IN [the/DT mat/NN]
– Apply unchunk rule
[the/DT liRle/JJ cat/NN] sat/VBD on/IN the/DT mat/NN
![Page 28: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/28.jpg)
Slide modified from Steven Bird's
Chinking
• A chink is a subsequence of the text that is not a chunk. • Define a regular expression that matches the sequences of tags in a chink
A simple chink regexp for finding NP chunks: (<VB.?>|<IN>)+
• First apply chunk rule to chunk everything – Input: the/DT liRle/JJ cat/NN sat/VBD on/IN the/DT mat/NN
– ChunkRule('<.*>+', ‘Chunk everything’) [the/DT liRle/JJ cat/NN sat/VBD on/IN the/DT mat/NN]
– Apply Chink rule above:
[the/DT liRle/JJ cat/NN] sat/VBD on/IN [the/DT mat/NN]
Chink Chunk Chunk
![Page 29: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/29.jpg)
Slide modified from Steven Bird's
Merging
• Combine adjacent chunks into a single chunk
– Define a regular expression that matches the sequences of tags on both sides of the point to be merged
• Example:
– Merge a chunk ending in JJ with a chunk star4ng with NN
MergeRule(‘<JJ>’, ‘<NN>’, ‘Merge adjs and nouns’)
[the/DT liRle/JJ] [cat/NN] sat/VBD on/IN the/DT mat/NN
[the/DT liRle/JJ cat/NN] sat/VBD on/IN the/DT mat/NN
• Spliung is the opposite of merging
![Page 30: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/30.jpg)
Cascaded Chunking
• Goal: create chunks that include other chunks
• Examples:
– PP consists of preposi4on + NP
– VP consists of verb followed by PPs or NPs
![Page 31: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/31.jpg)
Slide modified from Steven Bird's
Cascaded Chunking
![Page 32: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/32.jpg)
Next Assignment
– Use the nltk chunker and write chunking rules and apply it to the Treebank data.
– Evaluate your performance on the "tagged" treebank data (which is already chunked).
– There are 200 sentences. Develop the grammar using the first 150 sentences and test on the last 50 sentences.
![Page 33: CS114%Lecture%10% Parsing%and%Par4al%Parsing … › ~cs114 › CS114_slides › CS... · CS114%Lecture%10% Parsing%and%Par4al%Parsing% March%3,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar4n%&%Prof.%Pustejovksy%for%slides%](https://reader036.vdocument.in/reader036/viewer/2022070804/5f036dfe7e708231d4092a0c/html5/thumbnails/33.jpg)
Example • Write rules using POS to produce the chunks
• Sec4on 7.3, NLTK book
• Evaluate performance
• Definitely > 70% f-‐measure, shoot for >90%
[ A/DT form/NN ]
of/IN
[ asbestos/NN ]
once/RB used/VBN to/TO make/VB
[ Kent/NNP ]
[ cigareRe/NN filters/NNS ]
has/VBZ caused/VBN
[ a/DT high/JJ percentage/NN ]
of/IN
[ cancer/NN deaths/NNS ]
…..