unsupervised partial parsing: thesis defense
DESCRIPTION
Thesis defense slides covering my computational linguistics research in unsupervised parsingTRANSCRIPT
![Page 1: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/1.jpg)
Unsupervised Partial Parsing
Elias Ponvert
Department of LinguisticsThe University of Texas at Austin
Dissertation DefenseJuly 27, 2011
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 1 / 62
![Page 2: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/2.jpg)
1 Goals and contributions
2 Unsupervised partial parsingMain resultsDiscussion
3 Cascaded parsingMain resultsDiscussion
4 Concluding remarks
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 2 / 62
![Page 3: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/3.jpg)
Research goals
Generally:Develop computational models to learn humanlanguage
Hello!
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 3 / 62
![Page 4: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/4.jpg)
Research goalsSpecifically:Learn to predict constituent structure from raw text
the cat saw the red dog run
⇓
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 3 / 62
![Page 5: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/5.jpg)
Why unsupervised parsing?1 Less reliance on annotated training
Hello!
2 Apply to new languages and domains
Særær manannær man
mæþæn
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 4 / 62
![Page 6: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/6.jpg)
Assumptions made in parser learning
S
NP VPPP
P
on
NP
N
Sunday
Det
the
A
brown
N
bear
V
sleeps
,
,
Getting these labels right AS WELL AS the structureof the tree is hard
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 5 / 62
![Page 7: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/7.jpg)
Assumptions made in parser learning
P
on
N
Sunday
Det
the
A
brown
N
bear
V
sleeps
,
,
So the task is to identify the structure alone
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 5 / 62
![Page 8: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/8.jpg)
Assumptions made in parser learning
on Sunday the brown bear
sleeps,
Learning operates from gold-standard parts-of-speech(POS) rather than raw text
P N Det A N
V,
on Sunday , the brown bear sleepsP N , Det A N V
Klein & Manning 2003 CCMBod 2006a, 2006bKlein & Manning 2005 DMVSuccessors to DMV: - Smith 2006, Smith & Cohen 2009, Headden et al 2009, Spitkovsky et al 2010ab, &c
J. Gao et al 2003, 2004Seginer 2007
this work
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 5 / 62
![Page 9: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/9.jpg)
Unsupervised parsing: desiderata
Raw text
Standard NLP / extensible
Scalable and fast
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 6 / 62
![Page 10: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/10.jpg)
Contributions
• Unsupervised parsing satisfying thesedesiderata is possible
• Unsupervised partial parsing: predicting localconstituents with high accuracy
• Cascaded models: building constituent structurebottom up
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 7 / 62
![Page 11: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/11.jpg)
Outline
1 Goals and contributions
2 Unsupervised partial parsingMain resultsDiscussion
3 Cascaded parsingMain resultsDiscussion
4 Concluding remarks
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 8 / 62
![Page 12: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/12.jpg)
A new approach: start from the bottom
Unsupervised Partial Parsing =segmentation of (non-overlapping) multiword constituents
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 9 / 62
![Page 13: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/13.jpg)
Unsupervised segmentation of constituentsleaves some room for interpretation
Possible segmentations• ( the cat ) in ( the hat ) knows ( a lot ) about that
• ( the cat ) ( in the hat ) knows ( a lot ) ( about that )
• ( the cat in the hat ) knows ( a lot about that )
• ( the cat in the hat ) ( knows a lot about that )
• ( the cat in the hat ) ( knows a lot ) ( about that )
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 10 / 62
![Page 14: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/14.jpg)
Defining UPP by evaluation1. Constituent chunks:
non-hierarchical multiword constituentsS
NP
D
The
N
Cat
PP
P
in
NP
D
the
N
hat
VP
V
knows
NP
D
a
N
lot
PP
P
about
NP
N
that
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 11 / 62
![Page 15: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/15.jpg)
Defining UPP by evaluation2. Base NPs:
non-recursive noun phrases
S
NP
D
The
N
Cat
PP
P
in
NP
D
the
N
hat
VP
V
knows
NP
D
a
N
lot
PP
P
about
NP
N
that
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 11 / 62
![Page 16: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/16.jpg)
Multilingual data for direct evaluation
English WSJGerman NegraChinese CTB
Sentences Types TokensWSJ Penn Treebank 49K 44K 1M
Negra Negra German Corpus 21K 49K 300KCTB Penn Chinese Treebank 19K 37K 430K
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 12 / 62
![Page 17: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/17.jpg)
Constituent chunks and NPs in the data
WSJChunks 203KNPs 172KChunks ∩ NPs 161K
NegraChunks 59KNPs 33KChunks ∩ NPs 23K
CTBChunks 92KNPs 56KChunks ∩ NPs 43K
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 13 / 62
![Page 18: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/18.jpg)
The benchmark: CCL parser
the cat
saw
the red dog
run
the0 ��
cat0
��
1 ��saw
0 ���� ��0
��the
0 ��red
0 ��
0�� dog
0�� run
0��
Common Cover Links representation
Constituency tree
Seginer (2007 ACL; 2007 PhD UvA)
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 14 / 62
![Page 19: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/19.jpg)
Hypothesis
Segmentation can be learned bygeneralizing on phrasal boundaries
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 15 / 62
![Page 20: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/20.jpg)
UPP as a tagging problem
Bthe
Icat
Oin
Bthe
Ihat
the cat in the hat
B Beginning of a constituentI Inside a constituent
O Not inside a constituent
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 16 / 62
![Page 21: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/21.jpg)
Learning from boundaries
Bthe
Icat
Oin
Bthe
Ihat
the cat in the hat
STOP
#STOP
#
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 17 / 62
![Page 22: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/22.jpg)
Unsupervised learning tag model for UPP
B
the
I
cat
O
in the
I
hat
STOP
#
STOP
#
O
B
I
O
B
I
O
B
O
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 18 / 62
![Page 23: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/23.jpg)
Unsupervised learning tag model for UPP
B
the
I
cat
O
in the
I
hat
STOP
#
STOP
#
O
B
I
O
B
I
O
B
O
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 18 / 62
![Page 24: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/24.jpg)
Unsupervised learning tag model for UPP
B
the
I
cat
O
in the
I
hat
STOP
#
STOP
#
O
B
I
O
B
I
O
B
O
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 18 / 62
![Page 25: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/25.jpg)
Unsupervised learning tag model for UPP
B
the
I
cat
O
in the
I
hat
STOP
#
STOP
#
O
B
I
O
B
I
O
B
O
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 18 / 62
![Page 26: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/26.jpg)
Unsupervised learning tag model for UPP
B
the
I
cat
O
in the
I
hat
STOP
#
STOP
#
O
B
I
O
B
I
O
B
O
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 18 / 62
![Page 27: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/27.jpg)
Unsupervised learning tag model for UPP
B
the
I
cat
O
in the
I
hat
STOP
#
STOP
#
O
B
I
O
B
I
O
B
O
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 18 / 62
![Page 28: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/28.jpg)
Decoding the tag model for UPP
B
the
I
cat in the
I
hat
STOP
#STOP
#O B
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 19 / 62
![Page 29: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/29.jpg)
Decoding the tag model for UPP
B
the
I
cat in the
I
hat
STOP
#STOP
#O B
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 19 / 62
![Page 30: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/30.jpg)
Learning from punctuation
Bon
Isunday
Bthe
Ibrown
Ibear
STOP
#STOP
#
on sunday , the brown bear sleeps
STOP
,O
sleeps
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 20 / 62
![Page 31: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/31.jpg)
UPP: Models
P( ) ≈ P( ) P( | )B
the
I
cat
O
in
B
the
I
hat
Hidden Markov Model
B I
theBtheB I
Probabilistic right linear grammar
P( ) = P( ) P( | )theB I B I
BI
OB
I
thecat
inthe
hat
B
Ithe
Learning: expectation maximization (EM) viaforward-backward (run to convergence)
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 21 / 62
![Page 32: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/32.jpg)
UPP: Models
P( ) ≈ P( ) P( | )B
the
I
cat
O
in
B
the
I
hat
Hidden Markov Model
B I
theBtheB I
Probabilistic right linear grammar
P( ) = P( ) P( | )theB I B I
BI
OB
I
thecat
inthe
hat
B
Ithe
Decoding: ViterbiSmoothing: additive smoothing on emissions
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 21 / 62
![Page 33: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/33.jpg)
UPP: Constraints on sequences
Bthe
Icat
Oin
Bthe
Ihat
the cat in the hat
STOP
#STOP
#
STOP B
O I
1
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 22 / 62
![Page 34: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/34.jpg)
UPP evaluation: Setup
• Evaluation by comparison to treebank data• Standard train / development / test splits• Precision and recall on matched constituents• Benchmark: CCL• Both get tokenization, punctuation,
sentence boundaries
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 23 / 62
![Page 35: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/35.jpg)
UPP evaluation: Chunking (F-score)
0 10 20 30 40 50 60 70 80
CTB
Negra
WSJ
CCL∗ HMM Chunker PRLG Chunker
CCL non-hierarchical constituentsFirst-level parsing output
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 24 / 62
![Page 36: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/36.jpg)
UPP evaluation: Base NPs (F-score)
0 10 20 30 40 50 60 70 80
CTB
Negra
WSJ
CCL∗ HMM Chunker PRLG Chunker
CCL non-hierarchical constituentsFirst-level parsing output
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 25 / 62
![Page 37: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/37.jpg)
PRLG example output(the seeds) already are in (the script)
(little chance) that (shane longman) is goingto recoup today
it would have (severe implications) for(farmers ’ policy) holders
(thames ’s u.s. marketing agent)(donald taffner) is preparing to do just that
and all (the while) (the bonds) are in(the baby ’s diaper)
(mr. rustin) is (senior correspondent) in(the journal ’s london bureau)
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 26 / 62
![Page 38: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/38.jpg)
UPP: Review
• Sequence models can generalize on indicatorsfor phrasal boundaries
• Leads to improved unsupervised segmentation• Learn to predict NPs with high accuracy
• (English and German especially)
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 27 / 62
![Page 39: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/39.jpg)
Outline
1 Goals and contributions
2 Unsupervised partial parsingMain resultsDiscussion
3 Cascaded parsingMain resultsDiscussion
4 Concluding remarks
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 28 / 62
![Page 40: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/40.jpg)
Question
How do UPP models capturenoun phrase structure?
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 29 / 62
![Page 41: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/41.jpg)
What UPP models learn
B 100 · P(w|B)the 21.0a 8.7to 6.5’s 2.8in 1.9mr. 1.8its 1.6of 1.4an 1.4and 1.4
I 100 · P(w|I)% 1.8million 1.6be 1.3company 0.9year 0.8market 0.7billion 0.6share 0.5new 0.5than 0.5
O 100 · P(w|O)
of 5.8and 4.0in 3.7that 2.2to 2.1for 2.0is 2.0it 1.7said 1.7on 1.5
HMM Emissions: WSJ
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 30 / 62
![Page 42: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/42.jpg)
What UPP models learn
B 100 · P(w|B)der the 13.0die the 12.2den the 4.4und and 3.3im in 3.2das the 2.9des the 2.7dem the 2.4eine a 2.1ein a 2.0
I 100 · P(w|I)uhr o’clock 0.8juni June 0.6jahren years 0.4prozent percent 0.4mark currency 0.3stadt city 0.3000 0.3millionen millions 0.3jahre year 0.3frankfurter Frankfurt 0.3
O 100 · P(w|O)
in in 3.4und and 2.7mit with 1.7fur for 1.6auf on 1.5zu to 1.4von of 1.3sich oneself 1.3ist is 1.3nicht not 1.2
HMM Emissions: Negra
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 30 / 62
![Page 43: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/43.jpg)
What UPP models learn
B 100 · P(w|B)的 de, of 14.3一 one 3.1和 and 1.1两 two 0.9这 this 0.8有 have 0.8经济 economy 0.7各 each 0.7全 all 0.7不 no 0.6
I 100 · P(w|I)的 de 3.9了 (perf. asp.) 2.2个 ge (measure) 1.5年 year 1.3说 say 1.0中 middle 0.9上 on, above 0.9人 person 0.7大 big 0.7国 country 0.6
O 100 · P(w|O)
在 at, in 3.4是 is 2.4中国 China 1.4也 also 1.2不 no 1.2对 pair 1.1和 and 1.0的 de 1.0将 fut. tns. 1.0有 have 1.0
HMM Emissions: CTB
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 30 / 62
![Page 44: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/44.jpg)
Question
What about the PRLG, why does it do somuch better than the HMM?
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 31 / 62
![Page 45: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/45.jpg)
Question
P( ) ≈ P( ) P( | )B
the
I
cat
O
in
B
the
I
hat
Hidden Markov Model
B I
theBtheB I
Probabilistic right linear grammar
P( ) = P( ) P( | )theB I B I
BI
OB
I
thecat
inthe
hat
B
Ithe
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 31 / 62
![Page 46: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/46.jpg)
What’s wrong with this picture?
B 100 · P(w|B)the 21.0a 8.7to 6.5’s 2.8in 1.9mr. 1.8its 1.6of 1.4an 1.4and 1.4
I 100 · P(w|I)% 1.8million 1.6be 1.3company 0.9year 0.8market 0.7billion 0.6share 0.5new 0.5than 0.5
O 100 · P(w|O)
of 5.8and 4.0in 3.7that 2.2to 2.1for 2.0is 2.0it 1.7said 1.7on 1.5
• ’s occurs (immediately) before several terms thatappear after B
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 32 / 62
![Page 47: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/47.jpg)
What’s wrong with this picture?
B 100 · P(w|B)the 21.0a 8.7to 6.5’s 2.8in 1.9mr. 1.8its 1.6of 1.4an 1.4and 1.4
I 100 · P(w|I)% 1.8million 1.6be 1.3company 0.9year 0.8market 0.7billion 0.6share 0.5new 0.5than 0.5
O 100 · P(w|O)
of 5.8and 4.0in 3.7that 2.2to 2.1for 2.0is 2.0it 1.7said 1.7on 1.5
• ’s occurs (immediately) before several terms thatappear after B
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 32 / 62
![Page 48: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/48.jpg)
PRLG rule probabilities
B 100 · P(B → w q)B → the I 28.2B → a I 11.7B → mr. I 2.4B → its I 2.2B → an I 1.9B → his I 1.0B → this I 1.0B → their I 1.0B → some I 0.7B → new I 0.6
I 100 · P(I → w q)I → ’s I 2.6I → and I 1.3I → % O 1.1I → million O 0.6I → new I 0.5I → million STOP 0.5I → company O 0.5I → year O 0.4I → & I 0.4I → million I 0.4
O 100 · P(O → w q)O → of B 3.8O → to O 3.6O → in B 2.5O → and O 1.7O → to B 1.7O → of O 1.6O → in O 1.5O → and B 1.4O → for B 1.3O → it O 1.3
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 33 / 62
![Page 49: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/49.jpg)
PRLG rule probabilities
B 100 · P(B → w q)B → the I 28.2B → a I 11.7B → mr. I 2.4B → its I 2.2B → an I 1.9B → his I 1.0B → this I 1.0B → their I 1.0B → some I 0.7B → new I 0.6
I 100 · P(I → w q)I → ’s I 2.6I → and I 1.3I → % O 1.1I → million O 0.6I → new I 0.5I → million STOP 0.5I → company O 0.5I → year O 0.4I → & I 0.4I → million I 0.4
O 100 · P(O → w q)O → of B 3.8O → to O 3.6O → in B 2.5O → and O 1.7O → to B 1.7O → of O 1.6O → in O 1.5O → and B 1.4O → for B 1.3O → it O 1.3
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 33 / 62
![Page 50: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/50.jpg)
PRLG rule probabilities
B 100 · P(B → w q)B → the I 28.2B → a I 11.7B → mr. I 2.4B → its I 2.2B → an I 1.9B → his I 1.0B → this I 1.0B → their I 1.0B → some I 0.7B → new I 0.6
I 100 · P(I → w q)I → ’s I 2.6I → and I 1.3I → % O 1.1I → million O 0.6I → new I 0.5I → million STOP 0.5I → company O 0.5I → year O 0.4I → & I 0.4I → million I 0.4
O 100 · P(O → w q)O → of B 3.8O → to O 3.6O → in B 2.5O → and O 1.7O → to B 1.7O → of O 1.6O → in O 1.5O → and B 1.4O → for B 1.3O → it O 1.3
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 33 / 62
![Page 51: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/51.jpg)
PRLG rule probabilities
B 100 · P(B → w q)B → the I 28.2B → a I 11.7B → mr. I 2.4B → its I 2.2B → an I 1.9B → his I 1.0B → this I 1.0B → their I 1.0B → some I 0.7B → new I 0.6
I 100 · P(I → w q)I → ’s I 2.6I → and I 1.3I → % O 1.1I → million O 0.6I → new I 0.5I → million STOP 0.5I → company O 0.5I → year O 0.4I → & I 0.4I → million I 0.4
O 100 · P(O → w q)O → of B 3.8O → to O 3.6O → in B 2.5O → and O 1.7O → to B 1.7O → of O 1.6O → in O 1.5O → and B 1.4O → for B 1.3O → it O 1.3
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 33 / 62
![Page 52: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/52.jpg)
PRLG rule probabilities
B 100 · P(B → w q)B → the I 28.2B → a I 11.7B → mr. I 2.4B → its I 2.2B → an I 1.9B → his I 1.0B → this I 1.0B → their I 1.0B → some I 0.7B → new I 0.6
I 100 · P(I → w q)I → ’s I 2.6I → and I 1.3I → % O 1.1I → million O 0.6I → new I 0.5I → million STOP 0.5I → company O 0.5I → year O 0.4I → & I 0.4I → million I 0.4
O 100 · P(O → w q)O → of B 3.8O → to O 3.6O → in B 2.5O → and O 1.7O → to B 1.7O → of O 1.6O → in O 1.5O → and B 1.4O → for B 1.3O → it O 1.3
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 33 / 62
![Page 53: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/53.jpg)
Learning curves: Base NPs
10 20 30 40K
20
40
60
80
sentences10 20 30 40K
2060
100
20
40
60
80
F-s
core
EM iter sentences
1
0 20 40 60 80 100
20
40
60
80
EM iter
PRLG chunking model: WSJ
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 34 / 62
![Page 54: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/54.jpg)
Learning curves: Base NPs
5 10 15K1020304050
sentences 5 10 15K20
80140
20
40
F-s
core
EM iter sentences
1
0 50 100 1501020304050
EM iter
PRLG chunking model: Negra
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 34 / 62
![Page 55: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/55.jpg)
Learning curves: Base NPs
5 10 15K0
10
20
30
sentences 510 15K
2060
100
10
20
30
F-s
core
EM iter sentences
1
0 20 40 60 80 1000
10
20
30
EM iter
PRLG chunking model: CTB
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 34 / 62
![Page 56: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/56.jpg)
Question
How much can these models learn?
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 35 / 62
![Page 57: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/57.jpg)
Against a supervised benchmark
∼4500 10K 20K 30K 40K
20
40
60
80
WSJ Sentences
Base
NPs
F-sc
oreSupervised PRLGUnsupervised PRLG
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 36 / 62
![Page 58: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/58.jpg)
Against a supervised benchmark
∼2200 5K 10K 15K
10
20
30
40
50
Negra Sentences
Base
NPs
F-sc
oreSupervised PRLGUnsupervised PRLG
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 36 / 62
![Page 59: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/59.jpg)
Against a supervised benchmark
5 10 15K
10
20
30
40
50
CTB Sentences
Base
NPs
F-sc
ore
Supervised PRLGUnsupervised PRLG
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 36 / 62
![Page 60: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/60.jpg)
Negra/CTB training much smaller than WSJ
10K 20K 30K 40K
20
40
60
80
Negra PRLG
WSJ PRLG
CTB PRLG
Sentences
Base
NPs
F-sc
ore
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 37 / 62
![Page 61: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/61.jpg)
Treebank precisionS
NP
D
The
N
Cat
PP
P
in
NP
D
the
N
hat
VP
V
knows
NP
D
a
N
lot
PP
P
about
NP
N
that
(the cat in the hat) knows (a lot) (about that)
• Constituent chunks: Prec = 2/3, Rec = 2/3, F = 2/3
• Base NPs: Prec = 1/3, Rec = 1/2
• Treebank precision: 3/3Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 38 / 62
![Page 62: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/62.jpg)
On chunking the CTB
3 20 40 60 80
10
30
50
EM Iterations
Treebank precision
Base NPs F-scoreConstituent chunk F-score
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 39 / 62
![Page 63: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/63.jpg)
Question.
Do these models scale?
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 40 / 62
![Page 64: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/64.jpg)
Chunking with training from Gigaword NYT
+160K +320K +480K +640K
50
60
70
80
90
+NYT Sentences
Treebank precision
Base NPs F
Const. chunks F
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 41 / 62
![Page 65: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/65.jpg)
Chunking with training from Gigaword NYT
WSJ +160K +320K +480K +640K
50
60
70
80
90
+NYT Sentences
Treebank precision
Base NPs F
Const. chunks F
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 41 / 62
![Page 66: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/66.jpg)
Outline
1 Goals and contributions
2 Unsupervised partial parsingMain resultsDiscussion
3 Cascaded parsingMain resultsDiscussion
4 Concluding remarks
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 42 / 62
![Page 67: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/67.jpg)
Question
Are we limited to segmentation?
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 43 / 62
![Page 68: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/68.jpg)
Hypothesis
Identification of higher level constituentscan also be learned by generalizing onphrasal boundaries
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 44 / 62
![Page 69: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/69.jpg)
Cascaded UPP: 1 Segment raw text
there is no asbestos in our products now
there is no asbestos in our products now
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 45 / 62
![Page 70: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/70.jpg)
Cascaded UPP: 2 Choose stand-ins for phrases
our productsis no asbestos
there is no asbestos in our products now
there in nowis our
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 45 / 62
![Page 71: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/71.jpg)
Cascaded UPP: 3 Segment text + phrasal stand-ins
there in nowis our
there in nowis our
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 45 / 62
![Page 72: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/72.jpg)
Cascaded UPP: 4 Choose stand-ins and repeat steps 3–4
our products
in
is no asbestos
there
there in nowis our
is in now
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 45 / 62
![Page 73: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/73.jpg)
Cascaded UPP: 5 Unwind to output tree
our products
in
is no asbestos
there
is in now
thereis no asbestos in our products
now
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 45 / 62
![Page 74: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/74.jpg)
Cascaded UPP: Review
• Separate models learned at each cascade level• Models share hyper-parameters (smoothing etc)• Choice of pseudowords as phrasal stand-ins• Pseudoword-identification: corpus frequency• Cascade run to convergence
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 46 / 62
![Page 75: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/75.jpg)
Right-branching baseline
the quick brown fox jumped over the lazy dog
thequick
brownfox
jumpedover
thelazy dog
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 47 / 62
![Page 76: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/76.jpg)
Right-branching baseline
a Lorillard spokeswoman said , this is an old story
a
Lorillard
spokeswoman said
this
is
an
old story
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 47 / 62
![Page 77: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/77.jpg)
Cascaded UPP: Evaluation
0 10 20 30 40 50
CTB
Negra
WSJ
Constituents F-score
Baseline CCLCascaded HMM Cascaded PRLG
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 48 / 62
![Page 78: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/78.jpg)
Another benchmark: CCM
Constituent-context model (Klein & Manning, 2002)
• Generative probabilistic model• Gold-standard POS• Short sentences
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 49 / 62
![Page 79: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/79.jpg)
Evaluation on ≤10 word setences
0 10 20 30 40 50 60 70
CTB
Negra
WSJ
Constituents F-score
Baseline CCM CCLCascaded HMM Cascaded PRLG
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 50 / 62
![Page 80: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/80.jpg)
Example parses
two share
a house almost devoid of furniture
Gold standardtwo
share
a housealmost devoid
offurniture
Cascaded PRLG – WSJ correctincorrect
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 51 / 62
![Page 81: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/81.jpg)
Example parses
what
is one to think of all this
Gold standardwhat
is
one
to
think
of
all this
Cascaded PRLG – WSJ correctincorrect
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 51 / 62
![Page 82: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/82.jpg)
Example parses
diethe
csuCSU
tutdoes
dasthis in
in
bayernBavaria
dochnevertheless
auchalso
sehrvery
erfolgreichsuccessfully
Nevertheless, the CSU does this in Bavaria very successfully as well
Gold standard
die csutut das
in bayerndoch auch
sehr erfolgreich
Cascaded PRLG – Negra correctincorrect
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 52 / 62
![Page 83: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/83.jpg)
Example parses
beiwith
denthe
windsorsWindsors
bleibtstays
alleseverything
inin der
the
familiefamily
With the Windsors everything stays in the family.
Gold standard
bei den windsorsbleibt alles
in der familie
Cascaded PRLG – Negra correctincorrect
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 52 / 62
![Page 84: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/84.jpg)
Example parses
immerever
mehrmore
anlagenteilemachine parts
uberalternover-age
(with) more and more machine parts over-age
Cascaded PRLG – Negra correctincorrect
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 52 / 62
![Page 85: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/85.jpg)
Outline
1 Goals and contributions
2 Unsupervised partial parsingMain resultsDiscussion
3 Cascaded parsingMain resultsDiscussion
4 Concluding remarks
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 53 / 62
![Page 86: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/86.jpg)
Question
How do these cascaded chunkers work?
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 54 / 62
![Page 87: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/87.jpg)
Recall of NPs and PPs
NPs PPsLev 1 Lev 2 Lev 1 Lev 2
WSJ PRLG 77.5 78.3 9.1 77.6Negra HMM 54.7 62.3 24.8 48.1CTB PRLG 30.9 33.6 31.6 47.1
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 55 / 62
![Page 88: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/88.jpg)
Prec / Rec trade-offs in the cascade
1 2 3 4 5 6 720
40
60
80
Levels
Precision Recall F-score
1WSJ PRLG
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 56 / 62
![Page 89: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/89.jpg)
Prec / Rec trade-offs in the cascade
1 2 3 4 5 6 7
30
40
50
Levels
Precision Recall F-score
1Negra PRLG
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 56 / 62
![Page 90: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/90.jpg)
Prec / Rec trade-offs in the cascade
1 2 3 4 5 6 7
20304050
Levels
Precision Recall F-score
1CTB PRLG
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 56 / 62
![Page 91: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/91.jpg)
Learning curves
10K 20K 30K 40K35
40
45
50
WSJ Sentences
F-sc
ore
PRLGCCL
HMM
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 57 / 62
![Page 92: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/92.jpg)
Learning curves
5K 10K 15K25
30
35
40
Negra Sentences
F-sc
ore
PRLG
HMM
CCL
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 57 / 62
![Page 93: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/93.jpg)
Learning curves
5K 10K 15K
20
30
40
CTB Sentences
F-sc
ore
PRLGHMM
CCL
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 57 / 62
![Page 94: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/94.jpg)
Outline
1 Goals and contributions
2 Unsupervised partial parsingMain resultsDiscussion
3 Cascaded parsingMain resultsDiscussion
4 Concluding remarks
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 58 / 62
![Page 95: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/95.jpg)
What we’ve learned
• Unsupervised identification of base NPs andlocal constituents is possible
• A cascade of chunking models for raw textparsing has state-of-the-art results
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 59 / 62
![Page 96: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/96.jpg)
Future directions
• Improvements to the sequence models• Better phrasal stand-in (pseudoword)
construction• Learning joint models rather than a cascade
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 60 / 62
![Page 97: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/97.jpg)
Historical note
First known computational natural language parserTransformations and Discourse Analysis ProjectZellig Harris & colleagues, UPenn 1950s - 1960s
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 61 / 62
![Page 98: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/98.jpg)
Historical note
To the best of our knowledge, this is the firstapplication of FSTs to parsing. The programconsisted of the following phases:
1. Dictionary look-up.2. Replacement of some ‘grammatical idioms’ by a
single part of speech.3. Rule based part of speech disambiguation.4. A right to left FST composed with a left to right
FST for computing ‘simple noun phrases’.
Joshi & Hopely 1997
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 61 / 62
![Page 99: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/99.jpg)
Historical note
To the best of our knowledge, this is the firstapplication of FSTs to parsing. The programconsisted of the following phases:
4. A left to right FST for computing ‘simpleadjuncts’ such as prepositional phrases andadverbial phrases.
5. A left to right FST for computing simple verbclusters.
6. A left to right ‘FST’ for computing clauses.
Joshi & Hopely 1997
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 61 / 62
![Page 100: Unsupervised Partial Parsing: Thesis defense](https://reader035.vdocument.in/reader035/viewer/2022081507/554bf716b4c9055a368b566e/html5/thumbnails/100.jpg)
Thanks!
Elias Ponvert (UT Austin) Unsupervised Partial Parsing Dissertation Defense 62 / 62