probabilistic parsing
DESCRIPTION
Probabilistic Parsing. Ling 571 Fei Xia Week 4: 10/18-10/20/05. Outline. Misc: Hw3 and Hw4: lexicalized rules CYK recap Converting CFG into CNF N-best Quiz #2 Common prob equations Independence assumption Lexicalized models. CYK Recap. Converting CFG into CNF. CNF Extended CNF - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/1.jpg)
Probabilistic Parsing
Ling 571
Fei Xia
Week 4: 10/18-10/20/05
![Page 2: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/2.jpg)
Outline
• Misc: Hw3 and Hw4: lexicalized rules
• CYK recap– Converting CFG into CNF– N-best
• Quiz #2
• Common prob equations
• Independence assumption
• Lexicalized models
![Page 3: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/3.jpg)
CYK Recap
![Page 4: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/4.jpg)
Converting CFG into CNF
• CNF
• Extended CNF
• CFG in general vs. CFG for natural languages
• Converting CFG into CNF
• Converting PCFG into CNF
• Recovering parse trees
![Page 5: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/5.jpg)
Definition of CNF
• A, B,C are non-terminal, a is terminal, S is start symbol
• Definition 1: – A B C, – A a, – S Where B, C are not start symbols.
• Definition 2: -free grammar– A B C– A a
![Page 6: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/6.jpg)
Extended CNF
• Definition 3:– A B C– A a or A B
• We use Def 3:– Unit rules such as NPN are allowed.– No need to remove unit rules during
conversion.– CYK algorithm needs to be modified.
![Page 7: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/7.jpg)
CYK algorithm with Def 2 • For every rule Aw_i, • For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: if then
)Pr(]][][[ iwAAii
)Pr(*]][][1[*]][][[ BCACendmBmbeginval
valAendbegin ]][][[
),,(]][][[ CBmAendbeginB
]][][[ Aendbeginval
![Page 8: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/8.jpg)
CYK algorithm with Def 3• For every position i for all A, if Aw_i, for all A and B, if A=>B, update
• For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: …. for all non-terminals A and B, if AB, update
)Pr(]][][[ iwAAii ]][][[ Aii
]][][[ Aendbegin
![Page 9: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/9.jpg)
CFG
• CFG in general:– G=(N, T, P, S)– Rules:
• CFG for natural languages:– G=(N, T, P, S)– Pre-terminal: – Rules:
• Syntactic rules:
• Lexicon:
*)(, TNA
NN 1
1,, NNANA
1, NAaA
![Page 10: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/10.jpg)
Conversion from CFG to CNF
• CFG (in general) to CNF (Def 1)– Add S0S– Remove e-rules– Remove unit rules– Replace n-ary rules with binary rules
• CFG (for NL) to CNF (Def 3)– CFG (for NL) has no e-rules– Unit rules are allowed in CNF (Def 3)– Only the last step is necessary
![Page 11: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/11.jpg)
An example
• VP V NP PP PP
• To recover the parse tree w.r.t original CFG, just remove added non-terminals.
![Page 12: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/12.jpg)
Converting PCFG into CNF
• VPV NP PP PP 0.1
=>
VPV X1 0.1
X1 NP X2 1.0
X2 PP PP 1.0
![Page 13: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/13.jpg)
CYK with N-best output
![Page 14: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/14.jpg)
N-best parse trees
• Best parse tree:
• N-best parse trees:
probAendbegin ]][][[
],....,[]][][[ 1 NprobprobAendbegin
),,(]][][[ CBmAendbeginB
)],,,,(),....,,,,,[(]][][[ 11111 NNNNN jiCBmjiCBmAendbeginB
![Page 15: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/15.jpg)
CYK algorithm for N-best• For every rule Aw_i, • For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: for each if val > one of probs in then remove the last element in and insert val to the array remove the last element in B[begin][end][A] and insert (m, B,C,i, j) to B[begin][end][A].
]0,...,0),[Pr(]][][[ iwAAii
)Pr(** BCAppval ji
]][][[ Aendbegin
]][][1[],][][[ CendmpBmbeginp ji
]][][[ Aendbegin
]0,....,0[]][][[ Aendbegin)],,,,1),....(,,,,1[(]][][[ AendbeginB
![Page 16: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/16.jpg)
Mary bought books with cash
SNP VP (1,1,1)
SNP VP (1,1,2)
VPV NP (2,1,1)
VPVP PP (3,1,1)
NPNP PP (3,1,1)
PPP NP (4,1,1)
Ncash
NPN
- - - Pwith
SNP VP (1,1,1)
VPV NP (2,1,1)
Nbooks
NPN
- Vbought
Nbook
NPN
![Page 17: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/17.jpg)
Common probability equations
![Page 18: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/18.jpg)
Three types of probability
• Joint prob: P(x,y)= prob of x and y happening together
• Conditional prob: P(x|y) = prob of x given a specific value of y
• Marginal prob: P(x) = prob of x for all possible values of y
![Page 19: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/19.jpg)
Common equations
)(
),()|(
)|(*)()|(*)(),(
),()(
AP
BAPABP
BAPBPABPAPBAP
BAPAPB
![Page 20: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/20.jpg)
An example
• #(words)=100, #(nouns)=40, #(verbs)=20• “books” appears 10 times, 3 as verbs, 7 as
nouns
• P(w=books)=0.1• P(w=books,t=noun)=0.07• P(t=noun|w=books)=0.7• P(nouns)=0.4• P(w=books|t=nouns)=7/40
![Page 21: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/21.jpg)
More general cases
),...|(),...,(
),...,()(
111
1
,...,11
2
ii
in
AAn
AAAPAAP
AAPAPn
![Page 22: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/22.jpg)
Independence assumption
![Page 23: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/23.jpg)
Independence assumption
• Two variables A and B are independent if– P(A,B)=P(A)*P(B)– P(A)=P(A|B)– P(B)=P(B|A)
• Two variables A and B are conditional independent given C if – P(A,B|C)=P(A|C) * P(B|C)– P(A|B,C)=P(A|C)– P(B|A,C)=P(B|C)
• Independence assumption is used to remove some conditional factors, which will reduce the number of parameters in a model.
![Page 24: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/24.jpg)
PCFG parsers
))(|(
),...|(
),...,(),(
1
111
1
ii
i
ii
i
n
rlhsrP
rrrP
rrPSTP
It assumes each rule is independent of other rules
![Page 25: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/25.jpg)
Problems of independence assumptions
• Lexical independence:– P(VPV, Vbought)
= P(VPV)*P(Vbought)
See Table 12.2 on M&S P418.
come take think want
VP->V 9.5% 2.6% 4.6% 5.7%
VP->V NP 1.1% 32.1% 0.2% 13.9%
VP->V PP 34.5% 3.1% 7.1% 0.3%
VP->V SBAR 6.6% 0.3% 73.0% 0.2%
![Page 26: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/26.jpg)
Problems of independence assumptions (cont)
• Structural independence:– P(SNP VP, NPPron)
= P(SNP VP) * P(NPPron)
See Table 12.3 on M&S P420.
% as subj % as obj
NPPron 13.7% 2.1%
NPDet NN 5.6% 4.6%
NPNP SBAR 0.5% 2.6%
NPNP PP 5.6% 14.1%
![Page 27: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/27.jpg)
Dealing with the problems
• Lexical rules:– P(VPV | V=come)– P(VPV | V=think)
• Adding context info:
is a function that groups
into equivalence classes.
)(),...,|( 11 iii rPrrrP
)),....,(|(),...,|( 1111 iiii rrfrPrrrP
f 1,..., ii rr
![Page 28: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/28.jpg)
PCFG
))(|(
),...|(
),...,(),(
1
111
1
iii
ii
i
n
rlhsrP
rrrP
rrPSTP
It assumes each rule is independent of other rules
![Page 29: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/29.jpg)
A lexicalized model
))(),(|(*)))((),(|)((
)),...,),(|(*)),...,|)((
),...,|)(,(
),...|(
),...,(),(
1
11111
111
111
1
iiiiiii
iiiiii
iii
i
ii
i
n
rhrlhsrPrmhrlhsrhP
lrlrrhrPlrlrrhP
lrlrrhrP
lrlrlrP
lrlrPSTP
![Page 30: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/30.jpg)
An example
• he likes her
),Pr|(Pr*),Pr|(*
),|(*),|(*
),Pr|(Pr*),Pr|(*
),|Pr(*),|(*
),|(*),|(*
),|Pr(*),|(*
),|(*),|(*
),|(*),|(
),(
heronheronPheronherP
likesVlikesVPlikesVlikesP
heonheonPheonheP
herNPonNPPlikesNPherP
likesVPVNPVPPlikesVPlikesP
heNPonNPPlikesNPheP
likesSNPVPSPlikesSlikesP
likesTopSTopPToplikesP
STP
![Page 31: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/31.jpg)
Head-head probability
)...)(...)((
)....)(...)((
),(
),,(
),(
),,(
),|(
1
21
1
12
1
12
12
wAwXC
wAwXC
wAC
wAwC
wAP
wAwP
wAwP
w
)...)(...)((
)...)(...)((),|(
wNPlikesXC
heNPlikesXClikesNPheP
w
![Page 32: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/32.jpg)
Head-rule probability
))((
))((
))((
))((
))((
))((
),(
),,(
),|(
wAC
wAC
wAC
wAC
wAP
wAP
wAP
wAAP
wAAP
))((
)Pr)((),|Pr(
heNPC
onheNPCheNPonNPP
![Page 33: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/33.jpg)
Collecting the counts
))((
))((),|(
)...)(...)((
)....)(...)((),|(
1
2112
wAC
wACwAAP
wAwXC
wAwXCwAwP
w
![Page 34: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/34.jpg)
Remaining problems
• he likes her
• The Prob(T,S) is the same if the sentence is changed to “her likes he”.
),|Pr(*),|(*
),|(*),|(*
),|Pr(*),|(*
),|(*),|(
),(
herNPonNPPlikesNPherP
likesVPVNPVPPlikesVPlikesP
heNPonNPPlikesNPheP
likesSNPVPSPSlikesP
STP
![Page 35: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/35.jpg)
Previous model
))(),(|(*)))((),(|)((
)),...,),(|(*)),...,|)((
),...,|)(,(
),...|(
),...,(),(
1
11111
111
111
1
iiiiiii
iiiiii
iii
i
ii
i
n
rhrlhsrPrmhrlhsrhP
lrlrrhrPlrlrrhP
lrlrrhrP
lrlrlrP
lrlrPSTP
![Page 36: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/36.jpg)
A new model
)))((),(),(|(*)))((),(|)((
)),...,),(|(*)),...,|)((
),...,|)(,(
),...|(
),...,(),(
1
11111
111
111
1
iiiiiiii
iiiiii
iii
i
ii
i
n
rmlhsrhrlhsrPrmhrlhsrhP
lrlrrhrPlrlrrhP
lrlrrhrP
lrlrlrP
lrlrPSTP
![Page 37: Probabilistic Parsing](https://reader035.vdocument.in/reader035/viewer/2022081515/56814868550346895db576d3/html5/thumbnails/37.jpg)
New formula
• he likes her
),,|Pr(*),|(*
),,|(*),|(*
),,|Pr(*),|(*
),,|(*),|(
),(
VPherNPonNPPlikesNPherP
SlikesVPVNPVPPlikesVPlikesP
SheNPonNPPlikesNPheP
ToplikesSNPVPSPSlikesP
STP