an attempt at unsupervised learning of hierarchical dependency parsing
DESCRIPTION
An Attempt at Unsupervised Learning of Hierarchical Dependency Parsing. via the Dependency Model with Valence (DMV). Motivation. Dependency Parsing: Search Query Refinement Statistical Machine Translation Unsupervised Learning: Availability of Large Quantities of Data. DMV. - PowerPoint PPT PresentationTRANSCRIPT
An Attempt at Unsupervised Learning of Hierarchical Dependency
Parsing
via the Dependency Model with Valence (DMV)
Motivation
• Dependency Parsing:• Search Query Refinement• Statistical Machine Translation
• Unsupervised Learning:• Availability of Large Quantities of Data
DMV
• Pick a Direction (left or right)• Generate the first child, or stop;• Generate more children, until stop.• Repeat in the other direction.• Recurse…
• Porder• Pstop• Pattach
EM
• Inside-Outside Algorithm:• Inside: Pi(i,X,j) = P(X derives i…j)• Outside: Po(i,X,j) = P(S derives 0…iXj…l)
• Re-Estimation:• Frequency of sub-tree (i,X,j)=Pi(i,X,j)*Po(i,X,j)
Evaluation
• Head-percolation of Penn Treebank parses;• % edges correct (directed or undirected) in the
best (P)CFG parse…
• Zero Knowledge: 14.4 (29.9)• Adjacent Word Heuristic: 33.6 • Klein & Manning: 43.2 (63.7)• Oracle: 75.5 (77.5)• - Pattach: 60.0 (63.3) - Pstop: 53.9 (57.7)• - PstopA: 50.0 (54.8) - PstopN: 12.5 (30.8)
EM
• Didn’t work out… always made things worse, even when initialized with very good solutions.
• If started using Zero Knowledge, then after 1 iteration already gets 18.4 (38.4), then worsens.
• If started using an Ad-Hoc Harmonic for Pattach, then 21.5 (47.1) after 1 iteration, then worse, and similarly even for the Oracle solution…
• Summary:• - DMV – useful, simple, extensible model;• - EM – more thorough debugging needed.