an attempt at unsupervised learning of hierarchical dependency parsing

6
An Attempt at Unsupervised Learning of Hierarchical Dependency Parsing via the Dependency Model with Valence (DMV)

Upload: ernst

Post on 07-Jan-2016

28 views

Category:

Documents


3 download

DESCRIPTION

An Attempt at Unsupervised Learning of Hierarchical Dependency Parsing. via the Dependency Model with Valence (DMV). Motivation. Dependency Parsing: Search Query Refinement Statistical Machine Translation Unsupervised Learning: Availability of Large Quantities of Data. DMV. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: An Attempt at Unsupervised Learning of Hierarchical Dependency Parsing

An Attempt at Unsupervised Learning of Hierarchical Dependency

Parsing

via the Dependency Model with Valence (DMV)

Page 2: An Attempt at Unsupervised Learning of Hierarchical Dependency Parsing

Motivation

• Dependency Parsing:• Search Query Refinement• Statistical Machine Translation

• Unsupervised Learning:• Availability of Large Quantities of Data

Page 3: An Attempt at Unsupervised Learning of Hierarchical Dependency Parsing

DMV

• Pick a Direction (left or right)• Generate the first child, or stop;• Generate more children, until stop.• Repeat in the other direction.• Recurse…

• Porder• Pstop• Pattach

Page 4: An Attempt at Unsupervised Learning of Hierarchical Dependency Parsing

EM

• Inside-Outside Algorithm:• Inside: Pi(i,X,j) = P(X derives i…j)• Outside: Po(i,X,j) = P(S derives 0…iXj…l)

• Re-Estimation:• Frequency of sub-tree (i,X,j)=Pi(i,X,j)*Po(i,X,j)

Page 5: An Attempt at Unsupervised Learning of Hierarchical Dependency Parsing

Evaluation

• Head-percolation of Penn Treebank parses;• % edges correct (directed or undirected) in the

best (P)CFG parse…

• Zero Knowledge: 14.4 (29.9)• Adjacent Word Heuristic: 33.6 • Klein & Manning: 43.2 (63.7)• Oracle: 75.5 (77.5)• - Pattach: 60.0 (63.3) - Pstop: 53.9 (57.7)• - PstopA: 50.0 (54.8) - PstopN: 12.5 (30.8)

Page 6: An Attempt at Unsupervised Learning of Hierarchical Dependency Parsing

EM

• Didn’t work out… always made things worse, even when initialized with very good solutions.

• If started using Zero Knowledge, then after 1 iteration already gets 18.4 (38.4), then worsens.

• If started using an Ad-Hoc Harmonic for Pattach, then 21.5 (47.1) after 1 iteration, then worse, and similarly even for the Oracle solution…

• Summary:• - DMV – useful, simple, extensible model;• - EM – more thorough debugging needed.