pre-ordering dependency subtrees for phrase-based smt intern: arianna bisazza. mentors: alex ceausu,...

PRE-ORDERING DEPENDENCY SUBTREES

FOR PHRASE-BASED SMT

Intern: Arianna Bisazza. Mentors: Alex Ceausu, John Tinsley

Dependency subtree pre-ordering

What if… we can’t/don’t want to change the decoding process we have dependency parses available

…one way to go: pre-order input parse trees, then translate normally

Main research problems: how to pre-order? (ordering model) and what? (rule selection)

Pre-ordering model (1) – MLE

Baseline model: max likelihood MLE (relative frequency-based) Subtree representation: relation type and POS tag

OA|NN

_OC|VVPP OA|NN

_OC|VVPPProb=0.75

Prob=0.25OA|NN

_OC|VVPP

Limitations: - ambiguity due to coarse word classification (only few relation/POS tags) - coverage: many unseen or low-counts subtrees

Evaluation

Training/dev/test: 495/2.5/2.5K sent. from WMT-12 De-En train data1.6M/8K/9K training subtrees (rooted at verb nodes)

Method Add. models BLEU KRS ACC UNK

MLE-rel -- 57.77 71.01 46.35 9.53

MLE-relPOS -- 55.00 71.03 45.75 24.33

SMT-relPOS(moses)

LM(rel) +LM(POS) 58.84 72.57 48.45 --

+lexreo(relPOS) 60.24 73.05 49.37 --

+lexreo(words) 59.87 72.92 49.05 --

+lexreo(nodeSpan) 59.72 72.94 49.03 --

Method Add. models BLEU KRS ACC UNK

MLE-relPOS -- 63.38 77.27 63.08 11.91

SMT-relPOS(moses)

+lexreo(relPOS) 66.74 78.24 64.69 --

4.8M/23K/24K training subtrees (all with >1 node)

Selective pre-ordering

Not all subtrees need to be pre-ordered

(especially in language pairs like German-English) How to select them? Approach: compute average distortion gain on training data,

then only pre-order subtrees with high distortion gain Pre-ordering performances, with two different thresholds

Selection %subtrees Method Add. models BLEU KRS ACC UNK

None (all subtrees) 100%

MLE -- 63.38 77.27 63.08 11.91

SMT +lexreo(relPOS) 66.74 78.24 64.69 --

HRPd15f3 13%MLE -- 55.59 72.64 40.09 30.43


SRPd20f3 3%MLE -- 45.17 60.96 24.34 8.70


MT experiments

Using WMT-12 De-En training and test data

Input DLnewstest2009 newstest2010

BLEU KRS BLEU KRSOriginal

4

19.43 62.70 20.96 66.00Reo.all 20.25 62.20 21.88 65.48Reo.verbRoot 20.27 62.27 21.97 65.48Reo.HRPd15f3 19.70 62.51 21.26 65.91Reo.SRPd20f3 19.55 62.67 21.08 65.95Original

8

20.18 63.14 21.68 66.15Reo.all 20.34 61.85 21.97 65.00Reo.verbRoot 20.38 62.00 22.09 65.06Reo.HRPd15f3 20.35 62.67 21.82 65.89Reo.SRPd20f3 20.25 62.99 21.73 66.03

MT output examples (1)

ORI: nach dem steilen Abfall am Morgen konnte die Prager Börse die Verluste korrigieren .REO: nach dem steilen Abfall am Morgen die Prager Börse konnte die Verluste korrigieren .

REF: after a sharp drop in the morning , the Prague Stock Market corrected its losses .BASE: after the sharp falls on the morning , the Prague Stock Exchange to correct the losses . NEW: after the sharp falls on the morning the Prague Stock Exchange was able to correct the losses .


ORI: … über einen Plan , der funktionieren wird und der auf dem Markt auch wirksam sein muss .REO: … über einen Plan , der wird funktionieren und der muss sein auch wirksam auf dem Markt .

REF: … on a plan which will function and which also must be effective on the market . BASE: … on a plan that will work and on the market also needs to be effective . NEW: … on a plan that will work and must also be effective on the market .


ORI: die Kongress Abgeordneten müssen nämlich noch einige Details der Vereinbarung aushandeln , ehe sie die Endfassung des Gesetzes veröffentlichen und darüber abstimmen dürfen .

REO: die Kongress Abgeordneten müssen nämlich aushandeln , ehe sie veröffentlichen die Endfassung des Gesetzes und dürfen darüber abstimmen noch einige Details der Vereinbarung .

REF: that is , the members of congress have to complete some details of the agreement before they can make the final version of the law public and vote on it .

BASE: members of Congress : some details must still negotiate the agreement before they publish the final version of the law and able to vote on it .

NEW: members of Congress must negotiate before they publish the final version of the law and must still vote on some details of the agreement .

Conclusions & TODO’s

Pre-ordering with SMT-like system always outperforms baseline MLE, but gains are small

Evaluation issue: reference reorderings are very noisy!

When input is pre-ordered BLEU improves but KRS decreases... more error analysis needed!

Possible reason: the SMT system must be re-trained (or at least tuned) on pre-ordered data

More thresholds for rule selection should be tested … other suggestions?

Thanks for your attention!

pre-ordering dependency subtrees for phrase-based smt intern: arianna bisazza. mentors: alex ceausu,...

Documents