pre-ordering dependency subtrees for phrase-based smt intern: arianna bisazza. mentors: alex ceausu,...

15
PRE-ORDERING DEPENDENCY SUBTREES FOR PHRASE-BASED SMT Intern: Arianna Bisazza . Mentors: Alex Ceausu, John Tinsley

Upload: gerde-keiper

Post on 06-Apr-2015

105 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PRE-ORDERING DEPENDENCY SUBTREES FOR PHRASE-BASED SMT Intern: Arianna Bisazza. Mentors: Alex Ceausu, John Tinsley

PRE-ORDERING DEPENDENCY SUBTREES

FOR PHRASE-BASED SMT

Intern: Arianna Bisazza. Mentors: Alex Ceausu, John Tinsley

Page 2: PRE-ORDERING DEPENDENCY SUBTREES FOR PHRASE-BASED SMT Intern: Arianna Bisazza. Mentors: Alex Ceausu, John Tinsley

Dependency subtree pre-ordering

What if… we can’t/don’t want to change the decoding process we have dependency parses available

…one way to go: pre-order input parse trees, then translate normally

Main research problems: how to pre-order? (ordering model) and what? (rule selection)

Page 3: PRE-ORDERING DEPENDENCY SUBTREES FOR PHRASE-BASED SMT Intern: Arianna Bisazza. Mentors: Alex Ceausu, John Tinsley

Dependency subtree pre-ordering

“Die Budapester Staat anwaltschaft hat ihre Ermittlungen zum Vorfall eingeleitet.” the Budapest Prosecutor’s Office has its investigation on the accident initiated

ihre|PPOSAT

zum|APPRART

Vorfall|NN

hat|VAFIN

anwaltschaft|NN

Staat|NN

Budapester|NN

die|ART

eingeleitet|VVPP

.|$.

Ermittlungen|NN

NK

NK

NKSB

OC

PUNC OA

NK

MNR

NK

NN

VAFIN

VVPP $. NN

VAFIN

VVPP $.

NN

VVPP

NN

VVPP

... ...

Permute subtrees (a node + its children)

Each subtree processed independently

Page 4: PRE-ORDERING DEPENDENCY SUBTREES FOR PHRASE-BASED SMT Intern: Arianna Bisazza. Mentors: Alex Ceausu, John Tinsley

Dependency subtree pre-ordering

“Die Budapester Staat anwaltschaft hat ihre Ermittlungen zum Vorfall eingeleitet.” the Budapest Prosecutor’s Office has its investigation on the accident initiated

ihre|PPOSAT

zum|APPRART

Vorfall|NN

hat|VAFIN

anwaltschaft|NN

Staat|NN

Budapester|NN

die|ART

eingeleitet|VVPP

.|$.

Ermittlungen|NN

NK

NK

NKSB

OC

PUNC

OA NK

MNR

NK

NN

VAFIN

VVPP $. NN

VAFIN

VVPP $.

NN

VVPP

NN

VVPP

... ...

Permute subtrees (a node + its children)

Each subtree processed independently

Page 5: PRE-ORDERING DEPENDENCY SUBTREES FOR PHRASE-BASED SMT Intern: Arianna Bisazza. Mentors: Alex Ceausu, John Tinsley

Pre-ordering model (1) – MLE

Baseline model: max likelihood MLE (relative frequency-based) Subtree representation: relation type and POS tag

OA|NN

_OC|VVPP OA|NN

_OC|VVPPProb=0.75

Prob=0.25OA|NN

_OC|VVPP

Limitations: - ambiguity due to coarse word classification (only few relation/POS tags) - coverage: many unseen or low-counts subtrees

Page 6: PRE-ORDERING DEPENDENCY SUBTREES FOR PHRASE-BASED SMT Intern: Arianna Bisazza. Mentors: Alex Ceausu, John Tinsley

Pre-ordering model (2) – SMT

Idea: learn to reorder by SMT! Train a phrase-based system on pairs of original/pre-ordered

source language node sequences (subtrees)

Advantages: generalization: all node sequences can be processed model flexibility: represent different features as “factors” tune different model weights by MERT

ORIGINAL

SB|NN _ROOT|VAFIN OC|VVPP PUNC|$.NK|ART NK|NN NK|NN _SB|NNOA|NN _OC|VVPP ...OA|NN _OC|VVPP ...

PRE-ORDERED

SB|NN _ROOT|VAFIN OC|VVPP PUNC|$.NK|ART NK|NN NK|NN _SB|NN_OC|VVPP OA|NN...OA|NN _OC|VVPP ...

Page 7: PRE-ORDERING DEPENDENCY SUBTREES FOR PHRASE-BASED SMT Intern: Arianna Bisazza. Mentors: Alex Ceausu, John Tinsley

Pre-ordering model (2) – SMT

Possible models: original-to-preordered phrase table “target” (preordered) n-gram language models lexicalized reordering models at the level of relation type,

POS tags or words etc. all models log-linearly combined weights tuned by MERT, optimizing reo.score (KRS)

ORIGINAL

SB|NN|anwaltschaft _ROOT|VAFIN|hat OC|VVPP|eingeleitet PUNC|$.|.

NK|ART|die NK|NN|Budapester NK|NN|Staat _SB|NN|anwaltschaft

OA|NN|Ermittlungen _OC|VVPP|eingeleitet

...

Each feature type is represented as a factor, for example:

Page 8: PRE-ORDERING DEPENDENCY SUBTREES FOR PHRASE-BASED SMT Intern: Arianna Bisazza. Mentors: Alex Ceausu, John Tinsley

Evaluation

Training/dev/test: 495/2.5/2.5K sent. from WMT-12 De-En train data1.6M/8K/9K training subtrees (rooted at verb nodes)

Method Add. models BLEU KRS ACC UNK

MLE-rel -- 57.77 71.01 46.35 9.53

MLE-relPOS -- 55.00 71.03 45.75 24.33

SMT-relPOS(moses)

LM(rel) +LM(POS) 58.84 72.57 48.45 --

+lexreo(relPOS) 60.24 73.05 49.37 --

+lexreo(words) 59.87 72.92 49.05 --

+lexreo(nodeSpan) 59.72 72.94 49.03 --

Method Add. models BLEU KRS ACC UNK

MLE-relPOS -- 63.38 77.27 63.08 11.91

SMT-relPOS(moses)

+lexreo(relPOS) 66.74 78.24 64.69 --

4.8M/23K/24K training subtrees (all with >1 node)

Page 9: PRE-ORDERING DEPENDENCY SUBTREES FOR PHRASE-BASED SMT Intern: Arianna Bisazza. Mentors: Alex Ceausu, John Tinsley

Selective pre-ordering

Not all subtrees need to be pre-ordered

(especially in language pairs like German-English) How to select them? Approach: compute average distortion gain on training data,

then only pre-order subtrees with high distortion gain Pre-ordering performances, with two different thresholds

Selection %subtrees Method Add. models BLEU KRS ACC UNK

None (all subtrees) 100%

MLE -- 63.38 77.27 63.08 11.91

SMT +lexreo(relPOS) 66.74 78.24 64.69 --

HRPd15f3 13%MLE -- 55.59 72.64 40.09 30.43

SMT +lexreo(relPOS) 60.68 75.02 44.98 --

SRPd20f3 3%MLE -- 45.17 60.96 24.34 8.70

SMT +lexreo(relPOS) 51.70 65.21 28.76 --

Page 10: PRE-ORDERING DEPENDENCY SUBTREES FOR PHRASE-BASED SMT Intern: Arianna Bisazza. Mentors: Alex Ceausu, John Tinsley

MT experiments

Using WMT-12 De-En training and test data

Input DLnewstest2009 newstest2010

BLEU KRS BLEU KRSOriginal

4

19.43 62.70 20.96 66.00Reo.all 20.25 62.20 21.88 65.48Reo.verbRoot 20.27 62.27 21.97 65.48Reo.HRPd15f3 19.70 62.51 21.26 65.91Reo.SRPd20f3 19.55 62.67 21.08 65.95Original

8

20.18 63.14 21.68 66.15Reo.all 20.34 61.85 21.97 65.00Reo.verbRoot 20.38 62.00 22.09 65.06Reo.HRPd15f3 20.35 62.67 21.82 65.89Reo.SRPd20f3 20.25 62.99 21.73 66.03

Page 11: PRE-ORDERING DEPENDENCY SUBTREES FOR PHRASE-BASED SMT Intern: Arianna Bisazza. Mentors: Alex Ceausu, John Tinsley

MT output examples (1)

ORI: nach dem steilen Abfall am Morgen konnte die Prager Börse die Verluste korrigieren .REO: nach dem steilen Abfall am Morgen die Prager Börse konnte die Verluste korrigieren .

REF: after a sharp drop in the morning , the Prague Stock Market corrected its losses .BASE: after the sharp falls on the morning , the Prague Stock Exchange to correct the losses . NEW: after the sharp falls on the morning the Prague Stock Exchange was able to correct the losses .

Page 12: PRE-ORDERING DEPENDENCY SUBTREES FOR PHRASE-BASED SMT Intern: Arianna Bisazza. Mentors: Alex Ceausu, John Tinsley

MT output examples (2)

ORI: … über einen Plan , der funktionieren wird und der auf dem Markt auch wirksam sein muss .REO: … über einen Plan , der wird funktionieren und der muss sein auch wirksam auf dem Markt .

REF: … on a plan which will function and which also must be effective on the market . BASE: … on a plan that will work and on the market also needs to be effective . NEW: … on a plan that will work and must also be effective on the market .

Page 13: PRE-ORDERING DEPENDENCY SUBTREES FOR PHRASE-BASED SMT Intern: Arianna Bisazza. Mentors: Alex Ceausu, John Tinsley

MT output examples (3)

ORI: die Kongress Abgeordneten müssen nämlich noch einige Details der Vereinbarung aushandeln , ehe sie die Endfassung des Gesetzes veröffentlichen und darüber abstimmen dürfen .

REO: die Kongress Abgeordneten müssen nämlich aushandeln , ehe sie veröffentlichen die Endfassung des Gesetzes und dürfen darüber abstimmen noch einige Details der Vereinbarung .

REF: that is , the members of congress have to complete some details of the agreement before they can make the final version of the law public and vote on it .

BASE: members of Congress : some details must still negotiate the agreement before they publish the final version of the law and able to vote on it .

NEW: members of Congress must negotiate before they publish the final version of the law and must still vote on some details of the agreement .

Page 14: PRE-ORDERING DEPENDENCY SUBTREES FOR PHRASE-BASED SMT Intern: Arianna Bisazza. Mentors: Alex Ceausu, John Tinsley

Conclusions & TODO’s

Pre-ordering with SMT-like system always outperforms baseline MLE, but gains are small

Evaluation issue: reference reorderings are very noisy!

When input is pre-ordered BLEU improves but KRS decreases... more error analysis needed!

Possible reason: the SMT system must be re-trained (or at least tuned) on pre-ordered data

More thresholds for rule selection should be tested … other suggestions?

Page 15: PRE-ORDERING DEPENDENCY SUBTREES FOR PHRASE-BASED SMT Intern: Arianna Bisazza. Mentors: Alex Ceausu, John Tinsley

Thanks for your attention!