statistical phrase alignment model using dependency relation probability
Post on 24-Feb-2016
38 Views
Preview:
DESCRIPTION
TRANSCRIPT
Statistical Phrase Alignment Model Using Dependency Relation Probability
Toshiaki Nakazawa and Sadao KurohashiKyoto University
Outline Background Tree-based Statistical Phrase Alignment Model Model Training Experiments Conclusions
2
Conventional Word Sequence Alignment
受 (accept)光 (light)素子 (device)に (ni)は (ha)
フォト (photo)ゲート (gate)を (wo)用いた (used)
Aphotogateisusedforthephotodetector
・・・exhibited ■ ■astrong ■inhibitory ■effect ■on ■ ■tumor ■growth ■in ■the ■castrated ■mice ■as ■ ■in ■thenon-castrated ■mice ■
・・・
非 去勢
マウス
と 同様に
去勢
マウス
の 腫よう
の 成長
に 対し
強い
抑制
効果
を 示した
grow-diag-final-and
Conventional Word Sequence Alignment
受 (accept)光 (light)素子 (device)に (ni)は (ha)
フォト (photo)ゲート (gate)を (wo)用いた (used)
Aphotogateisusedforthephotodetector
受光
素子に
はフォト
ゲートを
用いた
Aphotogate
isused
forthe
photodetector
(accept)
(light)
(device)
(photo)
(gate)
(used)
(ni)
(ha)
(wo)
Proposed Model
1. Dependency trees
Proposed Model受
光素子
には
フォトゲート
を用いた
Aphotogate
isused
forthe
photodetector
(accept)
(light)
(device)
(photo)
(gate)
(used)
(ni)
(ha)
(wo)1. Dependency trees2. Phrase alignment3. Bi-directional agreement
・・・exhibited ■ ■astrong ■inhibitory ■effect ■on ■ ■tumor ■growth ■in ■the ■castrated ■mice ■as ■ ■in ■thenon-castrated ■mice ■
・・・
非 去勢
マウス
と 同様に
去勢
マウス
の 腫よう
の 成長
に 対し
強い
抑制
効果
を 示した
grow-diag-final-and ・・・ exhibited ■ ■ │ ┌─a │ ─├ strong ■ │ ─├ inhibitory ■ ─├ effect ■ ─├ on
│ │ ┌─tumor ■ │ └─growth ■ ─├ in ■
│ │ ┌─the │ │ ─├ castrated ■ │ └─mice ■ └─as ■ └─ in ■ │ ┌─ the │ ├─ non-castrated
■ ■
└─ mice ■ ・・・
─
┌非
─
┌去勢
─
┌マウス
─┌と
┬同様に
─
┌去勢
─
┌マウス
─
┌の
─
┌腫よう
─ ┌の
─
┌成長
─┌に
┬対し
─
┌強い
─ ┬抑制
─┌効果
┬を
示した
Proposed model
Related Work Using tree structures
[Cherry and Lin, 2003], [Quirk et al., 2005], [Galley et al., 2006], ITG, …
Considering phrase alignment [Zhang and Vogel, 2005], [Ion et al., 2006], …
Using two directed models simultaneously [Liang et al., 2006], [Graca et al., 2008], …
Tree-based Statistical Phrase Alignment Model
Dependency Analysis of Sentences
受光
素子に
はフォト
ゲートを
用いた
Aphotogate
isused
forthe
photodetector
(accept)
(light)
(device)
(photo)
(gate)
(used)
(ni)
(ha)
(wo)
受光素子にはフォトゲートを用いた A photogate is used for the photodetector
Source (Japanese) Target (English)
Word order
Head nodeHead node
Overview of the Proposed Model(in comparison to the IBM models) IBM models find the best alignment by
Proposed model
a
)|(),|(maxarg
)|,(maxargˆ
eaaef
eafa
a
a
pp
p
Word translatio
nWord
reordering
f: source sentencee: target sentencea: alignment
)|(),|(maxarg
)|,(maxargˆ
eaaef
eafa
a
a
pp
p
Phrase translatio
nDependency Relation
)|(),|()|(),|(maxarg
)|,()|,(maxargˆ
faafeeaaef
faeeafa
a
a
pppp
pp
Phrase translatio
n
Phrase translatio
nDependency Relation
Dependency Relation
),|( aefp
)|( eap
: Lexical prob. : Alignment prob.
Phrase Translation Probability
Phrase Translation Probability
Note that the sentences are not previously segmented into phrases
J
jAjs jsEFpp
1)( )|(),|(
)(Aef
J
jaj jefpp
1
)|(),|( aefIBM Model
)|()NULL|()|()|()|(),|(
213
323221
EFpFpEFpEFpEFpp
Aef
f4
f3
F2
f5
f2
f1F1
F3
s(j):s(1) = 1s(2) = 2s(3) = 2s(4) = 3s(5) = 1
sourcee4
e3
E2 e2
e1
E1
E3
A:A1=2A2=3A3=0
target
Dependency Relation Probability
Dependency Relations
Parent-child
Parent-child
Grandparent-child
?Fs(c)
EAs(c)
EAs(p)EAs(c)
rel(fc, fp) = c
Invertedparent-child
EAs(p)
Fs(p)fp
fc
rel(fc, fp) = c;crel(fc, fp) = prel(fc, fp) = NULL_psource target
・・・
・・・・・・
・・・
・・・
NULL
Dependency Relation Probability
Ds-pc is a set of parent-child word pairs in the source sentence
Source-side dependency relation probability is defined in the same manner
pcs)cp,(
cp ))f,f(()|(D
t relpp eA
pct)cp,(
cp ))e,e(()|(D
s relpp fA
Model Training
Model Training Step 1 : Estimate word translation prob. (IBM Model 1)
Initialize dependency relation prob.
Step 2 : Estimate phrase translation prob. and dependency relation prob. E-step
1. Create initial alignment2. Modify the alignment by hill-climbing
Generate possible phrases M-step: Parameter estimation
Word base
Tree base
p( コロラド |Colorado)=0.7p( 大学 |university)=0.6…
p(c) = 0.4p(c;c)= 0.3p(p) = 0.2…
p( コロラド |Colorado)=0.7p( 大学 |university)=0.6p( コロラド 大学 |university of Colorado)=0.9…
Step 2 (E-step)
受光
素子に
はフォト
ゲートを
用いた
Aphotogate
isused
forthe
photodetector
受光
素子に
はフォト
ゲートを
用いた
Aphotogate
isused
forthe
photodetector
Initial Alignment
Swap Reject
Initial alignment is greedily created
Modify the initial alignment with the operations: Swap Reject Add Extend
Example of Hill-climbing
)|(),|()|(),|(maxarg
)|,()|,(maxargˆ
faafeeaaef
faeeafa
a
a
pppp
pp
Generate Possible Phrases Generate new possible phrases by merging
the NULL-aligned nodes into their parent or child non-NULL-aligned nodes
The new possible phrases are taken into consideration from the next iteration
受光
素子に
はフォト
ゲートを
用いた
Aphotogate
isused
forthe
photodetector
Model Training Step 1 : Estimate word translation prob. (IBM Model 1)
Initialize dependency relation prob.
Step 2 : Estimate phrase translation prob. and dependency relation prob. E-step
1. Create initial alignment2. Modify the alignment by hill-climbing
Generate possible phrases M-step: Parameter estimation
Word base
Tree base
p( コロラド |colorado)=0.7p( 大学 |university)=0.6…
p(c) = 0.4p(c;c)= 0.3p(p) = 0.2…
p( コロラド |colorado)=0.7p( 大学 |university)=0.6p( コロラド 大学 |university of colorado)=0.9…
Experiments
Alignment Experiments Training: JST Ja-En paper abstract corpus (1M
sentences, Ja: 36.4M words, En: 83.6M words) Test: 475 sentences with the gold-standard
alignments annotated by hand Parsers: KNP for Japanese, MSTParser for
English Evaluation criteria: Precision, Recall, F1 For the proposed model, we did 5 iterations in
each Step
Experimental Results
Pre. Rec. FProposed 87.75 50.27 63.92intersection 90.34 34.28 49.71grow-final-and 81.32 48.85 61.04grow-diag-final-and 79.39 51.15 62.22
+1.7
Effectiveness of Phrase and TreePre. Rec. F
Trees + Phrases (Proposed) 85.54 51.00 63.90Trees 89.77 39.47 54.83Phrases 84.41 47.33 60.65None 85.07 38.06 52.59
cp
-1+1
Positional relations instead of dependency relations
Discussions Parsing errors
Parsing accuracy is basically good, but still sometimes makes incorrect parsing results
Parsing probability into the model Search errors
Hill-climbing sometimes goes local minima Random restart
Function words Behave quite differently in different languages (ex.
case markers in Japanese, articles in English) Post-processing
Post-processing for Function Words Reject correspondences between Japanese
particles and English “be” or “have” Reject correspondences of English articles Japanese “ する” and “ れる” or English “be” and
“have” are merged into its parent verb or adjective if they are NULL-aligned
Pre. Rec. FProposed 87.75 50.27 63.92Proposed+ modify 87.83 58.40 70.16grow-diag-final-and 79.39 51.15 62.22grow-diag-final-and + modify 80.46 51.15 62.54
+6.2
+0.3
Conclusion and Future Work Linguistically motivated phrase alignment
1. Dependency trees2. Phrase alignment3. Bi-directional agreement
Significantly better results compared to conventional word alignment models
Future work: Apply the proposed model for other language pairs
(Japanese-Chinese and so on) Incorporate parsing probability into our model Investigate the contribution of our alignment
results to the translation quality
top related