effective use of linguistic and contextual information for statistical machine translation libin...
TRANSCRIPT
Effective Use of Linguistic and Contextual Informationfor Statistical Machine
TranslationLibin Shen and Jinxi Xu and Bing Zhang and
Spyros Matsoukas and RalphWeischedelBBN Technologies
EMNLP2009Presented by Cai
Question
Lexical features are useful in MT But parameter’s number is large How to effectively use these features?
Previous Work
Discriminative training the parameters : the need of scalable development set and careful selection
Estimate a single score or likelihood of a translation with rich features (using ME): feature space too large, not practical
Main Contribution
Design effective and efficient statistical models (simple probabilistic models) to capture useful linguistic and context information for MT decoding
Features: robust and ideal
Features introduced
non-terminal labels (+performance) Length distribution of non-terminals
(+performance) Source-side context information
(+performance) Source-side structural information
(dependency information) no performance gain, surprisingly
What’s special
Assume the distribution of length of non-terminal is Gaussian (sampling,estimation, smoothing)
Soft dependency constraints by introducing labels of non-terminals
Context language model String-to-dependency rule-> dependency-to-
dependency rule
Experiments
Baseline: string-to-dependency system presented in (Shen et.al 2008)
Test each feature and their combinations Arabic-to-English and Chinese-to-English Measure: Bleu and TER Results: 2 points of BLEU in A-E and 1 points of B
LEU in C-E (nist06); 1.7 points of BLEU in A-E and 0.8 points of BLEU in C-E (nist06); 1.7 poi
Main Related Work
Z. He, Q. Liu, and S. Lin. 2008. Improving statistical machine translation using lexicalized rule, COLING ’08
A. Ittycheriah and S. Roukos. 2007. Direct translation model 2. NACCL 07
L. Shen, J. Xu, and R. Weischedel. 2008. A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model. ACL 2008