source syntax-based statistical machine translation...
TRANSCRIPT
![Page 1: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/1.jpg)
Source Syntax-based Statistical Machine Translation
Models and Approaches
Qun LiuInstitute of Computing Technology, CAS
Sino-Japanese Machine Translation Technology Cooperation Symposium2011-09-26 Beijing, China
![Page 2: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/2.jpg)
Outline
Background
Tree-to-String Model
Conclusion
![Page 3: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/3.jpg)
Statistical Translation Model
∑EP E∣F =1
P E∣F
![Page 4: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/4.jpg)
Translation Models
Syntax-based
Phrase-based
Semantic-based
Interlingua
Word-based Source Language Target Language
![Page 5: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/5.jpg)
Translation Models
Syntax-based
Phrase-based
Semantic-based
Interlingua
Word-based Source Language Target Language
![Page 6: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/6.jpg)
An Example
布什 与 沙龙 举行 了 会谈
bushi yu shalong juxing le huitan
Bush held a talk with Sharon
![Page 7: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/7.jpg)
Word-based Model
IBM Model 1-5● Peter F. Brown, Stephen A. Della Pietra,
Vincent J. Della Pietra, and Robert L. Mercer. 1993. The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics, 19(2):263-311.
![Page 8: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/8.jpg)
Word-based Model
bushi yu shalong juxing le huitan
Bush held a talk with Sharon
![Page 9: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/9.jpg)
Word-based Model
Source Target Probability
Bushi(布什)
Bush 0.7
President 0.2
US 0.1
yu(与)
and 0.6
with 0.4
juxing(举行)
hold 0.7
had 0.3
le(了)
hold 0.01
![Page 10: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/10.jpg)
Syntax-based
Phrase-based
Semantic-based
Interlingua
Word-based Source Language Target Language
Translation Models
![Page 11: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/11.jpg)
Phrase-based Model● Franz J. Och and Hermann Ney. 2004. The
Alignment Template Approach to Statistical Machine Translation. Computational Linguistics, 30(4):417-449.
● Philipp Koehn, Franz J. Och, and Daniel Marcu. 2003. Statistical Phrase-Based Translation. In Proceedings of the Human Language Technology and North American Association for Computational Linguistics Conference, pages 127-133, Edmonton, Canada, May.
![Page 12: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/12.jpg)
Phrase-based Model
bushi yu shalong juxing le huitan
Bush held a talk with Sharon
![Page 13: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/13.jpg)
Phrase-based ModelSource Target Probability
Bushi(布什)
Bush 0.5
president Bush 0.3
the US president 0.2
Bushi yu(布什与)
Bush and 0.8
the president and 0.2
yu Shalong(与沙龙)
and Shalong 0.6
with Shalong 0.4
juxing le huiang(举行了会谈)
hold a meeting 0.7
had a meeting 0.3
![Page 14: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/14.jpg)
Syntax-based
Phrase-based
Semantic-based
Interlingua
Word-based Source Language Target Language
Translation Models
![Page 15: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/15.jpg)
Syntax-based Translation Models #1
syntax level
phrase level
![Page 16: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/16.jpg)
Syntax-based Translation Models #2
linguisitcal syntax level
formal syntax level
phrase level
![Page 17: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/17.jpg)
Syntax-based Translation Models #3
phrase-based model
formally syntax-based model
linguistically syntax-based model
string-to-tree model
tree-to-string model
tree-to-tree model
![Page 18: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/18.jpg)
Syntax-based Translation Models #3
phrase-based model
formally syntax-based model
linguistically syntax-based model
string-to-tree model
tree-to-string model
tree-to-tree model
![Page 19: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/19.jpg)
Formally Syntax-based Model
Hierarchical Phrase-based Model● David Chiang, 2005. A hierarchical phrase-based
model for statistical machine translation. In Proceedings of ACL 2005.
Maximum Entropy Bracketing Transduction Grammar Model● Deyi Xiong, Qun Liu, and Shouxun Lin. Maximum
Entropy Based Phrase Reordering Model for Statistical Machine Translation. COLING-ACL 2006, Sydney, Australia, July 17-21.
![Page 20: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/20.jpg)
Hierarchical Phrased-based Model
bushi yu shalong juxing le huitan
Bush held a talk with Sharon
X1
X1
X2
X2
X3
X3
![Page 21: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/21.jpg)
Hierarchical Phrased-based ModelSource Target Probability
juxing le huiang(举行了会谈)
hold a meeting 0.6
had a meeting 0.3
X huitang( X 会谈)
X a meeting 0.8
X a talk 0.2
juxing le X(举行了 X )
hold a X 0.5
had a X 0.5
Bushi yu Shalong(布什与沙龙)
Bush and Sharon 0.8
Bushi X(布什 X )
Bush X 0.7
X yu Y( X 与 Y )
X and Y 0.9
![Page 22: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/22.jpg)
Syntax-based Translation Models #3
phrase-based model
formally syntax-based model
linguistically syntax-based model
string-to-tree model
tree-to-string model
tree-to-tree model
![Page 23: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/23.jpg)
String-to-Tree Model
● Kenji Yamada and Kevin Knight. 2001. A syntax-based statistical machine translation model. In Proceedings of ACL 2001.
● Daniel Marcu, Wei Wang, Abdessamad Echihabi, and Kevin Knight. 2006. SPMT: Statistical machine translation with syntactified target language phrases. In Proceedings of EMNLP 2006.
● Michel Galley, Jonathan Graehl, Kevin Knight, Daniel Marcu, Steve DeNeefe, Wei Wang, and Ignacio Thayer. 2006. Scalable inference and training of context-rich syntactic translation models. In Proceedings of COLING-ACL 2006.
![Page 24: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/24.jpg)
String-to-Tree Modelbushi yu shalong juxing le huitan
Bush held a talk with Sharon
NNP VBD DT NN IN NNP
NP NP
PP
VP
NP
S
VP
![Page 25: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/25.jpg)
String-to-Tree ModelSource Target Probability
juxing le huiang
(举行了会谈)
VP(VPD(hold) NP(DT(a) NN(meeting))) 0.6
VP(VPD(had) NP(DP(a) NN(meeting))) 0.3
VP(VPD(had) NP(DT(a) NN(talk))) 0.1
x1 huitang
( x1会谈)
VP(x1:VPD NP(DT(a) NN(meeting))) 0.8
VP(x1:VPD NP(DT(a) NN(talk))) 0.2
juxing le x1
(举行了 x1)
VP(VPD(hold) NP(DT(a) x1:NN)) 0.5
VP(VPD(had) NP(DT(a) x1:NN)) 0.5
x1 yu x
2
( x1与 x
2)
NP(x1:NNP CC(and) x
2:NNP)) 0.9
![Page 26: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/26.jpg)
Syntax-based Translation Models #3
phrase-based model
formally syntax-based model
linguistically syntax-based model
string-to-tree model
tree-to-string model
tree-to-tree model
![Page 27: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/27.jpg)
Syntax-based Translation Models #3
phrase-based model
formally syntax-based model
linguistically syntax-based model
string-to-tree model
tree-to-string model
tree-to-tree model
Our Work
![Page 28: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/28.jpg)
Outline
Background
Tree-to-String Model
Conclusion
![Page 29: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/29.jpg)
Our Work: Tree-to-String Model
Constituent-to-String Model
![Page 30: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/30.jpg)
Our Work: Tree-to-String Model
Constituent-to-String Model
Tree-based Translation
Forest-based Translation
Joint Parsing and Translation
![Page 31: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/31.jpg)
Our Work: Tree-to-String Model
Constituent-to-String Model
Tree-based Translation
Forest-based Translation
Joint Parsing and Translation
Dependency-to-String Model
![Page 32: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/32.jpg)
Our Work: Tree-to-String Model
Constituent-to-String Model
Tree-based Translation
Forest-based Translation
Joint Parsing and Translation
Dependency-to-String Model
![Page 33: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/33.jpg)
Constituent-to-String Model
● Yang Liu, Qun Liu, and Shouxun Lin. 2006. Tree-to-String Alignment Template for Statistical Machine Translation. In Proceedings of COLING/ACL 2006, pages 609-616, Sydney, Australia, July.Meritorious Asian NLP Paper Award
![Page 34: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/34.jpg)
Constituent-to-String Model
NPBASVSNPBP
Bush held a talk with Sharon
bushi yu shalong juxing le huitan
NPB
VPB
VP
IP
PP
![Page 35: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/35.jpg)
Constituent-to-String ModelSource Target Probability
VPB(VS(juxing) AS(le) NPB(huiang))(举行了会谈)
hold a meeting 0.6
have a meeting 0.3
have a talk 0.1
VPB(VS(juxing) AS(le) x1)
(举行了 x1)
hold a x1 0.5
have a x1 0.5
VP(PP(P(yu) x1:NPB) x
2:VPB)
(与 x1 x
2)
x2 with
x
1 0.9
IP(x1:NPB VP(x
2:PP x
3:VPB)) x
1 x3 x
2 0.7
![Page 36: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/36.jpg)
Constituent-to-String Rule● A constituent-to-string model is a statistical
translation model built on constituent-to-string translation rules
● A constituent-to-string translation rules consist of:● A syntax subtree in source side, where the leaf nodes
may be nonterminals or terminals (words)● A string of words and variables in target side● A one-to-one mapping between the nonterminal leaf
nodes in source subtree and the variables in target string
![Page 37: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/37.jpg)
Constituent-to-String Rule
VPB(VS( 举行 ) AS( 了 ) X1:NPB) → held a X1
![Page 38: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/38.jpg)
Our Work: Tree-to-String Model
Constituent-to-String Model
Tree-based Translation
Forest-based Translation
Joint Parsing and Translation
Dependency-to-String Model
![Page 39: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/39.jpg)
Tree-based Button-up Decoding
![Page 40: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/40.jpg)
Tree-based Button-up Decoding
![Page 41: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/41.jpg)
Tree-based Button-up Decoding
![Page 42: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/42.jpg)
Tree-based Button-up Decoding
![Page 43: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/43.jpg)
Tree-based Button-up Decoding
![Page 44: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/44.jpg)
Tree-based Button-up Decoding
![Page 45: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/45.jpg)
Tree-based Button-up Decoding
![Page 46: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/46.jpg)
Tree-based Button-up Decoding● Beam Search
![Page 47: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/47.jpg)
Our Work: Tree-to-String Model
Constituent-to-String Model
Tree-based Translation
Forest-based Translation
Joint Parsing and Translation
Dependency-to-String Model
![Page 48: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/48.jpg)
Forest-based Translation
● Haitao Mi, Liang Huang and Qun Liu. Forest-Based Translation. In Proceedings of ACL 2008 Columbus, OH
● Haitao Mi and Liang Huang. Forest-based Translation Rule Extraction. In Proceedings of EMNLP 2008 ,Honolulu, Hawaii. Nominated for the best-paper award
![Page 49: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/49.jpg)
Parsing Mistake Propagation
![Page 50: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/50.jpg)
Syntatic Ambiguity
![Page 51: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/51.jpg)
1-best ➜ n-best trees?
![Page 52: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/52.jpg)
Packed Forest
![Page 53: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/53.jpg)
Patten Matching on Forest
![Page 54: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/54.jpg)
Translation Forest
![Page 55: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/55.jpg)
Translation Forest
![Page 56: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/56.jpg)
Translation Forest
![Page 57: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/57.jpg)
Translation Forest
![Page 58: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/58.jpg)
Translation Forest
![Page 59: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/59.jpg)
N-best Trees vs. Forest
![Page 60: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/60.jpg)
Forest as Virtual ∞-best List
![Page 61: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/61.jpg)
Our Work: Tree-to-String Model
Constituent-to-String Model
Tree-based Translation
Forest-based Translation
Joint Parsing and Translation
Dependency-to-String Model
![Page 62: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/62.jpg)
Joint Parsing and Translatoin
● Yang Liu and Qun Liu. 2010. Joint Parsing and Translation. In Proceedings of COLING 2010, pages 707-715, Beijing, China, August.
![Page 63: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/63.jpg)
Seperate Parsing and Translation
![Page 64: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/64.jpg)
Joint Parsing and Translation
![Page 65: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/65.jpg)
Joint Parsing and Translation
![Page 66: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/66.jpg)
Joint Parsing and Translation
![Page 67: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/67.jpg)
Joint Parsing and Translation
![Page 68: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/68.jpg)
Joint Parsing and Translation
![Page 69: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/69.jpg)
Joint Parsing and Translation
![Page 70: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/70.jpg)
Joint Parsing and Translation
![Page 71: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/71.jpg)
Joint Parsing and Translation
![Page 72: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/72.jpg)
Joint Parsing and Translation
![Page 73: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/73.jpg)
Joint Parsing and Translation
![Page 74: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/74.jpg)
Joint Parsing and Translation
![Page 75: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/75.jpg)
Joint Parsing and Translation
![Page 76: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/76.jpg)
Joint Parsing and Translation
![Page 77: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/77.jpg)
Evaluation
String-based Translation = Joint Parsing and Translation
![Page 78: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/78.jpg)
Search Space Comparison
Probabilistic Distribution of Parsing Space
Probabilistic Distribution of Translation Space
![Page 79: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/79.jpg)
Search Space Comparison
Probabilistic Distribution of Parsing Space
Probabilistic Distribution of Translation Space
Tree-based Translation
![Page 80: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/80.jpg)
Search Space Comparison
Probabilistic Distribution of Parsing Space
Probabilistic Distribution of Translation Space
Forest-based Translation
![Page 81: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/81.jpg)
Search Space Comparison
Probabilistic Distribution of Parsing Space
Probabilistic Distribution of Translation Space
Joint Parsing and Translation
![Page 82: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/82.jpg)
Our Work: Tree-to-String Model
Constituent-to-String Model
Tree-based Translation
Forest-based Translation
Joint Parsing and Translation
Dependency-to-String Model
![Page 83: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/83.jpg)
Denpendency-to-String Model
● Jun Xie, Haitao Mi and Qun Liu, A novel dependency-to-string model for statistical machine translation, in Proceedings of EMNLP2011, July 27–31, 2011, Edinburgh, Scotland, UK.
![Page 84: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/84.jpg)
84
Constituent vs Dependency
PN
我
NN
学生
VC
是
CD
一 M
个
CLP
NPQP
NP
VPNP
IP是
我
一
个
学生 。
PU
。
![Page 85: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/85.jpg)
85
Constituent vs Dependency
Constituent Dependency
Node Category Word
Head Word No Yes
Number of Nodes 2*N N
Parsing Slow Fast
Nonterminals Yes No
![Page 86: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/86.jpg)
86
Dependency-based Models● Dependency is regarded as a promising model
because of its simplicity and its direct description of the relations between words
● Dependency is also regarded as a bridge between the syntax structure and the semantic structure
● Dependency has been successfully used to resolve many different NLP problems
● However, the attempt to build a dependency-based statistical translatoin model is not successful
![Page 87: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/87.jpg)
87
Dependency-based Translation Models
Chris Quirk, Arul Menezes, and Colin Cherry. 2005.Dependency treelet translation: Syntactically informed phrasal smt. In Proceedings of ACL 2005, pages 271–279.
This model is too complicated and cannot be repeated by othersDeyi Xiong, Qun Liu, and Shouxun Lin. 2007. A dependency treelet string correspondence model for statistical machine translation. In Proceedings of the Second Workshop on Statistical Machine Translation, pages 40–47, Prague, Czech Republic, June.
This model is over-flexible and of low performance
![Page 88: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/88.jpg)
88
Dependency-to-Tree Rule
(* 世界杯 *) ( 在 *) ( 成功 ) → 举行
(*FIFA*) was (successfully) held (*)
This rule is a really good rule, but it is too specific.This kind of rules may seldom be matched.
![Page 89: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/89.jpg)
89
Dependency-Treelet-to-String Rule
* 在 * 南非 * 举行 * → was hold in South Africa
This rule is too flexible. The target order of translation is difficult to be modeled.
![Page 90: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/90.jpg)
90
A novel Dependency-to-String Model
Our New Approach :
● Exact a rule on a whole subtree, rather than a treelet●
![Page 91: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/91.jpg)
91
A novel Dependency-to-String Model
Our New Approach :
● Exact a rule on a whole subtree, rather than a treelet● Generalize the rule using POS tags
![Page 92: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/92.jpg)
92
Original Rule
![Page 93: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/93.jpg)
93
Generalized Rule #1
![Page 94: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/94.jpg)
94
Generalized Rule #2
![Page 95: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/95.jpg)
95
Generalized Rule #3
![Page 96: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/96.jpg)
96
Generalized Rule #4
![Page 97: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/97.jpg)
97
Generalized Rule #5
![Page 98: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/98.jpg)
98
Generalized Rule #6
![Page 99: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/99.jpg)
99
Generalized Rule #7
![Page 100: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/100.jpg)
100
Experiment Results
![Page 101: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/101.jpg)
Outline
Background
Tree-to-String Model
Conclusion
![Page 102: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/102.jpg)
Conclusion
● We proposed two kinds of tree-to-string
translation model based on source side
syntax structure:
● Constituent-to-String Model
● Dependency-to-String Model
![Page 103: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/103.jpg)
Conclusion
● For constituent-to-string model, we proposed
three translation approaches:
● Tree-based Translation
● Forest-based Translation
● Joint Parsing and Translation(String-based translation)
Speed Precision
![Page 104: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/104.jpg)
Future Work● Semantic-base Translation Model● Structural Language Model
![Page 105: Source Syntax-based Statistical Machine Translation …liuqun/research/publications/CJMT-LiuQun.pdf · Outline Background Tree-to-String Model ... Scalable inference and training](https://reader031.vdocument.in/reader031/viewer/2022030517/5ac25c387f8b9a213f8e37ca/html5/thumbnails/105.jpg)
Thankshttp://nlp.ict.ac.cn