mt in the nclt
DESCRIPTION
MT in the NCLT. Andy Way NCLT, School of Computing, Dublin City University, Dublin 9, Ireland [email protected] www.nclt.dcu.ie/mt/. MT in the NCLT: Recent History. Marker-Based EBMT [Nano Gough, PhD 2005] Computational Linguistics 2003 NLE 2005; Machine Translation 2005 - PowerPoint PPT PresentationTRANSCRIPT
1
MT in the NCLT
Andy Way
NCLT, School of Computing,Dublin City University,
Dublin 9, Ireland
2
MT in the NCLT: Recent History
– Marker-Based EBMT• [Nano Gough, PhD 2005]
– Computational Linguistics 2003– NLE 2005; Machine Translation 2005– AMTA 02, MT Summit 03; TMI 04, EAMT 04 …
– Data-Oriented Translation• [Mary Hearne, PhD 2005]
– MT Summit 03, COLING 04, IJCNLP 04, EAMT 05, EAMT 06 …
– Hybrid Approaches (EBMT & SMT)• [Declan Groves, PhD 2007]
– Machine Translation 2006– ACL 05, EAMT 06, …
3
MT in the NCLT: Recent History
– Improving Online MT Systems (TransBooster)
• [Bart Mellebeek, PhD 2007]• [Karolina Owczarzak]
– MT Summit 05, AMTA 06, EAMT 05, 06 …
– Automatic Translation of DVD subtitles• [Steve Armstrong, MSc 2007]• [Other students’ ongoing PhD work in
SALIS]– Perspectives 06– ASLIB 06 …
4
Current Research
• Hybrid MT (MaTrEx)– Nicolas Stroppa et al.
• AMTA 06, OpenLab 06, IWSLT 06, NIST 06, MT Summit 07
• Dependency-Based Automatic Evaluation Metrics– Karolina Owczarzak, Josef Van Genabith
• MT Summit 07, Workshops at NAACL 07, ACL 07
• Integrating Syntax into SMT (Using Supertags)– Hany Hassan [& Khalil Sima’an]
• IEEE SLT 06, ACL 07 …
• Sign Language MT– Sara Morrissey [& RWTH Aachen]
• MT Summit 05, LREC 06, MT Summit 07 …
5
Current Research
• Word and Phrase Alignment in SMT– Yanjun Ma, Nicolas Stroppa
• ACL 07 …
• Sub-Tree Alignment– John Tinsley, Ventzi Zhechev, Mary Hearne
• MT Summit 07 …
• Parameter Estimation in MT– John Tinsley, Ventzi Zhechev, Mary Hearne
[& Khalil Sima’an]
• Constraint-Based MT– Yvette Graham, Josef Van Genabith
6
Language Pairs
• FrenchEnglish (EBMT)• EnglishGerman (EBMT)• SpanishEnglish (SMT, Hybrid)• SpanishBasque (Hybrid)• ChineseEnglish (SMT, EBMT)• ArabicEnglish (SMT, Hybrid)• ItalianEnglish (Hybrid)• JapaneseEnglish (EBMT, Hybrid)• DutchEnglish (Hybrid, SMT)• Sign LanguageEnglish (Hybrid)• …
7
Collaboration
• Tilburg (Memory-based Decoding)• Donostia (Basque MT)• Aachen (Sign-Language MT)• Amsterdam (Integrating Syntax &
SMT)• Edinburgh (SMT)• CMU (Hybrid SMT—EBMT)• Toshiba Beijing (Chinese MT)• …
8
Future Work
• MT via SMS• Automatic Interpreting• Enhanced hybrid models• Scalability• Tuning MT to text type & genre• MT using Pivot languages• Better quality phrases (cf. CONLL
monolingual chunking shared task)• …
9
Current and Future Funding
• Irish Government Sources– Science Foundation Ireland– Enterprise Ireland– IRCSET
• Companies– IBM– Microsoft
• Under Review– EU STREP (MT for Minority Languages)
• UPC, FBK-IRST, Edinburgh …
– SFI CSET in Next Generation Localisation• TCD, UCD, UL, IBM, Microsoft, Symantec …
10