mt in the nclt

Post on 02-Jan-2016

20 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

MT in the NCLT. Andy Way NCLT, School of Computing, Dublin City University, Dublin 9, Ireland away@computing.dcu.ie www.nclt.dcu.ie/mt/. MT in the NCLT: Recent History. Marker-Based EBMT [Nano Gough, PhD 2005] Computational Linguistics 2003 NLE 2005; Machine Translation 2005 - PowerPoint PPT Presentation

TRANSCRIPT

1

MT in the NCLT

Andy Way

NCLT, School of Computing,Dublin City University,

Dublin 9, Ireland

away@computing.dcu.iewww.nclt.dcu.ie/mt/

2

MT in the NCLT: Recent History

– Marker-Based EBMT• [Nano Gough, PhD 2005]

– Computational Linguistics 2003– NLE 2005; Machine Translation 2005– AMTA 02, MT Summit 03; TMI 04, EAMT 04 …

– Data-Oriented Translation• [Mary Hearne, PhD 2005]

– MT Summit 03, COLING 04, IJCNLP 04, EAMT 05, EAMT 06 …

– Hybrid Approaches (EBMT & SMT)• [Declan Groves, PhD 2007]

– Machine Translation 2006– ACL 05, EAMT 06, …

3

MT in the NCLT: Recent History

– Improving Online MT Systems (TransBooster)

• [Bart Mellebeek, PhD 2007]• [Karolina Owczarzak]

– MT Summit 05, AMTA 06, EAMT 05, 06 …

– Automatic Translation of DVD subtitles• [Steve Armstrong, MSc 2007]• [Other students’ ongoing PhD work in

SALIS]– Perspectives 06– ASLIB 06 …

4

Current Research

• Hybrid MT (MaTrEx)– Nicolas Stroppa et al.

• AMTA 06, OpenLab 06, IWSLT 06, NIST 06, MT Summit 07

• Dependency-Based Automatic Evaluation Metrics– Karolina Owczarzak, Josef Van Genabith

• MT Summit 07, Workshops at NAACL 07, ACL 07

• Integrating Syntax into SMT (Using Supertags)– Hany Hassan [& Khalil Sima’an]

• IEEE SLT 06, ACL 07 …

• Sign Language MT– Sara Morrissey [& RWTH Aachen]

• MT Summit 05, LREC 06, MT Summit 07 …

5

Current Research

• Word and Phrase Alignment in SMT– Yanjun Ma, Nicolas Stroppa

• ACL 07 …

• Sub-Tree Alignment– John Tinsley, Ventzi Zhechev, Mary Hearne

• MT Summit 07 …

• Parameter Estimation in MT– John Tinsley, Ventzi Zhechev, Mary Hearne

[& Khalil Sima’an]

• Constraint-Based MT– Yvette Graham, Josef Van Genabith

6

Language Pairs

• FrenchEnglish (EBMT)• EnglishGerman (EBMT)• SpanishEnglish (SMT, Hybrid)• SpanishBasque (Hybrid)• ChineseEnglish (SMT, EBMT)• ArabicEnglish (SMT, Hybrid)• ItalianEnglish (Hybrid)• JapaneseEnglish (EBMT, Hybrid)• DutchEnglish (Hybrid, SMT)• Sign LanguageEnglish (Hybrid)• …

7

Collaboration

• Tilburg (Memory-based Decoding)• Donostia (Basque MT)• Aachen (Sign-Language MT)• Amsterdam (Integrating Syntax &

SMT)• Edinburgh (SMT)• CMU (Hybrid SMT—EBMT)• Toshiba Beijing (Chinese MT)• …

8

Future Work

• MT via SMS• Automatic Interpreting• Enhanced hybrid models• Scalability• Tuning MT to text type & genre• MT using Pivot languages• Better quality phrases (cf. CONLL

monolingual chunking shared task)• …

9

Current and Future Funding

• Irish Government Sources– Science Foundation Ireland– Enterprise Ireland– IRCSET

• Companies– IBM– Microsoft

• Under Review– EU STREP (MT for Minority Languages)

• UPC, FBK-IRST, Edinburgh …

– SFI CSET in Next Generation Localisation• TCD, UCD, UL, IBM, Microsoft, Symantec …

10

top related