introducing machine translation to the european parliament

21
Introducing Machine Translation to the European Parliament Alexandros Poulis [email protected] DGTRAD.ITS TOB 02A011

Upload: jesse

Post on 07-Feb-2016

54 views

Category:

Documents


0 download

DESCRIPTION

Introducing Machine Translation to the European Parliament. Alexandros Poulis [email protected] DGTRAD.ITS TOB 02A011. Outline. Do we need MT@EP? What do we need MT for? One general MT system for all institutions: Is this possible? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Introducing Machine Translation to the European Parliament

Introducing Machine Translation to the European Parliament

Alexandros [email protected]

DGTRAD.ITSTOB 02A011

Page 2: Introducing Machine Translation to the European Parliament

Outline

Do we need MT@EP?

What do we need MT for?

One general MT system for all institutions: Is this possible?

How to make MT work for the EP: Current state of the project, future actions

Page 3: Introducing Machine Translation to the European Parliament

Does the EP need MT technology?

MT poll 2009: out of 137 respondents 21% used MT regularly and 40% sometimes (in DGTRAD)Today MT is more popular than what it used to beDomain usage statistics– In early 2010 almost 40.000 hits per month on MT

web-sites only by DGTRAD

– In December 2010 almost 500.000 hits were registered EP-wide

– Additional 1.000 requests to ECMT

Page 4: Introducing Machine Translation to the European Parliament

What do we need MT for?

Working group on Machine Translation– Needs Study

– Explore and identify Parliament’s needs in our given business and IT environment

Page 5: Introducing Machine Translation to the European Parliament

What do we need MT for?

Members of the EP MT working group– 5 translators with different backgrounds– PreTrad– ITS– Planning

Under the guidance of Miguel Frank

Page 6: Introducing Machine Translation to the European Parliament

What do we need MT for?

Possible use-cases (1):– MT as a CAT Tool: Help translators focus on non-

trivial, more demanding and creative translation tasks,

– improve quality and – increase productivity (?)

– Administrative mails are often written only in French or French-English

“Dear colleagues,As not all EU staff speak French, it would be really useful if you could send your emails in English/German as well. It would ensure more support.Je ne peux pas toujours traduire tout pour mes collegues. Merci.”

Page 7: Introducing Machine Translation to the European Parliament

What do we need Machine Translation for?

– Provide EP staff with access to knowledge and information they could not access before because of the language barrier

– More than 1.000 pages a year need to be produced in less than one hour (short deadlines, need for synchronous MT)

Page 8: Introducing Machine Translation to the European Parliament

What do we need MT for?

Possible use-cases (2):– Provide a risk-free alternative to online MT tools

where necessary and where possible– Help EP members and staff communicate faster and

better in languages they do not feel 100% comfortable with

– Try to reduce cost of outsourced translation– MT for gisting purposes when there is no need for

high quality human translation. If necessary light or heavy post-editing can be offered.

Page 9: Introducing Machine Translation to the European Parliament

Can one general purpose MT system serve all institutions?

The more domain-specific a SMT system the better output it gives (data – languages – domain specificity)

Are we re-inventing the wheel by working on MT in the EP while the Commission has almost developed a solution?

Page 10: Introducing Machine Translation to the European Parliament
Page 11: Introducing Machine Translation to the European Parliament
Page 12: Introducing Machine Translation to the European Parliament
Page 13: Introducing Machine Translation to the European Parliament

How can we make this work?

We must know which needs have to be addressed

We need appropriate hardware resources

Interinstitutional cooperation and sharing of information, data and know-how

User feedback

Page 14: Introducing Machine Translation to the European Parliament

Phase 1: building a lab-scale product

Open source Statistical MT tools

Combining EP and EC data from Euramis for various language pairs

We expected this system to be more appropriate for procedural documents

And indeed…

Page 15: Introducing Machine Translation to the European Parliament

Bleu score by document type (ENPT)

Page 16: Introducing Machine Translation to the European Parliament

Phase 1: Building a lab-scale product

A general purpose SMT system based on Euramis data may provide decent translations for certain document types (e.g. TC) What about QEs (written questions), CREs (verbatim reports of debates) and other doctypes which account for a large amount of our translation production?In 2010 we produced 488.622 pages of AM documents and… 113.111 pages of QEs! Almost 1 QE for every 4 AMs!

Page 17: Introducing Machine Translation to the European Parliament

Phase 1: Building a lab-scale product

Page 18: Introducing Machine Translation to the European Parliament

Our next steps– Provide feedback to the EP MT working group– Customise and optimise for different use-cases– Integrate to translation production environment

(CAT4TRAD, CAT-Tool)– Improve efficiency (faster updates of the models

when we have new versions of the training corpora)– Enhance our corpora to create custom engines– Combine technologies: MT+TM (e.g. enhanced fuzzy

matches - Philipp Koehn et al. 2010)

Phase 1: building a lab-scale product

Page 19: Introducing Machine Translation to the European Parliament

Phase 2: Provide a test environment to MT users

Evaluate usability of MT– as a CAT tool (dissemination)

– for assimilation and communication purposes

Page 20: Introducing Machine Translation to the European Parliament
Page 21: Introducing Machine Translation to the European Parliament

БЛАГОДАРЯ ВИ (BG)GRÀCIES (CA)DĚKUJI (CS)

TAK (DA)DANKE (DE)

ΕΥΧΑΡΙΣΤΟΥΜΕ (EL) THANK YOU (EN)

GRACIAS(ES) TÄNAME (ET)KIITOS (FI)MERCI (FR)

GO RAIBH MAITH AGAT (GA) KÖSZÖNJÜK (HU)

GRAZIE (IT) AČIŪ (LT)

PALDIES (LV)GRAZZI (MT)

BEDANKT (NL) DZIĘKUJĘ (PL)OBRIGADO (PT)

VA MULTUMIM (RO)DĚKUJI (SK)HVALA (SL)TACK (SV)