beyond automated quality scores (kim harris, text&form)

12
Kim Harris with Aljoscha Burchardt (DFKI), Hans Uszkoreit (DFKI), Arle Lommel (CSA) BEYOND AUTOMATED QUALITY SCORES From BLEU to professional error annotation in MT quality estimation and improvement

Upload: taus-enabling-better-translation

Post on 24-Jan-2017

292 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Beyond automated quality scores (Kim Harris, text&form)

Kim Harris with Aljoscha Burchardt (DFKI), Hans Uszkoreit (DFKI), Arle Lommel (CSA)

BEYOND AUTOMATED QUALITY SCORES

From BLEU to professional error annotation in MT quality estimation and improvement

Page 2: Beyond automated quality scores (Kim Harris, text&form)

Kim Harris • TAUS Roundtable Vienna 2016

• The closer a machine translation is to a professional human translation, the better it is

• Relatively high correlation with human judgements• One of the most popular automated and inexpensive

metrics.• Automated quality scores based on comparisons with sets

of HT references• Can be useful for certain estimation tasks but not for improvement• No ability to assess why scores improve or worsen• Focus on the score and not the actual quality

BLEU: Status Quo

Page 3: Beyond automated quality scores (Kim Harris, text&form)

Kim Harris • TAUS Roundtable Vienna 2016

• MQM/DQF error annotation for HT and MT• Analysis of quality based on real issues• Ranking/estimation properties• Use results to improve output

Error Annotation for MT Improvement

Page 4: Beyond automated quality scores (Kim Harris, text&form)

Kim Harris • TAUS Roundtable Vienna 2016

Annotation: Humans in the HQMT loop

Page 5: Beyond automated quality scores (Kim Harris, text&form)

Kim Harris • TAUS Roundtable Vienna 2016

Error profiles based on MQM annotation

By languages By system types

Page 6: Beyond automated quality scores (Kim Harris, text&form)

Kim Harris • TAUS Roundtable Vienna 2016

Error profiles

Page 7: Beyond automated quality scores (Kim Harris, text&form)

Kim Harris • TAUS Roundtable Vienna 2016

Error and source barrier analysis• Moving away from completely automatic • Analyse MQM errors, linguistic phenomena in target MT• Compare to source phenomena• Test suite analysis

• Basis for improved quality translation in MT thanks to categorization and markup of translation barriers in source language

• Mapping (almost) all linguistic phenomena for one language

• determine possible relationships between phenomena in the source and errors in the target

• can be used to test different MT systems and domains

New paradigm in HQMT

Page 8: Beyond automated quality scores (Kim Harris, text&form)

Kim Harris • TAUS Roundtable Vienna 2016

Enter: The Test Suite

Page 9: Beyond automated quality scores (Kim Harris, text&form)

Kim Harris • TAUS Roundtable Vienna 2016

Structure of Barrier Categories

Page 10: Beyond automated quality scores (Kim Harris, text&form)

Kim Harris • TAUS Roundtable Vienna 2016

Beyond BLEU

Page 11: Beyond automated quality scores (Kim Harris, text&form)

Kim Harris • TAUS Roundtable Vienna 2016

The Bigger Vision

Page 12: Beyond automated quality scores (Kim Harris, text&form)

Quality Translation 21 (QT21) has received funding from the EU’s Horizon 2020 research and innovation programme under grant no. 645452. META-QT has received funding from the EU’s Horizon 2020 research and innovation programme through the contract CRACKER (grant agreement no.: 645357). Formerly co-funded by FP7 and ICT PSP through the contracts T4ME (grant agreement no.: 249119), CESAR (grant agreement no.: 271022), METANET4U (grant agreement no.: 270893) and META-NORD (grant agreement no.: 270899).

Thank you!