silke gutermuth & silvia hansen-schirra university of mainz germany post-editing machine...

Silke Gutermuth & Silvia Hansen-Schirra

University of Mainz

Germany

Post-editing machine translation

– a usability test for

professional translation settings

EyeTrackBehavior 2012 | October 9-10 | Leuven

Post-editing?

• “term used for the correction of machine translation output by human linguists/editors” (Veale & Way 1997)

• “taking raw machine translated output and then editing it to produce a 'translation' which is suitable for the needs of the client” (one student explaining post-editing to another)

• “is the process of improving a machine-generated translation with a minimum of manual labour”

(TAUS Report 2010)


Degrees of Post-editing

• light or fast post-editing

- essential corrections only- time factor: quick

• full post-editing

- more corrections => higher quality- time factor: slow

(O‘Brien et al. 2009)


Background

Motivation: evaluation of machine translation (MT), post-editing of MT, eye-enhanced CAT workbenches (e.g. O‘Brien 2011, Doherty et al. 2010, Carl & Jakobsen 2010, Hyrskykari 2006)

Project: in cooperation with Copenhagen Business School (http://www.cbs.dk/Forskning/Institutter-centre/Institutter/CRITT/Menu/Forskningsprojekter)

Experiment: - English-German - translation vs. post-editing vs. editing- 6 source texts (ST) with different complexity levels (Hvelplund 2011)- 12 professional translators, 12 semi-professional translators- eye-tracking (Tobii TX 300), key-logging (Translog),

retrospective interviews, questionnaires


Translators‘ self-estimation

highly

satisfi

ed

somew

hat sati

sfied

neutra

l

somew

hat diss

atisfi

ed

highly

dissati

sfied

no answ

er0

2

4

6

How satisfied with E task?

professionals students

highly

satisfi

ed

somew

hat sati

sfied

neutra

l

somew

hat diss

atisfi

ed

highly

dissati

sfied

no answ

er0

2

4

6

8

How satisfied with PE task?



Translators‘ self-estimation

highly

satisfi

ed

somew

hat sati

sfied

neutra

l

somew

hat diss

atisfi

ed

highly

dissati

sfied

no answ

er0

2

4

6

How satisfied with E task?


highly

satisfi

ed

somew

hat sati

sfied

neutra

l

somew

hat diss

atisfi

ed

highly

dissati

sfied

no answ

er0

2

4

6

8

How satisfied with PE task?


yes no n.a.0

2

4

6

8

10

12

From scratch rather than PE?



Translators‘ evaluation of MT quality

well belo

w avera

ge

below av

erage

avera

ge

above

avera

ge

well ab

ove av

erage n.a.

02468

Rate MT output style


well belo

w avera

ge

below av

erage

avera

ge

above

avera

ge

well ab

ove av

erage n.a.

0

4

8

Rate MT output grammatically




well belo

w avera

ge

below av

erage

avera

ge

above

avera

ge

well ab

ove av

erage n.a.

02468

Rate MT output style


well belo

w avera

ge

below av

erage

avera

ge

above

avera

ge

well ab

ove av

erage n.a.

0

4

8

Rate MT output grammatically


well belo

w avera

ge

below av

erage

avera

ge

above

avera

ge

well ab

ove av

erage n.a.

0

2

4

6

Rate MT output quality


well belo

w avera

ge

below av

erage

avera

ge

above

avera

ge

well ab

ove av

erage n.a.

02468

Rate MT output accuracy




Professional translators:

conscious, subjective rating of machine translated output is extremely negative.

Can eye-tracking tell a different story dealing with objective and measurable facts?


Processing time

0

200000

400000

600000

800000

1000000

1200000

1400000

1600000

1800000

2000000 1889479.09

1410542.091270464.13

Mean values in milliseconds

P E T


Processing time

0

200000

400000

600000

800000

1000000

1200000

1400000

1600000

1800000

2000000 1889479.09

1410542.091270464.13

Mean values in milliseconds

P E T

edited texts quite often suffer from a distortion of meaning => source text needed for good quality translation => post-editing


Processing of ST vs. TT Translation

Text 1 Text 2 Text 30

100

200

300

400

500

600

Fixation duration T_STFixation duration T_TT


500

1000

1500

2000

2500

Fixation count T_STFixation count T_TT


Processing of ST vs. TT Translation vs. Post-editing


100

200

300

400

500

600

Fixation duration T_STFixation duration T_TT


500

1000

1500

2000

2500

Fixation count T_STFixation count T_TT


50

100

150

200

250

300

350

400

450

500

Fixation duration P_STFixation duration P_TT


500

1000

1500

2000

2500

Fixation count P_STFixation count P_TT


Processing metrics Translation vs. Post-editing

Translation: correlation between increasing ST complexity and TT processing metrics

Post-editing: no significant influence of ST complexity on TT processing metrics


Processing of ST Translation vs. Post-editing


50

100

150

200

250

300

350

400

450

Fixation duration T_STFixation duration P_ST


200

400

600

800

1000

1200

1400

1600

1800

2000

Fixation count T_STFixation count P_ST

=> post-editing more efficientWHY?


Fixation Duration of clauses

T3 P30

10

20

30

40

50

60

70

80

90

FINNON-FIN

Average fixation duration (in milliseconds ) per clause


Fixation Duration of clauses

T3 P30

10

20

30

40

50

60

70

80

90

FINNON-FIN

Average fixation duration (in milliseconds ) per clause

Good quality of MT for non-finite clauses

ST: to end the suffering TT-P: um das Leiden zu beendenST: Although emphasizing that TT-P: Obwohl betont wird, dassST: to protest against TT-P: um gegen … zu protestierenST: in the wake of fighting flaring TT-P: im Zuge des Kampfes gegen ein erneutes up again in Dafur Aufflammen in Darfur


Preliminary Conclusions

Efficient post-editing is possible under the following conditions: • good machine translation quality• post-editors who are language experts, i.e. they need

• knowledge of the conventions of the source and target language

• knowledge of the text type and register


What‘s next?

• Analysis of other contrastive differences and gaps• Analysis of ambiguities and processing problems• Comparison of complexity levels • Analysis of monitoring processes during TT production

(with Translog)• Comparison of professionals vs. semi-professionals• Correlations between process data and quality of

participants’ outputs• Comparison with other translation pairs


Bibliography

• Carl, Michael and Jakobsen, Arnt Lykke (2010): Relating Production Units and Alignment Units in Translation Activity Data, In Proceedings of International Workshop on Natural Language Processing and Cognitive Science (NLPCS), Madeira, Portugal.

• Doherty, Stephen, O'Brien, Sharon and Carl, Michael (2010): Eye tracking as an MT evaluation technique. Machine Translation, 24, 1, pp1-13.

• Hvelplund, Kristian Tangsgaard (2011): Allocation of cognitive resources in translation an eye-tracking and key-logging study. PhD thesis, Department of International Language Studies and Computational Linguistics, Copenhagen Business School.

• Hyrskykari , Aulikki (2006): Eyes in Attentive Interfaces: Experiences from Creating iDict, a Gaze-Aware Reading Aid. Dissertation, Tampere University Press.

• O'Brien, Sharon and Roturier, Johann and De Almeida, Giselle (2009): Post-Editing MT Output Views from the researcher, trainer, publisher and practitioner. http://www.mt-archive.info/MTS-2009-O’Brien-ppt.pdf

• O'Brien, Sharon (2011): Towards Predicting Post-Editing Productivity. Machine Translation, 25, 3, pp197-215.

• Postediting in Practice. A TAUS Report, March 2010 p.6• Veale, T. and Way, A. (1997). Gaijin: A Bootstrapping Approach to Example-Based Machine Translation. Recent Advances in Natural Language International Conference, 239-244.

http://www.webcitation.org/5kwbaCYwg



Contact

Silke Gutermuth & Silvia Hansen-Schirra

[email protected] & [email protected]

http://www.staff.uni-mainz.de/hansenss/

mailto:[email protected]

mailto:[email protected]

http://www.staff.uni-mainz.de/hansenss/

silke gutermuth & silvia hansen-schirra university of mainz germany post-editing machine...

Documents