silke gutermuth & silvia hansen-schirra university of mainz germany post-editing machine...
TRANSCRIPT
Silke Gutermuth & Silvia Hansen-Schirra
University of Mainz
Germany
Post-editing machine translation
– a usability test for
professional translation settings
EyeTrackBehavior 2012 | October 9-10 | Leuven
Post-editing?
• “term used for the correction of machine translation output by human linguists/editors” (Veale & Way 1997)
• “taking raw machine translated output and then editing it to produce a 'translation' which is suitable for the needs of the client” (one student explaining post-editing to another)
• “is the process of improving a machine-generated translation with a minimum of manual labour”
(TAUS Report 2010)
EyeTrackBehavior 2012 | October 9-10 | Leuven
Degrees of Post-editing
• light or fast post-editing
- essential corrections only- time factor: quick
• full post-editing
- more corrections => higher quality- time factor: slow
(O‘Brien et al. 2009)
EyeTrackBehavior 2012 | October 9-10 | Leuven
Background
Motivation: evaluation of machine translation (MT), post-editing of MT, eye-enhanced CAT workbenches (e.g. O‘Brien 2011, Doherty et al. 2010, Carl & Jakobsen 2010, Hyrskykari 2006)
Project: in cooperation with Copenhagen Business School (http://www.cbs.dk/Forskning/Institutter-centre/Institutter/CRITT/Menu/Forskningsprojekter)
Experiment: - English-German - translation vs. post-editing vs. editing- 6 source texts (ST) with different complexity levels (Hvelplund 2011)- 12 professional translators, 12 semi-professional translators- eye-tracking (Tobii TX 300), key-logging (Translog),
retrospective interviews, questionnaires
EyeTrackBehavior 2012 | October 9-10 | Leuven
Translators‘ self-estimation
highly
satisfi
ed
somew
hat sati
sfied
neutra
l
somew
hat diss
atisfi
ed
highly
dissati
sfied
no answ
er0
2
4
6
How satisfied with E task?
professionals students
highly
satisfi
ed
somew
hat sati
sfied
neutra
l
somew
hat diss
atisfi
ed
highly
dissati
sfied
no answ
er0
2
4
6
8
How satisfied with PE task?
professionals students
EyeTrackBehavior 2012 | October 9-10 | Leuven
Translators‘ self-estimation
highly
satisfi
ed
somew
hat sati
sfied
neutra
l
somew
hat diss
atisfi
ed
highly
dissati
sfied
no answ
er0
2
4
6
How satisfied with E task?
professionals students
highly
satisfi
ed
somew
hat sati
sfied
neutra
l
somew
hat diss
atisfi
ed
highly
dissati
sfied
no answ
er0
2
4
6
8
How satisfied with PE task?
professionals students
yes no n.a.0
2
4
6
8
10
12
From scratch rather than PE?
professionals students
EyeTrackBehavior 2012 | October 9-10 | Leuven
Translators‘ evaluation of MT quality
well belo
w avera
ge
below av
erage
avera
ge
above
avera
ge
well ab
ove av
erage n.a.
02468
Rate MT output style
professionals students
well belo
w avera
ge
below av
erage
avera
ge
above
avera
ge
well ab
ove av
erage n.a.
0
4
8
Rate MT output grammatically
professionals students
EyeTrackBehavior 2012 | October 9-10 | Leuven
Translators‘ evaluation of MT quality
well belo
w avera
ge
below av
erage
avera
ge
above
avera
ge
well ab
ove av
erage n.a.
02468
Rate MT output style
professionals students
well belo
w avera
ge
below av
erage
avera
ge
above
avera
ge
well ab
ove av
erage n.a.
0
4
8
Rate MT output grammatically
professionals students
well belo
w avera
ge
below av
erage
avera
ge
above
avera
ge
well ab
ove av
erage n.a.
0
2
4
6
Rate MT output quality
professionals students
well belo
w avera
ge
below av
erage
avera
ge
above
avera
ge
well ab
ove av
erage n.a.
02468
Rate MT output accuracy
professionals students
EyeTrackBehavior 2012 | October 9-10 | Leuven
Translators‘ evaluation of MT quality
Professional translators:
conscious, subjective rating of machine translated output is extremely negative.
Can eye-tracking tell a different story dealing with objective and measurable facts?
EyeTrackBehavior 2012 | October 9-10 | Leuven
Processing time
0
200000
400000
600000
800000
1000000
1200000
1400000
1600000
1800000
2000000 1889479.09
1410542.091270464.13
Mean values in milliseconds
P E T
EyeTrackBehavior 2012 | October 9-10 | Leuven
Processing time
0
200000
400000
600000
800000
1000000
1200000
1400000
1600000
1800000
2000000 1889479.09
1410542.091270464.13
Mean values in milliseconds
P E T
edited texts quite often suffer from a distortion of meaning => source text needed for good quality translation => post-editing
EyeTrackBehavior 2012 | October 9-10 | Leuven
Processing of ST vs. TT Translation
Text 1 Text 2 Text 30
100
200
300
400
500
600
Fixation duration T_STFixation duration T_TT
Text 1 Text 2 Text 30
500
1000
1500
2000
2500
Fixation count T_STFixation count T_TT
EyeTrackBehavior 2012 | October 9-10 | Leuven
Processing of ST vs. TT Translation vs. Post-editing
Text 1 Text 2 Text 30
100
200
300
400
500
600
Fixation duration T_STFixation duration T_TT
Text 1 Text 2 Text 30
500
1000
1500
2000
2500
Fixation count T_STFixation count T_TT
Text 1 Text 2 Text 30
50
100
150
200
250
300
350
400
450
500
Fixation duration P_STFixation duration P_TT
Text 1 Text 2 Text 30
500
1000
1500
2000
2500
Fixation count P_STFixation count P_TT
EyeTrackBehavior 2012 | October 9-10 | Leuven
Processing metrics Translation vs. Post-editing
Translation: correlation between increasing ST complexity and TT processing metrics
Post-editing: no significant influence of ST complexity on TT processing metrics
EyeTrackBehavior 2012 | October 9-10 | Leuven
Processing of ST Translation vs. Post-editing
Text 1 Text 2 Text 30
50
100
150
200
250
300
350
400
450
Fixation duration T_STFixation duration P_ST
Text 1 Text 2 Text 30
200
400
600
800
1000
1200
1400
1600
1800
2000
Fixation count T_STFixation count P_ST
=> post-editing more efficientWHY?
EyeTrackBehavior 2012 | October 9-10 | Leuven
Fixation Duration of clauses
T3 P30
10
20
30
40
50
60
70
80
90
FINNON-FIN
Average fixation duration (in milliseconds ) per clause
EyeTrackBehavior 2012 | October 9-10 | Leuven
Fixation Duration of clauses
T3 P30
10
20
30
40
50
60
70
80
90
FINNON-FIN
Average fixation duration (in milliseconds ) per clause
Good quality of MT for non-finite clauses
ST: to end the suffering TT-P: um das Leiden zu beendenST: Although emphasizing that TT-P: Obwohl betont wird, dassST: to protest against TT-P: um gegen … zu protestierenST: in the wake of fighting flaring TT-P: im Zuge des Kampfes gegen ein erneutes up again in Dafur Aufflammen in Darfur
EyeTrackBehavior 2012 | October 9-10 | Leuven
Preliminary Conclusions
Efficient post-editing is possible under the following conditions: • good machine translation quality• post-editors who are language experts, i.e. they need
• knowledge of the conventions of the source and target language
• knowledge of the text type and register
EyeTrackBehavior 2012 | October 9-10 | Leuven
What‘s next?
• Analysis of other contrastive differences and gaps• Analysis of ambiguities and processing problems• Comparison of complexity levels • Analysis of monitoring processes during TT production
(with Translog)• Comparison of professionals vs. semi-professionals• Correlations between process data and quality of
participants’ outputs• Comparison with other translation pairs
EyeTrackBehavior 2012 | October 9-10 | Leuven
Bibliography
• Carl, Michael and Jakobsen, Arnt Lykke (2010): Relating Production Units and Alignment Units in Translation Activity Data, In Proceedings of International Workshop on Natural Language Processing and Cognitive Science (NLPCS), Madeira, Portugal.
• Doherty, Stephen, O'Brien, Sharon and Carl, Michael (2010): Eye tracking as an MT evaluation technique. Machine Translation, 24, 1, pp1-13.
• Hvelplund, Kristian Tangsgaard (2011): Allocation of cognitive resources in translation an eye-tracking and key-logging study. PhD thesis, Department of International Language Studies and Computational Linguistics, Copenhagen Business School.
• Hyrskykari , Aulikki (2006): Eyes in Attentive Interfaces: Experiences from Creating iDict, a Gaze-Aware Reading Aid. Dissertation, Tampere University Press.
• O'Brien, Sharon and Roturier, Johann and De Almeida, Giselle (2009): Post-Editing MT Output Views from the researcher, trainer, publisher and practitioner. http://www.mt-archive.info/MTS-2009-O’Brien-ppt.pdf
• O'Brien, Sharon (2011): Towards Predicting Post-Editing Productivity. Machine Translation, 25, 3, pp197-215.
• Postediting in Practice. A TAUS Report, March 2010 p.6• Veale, T. and Way, A. (1997). Gaijin: A Bootstrapping Approach to Example-Based Machine Translation. Recent Advances in Natural Language International Conference, 239-244.
Contact
Silke Gutermuth & Silvia Hansen-Schirra
[email protected] & [email protected]
http://www.staff.uni-mainz.de/hansenss/