assessing the usability of machine translated content: a user-centred study using eye tracking
DESCRIPTION
Assessing the Usability of Machine Translated Content: A User-Centred Study using Eye Tracking. Dr. Stephen Doherty & Dr. Sharon O ’ Brien Centre for Next Generation Localisation School of Applied Language & Intercultural Studies Dublin City University. Outline. Introduction Research Aims - PowerPoint PPT PresentationTRANSCRIPT
Assessing the Usability of Machine Translated Content: A User-Centred Study using Eye Tracking
Dr. Stephen Doherty & Dr. Sharon O’BrienCentre for Next Generation Localisation
School of Applied Language & Intercultural Studies
Dublin City University
Outline
Introduction
Research Aims
Methods
Results
Conclusions
Introduction
Increased need for translation
Diversity of content and users
Rise in prevalence of machine translation [MT] both off- and online
Mixed reports of quality – attitudes and expectations
Divergence in R&D – translation studies/computer science
Evaluation metrics – human and automatic
Our focus here is on usability
Research Aims
To investigate if there are differences in usability between the English [source language] and the unedited machine translated target languages [FR, DE, SP, JP].
Or in other words: how usable is machine translated content?
Adoption of the ISO/TR 16982 definition of usability
Importance of ecological validity: real materials and users
Methods
User-centred approach [n = 30]; task driven – ‘new user’ scenario
Eye tracking [tobii 1750]:Fixation count and average duration
Attentional shifts; percentage time in each window
Textual regressions
Methods
Post-task questionnaire; five-point LikertComprehension
Task completion
Potential improvement
Future reuse
Recommendation
Recall
Methods
UsabilitySatisfaction
Efficiency [task success/task time]
Eye Tracking
Task timeLowest for EN [sig. JP]
Fixation count and average durationLowest for EN [sig. JP] for both
Attentional shifts; percentage time in each windowEN and FR spent most time in task window
EN fewest shifts of attention [sig. JP]
Textual regressionsRaw number and distance: EN and SP [sig. JP]
‘Long’ regressions: JP [sig. all others]
Questionnaire Results
ComprehensionEN rated highest [sig. for FR and JP]
Task completionEN rated highest [sig. for JP]
Potential improvementSP & EN rated as needing least improvement, but could still be improved upon
Future reuseFR & EN rated highest
RecommendationEN rated highest [sig. for JP and DE]
RecallEN scored highest [sig. for JP and DE]
Usability Results
SatisfactionEN rated highest [sig. for FR, DE, and JP]
Task completionEN and SP more successful [sig. JP]
EfficiencyEN most efficient [sig. JP and DE]
Conclusions
So, just how usable is raw MT?Similar results for EN, SP, and FR
DE and JP more problematic [MT system]
Functionally usable [more than just ‘gisting’]
UX best for EN users
MT viable for certain pairs
Human intervention necessary to ensure best UX
Questions?
[email protected] [email protected]
This research is supported by the Science Foundation Ireland (Grant 07/CE/I1142) as part of the Centre for Next Generation Localisation (www.cngl.ie) at Dublin City University.
Predictors of Positive UX
Satisfied users: comprehension & task time
Satisfied users: recommend to others
Task completion: textual regressions
Cognitive effort: instructions aiding task completion