building a persistent workforce on mechanical turk for multilingual data collection
DESCRIPTION
Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection. William B. Dolan Microsoft Research. David L. Chen The University of Texas at Austin. The 3 rd Human Computation Workshop (HCOMP) August 8, 2011. Introduction. Collect translation and paraphrase data - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/1.jpg)
Building a Persistent Workforce on Mechanical Turk for
Multilingual Data Collection
David L. ChenThe University of Texas at Austin
William B. DolanMicrosoft Research
The 3rd Human Computation Workshop (HCOMP)August 8, 2011
![Page 2: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/2.jpg)
Introduction
• Collect translation and paraphrase data• Statistical machine translation systems require
large amounts of parallel corpora• Professional translators are expensive– E.g. $0.36/word to create a Tamil-English corpus
[Germann, ACL 2001 Workshop]• No similar resource for paraphrases
![Page 3: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/3.jpg)
Variety of Natural Language Processing tasks done on Mechanical Turk
• Collecting paraphrase data– Buzek et al., NAACL 2010 AMT workshop– Denkowski et al., NAACL 2010 AMT workshop
• Collecting translation data– Ambati and Vogel, NAACL 2010 AMT workshop– Omar Zaidan and Chris Callison-Burch, ACL 2011
• Evaluating machine translation quality– Callison-Burch, EMNLP 2009– Denkowski and Lavie, NAACL 2010 AMT workshop
![Page 4: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/4.jpg)
Issues of using Mechanical Turk
• Catching people using machine translations– Collect and check against online translations– Use an image instead of text
• Improving poor translations– Ask other workers to edit the translations– Ask other workers to rank the translations
![Page 5: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/5.jpg)
Issues of using Mechanical Turk
• Attracting workers– Higher pay doesn’t always mean higher quality– More likely to cheat if pay is high
• Making tasks simple and quick enough– Difficult to collect data where input is required
rather than selecting buttons– More difficult to translate whole sentences than
short phrases
![Page 6: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/6.jpg)
Use video to collect language data
• Collect descriptions of videos (Chen and Dolan, ACL 2011)
• Parallel descriptions in different languages Translation data
• Parallel descriptions in the same language Paraphrase data
• Also useful as video annotation data
![Page 7: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/7.jpg)
Video description task
• Show a short YouTube video to the user• Ask them to write a single-sentence
description in any language– Removes incentive to cheat
• Only require monolingual speakers– Similar to work by Hu et al. [CHI 2011]
![Page 8: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/8.jpg)
Annotation Task
Describe video in a single sentence
![Page 9: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/9.jpg)
Example Descriptions• Someone is coating a pork chop in a glass bowl of flour.• A person breads a pork chop.• Someone is breading a piece of meat with a white
powdery substance.• A chef seasons a slice of meat.• Someone is putting flour on a piece of meat.• A woman is adding flour to meat.• A woman is coating a piece of pork with breadcrumbs.• A man dredges meat in bread crumbs.• A person breads a piece of meat.• A woman is breading some meat.• A woman coats a meat cutlet in a dish.
![Page 10: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/10.jpg)
Example Descriptions• Ženska panira zrezek. (Slovene) • правење шницла (Macedonian) • Jemand wendet ein Stück Fleisch in Paniermehl (German) • Une femme trempe une cotelette de porc dans de la
chapelure. (French) • cineva da o felie de carne prin pesmet (Romanian) • गोश में कुछ ओत्ता मिमला रहा हे (Hindi) • (Georgian) ქალი ხორცს ავლებს რაღაცაში• Dame haalt lap vlees door bloem. (Dutch) • Šnicla mesa se valja u smesu začina (Serbian) • Unas manos embadurnan un trozo de carne en un recipiente
(Spanish)
![Page 11: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/11.jpg)
Quality Control
Tier 1$0.01 per description
Tier 2$0.05 per description
Initially everyone only has access to Tier-1 tasks
![Page 12: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/12.jpg)
Quality Control
Tier 1$0.01 per description
Tier 2$0.05 per description
Good workers are promoted to Tier-2 based on # descriptions, English fluency, quality of descriptions
![Page 13: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/13.jpg)
Quality Control
Tier 1$0.01 per description
Tier 2$0.05 per description
The two tiers have identical tasks but have different pay rates
![Page 14: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/14.jpg)
Video collection task
• Ask workers to submit video clips from YouTube
• Single, unambiguous action/event• Short (4-10 seconds)• Generally accessible• No dialogues• No words (subtitles, overlaid text, titles)• Also uses multi-tiered payment system
![Page 15: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/15.jpg)
Daily number of descriptions collected
7/217/24
7/277/30 8/2 8/5 8/8
8/118/14
8/178/20
8/238/26
8/29 9/1 9/4 9/70
2000
4000
6000
8000
10000
12000
14000
![Page 16: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/16.jpg)
Distribution of languages collected
• Other: Tagalog, Portuguese, Norwegian, Filipino, Estonian, Turkish, Arabic, Urdu, Hungarian, Indonesian, Malay, Bulgarian, Danish, Bosnian, Marathi, Swedith, Albanian
English 85550 (33855) Spanish 1883Hindi 6245 Gujarati 1437Romanian 3998 Russian 1243Slovene 3584 French 1226Serbian 3420 Italian 953Tamil 2789 Georgian 907Dutch 2735 Polish 544German 2326 Chinese 494Macedonian 1915 Malayalam 394
![Page 17: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/17.jpg)
Statistics of data collected• Total money spent: $5000• Total number of workers: 835
633
50
152
Number of workers
Tier-1Tier-2Non-English
$510
$1691$1260
$1539
Money spent
Tier-1Tier-2Non-EnglishMisc
![Page 18: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/18.jpg)
Number of descriptions submitted
Series10
500
1000
1500
2000
2500
3000
3500
4000
Sorted histogram of the top 255 workers
![Page 19: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/19.jpg)
Persistent workforce
• Return workers who continually work on our task
• Several workers annotated all available videos, even both Tier-1 and Tier-2 tasks
• Easier to model and evaluate workers when they submit many annotations
• Identified good workers require little or no supervision
![Page 20: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/20.jpg)
Sample worker responses
• “The speed of approval gave me confidence that you would pay me for future work.”
• “Posters on Turker Nation recommended it as both high-paying and interesting. Both ended up being true.”
• “I consider this task pretty enjoyable. some videos are funny, others interesting.”
• “Fast, easy, really fun to do it.”
![Page 21: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/21.jpg)
Lessons learned
• Design the tasks well– Short, simple, easy, accessible, and fun
• Learn from worker responses• Compensate the workers fairly• Communicate– Quick task approval– Explain rejections– Respond to comments and emails
![Page 22: Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection](https://reader036.vdocument.in/reader036/viewer/2022081520/56815e99550346895dcd2b31/html5/thumbnails/22.jpg)
Conclusion
• Introduced a novel way to collect translation and paraphrase data
• Conducted a successful pilot data collection on Mechanical Turk– Tiered payment system– High-quality persistent workforce
• All data (including most of the videos) available for download at
http://www.cs.utexas.edu/users/ml/clamp/videoDescription/