crowdsourcing is for the tail
DESCRIPTION
Talk given at the Dagstuhl Seminar 14282 "Crowdsourcing and the Semantic Web"TRANSCRIPT
Crowdsourcing is for the tail
Gianluca DemartinieXascale Infolab
University of Fribourg, Switzerland
gianlucademartini.netexascale.info
Crowdsourced Data Curation
• Enforce quality and coverage in KBs• To curate tail entity structured representation• Leveraging the diversity of the crowd• Targeted Crowdsourcing
The long tail of entity popularity
Tail Entities
• Local restaurants• Niches sport domains (chess, cricket)• Emerging music bands• Rare diseases
Gianluca Demartini 7
Improving Crowdsourcing Platforms
8
Push Crowdsourcing
• Pick-A-Crowd: A system architecture that uses Task-to-Worker matching:– The worker’s social profile – The task context
• Workers can provide higher quality answers on tasks they relate to
Djellel Eddine Difallah, Gianluca Demartini, and Philippe Cudré-Mauroux. Pick-A-Crowd: Tell Me What You Like, and I'll Tell You What to Do. In: 22nd International Conference on World Wide Web (WWW 2013), Rio de Janeiro, Brazil, May 2013.
9
Pick-A-Crowd
10
Discussion
• Task-to-Worker recommendation / Matchmaking
• Experimental comparison with AMT shows a consistent quality improvement
“Workers Know what they Like”
Gianluca Demartini 11
OpenTurk
• Yet another a platform? Build on top of Mturk!• Chrome Extension for push / notification• 400+ users• http://bit.ly/openturk-extension• Open source: https
://github.com/openturk/extension
Transactive Search
Transactive Search
• Transactive Memories• Transactive Search:– Memory reconstructed by a group of people– Need to target the right people– A form Targeted Crowdsourcing
• “Who attended the ISWC 2013 conference?”
Gianluca Demartini 14
Transactive Search
• Machines: Harvest the Web + Data Mining• Crowd: Search twitter, look at event pictures• Transactive Memories: Remember who I met
Michele Catasta, Alberto Tonon, Djellel Eddine Difallah, Gianluca Demartini, Karl Aberer, and Philippe Cudré-Mauroux. Hippocampus: Answering Memory Queries using Transactive Search. In: 23rd International Conference on World Wide Web (WWW 2014), Web Science Track. Seoul, South Korea, April 2014.
Gianluca Demartini 15
Who attended ISWC 2013?
Conclusions
• Crowdsourcing For Tail Entities• Focusing on the difficult part of the KB– The tail is long!
• Challenges– Which tail entities are valuable?– Who is the right worker?– Focus on passion rather than monetary incentives