presenter: chien-ju ho 2009.4.21. introduction to amazon mechanical turk applications ...

Presenter: Chien-Ju Ho 2009.4.21

Introduction to Amazon Mechanical Turk Applications Demographics and statistics The value of using MTurk Repeated labeling A machine-learning perspective

Automaton Chess Player built in 80s.

Human Intelligence Task (HIT) Tasks hard for computers Developer Prepay the money Publish HITs Get results Worker Complete the HITs Get paid

User Survey

Image Tagging

Data Collection

Audio Transcription Split the audio into 30sec pieces Image Filtering Filter porn or inappropriate image Lots of applications

It depends on the task. Some information: Payment >= 0.01: 586 Payment >= 0.05: 357 Payment >= 0.10: 264 Payment >= 0.50: 74 Payment >= 1.00: 48 Payment >= 5.00: 5

Survey on 1000 Turkers Conduct the survey twice (Dec. 2008 and Oct. 2008) Consistent statistics Blog Post: A Computer Scientist in a Business School A Computer Scientist in a Business School Where are Turkers from? United States76.25% India 8.03% United Kingdom 3.34% Canada 2.34%

Degree Age Gender Income/year

Use the data from ComScore In summary, Tukers are younger Portion of 21-35 years old: 51% vs. 22% in internet mainly female 70% female vs. 50 % female having lower income 65% turkers with income < 60k/year vs. 45% in internet having smaller family 55% turkers have no children vs. 40% in internet

Victor S. Sheng, Foster Provost, and Panagiotis G. Ipeirotis New York University KDD 2008

Imperfect labeling Amazon mechanical Turk Games with a purpose Repeated labeling Improve the supervised induction Increase the single-label accuracy Decrease the cost for acquiring training data

Increase single-label accuracy Decrease cost for training data Labeling is cheap (using MTurk or GWAP) Obtaining data sample might be expensive (taking new pictures, feature extraction)

How repeated labeling influence quality of the label accuracy of the model cost of acquiring data and the label Selections of data points to label repeatedly

Uniform labeler quality All labelers exhibit the same quality p p is the probability labeler label correctly For 2N+1 labelers, the label quality q is Label quality for different settings of p

Different labeler quality Repeated labeling is helpful in some cases An example: three labelers with quality p, p+d, p-d Repeated labeling is preferable to single labeler with quality p+d when settings is in the blue region No detailed analysis in the paper

Majority voting (MV) Simple and intuitive Drawback of information lost Uncertainty-preserved labeling Multiplied Example procedure (ME) Using frequency as the weight of the label

Round-robin strategy Label the example with the fewest labels Repeated label the examples in a fixed order

The definition of the cost C U : the cost for the unlabeled portion C L : the cost for labeling Single labeling (SL): Acquire a new training example cost C U +C L Repeated labeling with majority vote (MV) Get another label for existing example cost C L

Round-robin strategy, C U

presenter: chien-ju ho 2009.4.21. introduction to amazon mechanical turk applications ...

Documents

label slide

internet slide

paper slide

paid slide

training data slide

data collection slide

image tagging slide

user survey slide