task and workflow design i kse 801 uichin lee. turkit: human computation algorithms on mechanical...

23
Task and Workflow Design I KSE 801 Uichin Lee

Upload: dwight-campbell

Post on 23-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

Task and Workflow Design I

KSE 801Uichin Lee

Page 2: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

TurKit: Human Computation Algorithms on Mechanical Turk

Greg Little, Lydia B. Chilton, Rob Miller, and Max Goldman

(MIT CSAIL)UIST 2010

Page 3: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

Workflow in M-Turk

HIT

HIT

HIT

HIT

HIT

HIT

Data Collected

in CSV File

Requester posts HIT Groups to

Mechanical Turk

Data Exported for Use

Page 4: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

Workflow: Pros & Cons

• Easy to run simple, parallelized tasks.• Not so easy to run tasks in which turkers

improve on or validate each others’ work.

• TurKit to the rescue!

Page 5: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

The TurKit Toolkit

• Arrows indicate the flow of information.

• Programmer writes 2 sets of source code:– HTML files for web

servers– JavaScript executed by

TurKit

• Output is retrieved via a JavaScript database.

Turkers

Mechanical Turk

Web Server TurKit

*.html *.js

Programmer

JavaScript Database

Page 6: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

Crash-and-rerun programming model

• Observation: local computation is cheap, but the external class cost money

• Managing states over a long running program is challenging– Examples: Computer restarts? Errors?

• Solution: store states in the database (in case)• If an error happens, just crash the program and re-run by

following the history in DB– Throw a “crash” exception; the script is automatically re-run.

• New keyword “once”: – Remove non-determinism– Don’t need to re-execute an expensive operation (when re-run)

• But why should we re-run???

Page 7: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

Example: quicksort

Page 8: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

Parallelism

• First time the script runs, HITs A and C will be created

• For a given forked branch, if a task fails (e.g., HIT A), TurKit crashes the forked branch (and re-run)

• Synchronization w/ join()

Page 9: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

MTurk Functions

• Prompt(message, # of people)– mturk.prompt("What is your favorite color?", 100)

• Voting(message, options)• Sort(message, items)

VOTE() SORT()

Page 10: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

TurKit: Implementation

• TurKit: Java using Rhino to interpret JavaScript code, and E4X2 to handle XML results from MTurk

• IDE: Google App Engine3 (GAE)

Online IDE

Page 11: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

Exploring Iterative and Parallel Human Computation Processes

Greg Little, Lydia B. ChiltonMax Goldman, Robert C. Miller

HCOMP 2010

Page 12: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

HC Task Model

• Dimension: – Dependent (iterative) or independent (parallel) tasks – Creation and decision tasks

• Task model examples

Creation tasks (creating new content): e.g., writing ideas,

imagery solutions, etc.

Decision tasks (voting/rating): e.g., rating the quality of a description of

an image

Page 13: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

HC Task Model

• Combining tasks: iterative and parallel tasks

Iterative pattern: a sequence of creation tasks where the result of each task feeds into the next one, followed by a comparison task

Parallel pattern: a set of creation tasks executed in parallel, followed by a task of choosing the best

Page 14: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

Experiment: Writing Image Description

• Iterative vs. parallel; each 6 creation tasks ($0.02), followed by rating tasks (1-10 scale, $0.01)

Page 15: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

Experiment: Writing Image Description

• Turkers in iterative condition gave better description while parallel condition always shows an empty text area.

Page 16: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

Experiment: Writing Image Description

• Average rating after n iterations– After six iterations: 7.9 vs. 7.4, t-test T29=2.1, p=0.04

iterative

parallel

Page 17: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

Experiment: Writing Image Description

• Length vs. rating: positive correlation

• The two outliers (circled) represent instances of text copied from the Internet (with superficial description)

Length (characters)

Ratin

g

Page 18: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

Experiment: Writing Image Description

• Work Quality:– 31% mainly append content at the end, and make only minor

modifications (if any) to existing content; – 27% modify/expand existing content, but it is evident that they use

the provided description as a basis;– 17% seem to ignore the provided description entirely and start over;– 13% mostly trim or remove content; – 11% make very small changes (adding a word, fixing a misspelling,

etc);– 1% copy-paste superficially related content found on the internet.

• Creating vs. improving (takes about the same time, avg. 211 seconds)

Page 19: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

Experiment: Brainstorming

Page 20: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

Experiment: Brainstorming

• Iterative work: higher average rating– Biased thinking: e.g., tech -> xxtech -> yytech

• Parallel work: diversity, higher deviation (rating) – No iteration for brainstorming

Iteration Rating

Avg.

Rati

ng

iterative

parallel

Page 21: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

Example: Blurry Text Recognition

Page 22: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

Example: Blurry Text Recognition

• Iterative performs better than parallel

Iteration

Accu

racy

Page 23: Task and Workflow Design I KSE 801 Uichin Lee. TurKit: Human Computation Algorithms on Mechanical Turk Greg Little, Lydia B. Chilton, Rob Miller, and

Summary

• TurKit: a flexible programming tool for m-turk

• Various work-flow can be designed; e.g., iterative, parallel, and hybrid

• Iterative performs better than parallel in several cases (e.g., image description, brainstorming, text recognition)