trustworthy micro-task crowdsourcing: challenges and opportunities

55

Click here to load reader

Upload: alessandro-bozzon

Post on 22-Jan-2018

486 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

TRUSTWORTHY MICRO-TASK CROWDSOURCING: CHALLENGES AND OPPORTUNITIES

ALESSANDRO BOZZON - DELFT UNIVERSITY OF TECHNOLOGY

Page 2: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

ABOUT ME

Page 3: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

http

://bl

og.e

lect

ricbr

icks

.com

/?at

tach

men

t_id

=166

21

About Me

Assistant Professor @TU Delft Web Information Systems, Social Data Science

Faculty Fellow @ IBM Benelux Inclusive Enterprise

Research Fellow @ AMS Social Sensing, Smart Citizens

Web Engineering

Web Science

Information Retrieval

User Modelling

Crowdsourcing

Human Computation

Page 4: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

Create

,Inte

rpre

tEngage, Retain

Machines

10/04/15 09:55

Page 1 of 1http://uxrepo.com/static/icon-sets/linecons/svg/database.svg

Data

People

How can humans and machines better collaborate in solving (computational) problems?

Process

Descr

ibe R

ealit

y

trough Peo

ple

Interact

Fundamental and Experimental Research

A so

cio-

tech

nica

l sys

tem

How can human-generated Web data be transformed into a source that informs Web system design?

How to enhance Web-based systems with automated, large-

scale human computation?

Page 5: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

Application Domains

3 use-cases, spanning different contexts and societal/industrial needs

Workforce Well-being

enterprise crowdsourcing

Knowledge Creation and Acceleration

online content creation

Intelligent Cities

urban sensing / crowd sensing

Page 6: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

OUTLINE

▸ What is trust?

▸ Challenges

▸ Opportunities

Page 7: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

WHAT IS TRUST?

Page 8: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

WHAT IS TRUST TO YOU?

Page 9: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

THE ECONOMICS OF TRUST

Trust = Speed Cost

Trust = Speed Cost

Page 10: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

TRUST: I KNOW IT WHEN I SEE IT

▸ Trust is a complex social phenomenon

▸ There is no universally accepted scholarly definition

▸ What is meant by “trust” differs from discipline to discipline

▸ Has varying importance at different stages of relationship development

▸ Confidence that [one] will find what is desired [from another] rather than what is feared.

Morton Deutsch (psychologist)

considered the founder of modem theory and research on trust

Page 11: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

TRUST: ACCEPTED DEFINITIONS

▸ “Expectations of benign behaviour from someone in a socially uncertain situation due to beliefs about the person’s dispositions (including his feelings towards you)”

▸ “Psychological state comprising the intention to accept vulnerability based upon positive expectations of the intentions or behaviour of another”

Rousseau, D. M., Sitkin, S. B., Burt, R. S. and Camerer, C. (1998) ‘Not So Different After All: A Cross-discipline View of Trust’, Academy of Management Review 23(3): 393–404.

Yamagishi, T.. Trust: The evolutionary game of mind and society. New York, NY: Springer.

Page 12: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

TRUSTWORTHINESS

▸ “The character trait of a trustee, that is, his or her disposition to act in an altruistic or ethical manner even when the action is not backed up by self-interest”

▸ Trustworthiness is a trait of a trustee, trust is a trait of a truster

Yamagishi, T.. Trust: The evolutionary game of mind and society. New York, NY: Springer.

Page 13: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

▸ A speaker's ethos (being trustworthy) is based on:

▸ Professional competence, spirited personal integrity (aretê)

▸ Intelligent good sense, practical wisdom (phronêsis)

▸ Good will and respect (eúnoiâ)

MODELS OF TRUST /1 - ARISTOTLE’S RHETORIC

Page 14: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

MODELS OF TRUST /2

▸ Three characteristics determine the perceived trustworthiness:

▸ Ability: skills, competencies, and characteristics that enable a party to have influence within some specific domain

▸ Benevolence: the extent to which a trustee is believed to want to do good to the trustor, aside from an egocentric profit motive

▸ Integrity: involves the trustor's perception that the trustee adheres to a set of principles that the trustor finds acceptable

▸ Each of the three factors can vary along a continuum

An Integrative Model of Organizational TrustRoger C. Mayer, James H. Davis and F. David SchoormanThe Academy of Management Review Vol. 20, No. 3 (Jul., 1995), pp. 709-734

14K Citations

1995 Mayer, Davis, and Schoorman 715

FIGURE 1 Proposed Model of Trust

Factors of Perceived

Trustworthiness

| | Ability 1 ~~Perceived Risk

Benevolence Trust Relationship Outcomes

| Propensity |

measure focuses on a generalized trust of others-something akin to a personality trait that a person would presumably carry from one situation to another. For example, typical items in his scale are "In dealing with strangers one is better off to be cautious until they have provided evi- dence that they are trustworthy" and "Parents usually can be relied upon to keep their promises."

Several other authors have discussed trust in similar ways. For ex- ample, Dasgupta's treatment of trust includes generalized expectations of others; for example, "Can I trust people to come to my rescue if I am about to drown?" (1988: 53; emphasis added). Similarly, Farris, Senner, and But- terfield (1973: 145) defined trust as "a personality trait of people interact- ing with peripheral environment of an organization." In this approach trust is viewed as a trait that leads to a generalized expectation about the trustworthiness of others. In the proposed model this trait is referred to as the propensity to trust.

Propensity to trust is proposed to be a stable within-party factor that will affect the likelihood the party will trust. People differ in their inherent propensity to trust. Propensity might be thought of as the general will- ingness to trust others. Propensity will influence how much trust one has for a trustee prior to data on that particular party being available. People with different developmental experiences, personality types, and cultural backgrounds vary in their propensity to trust (e.g., Hofstede, 1980). An example of an extreme case of this is what is commonly called blind trust. Some individuals can be observed to repeatedly trust in situations that most people would agree do not warrant trust. Conversely, others are unwilling to trust in most situations, regardless of circumstances that would support doing so.

Page 15: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

MODELS OF TRUST /2

▸ Risk is inherent in the behavioural manifestation of the willingness to be vulnerable ▸ To have trust, no need to risk anything; but trusting action implies taking a risk

▸ Trust is not involved in all risk-taking behaviour

▸ Context matters! ▸ Stakes involved

▸ Balance of power in the relationship

▸ The alternatives available to the trustor

An Integrative Model of Organizational TrustRoger C. Mayer, James H. Davis and F. David SchoormanThe Academy of Management Review Vol. 20, No. 3 (Jul., 1995), pp. 709-734

14K Citations

1995 Mayer, Davis, and Schoorman 715

FIGURE 1 Proposed Model of Trust

Factors of Perceived

Trustworthiness

| | Ability 1 ~~Perceived Risk

Benevolence Trust Relationship Outcomes

| Propensity |

measure focuses on a generalized trust of others-something akin to a personality trait that a person would presumably carry from one situation to another. For example, typical items in his scale are "In dealing with strangers one is better off to be cautious until they have provided evi- dence that they are trustworthy" and "Parents usually can be relied upon to keep their promises."

Several other authors have discussed trust in similar ways. For ex- ample, Dasgupta's treatment of trust includes generalized expectations of others; for example, "Can I trust people to come to my rescue if I am about to drown?" (1988: 53; emphasis added). Similarly, Farris, Senner, and But- terfield (1973: 145) defined trust as "a personality trait of people interact- ing with peripheral environment of an organization." In this approach trust is viewed as a trait that leads to a generalized expectation about the trustworthiness of others. In the proposed model this trait is referred to as the propensity to trust.

Propensity to trust is proposed to be a stable within-party factor that will affect the likelihood the party will trust. People differ in their inherent propensity to trust. Propensity might be thought of as the general will- ingness to trust others. Propensity will influence how much trust one has for a trustee prior to data on that particular party being available. People with different developmental experiences, personality types, and cultural backgrounds vary in their propensity to trust (e.g., Hofstede, 1980). An example of an extreme case of this is what is commonly called blind trust. Some individuals can be observed to repeatedly trust in situations that most people would agree do not warrant trust. Conversely, others are unwilling to trust in most situations, regardless of circumstances that would support doing so.

Page 16: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

MODELS OF TRUST /3

▸ Trust can be split into three different components ▸ Calculative: based on a rational choice (risk - return calculation) ▸ As in Mayer et. al

▸ Institutional: confidence in regulatory factors (e.g. a legal system that protects individuals’ rights and property) ▸ which promote the creation of trust

▸ Relational: derived over time from repeated interactions between trustor and trustee ▸ covers such factors as familiarity and experience with each other

Rousseau, D. M., Sitkin, S. B., Burt, R. S. and Camerer, C. (1998) ‘Not So Different After All: A Cross-discipline View of Trust’, Academy of Management Review 23(3): 393–404.

1998 Rousseau, Sitkin, Burt, and Camerer 401

research in Hungary and the United States,Pearce and Brzenksky (in press) see a minimumlevel oi institutional trust as sine qua non ior theemergence oi interpersonal trust.

The possibility oi a step function characteriz-ing the role institutional trust plays in shapinginterpersonal trust remains a subject for futureresearch. Nonetheless, a variety of institutionaliactors, including legal forms, social networks,and societal norms regarding conflict manage-ment and cooperation, are likely to interact increating a context for interpersonal and interor-ganizational trust.

The variations we observe in the bandwidth oftrust across relationships suggest there may bea tension between acting out of self-interest(agency) and acting out of the interests of abroader collective (community). Sabel (1993) sug-gests that we need to move away from thinkingof trust as rational self-interest toward a sharedsense of community with a common fate. Beliefscan arise in communities that lead to avoidanceof exploitation, where trusting others is a condi-tion of membership. Such shared understand-ings between individuals or between firms canarise out of interactions and from shared or com-mon knowledge.

Indeed, some societies have penalties for put-ting oneself ahead of community interests(Hearn, 1904, as cited by Hagen & Choe, thisissue). High power distance societies, such asJapan, may build obligations into societal roles,creating codes of conduct that reinforce collec-tive behavior through relational sanctions. Inlow power distance societies, such as NorthAmerica, mechanisms that support repeated in-

teractions, including stable employment, net-work ties, and laws protecting property rights ofindividuals and firms, may also enable trust(Nooteboom, Berger, & Noorderhaven, 1997). Ourdiscussion of bandwidth suggests that institu-tional mechanisms can play a critical role inshaping the mix of trust and distrust that exists.

In Figure 1 we model the three basic forms oitrust (calculative, relational, and institutional)with respect to the issue of bandwidth. (Notethat we conclude that deterrence is not trust andexclude it from the model.)

The various forms trust can take—and thepossibility that trust in a particular situationcan mix several forms together—account forsome of the apparent confusion among scholars.Conceptualizing trust in only one form in agiven relationship risks missing the rich diver-sity of trust in organizational settings. Recogniz-ing that, in a given relationship, trust has abandwidth (which may exist to different degreesbetween the same parties, depending on thetask or setting) introduces the idea that experi-ences over the life of a relationship may lead topendulum swings. The interests of each partyseparately and their mutual concerns might bemet to a limited degree at any single point intime—but to a large degree over the life of therelationship.

Is Trust in Transition?The form and context of organizations are in

transition. We observe in society a move towardsmall-scale relations (Miles & Creed, 1995; Miles& Snow, 1992). In this era of more flexible forms

HGURE IA Model of Trust

Early Middle

Developmental time

Later

Page 17: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

TRUST-WARRANTING PROPERTIES

▸ “Trust is produced through complex information processing rather than by simplification of information”

▸ Only systems that support the exchange of reliable trust cues - and thus allow for correct trust attribution - will be viable in the long run

▸ The goal is to encourage trustworthy action

▸ And – subsequently – trust

▸ Technology can help

▸ Transmit signals of trust prior to trusting action (e.g. a reputation score)

▸ Be the channel for trusting action (e.g. performed work)

▸ Be used for fulfillments (e.g. compensate work)

Yamagishi, T.. Trust: The evolutionary game of mind and society. New York, NY: Springer.

Page 18: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

LACK OF TRUST IN MICRO-TASK CROWDSOURCING

Page 19: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

Vs.

THE COMMON NARRATIVE

Page 20: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

THIEVES VS. DICTATORS

THE COMMON NARRATIVE

▸ Workers are malicious

▸ Bad work = malicious workers

▸ Workers are people that fill in time, to make some extra money

▸ No need to pay too much, quality != reward

▸ Spammer requesters

▸ Requesters are unfair

▸ Requesters are forgetful

▸ AMT does nothing, so we are invisible, and without power

Page 21: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

FROM THE REQUESTER’S POINT OF VIEW

2. MECHANICS OF TRUST In this section we lay the foundation for a framework of trust in technology-mediated interactions. We start by introducing structural conditions that define risky exchanges. Our key concern is to identify (1) the factors that allow for the formation of trust, and (2) the incentives they provide for trustworthy behavior. This knowledge can be used to identify how new technologies, which enable new types of interactions or transform existing ones, can support trust.

2.1 The Basic Model Trust is only required in situations that are characterized by risk and uncertainty. Only if something is at stake, and only if the outcome of a situation is uncertain, do we need to trust. Uncertainty, and thus the need for trust, stems from our lack of detailed knowledge about the other actor’s abilities and motivation (Deutsch, 1958). If we had accurate insight into their reasoning, trust would not be an issue (Giddens, 1990). We develop our framework from the sequential interaction between two actors (not always people): trustor (the trusting actor) and trustee (the trusted actor). Figure 1 shows a model of a prototypical trust-requiring situation.

Figure 1: The basic interaction between trustor and trustee.

We have two actors about to engage in an exchange. Both can realize some gain by conducting the exchange. The exchange might involve money, but it also applies to information, time, or other goods that have value to the actors. Our exchange may depict a one-off encounter, or a ‘snapshot’ of an established relationship consisting of many subsequent exchanges. Prior to the exchange, trustor and trustee perceive information about each other (1). If trustor and trustee are separated in space (i.e. if their interaction is mediated by technology), less information might be available; a factor that can increase uncertainty (Giddens, 1990). The trustor has to make the first move; she can only achieve a benefit by first engaging in some form of trusting action (2a). Goods that are risked by trusting action (2a) are not only financial, but also anything with utility to the trustor: time, personal information, or psychological gratification. Even the act of trusting itself can be seen as an investment, because misplaced trust can not only lead to the loss of the invested good, but also the psychological cost of having acted naively (Lahno, 2002a).

Outside Option 1 Signals

TRUSTEETRUSTOR

Separation in Time

Separation in Space

+ UNCERTAINTY

+ UNCERTAINTY

2a Trusting Action2b Withdrawal

3a Fulfilment 3b Defection

RISK

Picture adapted from: The Mechanics of Trust: A Framework for Research and Design (2005)

1. REPUTATION / QUALIFICATION CALCULATIVE TRUST

2A. OFFER PRECIOUS DATA TO WORK WITH

3A. EXECUTE TASK 3B. ABANDON

2B. NOPE, NOT HERE

REQUESTER WORKER

THE TRUST CREATION PROCESS

Page 22: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

AMT MARKETPLACE IS A HIGH- NOISE ENVIRONMENT WHERE LOW-QUALITY WORKERS LIKE SPAMMERS ARE PREVALENT

Wang, Ipeirotis, ProvostManaging Crowdsourcing Workers.

2011

Page 23: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

http://turkrequesters.blogspot.nl/2013/01/the-reasons-why-amazon-mechanical-turk.html

Page 24: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

FROM THE WORKER’S POINT OF VIEW

2. MECHANICS OF TRUST In this section we lay the foundation for a framework of trust in technology-mediated interactions. We start by introducing structural conditions that define risky exchanges. Our key concern is to identify (1) the factors that allow for the formation of trust, and (2) the incentives they provide for trustworthy behavior. This knowledge can be used to identify how new technologies, which enable new types of interactions or transform existing ones, can support trust.

2.1 The Basic Model Trust is only required in situations that are characterized by risk and uncertainty. Only if something is at stake, and only if the outcome of a situation is uncertain, do we need to trust. Uncertainty, and thus the need for trust, stems from our lack of detailed knowledge about the other actor’s abilities and motivation (Deutsch, 1958). If we had accurate insight into their reasoning, trust would not be an issue (Giddens, 1990). We develop our framework from the sequential interaction between two actors (not always people): trustor (the trusting actor) and trustee (the trusted actor). Figure 1 shows a model of a prototypical trust-requiring situation.

Figure 1: The basic interaction between trustor and trustee.

We have two actors about to engage in an exchange. Both can realize some gain by conducting the exchange. The exchange might involve money, but it also applies to information, time, or other goods that have value to the actors. Our exchange may depict a one-off encounter, or a ‘snapshot’ of an established relationship consisting of many subsequent exchanges. Prior to the exchange, trustor and trustee perceive information about each other (1). If trustor and trustee are separated in space (i.e. if their interaction is mediated by technology), less information might be available; a factor that can increase uncertainty (Giddens, 1990). The trustor has to make the first move; she can only achieve a benefit by first engaging in some form of trusting action (2a). Goods that are risked by trusting action (2a) are not only financial, but also anything with utility to the trustor: time, personal information, or psychological gratification. Even the act of trusting itself can be seen as an investment, because misplaced trust can not only lead to the loss of the invested good, but also the psychological cost of having acted naively (Lahno, 2002a).

Outside Option 1 Signals

TRUSTEETRUSTOR

Separation in Time

Separation in Space

+ UNCERTAINTY

+ UNCERTAINTY

2a Trusting Action2b Withdrawal

3a Fulfilment 3b Defection

RISK

Picture adapted from: The Mechanics of Trust: A Framework for Research and Design (2005)

1. ????????

2A. PERFORM TASK

3A. PAY / BONUS 3B. DON’T PAY

2B. IGNORE

WORKER REQUESTER

INFORMATION ASYMMETRY / LACK OF TRANSPARENCY / POWER DIFFERENTIAL

THE TRUST CREATION PROCESS

Page 25: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

https://www.reddit.com/r/mturk/comments/1gkb1l/how_it_feels_sometimes_as_a_turker/?ref=share&ref_source=link

https://www.reddit.com/r/mturk/comments/1bni10/yesterday_when_i_was_trying_to_explain_turking_to/?ref=share&ref_source=link

Page 26: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

WE CAN BE REJECTED YET THE REQUESTERS STILL HAVE OUR ARTICLES AND SENTENCES. NOT FAIR

A workerMartin, D., Hanrahan, B.V., O’Neill, J., and Gupta, N. Being a turker. In

Proc. CSCW 2014. 2014

Page 27: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

I WOULD LIKE TO SEE THE ABILITY TO RETURN A HIT AS DEFECTIVE SO IT DINGS THE REQUESTER’S REPUTATION AND NOT MINE. LET’S FACE IT, IF I’M SUPPOSED TO FIND AN ITEM FOR SALE ON AMAZON BUT THEY SHOW ME A CHILD’S CRAYON DRAWING...THERE REALLY NEEDS TO BE A WAY TO HANDLE THAT WITHOUT IT ALTERING MY NUMBERS

A workerM. Six Silberman, Lilly Irani, and

Joel Ross. 2010. Ethics and tactics of professional crowdwork. XRDS

17, 2 (December 2010)

Page 28: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

WHAT ABOUT INSTITUTIONAL TRUST?

Page 29: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

“I DON’T CARE ABOUT THE PENNY I DIDN’T EARN FOR KNOWING THE DIFFERENCE BE- TWEEN AN APPLE AND A GIRAFFE. I’M ANGRY THAT AMT WILL TAKE REQUESTERS’ MONEY BUT NOT MANAGE, OVERSEE, OR MEDIATE THE PROBLEMS AND INJUSTICES ON THEIR SITE. “

A workerM. Six Silberman, Lilly Irani, and

Joel Ross. 2010. Ethics and tactics of professional crowdwork. XRDS

17, 2 (December 2010)

Page 30: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

IS THIS A RELEVANT DISCUSSION? YEAH! - TO SOME, IT’S A JOB

TO OTHERS, A 2ND SOURCE OF INCOME

Page 31: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

IS THIS A RELEVANT DISCUSSION? YEAH! - YOU PAY TAXES

Page 32: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

IS THIS A RELEVANT DISCUSSION? YEAH! - GOOD REQUESTERS NEED GOOD WORKERS

Page 33: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

PLANSOURCING: GENERATING BEHAVIOR CHANGE PLANS WITH FRIENDS AND CROWDS

CSCW2016 - E. AGAPIE, L. COLUSSO, S. A. MUNSON, G. HSIEH

▸ PlanSourcing: Generating Behavior Change Plans with Friends and Crowds

▸ IDEA: Seek help from strangers in online task markets (upwork and Amazon Mechanical Turk), to suggest personalised behaviour change plan

▸ There is a social costs in asking for help (e.g. worrying about being judged)

▸ Participants primarily expressed concerns with regard to friends

▸ More comfortable sharing information with strangers

▸ More diverse recommendations from strangers than friends

Each worker created one plan for one participant. Planners were given the participant’s description, goal, and activity log and asked to create a one week plan to help the person exercise more, eat healthier, or save more money. The planners had three days to create the plan. Workers on Mechanical Turk and oDesk were limited to working at most two hours on the plan. The planners were provided with a similar structure as the activity logs (Figure 2) in which they could create their plan. Other than that, we provided no other constraints so planners may flexibility structure and present their plans.

A sample of an activity log and instructions provided to the planner are available at the following github link: https://github.com/eagapie/PlanSourcing-Generating-Behavior-Change-Plans-with-Friends-and-Crowds.

Recruitment and Participants Through Craigslist and a university mailing list, we recruited participants interested in increasing their physical activity, eating healthier, or saving money. Participants were screened using a survey to include only participants who were (1) not actively working towards their chosen behavior and (2) were considering or planning to change their behaviors (in the contemplative or planning stage of the transtheoretical model for behavior change [20]). Further, participants had to be willing to contact up to three friends to help them create a one-week long behavior change plan.

79 people completed the screener survey. Of these, we enrolled 63 participants in the study. 41 participants did not complete the study, either by not filling in their activity logs or not contacting a friend. 22 participants completed all the steps of the study. Out of the 22 participants, 8 had exercise goals, 8 had diet goals, and 6 had financial goals. Their ages ranged from 19 to 45 (mean=28 years), 17 were female and 5 male (Appendix 1).

Participants prepared a week-long activity log, recruited a friend, received plans prepared by three people (the recruited friend and two crowdworkers, discussed below), and completed a post-study survey and interview. They were compensated with Amazon gift cards: $20 for logging activity for a week and contacting a friend and $40 for the final interview.

Planners For these 22 participants, we recruited 66 planners, including friends and workers on Mechanical Turk and oDesk. 60 of these planners completed a follow up survey (discussed in the next sections). Out of those, 19 were male, 40 female, and one identified as other. There were more females than males among oDesk planners: two male and 19 females. ODesk planners were recruited from the Personal Assistants role on oDesk. We chose this category because it included some oDesk workers with expertise in exercise, finance, or nutrition. However, these categories were not clearly delimited, so we recruited Personal Assistants more broadly.

The friend planners were recruited by the participants, and had known the participants for an average of 13 years, with a range of 1 to 35 years. Most friend planners were close to the participants: family members, spouses, siblings, other close friends. Only one friend planner was recruited from the broader network of Facebook friends of the participant. Each planner reported talking with the participants at least a couple of times per week. Friends were compensated by entering a lottery for one $25 Amazon gift card per every 10 participants. Planners on Mechanical Turk were compensated with $5 per task. Workers from oDesk bid for the task, varying between $3 and $33. The workers who received the lowest hourly wages had no reputation and said they wanted to perform work at low-cost while building their reputation. All crowdworkers were selected from the US.

Planner and participant surveys and interviews We explored the participants’ assessment of the plans quality through a survey, which included quantitative measure of

Figure 1. Study Structure

Figure 2. Example Template that participants used for

creating a food plan

IS THIS A RELEVANT DISCUSSION? YEAH! - GOOD REQUESTERS NEED GOOD WORKERS

Page 34: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

ISSUES ARE WELL-KNOWN

1. Uncertainty about payment

2. Unaccountable and seemingly arbitrary rejections

3.The apparent prevalence of fraudulent requesters

4. Prohibitive time limits

5. Long pay delays

6. Uncommunicative requesters and administrators

7. Technologically defective HITs

8. HITs with unclear or inadequate instructions

9. Low pay, arbitrariness of bonuses

10. Data privacy

M. Six Silberman, Joel Ross, Lilly Irani, and Bill Tomlinson. 2010. Sellers' problems in human computation markets. ACM SIGKDD Workshop on Human Computation (HCOMP '10)

2010!

Page 35: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

ISSUES ARE WELL-KNOWN / 2

▸ “MTurk is not a game or a social network, it is an unregulated labor marketplace: a system which deliberately does not pay fair wages, does not pay due taxes, and provides no protections for workers.“

▸ Much of the challenges posed by this kind of work can be attributed to the anonymity of all parties, unchecked authority of the requester to decide payment terms, and the general imbalance of information

Karën Fort, Gilles Adda, and K. Bretonnel Cohen. 2011. Amazon mechanical turk: Gold mine or coal mine?. Comput. Linguist. 37, 2 (June 2011), 413-420.

Benjamin B. Bederson and Alexander J. Quinn. 2011. Web workers unite! addressing challenges of online laborers. In CHI '11 Extended Abstracts on Human Factors in Computing Systems (CHI EA '11)

2011!

Page 36: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

▸ Workers are not happy with the wage (and treatment) they receive

▸ Information, opportunity, and choice are all rather limited

▸ On AMT choice and opportunity are largely determined by:

▸ experience

▸ ratings

▸ skills and qualifications

▸ information

Being A Turker David Martin, Benjamin V. Hanrahan, Jacki O’Neill

Xerox Research Centre Europe 6 chemin de Maupertuis, Grenoble France

{david.martin, ben.hanrahan, jacki.oneill}@xrce.xerox.com

Neha Gupta University of Nottingham

University Park NG7 2TD Nottingham [email protected]

ABSTRACT We conducted an ethnomethodological analysis of publicly available content on Turker Nation, a general forum for Amazon Mechanical Turk (AMT) users. Using forum data we provide novel depth and detail on how the Turker Nation members operate as economic actors, working out which Requesters and jobs are worthwhile to them. We show some of the key ways Turker Nation functions as a community and also look further into Turker-Requester relationships from the Turker perspective – considering practical, emotional and moral aspects. Finally, following Star and Strauss [25] we analyse Turking as a form of invisible work. We do this to illustrate practical and ethical issues relating to working with Turkers and AMT, and to promote design directions to support Turkers and their relationships with Requesters.

Author Keywords Ethnomethodology; content analysis; crowdsourcing; microtasking; Amazon Mechanical Turk; Turker Nation.

ACM Classification Keywords H.5.3 Group and Organizational Interfaces – Computer, Supported Cooperative Work

General Terms Human Factors

INTRODUCTION The concept of crowdsourcing was originally defined by Jeff Howe of Wired Magazine as “the act of a company or institution taking a function once performed by employees and outsourcing it to an undefined (and generally large) network of people in the form of an open call.” [8] This ‘undefined network of people’ is the key topic of this article. We present the findings of an ethnomethodological analysis of posts and threads on a crowdsourcing forum called Turker Nation1. We have sought to understand members of the crowd – their reasoning practices, concerns, and relationships with requesters and each other – as they are shown in their posts on the forum. We seek to present them as faithfully as possible, in their own words, in order to provide more definition to this network of people. We

1 http://turkernation.com/forum.php

believe that this will be beneficial for researchers and businesses working within the crowdsourcing space.

Crowdsourcing encompasses multiple types of activity: invention, project work, creative activities, and microtasking. This latter is our focus here. The most well-known microtask platform is Amazon Mechanical Turk (AMT)2, and the Turker Nation forum that we studied is dedicated to users of this platform. The basic philosophy of microtasking and AMT is to delegate tasks that are difficult for computers to do to a human workforce. This has been termed ‘artificial artificial intelligence’. Tasks like image tagging, duplicate recognition, translation, transcription, object classification, and content generation are common. ‘Requesters’ (the AMT term for people who have work to be completed) post multiple, similar jobs as Human Intelligence Tasks (HITs), which can then be taken up by registered ‘Turkers’. Turkers (termed ‘Providers’ by AMT) are the users completing the HITs, which typically take seconds or minutes paid at a few cents at a time.

For Amazon, the innovative idea was to have an efficient and cost effective way to curate and manage the quality of content on their vast databases (weeding out duplicates, vulgar content, etc.). While Amazon is still a big Requester, AMT has been deployed as a platform and connects a wide variety of Requesters with up to 500,000 Providers. However, Fort et al. [6] have performed an analysis on the available data and suggest that real number of active Turkers is between 15,059 and 42,912; and that 80% of the tasks are carried out by the 20% most active (3,011–8,582) Turkers. While these numbers are useful, the research community still has little deep qualitative knowledge about this workforce. Questions remain unanswered such as: how and what do they look for in jobs; what are their concerns; and how do they relate to requestors?

LITERATURE REVIEW To date much of the research on AMT takes the employers’ perspective, e.g. [14, 15, 17, 18], and this has in turn been highlighted [6, 16]. Silberman et al. [23] note that this mainstream research looks at how: “[to] motivate better, cheaper and faster worker performance […] to get good data from workers, quickly and without paying much.” When it comes to the Turkers themselves, research is more limited,

1 http://turkernation.com/forum.php 2 http://www.mturk.com

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the fullcitation on the first page. Copyrights for components of this work owned by othersthan ACM must be honored. Abstracting with credit is permitted. To copy otherwise,or republish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee. Request permissions from [email protected]. CSCW'14, February 15–19, 2014, Baltimore, Maryland, USA. Copyright © 2014 ACM 978-1-4503-2540-0/14/02...$15.00. http://dx.doi.org/10.1145/2531602.2531663

CSCW 2014 • Performing Crowd Work February 15-19, 2014, Baltimore, MD, USA

224

Page 37: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

WORKERS, UNITE!

▸ Workers organised in communities, outside AMT

▸ TurkerNation, mTurkForum, etc

▸ Workers maintain and repair AMT, to help it work as intended ▸ They help resolve breakdowns

▸ They advise employers about flawed task designs or bugs

▸ They teach each other how to use tools

▸ They also help each other ▸ Suggesting good HITS

▸ Instructing newcomers

▸ Discussing requesters

Web workers unite! addressing challenges of online laborers BB Bederson, AJ Quinn CHI'11 Extended Abstracts on Human Factors in Computing Systems, 97-106

Page 38: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

CHI 2015: N. SALEHI, L.C. IRANI, M,S. BERNSTEIN, A. ALKHATIB, E. OGBE, K. MILLAND

WE ARE DYNAMO: OVERCOMING STALLING AND FRICTION IN COLLECTIVE ACTION FOR CROWD WORKERS

▸ A platform to support the Mechanical Turk community in forming publics around issues and then mobilizing

▸ GOALS: reduce stalling and friction

Page 39: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

THE PROBLEM WITH ONLINE COMMUNITIES

▸ Efforts that require sustained effort and critical mass are less likely to succeed

▸ Divided loyalties

▸ Time pressures to earn money

▸ Risks that agitation poses to their reputations

▸ disputes over members who were suspected of operating multiple accounts

▸ making a statement taken as insult

▸ The forums archive these interactions and reconciliation can be difficult online

▸ Members of online-only communities may struggle to achieve trust Dahlberg, L. Computer mediated communication and the public sphere: A critical analysis. Journal of Computer Mediated Communication, 2001.

Niloufar Salehi, Lilly C. Irani, Michael S. Bernstein, Ali Alkhatib, Eva Ogbe, Kristy Milland, and Clickhappier. 2015. We Are Dynamo: Overcoming Stalling and Friction in Collective Action for Crowd Workers. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15)

Page 40: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

A SUCCESS STORY: TURKOPTICON

▸ Measuring benevolence and integrity of requesters

▸ communicativity: How responsive has this requester been to communications or concerns you have raised?

▸ generosity: How well has this requester paid for the amount of time their HITs take?

▸ fairness: How fair has this requester been in approving or rejecting your work?

▸ promptness: How promptly has this requester approved your work and paid?

Lilly Irani and M.SixSilberman.2013. Turkopticon: Interrupting Worker Invisibility on Amazon Mechanical Turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2013), 611-620.

Page 41: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

IMPROVING THE EXCHANGE OF TRUST CUES

Page 42: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

FROM VISION PAPER

Publish

Execute

Participate

Assess,discuss

Requester Worker community

Worker

Platform

Task

Task modeling Worker modeling Interaction cycle Security & privacy

Page 43: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

THE ROLE OF SOFTWARE

▸ Workers rely on number of Turking tools

▸ A list of all requesters

▸ Script to record worker history

▸ Client-side scripts to hide HITs posted by particular requesters

▸ Scripts to monitor the market status

▸ Shouldn’t these tool be made “official”?

▸ To cope with lack of institutional trust?

Page 44: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

MINIMISE REQUESTERS’ ERRORS▸ (semi) automated task analysis, to check and assess tasks before

submission

▸ To check for clarity of instruction

▸ To assess complexity ▸ reward according ▸ establish fair completion time

▸ To improve verifiability and transparency — Kittur et al. (2008)

▸ By making the submission easy to verify, quality management will be facilitated

Page 45: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

MEASURING CROWDSOURCING EFFORT WITH ERROR-TIME CURVES▸ Crowdsourcing systems lack effective measures of the effort required to

complete each task

▸ Objective measures could help task selection, and reward estimation

▸ Error-time area model the effort required for a worker to accurately complete a task 1.Calculated by recruiting workers to complete the task under different time

limits

2.Fit curve 3.Calculate ETA (area under the curve)

CHI 2015: JUSTIN CHENG, JAIME TEEVAN, AND MICHAEL S. BERNSTEIN

Evaluating Crowdsourcing CHI 2015, Crossings, Seoul, Korea

1365

Page 46: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

UIST ’15: STANFORD CROWD RESEARCH COLLECTIVE

DAEMO: A SELF-GOVERNED CROWDSOURCING MARKETPLACE ▸ A self-governed crowdsourcing marketplace

▸ open-governance model to achieve equitable representation

▸ Prototype task to improve the work quality

▸ milestones used to build a common ground and adjust the task description.

▸ It also facilitates discussing cost and time to do a job

Daemo: a Self-Governed Crowdsourcing MarketplaceStanford Crowd Research Collective ⇤

Stanford HCI [email protected]

ABSTRACTCrowdsourcing marketplaces provide opportunities for au-tonomous and collaborative professional work as well as so-cial engagement. However, in these marketplaces, workersfeel disrespected due to unreasonable rejections and low pay-ments, whereas requesters do not trust the results they re-ceive. The lack of trust and uneven distribution of poweramong workers and requesters have raised serious concernsabout sustainability of these marketplaces. To address thechallenges of trust and power, this paper introduces Daemo, aself-governed crowdsourcing marketplace. We propose a pro-totype task to improve the work quality and open-governancemodel to achieve equitable representation. We envisageDaemo will enable workers to build sustainable careers andprovide requesters with timely, quality labor for their busi-nesses.

Author Keywordscrowdsourcing; crowd research; crowd work.

ACM Classification KeywordsH.5.3. Group and Organization Interfaces: Computer-supported cooperative work

INTRODUCTIONPaid crowdsourcing marketplaces such as Mechanical Turkand Upwork have created opportunities for workers to sup-plement their income and enhance their skills, while allowingrequesters to get their work completed efficiently. These mar-ketplaces have attracted many participants globally; however,they have repeatedly failed to ensure high-quality results, fair⇤ This project was created via a world-wide, crowdsourced researchprocess initiated at Stanford University: S. Gaikwad, D. Morina, R.Nistala, M. Agarwal, A. Cossette, R. Bhanu, S. Savage, V. Narwal,K. Rajpal, J. Regino, A. Mithal, A. Ginzberg, A. Nath, K. R. Zi-ulkoski, T. Cossette, D. Gamage, A. Richmond-Fuller, R. Suzuki, J.Herrejon, K. V. Le, C. Flores-Saviaga, H. Thilakarathne, K. Gupta,W. Dai, A. Sastry, S. Goyal, T. Rajapakshe, N. Abolhassani, A.Xie, A. Reyes, S. Ingle, V. Jaramillo, M.D. Godinez, W. Angel, M.Godinez, C. Toxtli, J. Flores, A. Gupta, V. Sethia, D. Padilla, K. Mil-land, K. Setyadi, N. Wajirasena, M. Batagoda, R. Cruz, J. Damon, D.Nekkanti, T. Sarma, M.H. Saleh, G. Gongora-Svartzman, S. Bateni,G. Toledo-Barrera, A. Pena, R. Compton, D. Aariff, L. Palacios, M.P. Ritter, Nisha K.K., A. Kay, J. Uhrmeister, S. Nistala, M. Esfahani,E. Bakiu, C. Diemert, L. Matsumoto, M. Singh, V. Jaramillo-Lopez,K. Patel, R. Krishna, G. Kovacs, R. Vaish, M. Bernstein

Permission to make digital or hard copies of part or all of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for third-party components of this work must be honored.For all other uses, contact the Owner/Author.Copyright is held by the owner/author(s).UIST ’15 Adjunct, November 08-11, 2015, Charlotte, NC, USA.ACM 978-1-4503-3780-9/15/11http://dx.doi.org/10.1145/2815585.2815739

Figure 1. Task creation workflow for a requester: prototype task cre-

ation, initial submissions review, and hiring high quality workers for

future milestones. [https://daemo.stanford.edu].

Icon courtesy Font Awesome by Dave Gandy - http://fontawesome.io

wages, respect for workers, and convenience in authoring ef-fective tasks [1].

From our interviews with requesters, it has become clear thatthey struggle to trust their workers. They will rerun tasks,discard gathered data, and add increasingly complex workerfilters. On the other hand, workers do not trust requestersto follow through with pay and fair treatment. In response,workers often withhold their full effort unless they have anexperience with the requester.

Moreover, existing marketplaces suffer from uneven distribu-tions of power [4]. For example, requesters have the powerto deny payments for finished tasks and workers have inade-quate means to contest this. Operational governance and ruleshave been secondary considerations on markets thus far, fit-ted to support the focus on the commoditizing of work. Thisresulted in an asymmetrical relationship between workers, re-questers, and the marketplace on fronts such as parity of in-formation access, wage negotiation, and reputation. A com-mon complaint [3]: “We can be rejected yet the requestersstill have our articles and sentences. Not Fair.”

We present Daemo, a crowd-built, self-governed crowdsourc-ing marketplace. To increase trust, we introduce the idea ofprototype tasks, where each new task must first launch in anintermediate feedback mode where workers can comment onthe task, requesters can review the submissions and qualify asubset of workers to continue. During this phase, workers andrequesters work together to refine the task description and re-duce errors. Daemo also adopts a representative democraticgovernance model to elect a leadership board. Engaging allvested parties in the governance of the marketplace gives anopportunity to create genuine worker-requester relationshipsand redefine the future of work.

RELATED WORKFeedback, wages, task decomposition, and quality control aresome of the fundamental elements of a successful crowd-

101

Page 47: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

ASIST 2011: JÖRN KLINGER, MATTHEW LEASE

ENABLING TRUST IN CROWD LABOR RELATIONS THROUGH IDENTITY SHARING

▸ Link Crowdsourcing IDs to social network profiles

▸ No disclosing of sensitive data

▸ Reduced the anonymity in the Crowdmarket

▸ Accountability for malicious work

Assuming that a user is already signed up to both platforms, we first authorize the Facebook application, by simply navigating to its URL and clicking “Allow”.

Figure 1. Authorizing the application.

After authorizing with the application, we see the main menu. Since no ID linking has taken place so far, the options available are to link the Facebook profile to a worker or a requester ID (it is possible to link a single profile to both a worker and a requester ID).

Figure 2. Linking the Facebook profile to the worker ID.

The next step in this process is to click on "Link your Facebook profile to a worker ID." On the next screen, we see the information that the application reads from our Facebook profile. The current prototype accesses gender, birthday, current location, hometown and languages spoken. To give an idea of upcoming features, the prototype allows one to specify their skills in these languages.

Figure 3. Specifying language skills and entering worker ID to proceed with the linking of IDs.

We now enter our AMT worker ID into the respective field, confirm that ID and click "Submit Information". Once the

information is submitted, it is temporarily stored in a centralized database. An activation code is sent to the email address associated with the previously specified worker ID.

Figure 4. An activation code is send to the email associated with the users AMT worker account.

Then we retrieve confirmation code from that email and enter it into the application’s main menu. Subsequently, the worker’s information is permanently stored in the database.

Figure 5. The confirmation code is entered.

Once identities have been linked, workers can use the application to find jobs suited for them. Through the main menu, requesters manage and create groups of workers with specific skills or backgrounds and advertise jobs to them. The current prototype allows requesters to use hometown, current location and languages spoken as criteria when searching the centralized database for workers. Once a panel is established, the requester can issue jobs to the panel. To do so, they create the job on AMT, enter the link to the job into the “Job-URL” field, specify a code that the worker will have to enter as a qualification for the job, and add a brief description. Then they click “Send to Panel” in order to advertise the job to one of their worker panels.

Figure 6. Requesters can group workers into “panels” issue tasks to these panels as a group.

Analogously, workers see a list of jobs offered to them based on their skills and background.

Figure 7. The Worker HQ allows workers to access and manage the jobs offered to them.

For each job, workers can see the specified task link, qualification code, description, and a link to the requester’s (corporate) Facebook profile, giving workers the opportunity to investigate the requester before taking a job.

CONCLUSION AND FUTURE WORK We have discussed how fraud and spam currently limit practical benefits of Crowdsourcing. We also reviewed existing solutions. We described our approach, which, instead of fighting the symptoms of spam, aims to prevent it by increasing trust relationships between workers and requesters. The applications allows users to link their Crowdsourcing IDs to social network profiles, which (without disclosing sensitive data) takes away some of the anonymity in the Crowdmarket leading to the possibility of holding users responsible for their fraud. While the current prototype is fairly basic and supports Facebook and AMT, future versions will extend to inter-operate with other public identity mechanisms and crowdsourcing platforms. The approach also lends itself to a history / reputation system, allowing employers to suggest fitting jobs to workers or workers to requesters based on their history. A potential concern is reduced anonymity leading to online slander. In the case of our application faulty behavior in online labor could affect users' private online persona. The severity of this issue, as well as possible intermediate solutions in which avatar’s are not disposable, but the workers’ anonymity is preserved nonetheless will be the subject of a large-scale controlled user study.

In this study, Crowdworkers and requesters will use the prototype for a series of tasks varying in their structure and complexity. For example, some tasks will benefit from workers having experience in similar tasks - this allows testing potential advantages of a creating a stable panel of workers, while other tasks are simpler and do not require such prior knowledge. In addition to measuring time invested, money spent/earned as well as the amount of spam and fraud, participants will evaluate their satisfaction with the application and rate how trustworthy they found the workers/requesters they interacted with. Results will be

checked against a control group issuing/working on a similar set of task, but using bare AMT sans the application.

We hypothesize that identity sharing will significantly increase trust and consequently decrease spam and fraud, simultaneously improving cost-efficiency through better matching workers to tasks.

REFERENCES Eagle, N. (2009). txteagle: Mobile crowdsourcing.

Internationalization. Design and Global Development, 447-456.

Horton, J.J. and Chilton, L.B. (2010). The labor economics of paid crowdsourcing. Proceedings of the 11th ACM conference on Electronic commerce, 209-218.

Howe, J. (2006) The Rise of Crowdsourcing, Wired, 14(6), URL (accessed 12 May; 2011): http://www.wired.com/wired/archive/14.06/crowds.

Ipeirotis, P. (2010). Mechanical Turk: Now with 40.92% spam. Accessed 12 May, 2011. http://behind-the-enemy-lines.blogspot.com/2010/12/mechanical-turk-now-with-4092-spam.html

Ipeirotis, P. (2010a). Be a Top Mechanical Turk Worker: You Need $5 and 5 Minutes. Accessed 12 May, 2011. htttp://behind-the-enemy-lines.blogspot.com/2010/10/be-top-mechanical-turk-worker-you-need.html

Ipeirotis 2010b: Mechanical Turk, Low Wages, and the Market for Lemons. Accessed 12 May, 2011. http://behind-the-enemy-lines.blogspot.com/2010/07/ mechanical-turk-low-wages-and-market.html.

Ipeirotis P., Provost, P. & Wang, J. (2010). Quality management on amazon mechanical turk. In HCOMP ’10, New York, NY, USA.

Kittur, A., Chi, E., Suh, B. (2007). Crwodsourcing User Studies With Mechanical Turk. CHI 2007: ACM Conference on Human-factors in Computing Systems.

Law, E., Bennett, P. & Horvitz, E. (2011). The Effects of Choice in Routing Relevance Judgments . To appear as a short-paper in Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2011).

Mason, W. & Suri, S. (2010). Conducting behavioral research on Amazon's Mechanical Turk. Available at SSRN: http://ssrn.com/abstract=1691163.

McCreadie, R., Macdonald, C. & Ounis, I. (2010. Crowdsourcing a news query classification dataset. In the SIGIR 2010 Workshop on Crowdsourcing for search evaluation (CSE 2010), 31-38.

Parameswaran, A. & Polyzotis, N. (2011). Answering Queries using Humans, Algorithms and Databases. 5th Biennial Conference on Innovative Data Systems Research (CIDR'11).

Page 48: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

▸ IDEA: Increasing social transparency to enhance accountability

▸ Sharing of demographic information

▸ Peer-dependent reward schemes (team and competition - pairs )

▸ Social transparency lead to better results when information shared with their colleagues

▸ No difference in peer-dependent reward schemes

▸ Increasing social transparency = concern for privacy issues

CHI 2013: SHIH-WEN HUANG AND WAI-TAT FU

DON’T HIDE IN THE CROWD! INCREASING SOCIAL TRANSPARENCY BETWEEN PEER WORKERS IMPROVES CROWDSOURCING OUTCOMES

Don’t Hide in the Crowd! Increasing Social TransparencyBetween Peer Workers Improves Crowdsourcing Outcomes

Shih-Wen Huang and Wai-Tat FuUniversity of Illinois at Urbana-Champaign

Department of Computer ScienceUrbana, IL 61801

{shuang51,wfu}@illinois.edu

ABSTRACTThis paper studied how social transparency and differentpeer-dependent reward schemes (i.e., individual, teamwork,and competition) affect the outcomes of crowdsourcing. Theresults showed that when social transparency was increasedby asking otherwise anonymous workers to share their de-mographic information (e.g., name, nationality) to the pairedworker, they performed significantly better. A more detailedanalysis showed that in a teamwork reward scheme, in whichthe reward of the paired workers depended only on the col-lective outcomes, increasing social transparency could off-set effects of social loafing by making them more account-able to their teammates. In a competition reward scheme, inwhich workers competed against each other and the rewarddepended on how much they outperformed their opponent,increasing social transparency could augment effects of so-cial facilitation by providing more incentives for them to out-perform their opponent. The results suggested that a carefulcombination of methods that increase social transparency anddifferent reward schemes can significantly improve crowd-sourcing outcomes.

Author KeywordsCrowdsourcing; human computation; social transparency;social facilitation; social loafing

ACM Classification KeywordsH.5.m. Information Interfaces and Presentation (e.g. HCI):Miscellaneous

INTRODUCTIONCrowdsourcing has been proven as an effective way to solvevarious kinds of problems [8]. One of the most notable ex-amples is the ESP game [26], which recruits people to gen-erate image labels while playing an online game. By 2008,this game had recruited 200,000 players and collected morethan 50 million labels [27]. In addition, individuals can also

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.CHI 2013, April 27–May 2, 2013, Paris, France.Copyright 2013 ACM 978-1-4503-1899-0/13/04...$15.00.

Figure 1. Sharing demographic information between paired workersallows the crowdsourcing system to collect outcomes with higher quality.

easily recruit online workers from Amazon Mechanical Turk(AMT)1 to solve problems that are difficult for digital com-puters (e.g., collecting labeled data for natural language pro-cessing [23] and relevance evaluation [1]) at a very low cost.These examples have merely begun to demonstrate the po-tential of crowdsourcing as a social computing technique thatcan be applied in a wide range of situations.

However, quality control is still one of the biggest issues forcrowdsourcing2 [11]. For example, although the ESP gamecan successfully recruit players to create a large amount ofimage labels, a study [22] showed that many of these are low-quality labels that can be generated by robots with little realworld knowledge. Moreover, the reliability of workers re-cruited from AMT is also questionable. People have foundthat there are many spammers or even robots in AMT, whichgreatly reduce the quality of the outputs collected from thisplatform [8]. As a result, designing mechanisms that can en-sure the quality of crowdsourcing outcomes is crucial to itssuccess.

1https://www.mturk.com/mturk/welcome2Crowdsourcing and human computation are two highly overlappedfields. Though the foci of these two fields are slightly different, re-searchers sometimes use these two terms interchangeably. We usethe term crowdsourcing throughout because the findings of our studycan be applied to diverse applications; however, research in humancomputation is also highly related to this study. For more discus-sion about the similarities and differences between crowdsourcingand human computation, please see Quinn et al. [21].

Page 49: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

CSCW 2013: P. KINNAIRD, L, DABBISH, S. KIESLER, H. FASTE

CO-WORKER TRANSPARENCY IN A MICROTASK MARKETPLACE

▸ IDEA: show an image displaying whether or not there are coworkers

▸ Co-worker information can significantly influence worker motivation and work quality.

▸ When 1 out of many, reduced feelings of task significance and work importance.

shown an image displaying whether or not they had co-workers (see Figure 1).

The effect of co-worker information on worker motivation must be carefully tested. On the one hand, if workers learn they have some co-workers, they might become more motivated because they feel like part of a social entity. Common identity engenders loyalty and social feelings among members [10,12]. A worker learning he has co-workers might feel good that he has compatriots, and is sharing their fate. A recent study in which MTurk workers were shown where they stood in a workflow suggests this is a possibility [14]. On the other hand, information about co-workers could backfire if workers receiving this information feel their work is redundant and they are a small part of a large, impersonal process.

How many co-workers? A related research question is whether employers should inform workers about the number of their co-workers. What if this number is large? We propose that too many co-workers will undermine worker motivation. Research on the group size shows that small groups of fewer than 10 people are more cohesive than larger groups whereas people in larger groups also feel less effective and tend to contribute less [2,8]. In one study of helping conducted in three cities, a large crowd of bystanders was less likely to help someone who dropped coins or pencils than a small

group of bystanders, suggesting diffusion of responsibility in larger crowds [19]. Larger numbers of co-workers could also suggest to workers that they are redundant, meaning their work is less valuable to the final product. Studies show that perceived redundancy reduces worker effort and motivation [7,8]

TWO EXPERIMENTS ON CO-WORKER VISIBILITY We conducted two between-subjects experiments on co-worker visibility in Mechanical Turk. We drew on the literature above to pose the following hypotheses:

H1: Workers informed they have a small number of coworkers will perform better quality work than those not informed about the existence of co-workers.

H2: Workers informed they have a small number of coworkers will perform better quality work than those informed they have more coworkers.

Experiment 1: Method Participants were paid 25 cents for completing a HIT in MTurk. Workers were recruited between 2 PM EST on April 10, 2012 and 8 AM EST on April 12, 2012. In the HIT, we presented participants with a list of 18 statements made in the past year by prominent politicians in the U.S., evenly split between Democrats and Republicans and displayed in random order. For example, one statement was, “Health care is something that the American people

Figure 1. Co-worker information in MTurk Experiment 1.

Page 50: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

PICK-A-CROWD: TELL ME WHAT YOU LIKE, AND I'LL TELL YOU WHAT TO DO▸ Workers profile (ability?) are built from social networking profiles

▸ Workers are assigned tasks based on their interests

▸ Assumption: workers will perform these tasks given their existing interests

▸ Average a 29% relative improvement over the best accuracy obtained by the AMT model5.2. System Architecture

Figure 5.1 – Pick-A-Crowd Component Architecture. Task descriptions, Input Data, and aMonetary Budget are taken as input by the system, which creates HITs, estimates their difficultyand suggests a fair reward based on the skills of the crowd. HITs are then pushed to selectedworkers and results get collected, aggregated, and finally returned back to the requester.

5.2.2 HIT Generation, Difficulty Assessment, and Reward Estimation

The first pipeline in the system is responsible for generating the HITs given some input dataprovided by the requester. HITs can for instance be generated from i) a Web template to classifyimages in pre-defined categories, together with ii) a set of images and iii) a list of pre-definedcategories. The HIT Generator component dynamically creates as many tasks as required (e.g.,one task per image to categorize) by combining those three pieces of information.

Next, the HIT Difficulty Assessor takes each HIT and determines a complexity score for it.This score is computed based on both the specific HIT (i.e., description, keywords, candidateanswers, etc.) and on the worker profiles (see Section 5.3 for more detail on how such profilesare constructed). Different algorithms can be implemented to assess the difficulty of the tasksin our framework. For example, a text-based approach can compare the textual description ofthe task with the skill description of each worker and compute a score based on how manyworkers in the crowd could perform well on such HITs.

An alternative a more advanced prediction method can exploit entities involved in the taskand known by the crowd. Entities are extracted from the textual descriptions of the tasks anddisambiguated to LOD entities. The same can be performed on the worker profiles: eachFacebook page that is liked by the workers can be linked to its respective LOD entities. Thenthe set of entities representing the HITs and the set of entities representing the interests of thecrowd can be directly compared. The task is classified as difficult when the entities involved inthe task heavily differ from the entities liked by the crowd.

73

WWW ’13: D.E. DIFALLAH, G. DEMARTINI, P. CUDRÉ-MAUROUX.

Page 51: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

BEYOND SIMPLE ABILITY MEASURES

▸ Trust clues are currently built on a “result-centred” interpretation of reliability

▸ Modelling abilities (expertise, skills) can help, but now not fully exploited

▸ What about benevolence, integrity, and intent of workers?

Page 52: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

BEYOND MONETARY REWARDS▸ Non-cash rewards are known to trigger emotional

responsehttps://www.reddit.com/r/mturk/comments/4dogxk/been_turking_for_2_weeks_recently_hit_1000

https://www.reddit.com/r/mturk/comments/2sy6vc/woke_up_checked_my_email_and_holy_f***

The relative relativity of material and experiential purchases. Carter, Travis J.; Gilovich, Thomas Journal of Personality and Social Psychology, Vol 98(1), Jan 2010, 146-159.

WHY COULDN’T AMT DO THE SAME?

Page 53: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

THE ROLE OF COMMUNITIES

▸ On-line worker communities are now “isolated” from crowdsourcing platforms

▸ Integrate knowledge from communities inside the platform

▸ Beyond requester assessment

▸ To express, in more details, relational trust

Page 54: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

THE BEST TIME TO PLANT A SEED IS 20 YEARS AGO. THE SECOND BEST TIME IS TODAY

CHINESE PROVERB

Page 55: Trustworthy Micro-task Crowdsourcing: Challenges and Opportunities

Deadline approaching (June 5, 2016) Keynotes: Maarten de Rijke (Naturally Intelligent Search) Philippe Cudre-Mauroux  (Entity-Centric Data Management) Special issue!