ieee transactions on mobile computing, vol. 17...

13
Data Quality Guided Incentive Mechanism Design for Crowdsensing Dan Peng, Fan Wu , Member, IEEE, and Guihai Chen, Senior Member, IEEE Abstract—In crowdsensing, appropriate rewards are always expected to compensate the participants for their consumptions of physical resources and involvements of manual efforts. While continuous low quality sensing data could do harm to the availability and preciseness of crowdsensing based services, few existing incentive mechanisms have ever addressed the issue of data quality. The design of quality based incentive mechanism is motivated by its potential to avoid inefficient sensing and unnecessary rewards. In this paper, we incorporate the consideration of data quality into the design of incentive mechanism for crowdsensing, and propose to pay the participants as how well they do, to motivate the rational participants to efficiently perform crowdsensing tasks. This mechanism estimates the quality of sensing data, and offers each participant a reward based on her effective contribution. We also implement the mechanism and evaluate its improvement in terms of quality of service and profit of service provider. The evaluation results show that our mechanism achieves superior performance when compared to general data collection model and uniform pricing scheme. Index Terms—Crowdsensing, incentive mechanism, quality estimation, maximum likelihood estimation, information theory Ç 1 INTRODUCTION C ROWDSENSING is a new paradigm of applications that enables the ubiquitous mobile devices with enhanced sensing capabilities to collect and to share local information towards a common goal [1], [2]. In recent years, a wide vari- ety of applications have been developed to realize the potential of crowdsensing throughout everyday life, such as environmental quality monitoring [3], [4], noise pollution assessment [5], [6], road and traffic condition monitoring [7], [8], bus arrival time prediction [9], [10], road-side park- ing statistics [11], [12], and indoor localization [13], [14]. However, the success of crowdsensing based services criti- cally depends on sufficient and reliable data contributions from individual participants. Sensing, processing, and transmitting data in crowdsens- ing applications requires manual efforts and physical resources. Therefore, appropriate rewards are always expected to compensate the owners of task-taking mobile devices. These owners, or say participants in the literature of crowdsensing, are commonly assumed to be rational, and will not take sensing tasks and make contributions unless there are sufficient incentives. Although researchers have proposed a number of incentive mechanisms for participa- tion in crowdsensing [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], they have not fully exploited the connection between quality of sensing data and rewards for contributions. Sensing data of high quality, based on which the crowdsensing service provider aggregates and extracts information for accurate decision making and attentive service providing, is fundamentally important. In crowd- sensing, data quality can be affected by the difficulty of sensing tasks, the characteristics of mobile sensors, the clarity of task instructions, as well as the expertise and willingness of individual participants [30], [31]. Particu- larly, participants with different spatial-temporal contexts and personal effort levels are likely to submit sensing data of diverse quality. Furthermore, rational participants tend to strategically minimize their efforts, while doing the sensing tasks, and thus may degrade the quality of sensing data. For example, careless or indifferent submissions are always found in crowdsensing based noise monitoring applications. When asked for environmental sound heard of neighborhood, a participant may perform the sensing tasks through a mobile device placed inside her pocket, rather than carefully taking out the device to sense accu- rately. Such a low quality submission would invalidate the estimation of noise pollution. Examples can also be found in crowdsourcing based services. A recent case study on crowdsourcing spam attacks [32] shows that the crowd services can be maliciously manipulated. Continuous low quality sensing data undoubtedly do harm to the availability and preciseness of crowdsensing based services, and quality control should be an important concern in crowd services. However, to the best of our knowledge, few existing works have taken the observation of data quality into consideration, when designing incentive mechanisms for crowdsensing, to guarantee the quality of crowdsensing based services. It is very challenging to The authors are with the Department of Computer Science and Engineering, Shanghai Key Laboratory of Scalable Computing and Systems, Shanghai Jiao Tong University, Minhang, Qu 200240, China. E-mail: [email protected], {fwu, gchen}@cs.sjtu.edu.cn. Manuscript received 21 Nov. 2015; revised 3 May 2017; accepted 28 May 2017. Date of publication 12 June 2017; date of current version 5 Jan. 2018. (Corresponding author: Fan Wu.) For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference the Digital Object Identifier below. Digital Object Identifier no. 10.1109/TMC.2017.2714668 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17, NO. 2, FEBRUARY 2018 307 1536-1233 ß 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See ht_tp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Upload: others

Post on 28-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17 ...cs.sjtu.edu.cn/~fwu/res/Paper/PWC18TMC.pdfThecrowdsensingprocess(therightpart)canbediscribedas follows. First, the service provider

Data Quality Guided Incentive MechanismDesign for Crowdsensing

Dan Peng, Fan Wu ,Member, IEEE, and Guihai Chen, Senior Member, IEEE

Abstract—In crowdsensing, appropriate rewards are always expected to compensate the participants for their consumptions of

physical resources and involvements of manual efforts. While continuous low quality sensing data could do harm to the availability and

preciseness of crowdsensing based services, few existing incentive mechanisms have ever addressed the issue of data quality. The

design of quality based incentive mechanism is motivated by its potential to avoid inefficient sensing and unnecessary rewards. In this

paper, we incorporate the consideration of data quality into the design of incentive mechanism for crowdsensing, and propose to pay

the participants as how well they do, to motivate the rational participants to efficiently perform crowdsensing tasks. This mechanism

estimates the quality of sensing data, and offers each participant a reward based on her effective contribution. We also implement the

mechanism and evaluate its improvement in terms of quality of service and profit of service provider. The evaluation results show that

our mechanism achieves superior performance when compared to general data collection model and uniform pricing scheme.

Index Terms—Crowdsensing, incentive mechanism, quality estimation, maximum likelihood estimation, information theory

Ç

1 INTRODUCTION

CROWDSENSING is a new paradigm of applications thatenables the ubiquitous mobile devices with enhanced

sensing capabilities to collect and to share local informationtowards a common goal [1], [2]. In recent years, a wide vari-ety of applications have been developed to realize thepotential of crowdsensing throughout everyday life, such asenvironmental quality monitoring [3], [4], noise pollutionassessment [5], [6], road and traffic condition monitoring[7], [8], bus arrival time prediction [9], [10], road-side park-ing statistics [11], [12], and indoor localization [13], [14].However, the success of crowdsensing based services criti-cally depends on sufficient and reliable data contributionsfrom individual participants.

Sensing, processing, and transmitting data in crowdsens-ing applications requires manual efforts and physicalresources. Therefore, appropriate rewards are alwaysexpected to compensate the owners of task-taking mobiledevices. These owners, or say participants in the literatureof crowdsensing, are commonly assumed to be rational, andwill not take sensing tasks and make contributions unlessthere are sufficient incentives. Although researchers haveproposed a number of incentive mechanisms for participa-tion in crowdsensing [15], [16], [17], [18], [19], [20], [21], [22],[23], [24], [25], [26], [27], [28], [29], they have not fully

exploited the connection between quality of sensing dataand rewards for contributions.

Sensing data of high quality, based on which thecrowdsensing service provider aggregates and extractsinformation for accurate decision making and attentiveservice providing, is fundamentally important. In crowd-sensing, data quality can be affected by the difficulty ofsensing tasks, the characteristics of mobile sensors, theclarity of task instructions, as well as the expertise andwillingness of individual participants [30], [31]. Particu-larly, participants with different spatial-temporal contextsand personal effort levels are likely to submit sensingdata of diverse quality. Furthermore, rational participantstend to strategically minimize their efforts, while doingthe sensing tasks, and thus may degrade the quality ofsensing data.

For example, careless or indifferent submissions arealways found in crowdsensing based noise monitoringapplications. When asked for environmental sound heardof neighborhood, a participant may perform the sensingtasks through a mobile device placed inside her pocket,rather than carefully taking out the device to sense accu-rately. Such a low quality submission would invalidate theestimation of noise pollution. Examples can also be found incrowdsourcing based services. A recent case study oncrowdsourcing spam attacks [32] shows that the crowdservices can be maliciously manipulated.

Continuous low quality sensing data undoubtedly doharm to the availability and preciseness of crowdsensingbased services, and quality control should be an importantconcern in crowd services. However, to the best of ourknowledge, few existing works have taken the observationof data quality into consideration, when designing incentivemechanisms for crowdsensing, to guarantee the quality ofcrowdsensing based services. It is very challenging to

� The authors are with the Department of Computer Science and Engineering,Shanghai Key Laboratory of Scalable Computing and Systems, ShanghaiJiao TongUniversity,Minhang, Qu 200240, China.E-mail: [email protected], {fwu, gchen}@cs.sjtu.edu.cn.

Manuscript received 21 Nov. 2015; revised 3 May 2017; accepted 28 May2017. Date of publication 12 June 2017; date of current version 5 Jan. 2018.(Corresponding author: Fan Wu.)For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference the Digital Object Identifier below.Digital Object Identifier no. 10.1109/TMC.2017.2714668

IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17, NO. 2, FEBRUARY 2018 307

1536-1233� 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See ht _tp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17 ...cs.sjtu.edu.cn/~fwu/res/Paper/PWC18TMC.pdfThecrowdsensingprocess(therightpart)canbediscribedas follows. First, the service provider

design quality based incentive mechanisms for crowdsens-ing. We list three major challenges:

� Quality Estimaton: It is technically difficult to estimatethe quality of sensing data without any prior knowl-edge of the sensing behavior of individual participantsor the ground truth of targeted contexts. Subsequentquality verification would require significant invest-ments in deploying particular infrastructures to doon-site sensing andground truth collecting, likeModel831-NMS permanent noise monitoring system[6].Lacking in flexibility and scalability, the deploymentof traditional static sensing infrastructures, in turn,negates the benefits of crowdsensing.

� Incentive Design: It is challenging to design incentivemechanisms that achieve both individual rationalityand profit maximization. Here, individual rationalitymeans that a participant should be rewarded no lessthan her sensing cost, and the profit of service pro-vider is the difference between the value of crowd-sensing based services and the total rewards toparticipants. Deliberate incentive mechanisms arerequired to motivate effective data contributionsfrom rational participants, and to maintain a robust,profitable market for crowdsensing service provider.

� Effective Feedback: It is nontrivial to bridge the gapbetween quality of sensing data and rewards for con-tributions. Participants of crowdsensing, who per-form the sensing tasks with heterogeneous physicalresources and manual efforts, and therefore submitsensing data of diverse quality, may require appro-priate rewards according to their contributions.Whiletraditional uniform pricing scheme is unfair, the Pay-as-Bid pricing method used in most of the auctionbased incentive mechanisms is somehow trouble-some for participants and indulgent of carelessbehavior. Feedback on data quality would be neces-sary to encourage long-term, effective contributions.

In this paper, we incorporate the consideration of dataquality into the design of incentive mechanism, and pro-pose to pay the rational participants as how well they do, tomotivate efficient crowdsensing.

Our main contributions are listed as follows.

� We propose to design a quality based incentivemechanism that directly motivates individual partic-ipants to submit high quality sensing data for long-term, effective crowdsensing.

� Second, we extend the well-known Expectation Max-imization algorithm that combines maximum likeli-hood estimation and Bayesian inference to estimatethe quality of sensing data, and further apply theclassical Information Theory to measure the effectivedata contribution. Based on the estimated qualityand contribution, we determine fair and properrewards to the participants. The incentive mecha-nism achieves individual rationality and (approxi-mate) profit maximization.

� Finally, we implement and extensively evaluate theincentive mechanism. Our evaluation results showthat it achieves superior performance in terms ofquality assurance and profit management, when

compared to general data collection model and uni-form pricing scheme.

The rest of the paper is organized as follows. We presentthe crowdsensing model, design objectives, and technicalpreliminaries in Section 2, and then describe the detaileddesign of our quality based incentivemechanism in Section 3.In Section 4, we extend our design to support crowdsensingsystems where the service value depends on an overall qual-ity of sensing data from all the participants. In Section 5, weevaluate our incentive mechanism and show the results. InSection 6, we briefly review related work. Finally, we discussthe limitations of our incentive mechanism in Section 7, andconclude the paper and future work in Section 8.

2 PRELIMINARIES

In this section, we present the crowdsensing model, and keytechniques for quality estimation. Table 1 lists the notationsand descriptions used in the paper.

2.1 Crowdsensing Model

As illustrated by Fig. 1, there are three major components inthe crowdsensing system, i.e., service subscribers whorequest services, a service provider who conducts the crowd-sensing campaign and provides services, and a crowd of par-ticipants who submit sensing data to support the services.

TABLE 1Key Notations

Notation Definition

T Set of sensing tasksA Set of participantsD Set of discrete noise intervalsAt Set of participants who complete task t 2 TTk Set of tasks that ak 2 A performsS Set of observed sensing dataP Set of missing true noise interval indicatorsE Set of unknown effort matricesLðE;P; SÞ Likelihood function of Eek Effort matrix of akekij Probability that ak submits data in interval dj

while the true interval is diP Noise interval distributionpt True noise interval indicator for task tpti Probability of task twith true noise interval

being didkt Noise interval that ak’s sensing data for task t

falls intoIðdkt ¼ djÞ Indicator function for the event dkt ¼ djqk Quality of ak’s sensing dataxmðqkÞ Contribution of sensing data of quality qkvmðqkÞ Marginal contribution of sensing data of qual-

ity qk to the cooperatively achieved servicevalue

ck Reserve price/sensing cost of akrk Reward to ak for her contributionrmk Marginal reward to ak for her marginal contri-

butionV Value gained from qualified sensing datavðWÞ Overall value gained from sensing data of par-

ticipants in setWr� Optimal quality based rewardr�m Optimal quality based marginal rewardr�u Optimal uniform reward

308 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17, NO. 2, FEBRUARY 2018

Page 3: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17 ...cs.sjtu.edu.cn/~fwu/res/Paper/PWC18TMC.pdfThecrowdsensingprocess(therightpart)canbediscribedas follows. First, the service provider

The crowdsensing process (the right part) can be discribed asfollows. First, the service provider releases a set T of sensingtasks (e.g., noise sensing on campus at 10:00 am) with anincentive announcement and a quality requirement (e.g., anerror threshold). In the area of interests, there is a setA ¼ fa1; a2; . . . ; ang of participants, with sensors embeddedin their mobile devices. Each participant ak 2 A bears a pri-vate reserve price/sensing cost ck (i.e., a monetary value forher consumptions of physical resources and involvements ofmanual efforts), and thus expects a reward for her contribu-tion. Without sufficient rewards, the participants may notundertake the sensing tasks. The service provider estimatesthe quality qk of sensing data from each participant ak. Bytaking the profile of the participants’ data quality and sens-ing costs into consideration, she selects a subset W � A ofparticipants to perform each sensing task, and offers eachparticipant ak 2W a certain amount of reward rk accordingto her effective contribution. After collecting the sensingdata for some tasks, the service provider updates quality esti-mation qk for each participant ak 2W to guide the nextround of recruitment (the right part), and extracts informa-tion to provide services (the left part).

We consider a general class of crowdsensing applica-tions, in which the availability and preciseness of servicessignificantly depends on the quality of sensing data, e.g.,urban noise pollution monitoring, which measures ambi-ent noise pollution based on sensing data collected frommobile devices. For each piece of sensing data with anerror below the specified threshold, the service providergains a value V (e.g., the subscription fee from servicesubscribers). For simplicity, we assume that V is fixed inour basic incentive mechanism, and then relax theassumption. The objective of the service provider is tomaximize her own profit, by recruiting participants withproper rewards and providing services with guaranteedquality. The profit is defined as the difference betweenthe total value gained from the sensing data and therewards for participants, i.e.,

Profit ,Xak2WðV � rkÞ:

In this paper, we focus on the data quality that is specifi-cally affected by participants’ effort levels for sensing, andaim at designing incentive mechanisms for the service pro-vider to stimulate high quality sensing and long-term, effec-tive contributions.

2.2 Quality Estimation via EM

For crowdsensing, e.g., urban noise sensing, it is reasonableto calibrate the sensing data to moderate the inherent uncer-tainty of mobile devices. Here, we divide the reading ofsensing data into discrete intervals, and suggest the serviceprovider to deliver a certain interval to the service subscrib-ers, rather than an accurate reading, to mitigate the impactof device variance and device error. The discrete intervalsare denoted as a set D ¼ fd1; d2; . . . ; dmg, where each inter-val spans over a range of decibels, and the granularity ofinterval division can be determined by the tradeoff betweenaccuracy and complexity.

Regarding the quality of sensing data as a result of theeffort levels, we estimate “effort matrix” ek for each partici-pant ak, andmap this effort matrix into a scalar quality valuethrough function qk ¼ gðekÞ. Here, the effort matrix ek is anm�m matrix, with element ekij 2 ½0; 1�, i ¼ 1; . . . ;m,j ¼ 1; . . . ;m, indicating the probability that participant aksubmits a piece of sensing data in interval dj while the truereading is in interval di. Particularly, fekiiji ¼ 1; . . . ;mg con-tains the probabilities that participant ak obediently performsoutside-pocket sensing for each of them possible cases. Fur-thermore, the conditional probabilities satisfy

Pj e

kij ¼ 1.

We note that, the effort matrix can be measured when wehave ground truth for all spatial-temporal contexts. How-ever, for crowdsensing, the true reading, or even the inter-val, cannot be ascertained in most cases, making the directverification of data quality and the discernment of effortmatrix challenging. In this paper, we resort to the well-known expectation maximization (EM) algorithm [33] toestimate each participant’s effort matrix.

The EM algorithm is an iterative method for finding theMaximum Likelihood Estimation (MLE) of the parameters(e.g., the effort matrix for each participant, and the true noiseinterval for each task), when there is missing data (e.g., theindicators to tell right or wrong for sensing data) that pre-cludes the straightforward estimation for the parameters.Here, MLE calculates the best estimation for parameters thatmaximizes the (log-)likelihood of the observations (e.g., thesubmitted sensing data), and converges in probability to thetrue value of the parameterswhen the number of observationsis sufficiently large.

Given a set S of observed sensing data, a set P of missingtrue interval indicators, a set E of unknown effort matrices,and the density function f , the likelihood of unknown E is

LðE;P; SÞ ¼ fðP; SjEÞ:

Fig. 1. A general crowdsensing model. According to participants’ data quality and sensing cost, the service provider selects a subset of participantsto perform sensing tasks for crowdsening based services.

PENG ET AL.: DATA QUALITY GUIDED INCENTIVE MECHANISM DESIGN FOR CROWDSENSING 309

Page 4: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17 ...cs.sjtu.edu.cn/~fwu/res/Paper/PWC18TMC.pdfThecrowdsensingprocess(therightpart)canbediscribedas follows. First, the service provider

To find the MLE of E, the EM algorithm iteratively runs thefollowing two steps until convergence (supposing that bEt isthe current value of E after t iterations):

E-step calculates the expected value of likelihood func-tion, with respect to the conditional distribution of P givenobservation S under the estimation of E,

QðEj bEtÞ ¼ EP jS;bEt

½LðE;P; SÞ�:

M-step seeks the estimation bE that maximizes the expec-tation function, bEtþ1 ¼ argmax

EQðEj bEtÞ:

To estimate the interval indicators and participants’ effortmatrices, we extend the EM algorithm and iterate the follow-ing two steps until convergence: 1Þ estimate the effort matrixand noise interval distribution via maximum likelihood esti-mation, based on the estimated interval indicators; and 2Þcalculate new estimation of interval indicators, according tothe estimated effort matrices and noise interval distribution.

The converged estimation of participant’s effort matrixindicates the quality of sensing data, while the noise intervaldistribution is suggestive of the noise pollution level.

3 QUALITY BASED INCENTIVE

In this section, we detail the design of our quality basedincentive mechanism for crowdsensing. To pay each indi-vidual participant ak as how well she does in sensing, weestimate her effort matrix ek, calculate her quality qk of sens-ing data, quantify her effective contribution xmðqkÞ, andoffer her a proper reward rk.

The feedback of quality based incentive is illustrated byFig. 2. Rewards are determined for each participant accord-ing to the quality of historical sensing data, and in turn, theparticipants adjust their personal effort levels for completingthe succeeding sensing tasks. We assume that participantsdo not dramatically change their effort levels over a shorttime, and estimate the quality of their sensing data periodi-cally. Taking the quality of sensing data into consideration,our incentive mechanism can encourage long-term, effectivecontribution for crowdsensing based services.

3.1 A Simple Case

We first regard all of the submitted sensing data as quali-fied, and present a simple pricing scheme. We assume thatthe participants’ sensing costs follow a probability distribu-tion, with a probability distribution function fðckÞ, and acumulative distribution function F ðckÞ.

A rational participant ak will not do a given sensing taskunless she gets a reward r � ck. Therefore, the service

provider’s profit by providing services and recruiting par-ticipant ak, which is defined as the difference between valueV gained from the sensing data, and the reward r to partici-pant ak, where V � r, is formulated as

Profitðck; rÞ ¼ 0; r < ck;V � r; r � ck:

�While the distribution of ck is independent of value V

and reward r, the expected profit is calculated as

ProfitðrÞ ¼Z 1

0

Profitðck; rÞfðckÞdck

¼Z r

0

ðV � rÞfðckÞdck ¼ F ðrÞðV � rÞ:

Therefore, the service provider canmaximize her profit bytaking the first derivative of the function ProfitðrÞ, solvingthe following equation, and getting the optimal reward, i.e.,

r� ¼ V � F ðr�Þfðr�Þ :

3.2 Quality Estimation

In practice, due to their various effort levels, different par-ticipants may submit sensing data of diverse quality. In thissection, we extend the Estimation Maximization algorithmto estimate the effort matrix ek for each participant ak, andthen estimate the quality of her sensing data as qk ¼ gðekÞ.

Specifically, we denote the set of participants that submitsensing data to task t as At � A, and the set of tasks that par-ticipant ak performs as Tk � T. For task t 2 Tk, the truenoise interval is denoted as d0t , while the interval into whichparticipant ak’s sensing data falls is denoted as dkt . An indi-cator function Iðdkt ¼ djÞ (i.e., Iðdkt ¼ djÞ ¼ 1 when eventdkt ¼ dj is true; otherwise, Iðdkt ¼ djÞ ¼ 0) is applied todescribe the submission of sensing data.

We assume that the effort levels of participants are inde-pendent, and do not change for a period of time. So that wecan periodically learn the effort matrix ek for each partici-pant ak, and put this knowledge into practice. Without thetrue interval indicator, i.e., pt ¼ fptiji ¼ 1; . . . ;mg for eachtask t (pti ¼ 1 if d0t ¼ di for sure) is unavailable, we resort tothe EM algorithm that combines Maximum likelihood esti-mation and Bayesian inference to iteratively estimate theunknown effort matrix ek and noise interval distributionP ¼ fpiji ¼ 1; . . . ;mg. The pseudo-code is shown in Algo-rithm 1, which runs as follows.

(1) Initialization: For each task t, the probability distribu-tion of true noise interval indicator pt is initialized as

Fig. 2. Feedback of quality based incentive. The service provider estimates the data quality for participants, and offers them proper rewards based ontheir effective data contributions, to encourage high quality sensing data.

310 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17, NO. 2, FEBRUARY 2018

Page 5: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17 ...cs.sjtu.edu.cn/~fwu/res/Paper/PWC18TMC.pdfThecrowdsensingprocess(therightpart)canbediscribedas follows. First, the service provider

pti ¼ pðd0t ¼ diÞ ¼P

ak2AtIðdkt ¼ diÞjAtj :

(2) Estimation of effort matrix and noise interval distri-bution: Given the likelihood function

LðE;P; SÞ ¼ fðP; SjEÞ;and

LðE;SÞ ¼ fðSjEÞ ¼XP

fðP; SjEÞ;

where E ¼ fekjak 2 Ag, P ¼ fptjt 2 Tg, and S ¼fdkt jt 2 T; ak 2 Atg, the maximum likelihood estimateof E makes the observation S most likely to happen.

We note that the effort matrix ek for each partici-pant ak follows the Multinomial Distribution. Whenparticipant ak performs nk

i independent tasks withtrue interval di, her sensing data for these tasks fallsinto interval dj with probability ekij, where ekij � 0

andP

j ekij ¼ 1, j ¼ 1; . . . ;m. Let nk

i1; . . . ; nkim be the

number of submissions corresponding to intervald1; . . . ; dm, respectively. Then we have

Pj n

kij ¼ nk

i ,and the likelihood function of eki ,

fðnki1; . . . ; n

kimjeki1; . . . ; ekimÞ ¼

nki !Qnkij!

YðekijÞn

kij :

Taking the log-likelihood, Lagrange multipliers, andderivatives, we get the most natural estimates [34],

ekij ¼nkij

nki

¼P

t2Tk ptiIðdkt ¼ djÞPt2Tk pti

; j ¼ 1; . . . ;m:

The noise interval distribution is estimated as

pi ¼P

t2T pti

jTj ; i ¼ 1; . . . ;m:

(3) Estimation of true noise interval indicator: Given theobserved sensing data S, the effort matrices E, andthe noise interval distributionP, we apply the Bayes-ian inference to estimate the true noise interval indi-cators P . Considering the n independent setsfS1; . . . ; Sng of observations from individual partici-pants, where Sk ¼ fdkt jt 2 Tg; k ¼ 1; . . . ; n, we have

pðP jSÞ ¼ pðP ÞpðSjP ÞpðSÞ ¼ pðP ÞpðS1jP Þ . . . pðSnjP Þ

pðSÞ :

When all terms not involving the true noise intervalindicator are absorbed into the proportionality sign,we calculate the distribution of true noise intervalindicator for each task as

pti ¼pi

Qak2At

QjðekijÞIðd

kt¼djÞP

q pq

Qak2At

QjðekqjÞIðd

kt¼djÞ

; i ¼ 1; . . . ;m:

(4) Convergence: We iterate step 2� 3 until the two esti-mates converge, i.e., j bEtþ1 � bEtj"; j bPtþ1 � bPtj h.For each iteration (the while loop), the computationcomplexity is polynomial as OðjAjjTjjDjÞ ¼ OðnÞ,

since the number of tasks and the number of inter-vals are constant and decided by the service providerbefore the algorithm works.

Algorithm 1. Effort Matrix Estimation

Input: A set S ¼ fdkt jt 2 T; ak 2 Atg of observations.Output: Estimation of effort matrix E, marginal distribution of

noise interval P, and posterior estimation of true noiseinterval indicators P .

1: // Initialization of Noise Interval Indicator

2: foreach t 2 T do3: cnt½m� 0;4: foreach ak 2 At do5: i dkt ; cnti cnti þ 1;6: num cnt1 þ þ cntm;7: foreach i 2 D do8: pti cnti=num;9: while not converged do10: // Estimation of Effort Matrix

11: foreach ak 2 A do12: cnt½m� 0; ek½m�m� 0;

13: foreach t 2 Tk do14: j dkt ;15: foreach i 2 D do16: ekij ekij þ pti; cnti cnti þ pti;17: foreach i; j 2 D do18: ekij ekij=cnti;19: // Estimation of Noise Interval

20: foreach i 2 D do21: pi 0;22: foreach t 2 T do23: pi pi þ pti;24: pi pi=jTj;25: // Estimation of Noise Interval Indicator

26: foreach t 2 T do27: pt

½m� 1;28: foreach i 2 D do29: foreach ak 2 At do30: j dkt ; pti ptie

kij;

31: smp p1pt1 þ þ pmp

tm;

32: foreach i 2 D do33: pti pip

ti=smp;

34: Return E ¼ fekjak 2 Ag, P ¼ fpiji 2 Dg, P ¼ fptjt 2 Tg;

We claim that the EM algorithm increases the likelihoodfunction in each iteration, and finally converges to a stableestimation [35]. To circumvent the problem of gettingtrapped in a local optimum, we try several executions of thealgorithm with different initializations on subsets of sub-missions. Although it is hard to provide theoretical guaran-tee for its performance, the EM algorithm has been widelyused, and a provably optimal convergence rate up to a loga-rithmic factor has been shown in [36].

With the estimation for effort matrix ek, we can get thequality of ak’s sensing data through the mapping function.For simplicity, we focus on pure obedience, and set

qk ¼ gðekÞ ¼Pi e

kii=m. With the estimation for distribution

of true noise interval indicator pt ¼ fpt1; pt2; . . . ; ptmg for taskt, the interval d�i to be delivered is the one with maximumpossibility, i.e., d�i ¼ argmax

ipti:

PENG ET AL.: DATA QUALITY GUIDED INCENTIVE MECHANISM DESIGN FOR CROWDSENSING 311

Page 6: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17 ...cs.sjtu.edu.cn/~fwu/res/Paper/PWC18TMC.pdfThecrowdsensingprocess(therightpart)canbediscribedas follows. First, the service provider

3.3 Contribution Quantification

Various analyses and experiments have confirmed that expertwork can be accomplished by the local crowd, even if they arelack of expert knowledge. However, the contribution of eachindividual participant remains unknown. Here, inspired byideas in Information Theory and Shannon’s Channel CodingTheorem [37], [38], we quantify the participants’ contribu-tions through information uncertainty reduction.

We regard the right part of crowdsensing system (Fig. 1)as a signal transmission system (Fig. 3). The input signal Xis the environmental data that the crowdsensing system tar-gets to collect, and the output signal Y is the sensing datathat service provider actually receives from the participants.Transmitted through the channel, an input signal may bedistorted in a random way depending on the channel condi-tion (i.e., the noise variable Z, which is independent ofX, onthe transmission channel), and thus the output signal maybe different from the input signal. Similarly, in crowdsens-ing system, the quality of sensing data would be effected byparticipant’s sensing quality qk 2 ½0; 1�. We use pðz ¼ 0Þ ¼ qkto indicate that the output signal is equal to the input signalwith probability qk, and pðz ¼ 1Þ ¼ 1� qk to indicate that anerror occurs with probability 1� qk.

Similar to the capacity of a noisy channel [37], the contri-bution of the sensing data can be expressed as mutual infor-mation,

IðX;Y Þ ¼ HðXÞ �HðXjY Þ¼ HðXÞ �

Xy

pðyÞHðXjY ¼ yÞ

¼ HðXÞ �Xy

pðyÞhbðqkÞ

¼ HðXÞ � hbðqkÞ;where HðXÞ is entropy of X, HðXjY Þ is the conditional

entropy of X given Y , and hbðqkÞ is a binary entropy for thebinary random noise Z with distribution fqk; 1� qkg, i.e.,

hbðqkÞ ¼ �qklog ðqkÞ � ð1� qkÞlog ð1� qkÞ:This mutual information IðX;Y Þmeasures how much infor-mation the presence of Y contributes to making the correctinference forX.

In our crowdsensing system, when no sensing data issubmitted, the service provider gets little information aboutthe environment, and thus all the m optional intervals ofenvironmental data are equally likely to be observed withprobability 1=m, making the uncertainty maximal at

HðXÞ ¼ �Xm

1

mlog

1

m

� �¼ log ðmÞ:

Generally, if Z is not a binary random variable, but distrib-uted with qk in the correct interval and equal probabilityð1� qkÞ=ðm� 1Þ for each of the m� 1 rest intervals, thenthe information uncertainty will be

hmðqkÞ ¼ �qklog ðqkÞ �Xm�1

1� qkm� 1

log1� qkm� 1

� �

¼ �qklog ðqkÞ � ð1� qkÞlog 1� qkm� 1

� �:

Therefore, the effective contribution of sensing data ofquality qk, can be formulated as

xmðqkÞ ¼ log ðmÞ þ qklog ðqkÞ þ ð1� qkÞlog 1� qkm� 1

� �:

With the convention 0 log 0 ¼ 0, sensing data of qualityqk ¼ 1 will result in minimal uncertainty, hmð1Þ ¼ 0, andmaximal contribution, xmð1Þ ¼ log ðmÞ. Though a binarychannel which never makes errors and one always makeserrors are equally good for communication, we only con-sider and reward sensing data of quality within a range of½0:5; 1�. As Fig. 4 shows, the effective contribution of sensingdata increases with sensing quality.

Practically, with the same volume, sensing data of highquality carries larger amount of constructive informationthan that of low quality. Specifically, the high quality datacontains intrinsic efficiency, while the low quality dataneeds extra information, functioning like error-correctingcode (ECC), to detect and/or correct errors without resub-mission. In crowdsensing, such kind of error correction, ismore often conducted in the form of verification by recruit-ing another group of participants or sensing another kind ofdata (i.e., light signal to determine if the device is out ofpocket). Here, we elide the specific ECC and focus on itscost (i.e., accounting for a part of the data volume), andquantify the effective contribution of sensing data as theinformation uncertainty reduction.

3.4 Reward Distribution

In this section, we take a step further and reward eachselected participant proportionally to her qualified contri-bution, i.e., rk ¼ rxmðqkÞ ¼ rxmðgðekÞÞ, where r is a bench-mark reward.

Fig. 3. A discrete channel ða; ZÞ, where Y ¼ aðX;ZÞ.

Fig. 4. Data contribution changes with sensing quality.

312 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17, NO. 2, FEBRUARY 2018

Page 7: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17 ...cs.sjtu.edu.cn/~fwu/res/Paper/PWC18TMC.pdfThecrowdsensingprocess(therightpart)canbediscribedas follows. First, the service provider

We adjust parameters of the simple case. From partici-pant ak with an effort matrix ek, the profit that the serviceprovider gains from her sensing data is

Profitðck; ek; rÞ ¼ 0; rxmðgðekÞÞ < ck;

V � rxmðgðekÞÞ; rxmðgðekÞÞ�ck:

(

Given the joint distribution1 fðck; ekÞ of sensing cost andeffort matrix, we can calculate the expected profit of the ser-vice provider, by integrating the function over all possiblesensing cost and effort matrix, i.e.,

ProfitðrÞ ¼Zek

Z 1

0

Profitðck; ek; rÞfðck; ekÞdckdek:

Then, the optimal benchmark reward is determined bythe solution to the maximization problem

r� ¼ argmaxr

ProfitðrÞ:

Each selected participant ak will get her quality basedreward rk ¼ r�xmðgðekÞÞ.

We note that, for simple joint distribution fðck; ekÞ, theoptimal benchmark reward r� can be calculated by solvingthe integral equation and taking the derivation of r. How-ever, for complex cases, greedy algorithms can find theproper reward with approximate profit more efficiently.

4 QUALITY AWARE SERVICE VALUE

In previous section, we assume that the service providergains a fixed value V for sensing data with an error belowthreshold, and reward each participant independently andsolely upon her quality of sensing data. While in practicalscenarios, the value of crowdsensing based service isachieved as a cooperative work from participants, or wesay, each participant contributes marginal value to the ser-vice. In this section, we relax the assumption on fixed value,and reward participants according to their marginal contri-bution to the service value.

4.1 Marginal Contribution Quantification

The service value gained from the overall selected partici-pants is denoted by vðWÞ, whereW is the finite set of partic-ipants, and v : 2W 7!< associates with each subset S �W areal-valued value vðSÞ to which each selected participantmakes a contribution. In General, vðSÞ is dependent on theoverall value V (e.g., subscription fee from service subscrib-ers) and the sensing quality and effective contribution ofthis subset of participants. We have vð;Þ ¼ 0 by default.

To motivate participants in a fair manner, we calculatethe marginal contribution vmðqkÞ for each participant. Here,fairness means that, if participant ai and participant ajalways behave the same in each cooperation with the otherparticipants, i.e., for all S that contains neither ai nor aj,vðS [ faigÞ ¼ vðS [ fajgÞ, then they should make the samecontributions. Their behavior is characterized by the

estimated quality of sensing data. Also, to encourage highquality sensing, the marginal contribution, as well as themarginal reward, should monotonically increase with thequality of sensing data.

With an elegant axiomatic characterization, the Shapleyvalue [39] is well-studied in the field of cooperative gametheory. Shapley value meets a collection of desirable proper-ties, i.e., efficiency, symmetry, dummy, and additivity,which provides excellent fairness for our marginal contribu-tion analysis.

In our design, the Shapley value of participant ak is calcu-lated by

vmðqkÞ ¼X

S�Wnfakg

jSj!ðjW j�jSj�1Þ!jW j! ðvðS [ fakgÞ � vðSÞÞ;

where jSj and jW j are cardinalities of sets S and W respec-tively. It captures the average marginal contribution of par-ticipant ak, averaging over all possible selection orderingsaccording to which the cooperation could be built up fromthe original empty set.

The above Shapley value would calculate all the jW j! per-mutations of W . To mitigate the complexity, we randomlyselect a const number K of permutations out of the jW j!ones with equal possibility, calculate the marginal contribu-tion for each participant, and take the average value overthese permutations, i.e.,

vmðqkÞ ¼X

Si�Wnfakg;i¼1;...;K

vðSi [ fakgÞ � vðSiÞK

:

The approximate Shapley value may not strictly hold theproperties like symmetry and additivity, but we can guaran-tee efficiency by normalization and dummy by nature. Com-pared to a one-step (e.g., greedy) marginal contributioncalculation, this approximate value canmitigate the differen-ces of marginal contributions which result from orderings,and thus can improve the fairness among the participants.

4.2 Marginal Reward Distribution

Based on the approximate Shapley value, we determine themarginal reward for each participant. Similar to the inde-pendent reward scenario, the marginal rewards for partici-pants are proportional to their marginal contributions, i.e.,rmk ¼ rmvmðqkÞ, where rm is a reward ratio between (0,1].

Thus, the profit for service provider is ð1� rmÞvðWÞ, andthe optimal ratio maximizes the profit with individualrationality.

5 EVALUATION RESULTS

In this section, we conduct simulations to evaluate perfor-mance of our quality based incentive mechanism. We firstanalyze the improvement in quality assurance. Then, wecompare our quality based reward mechanism to the uni-form pricing scheme, and illustrate the superior perfor-mance in profit management.

5.1 Quality Assurance

We install NoiseTube mobile app [40] on Google Nexus 7,and use the embedded acoustic sensor to measure noise in ameeting room. We recruit 10 participants to take part in the

1. Although this joint distribution can be learnt from historical dataor general survey, it is not our main contribution. Here, we assume thatthe distribution is common knowledge, and consider two kinds of jointdistributions, where there is no correlation or a strong positive correla-tion between sensing cost and effort matrix.

PENG ET AL.: DATA QUALITY GUIDED INCENTIVE MECHANISM DESIGN FOR CROWDSENSING 313

Page 8: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17 ...cs.sjtu.edu.cn/~fwu/res/Paper/PWC18TMC.pdfThecrowdsensingprocess(therightpart)canbediscribedas follows. First, the service provider

experiment, each of which carries a nexus and randomlyputs it into his/her pocket or on the table. The participantsare well told that accurate monitoring occurs when they putout the nexuses and keep them undisturbed.

The basic experiment is to test whether the participants’effort levels will effect the noise readings. As Fig. 5a shows,the noise reading from a muffled microphone inside pocketis at least 5dBs lower than that of outside pocket sensing.The sensing data submitted by participants, as shown inFigs. 5b, c, d, e, f, also presents such reading differences,based on which we can roughly tell the effort levels of par-ticipants, i.e., node 10 is sensing with the highest effort leveland submits almost perfect readings; node 7, node 8, node 9and node 11 are 85 percent accurate with high effort levelsat most of the time; node 2 and node 4 are helpful with 70percent accuracy; node 6 is careless with high accuracy atfirst and then gradually slacks off; node 5 is indifferent withhalf accuracy and the other half deviation intermittently;and node 3 is sensing with the lowest effort level with allreadings lower than ground truth.

Given the reading differences, we compare the qualityassurance, i.e., the overall monitoring accuracy, as a collec-tive work from the crowd, in our quality measured model(QM), traditional majority voting model (MV), and all andaverage model (AA). The difference is: QM excludes sens-ing data with low quality (i.e., with accuracy less than 50percent) and assigns quality-estimated data with different

weights; MV selects the most frequent noise interval at first,and then calculates the noise reading averagely; and AAtakes in all submissions and reports the average reading.

Results, as shown in Fig. 6, indicate that QM outperformsthe other two models, in monitoring the noise pollutionmore accurately (i.e., the readings keep closely to theground truth), and more robustly to the efforts fluctuationof participants, especially when careless and indifferent par-ticipants take up more than half of the whole population.Furthermore, the MV model may direct the monitoring intoa fierce fluctuation when the noise interval is highly precise,which is 5dBs per interval in our setting. Despite a similartrend with QM, the AA model is more vulnerable to largeamount of low quality submissions.

5.2 Profit Management

To test the performance of our quality based incentivemechanism in terms of profit management, we first generatethe sensing costs and effort matrices for participants, andthen compare the profit of our mechanism to that of the uni-form pricing scheme.

We draw vc and ve from a bivariate normal distribution,ðc; eÞ � N ðm1;m2; s

21; s

22; rÞ, where m1 ¼ 2:0, m2 ¼ 0:75, s1 ¼

1:0, s2 ¼ 0:125, and r ¼ 0:0 is set to indicate that there is nocorrelation between sensing cost and effort matrix (Fig. 7a),or r ¼ 0:8 for a strong positive correlation (Fig. 7b). Accord-ing to the 68� 95� 99:7 rule/3s rule [41], the 95.45 percent

Fig. 5. Accuracy of noise pollution monitoring with different effort levels of participants. (a) General noise reading differences between outside pocketsensing and inside pocket sensing. (b)-(f) Noise readings of ground truth (Node 1) and from the 10 participants (Node 2-11).

Fig. 6. Comparison of monitoring accuracy of different models. Our quality based incentive mechanism outperforms the majority moving model andthe all and average model, in monitoring accuracy and robustness to quality fluctuation.

314 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17, NO. 2, FEBRUARY 2018

Page 9: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17 ...cs.sjtu.edu.cn/~fwu/res/Paper/PWC18TMC.pdfThecrowdsensingprocess(therightpart)canbediscribedas follows. First, the service provider

confidence interval is m� 2s, which empirically states thatabout 95.45 percent data drawn from the normal distribu-tion lies within ½0:0; 4:0� � ½0:5; 1:0� in our setting. Then, wetransform vc and ve to ck and ek correspondingly by settingck ¼ maxð�0:5;minðvc; 4:5ÞÞ and ekii ¼ maxð0:45;minðve; 1:05ÞÞ,i ¼ 1; . . . ;m. Therefore, the extreme data is excluded andthe rest majority approximately follows the same normaldistribution. Notably, other forms of distribution are alsoexperimentally possible, and the exact joint distributionneeds to be carefully estimated in practical crowdsensingmarkets [42].

After getting the joint distribution, we compare the profitof our quality based incentive mechanism and that of theuniform pricing scheme. We note that, the quality basedincentive mechanism gains a full value V from sensing databy providing an error-bounded service, and offers each par-ticipant a proper reward based on her effective contribution.In the uniform pricing scheme, the sensing data is regardedequally with the same quality, and the participants areoffered the same reward,

r�u ¼ maxak2At

ck:

However, the gained value is restricted by the actual qualityqk of sensing data, i.e., vk ¼ yðV; qkÞ, which monotonouslyincreases with quality qk.

For simplicity, we consider that there are jDj ¼ 2 noiseintervals, and omit the subscript of e. Then, the effectivecontribution is calculated as

xmðgðeÞÞ ¼ c2ðeÞ ¼ 1þ elog eþ ð1� eÞlog ð1� eÞ:Value function is set to be vk ¼ V sin ðxmðgðeÞÞ � p=2Þ,which is concave with feasible xm 2 ½0; 1�.

We select participants from sufficient crowd, in anincreasing order of cost/contribution ratio, and calculatethe optimal reward for the top proportion of them, rangingfrom 10 to 100 percent. The optimal reward, in our qualitybased incentive mechanism, is determined by

r� ¼ argminr

rxmðgðekÞÞ � ck � 0; 8ak 2 At:

Each participant ak will get a proper reward,

rk ¼ r�xmðgðekÞÞ:Results, as shown in Fig. 8, indicate that our quality

based incentive mechanism overwhelmingly outperformsthe uniform pricing scheme, in both of the two distributions.Our mechanism complies with the cost/contribution ratioto set the optimal reward in every stage, and thus can fullyleverage the power of participants to complete the sensingtasks at a low cost, when compared to the uniform pricingscheme. Moreover, with the guaranteed value of service,the quality based incentive mechanism, with higher accu-racy and less fluctuation in noise monitoring, is moreappealing to the service provider.

The results also suggest the proper fraction of participantsthat the service provider should try to recruit, which is 80

Fig. 7. Joint distribution of sensing cost and effort matrix.

Fig. 8. Comparison on profit of different pricing schemes. Our qualitybased incentive mechanism overwhelmingly outperforms the uniformpricing scheme, in both of the two distributions.

PENG ET AL.: DATA QUALITY GUIDED INCENTIVE MECHANISM DESIGN FOR CROWDSENSING 315

Page 10: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17 ...cs.sjtu.edu.cn/~fwu/res/Paper/PWC18TMC.pdfThecrowdsensingprocess(therightpart)canbediscribedas follows. First, the service provider

percent for both schemeswhen sensing cost and effort matrixhas no correlation, and 80 and 70 percent, for our qualitybased incentive mechanism and the uniform pricing scheme,respectively, when the factors are positively correlated. It isreasonable to see such a turning point in profit, from a smoothrise to a fall, since there are always some participants withsensing costs higher than what they deserve to be rewardedaccording to the their effective contributions, i.e., with unrea-sonably high sensing costs for subpar contributions. Thex-axis ends with 970 (Fig. 8a) and 940 (Fig. 8b), respectively,sincewe have excluded extreme data from the distribution.

These results also help to demonstrate the significantimportance of quality estimation and contribution quantifi-cation to guarantee both quality and profit of crowdsensingbased services. Through reward control, the participantswho submit low quality sensing data will be paid less oreven nothing, and will try to increase their sensing effort toget proper rewards again. On the other hand, being wellpaid, those participants who submit high quality sensingdata will keep doing sensing tasks. Through this dynamicinteraction, our mechanism can help the service provider todistinguish the valuable participants from the rest, and tomaintain long-term, efficient crowdsensing.

5.3 Marginal Profit Management

We also compare the marginal profit of our extended qual-ity based incentive mechanism and uniform pricing scheme.Different from the experiments in previous section, here, thevalue of service is achieved by the cooperative work ofparticipants.

We use a submodular function on the mean quality todescribe the marginal contribution, i.e.,

vðWÞ ¼ V � hðjW jÞPak2W gðekÞjW j ;

where hðnÞ ¼ n=ðnþ 1000Þmodels the relationship betweenthe number of participants and the credibility of the data.The function vðWÞ takes into consideration both the numberand the sensing quality of selected participants.

The optimal marginal reward is determined by

r�m ¼ argminr

rvmðgðekÞÞ � ck � 0; 8ak 2W:

Each participant ak will get a proper reward,

rmk ¼ r�mvmðgðekÞÞ:Results, as shown in Figs. 9a and b, state that our quality

based extension achieves better performance than that of theuniform pricing scheme, in both of the two distributions forcooperative crowdsensing. We select participants accordingto their ratio of cost over marginal contribution in a non-decreasing order, which leads to high quality of service witha moderate cost. The results also suggest similar recruitmentproportion. While there is an obvious oscillation of profitfrom uniform pricing scheme, our quality based incentivemechanism guarantees an increasing profit, before an over-recruitment, i.e., a high benchmark reward is set to take inparticipants who require much high reward but contributelittle to the service value. The service provider could detectand avoid this side effect, by considering both the marginalcontribution and sensing cost of participants, and recruitinga proper number of participants.

6 RELATED WORK

In this section, we briefly review related work on incentivemechanism design and data quality estimation in crowd-sourcing and crowdsensing.

6.1 Incentive Mechanisms for Crowd

There are extensive researches targeting the incentive mech-anism design for crowdsourcing and crowdsensing.We clas-sify these designs as quantity-oriented or quality-oriented.

6.1.1 Quantity-Oriented Incentive Mechanisms

These incentive mechanisms are designed to increase thequantity of data, or we say, to motivate participation ofcrowd. Lee and Hoh [15] proposed a reverse auction baseddynamic pricing scheme to motivate participants to selltheir sensing data with claimed bids. Yang et al. [16] consid-ered a platform-centric incentive model, where the rewardis proportionally shared by participants in a Stackelberggame, and a user-centric incentive model, where partici-pants in the auction bid for tasks and get paid no lowerthan the submitted bids. Zhao et al. [18] and Gao et al. [19]suggested online incentive mechanisms to flexibly recruitparticipants who appear opportunistically in the phenom-ena of interests. Chandr et al. [20] considered participantreliability, i.e., the ratio of responses submitted timely totasks accepted by the participant, in their online, time sensi-tive crowdsourcing platform. Wei et al. [21] considered thedynamic nature of both participants and service providers.Luo et al. [22] studied an all-pay auction with realistic con-straints such as information asymmetry, population uncer-tainty, and risk aversion. More related work and openresearch challenges have been discussed in the survey [29].

Fig. 9. Comparison on profit of different marginal pricing schemes. Ourquality based incentive extension achieves better performance than thatof the uniform pricing scheme before over-recruitment, in both of the twodistributions.

316 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17, NO. 2, FEBRUARY 2018

Page 11: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17 ...cs.sjtu.edu.cn/~fwu/res/Paper/PWC18TMC.pdfThecrowdsensingprocess(therightpart)canbediscribedas follows. First, the service provider

In general, these reverse auction based incentive mecha-nisms have not considered the quality of sensing data, andthus will face limitations on the guarantee and improve-ment of quality of service. To free participants from extrasensing-unrelated efforts, we don’t require the participantsto rigorously calculate their reserve prices, nor to reveal thisprivate information and bid for sensing tasks, which is dif-ferent from those auction based incentive mechanisms.

6.1.2 Quality-Oriented Incentive Mechanisms

There are also systems considering reputation of partici-pants and impact of incentive on personal effort. Reddyet al. [23] built a performance based reputation system toidentify and select well-suited participants for crowdsens-ing. They had an expert, peer comparison, and progressreview for quality estimation. Cheung et al. [24] consideredheterogeneous initial locations, movement costs, movementspeeds, and participant reputation levels, to help partici-pants determine task selections and mobility plans. The rep-utation level, as a prior knowledge, is described as theeligibility for sensing tasks. Pan et al. [43] recorded perfor-mance for reputation and studied how to assemble teams ofparticipants through intelligent task allocation for efficientquality-time-cost trade-offs. On the other hand, althoughvarious empirical experiments [30], [44] demonstrate thatfinancial and social incentives do have an impact on the per-formance of participants, such as engagement, complianceand quality, they fail to generalize an incentive model toadaptively guide the participants’ behavior.

Koutsopoulos [25] considered participation level (samplefrequency) in the incentivemechanismdesign. However, thiswork separately addresses the quality issue and incentivedistribution, and does not tell how to improve the sensingquality of participants. While Kawajiri et al. [26] steered par-ticipants to cover sufficient locations to improve the qualityof service, they based their quality of service on the numberof data submissions from different locations in their wirelessindoor localization system. In contrast, we systemically con-sider andmeasure the effort levels of participants, and bridgethe gap between data quality and reward distribution.

Multi-attributive auction [45] allows us to use a powerfuladditive utility function to combine sensing cost, participantcredibility, sensing quality, and privacy concern, to enabledynamic negotiation between participants and the serviceprovider. While it shows how to bridge the gap betweenquality, privacy, and incentive through a utility function forthe service provider from each single participant, our mech-anism considers both independent and marginal data con-tributions from participants.

6.2 Quality Estimation of Sensing Data

While majority voting and all-and-average can be used toaggregate sensing data, these methods treat each participantas equal in expertise and willingness, and take their sensingdata as equal in quality and contribution, which may dis-courage diligent participants and indulge careless sensingbehavior.

The expectation maximization (EM) algorithm we adoptis originally introduced by Dawid and Skene [33], where itis used to obtain maximum likelihood estimates of observ-ers’ error rates and to infer the true response of patients.

The convergence properties is discussed in [35]. Wang andIpeirotis [46] applied EM algorithm to estimate the qualityof crowdsourced labeling workers. Zhang et al. [36] pro-posed to combine spectral methods and EM algorithm toaddress the problem of crowdsourced multi-class labelingwith an optimal convergence rate up to a logarithmic factor.

Other quality estimation methods include [47] whichlearns a prediction model in the absence of gold labels withuncertainty-aware learning and soft-label updating to ana-lyze participant performance, [48] which applies reinforce-ment learning to dynamically decide when and how manygold questions to insert to estimate participant reliability,and [49] which evaluates different strategies, e.g., beta repu-tation, Gompertz function, robust averaging, maximumlikelihood estimation, and a Bayesian model, to predict par-ticipant trustworthiness.

7 DISCUSSION AND LIMITATIONS

In this section, we discuss the limitations and possibleimprovement of our quality based incentive mechanism forcrowdsensing.

Generally, an issue that lacks discussion and verificationin the literature of crowdsensing is the detailed definitionand determination of quality of data in various crowdsens-ing based applications.

We have considered analyzing the data reading to esti-mate the quality of data. In such environmental monitoringscenarios, the spatial and temporal contexts, e.g., locationaccuracy and sampling frequency, also matter a lot to thequality of data. For other applications where there is noclearly calculable metric like dB level for quality analysis,we should take other metrics (e.g., accessibility, value-added, and ease of understanding) into consideration.Quantifying data quality is not uniquely difficult in crowd-sensing. Machine learning techniques and manual judg-ment may be required to address it.

As to the incentive part, we have evaluated our qualitybased incentive mechanism and compared it to the uniformpricing scheme. Another promising branch of pricing mech-anisms is based on reverse auction, where participants bidto sell their sensing data to the service provider. Whiledesirable properties like truthfulness are achieved, theassumption of perfect rationality and the concern of privacyrevelation of participants is needed to understand.

We should further search for and go beyond the currentreputation systems and quality aware incentive mecha-nisms to address these issues, and conduct more practicalexperiments to improve and validate our mechanism forreal-world applications.

8 CONCLUSION AND FUTURE WORK

In this paper, we have incorporated the consideration of dataquality into the design of incentive mechanism for crowd-sensing. By applying the expectation maximization algo-rithm and information theory, we have bridged the gapbetween quality of sensing data and proper reward for con-tribution, and proposed the quality based incentive mecha-nism, which achieves both individual rationality and(approximate) profit maximization. Our incentive mecha-nism estimates the effort matrix for each participant,

PENG ET AL.: DATA QUALITY GUIDED INCENTIVE MECHANISM DESIGN FOR CROWDSENSING 317

Page 12: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17 ...cs.sjtu.edu.cn/~fwu/res/Paper/PWC18TMC.pdfThecrowdsensingprocess(therightpart)canbediscribedas follows. First, the service provider

calculates the quality of sensing data, and offers a reward inaccordance with each effective contribution, aiming to moti-vate individual participants with different sensing costs toplace sufficient manual efforts and submit high quality sens-ing data in crowdsensing. We have also implemented part ofthe mechanismwith extensive experiments and simulations.Compared to the existing data collection model and uniformpricing scheme, our mechanism achieves superior perfor-mance in quality assurance and profitmanagement.

As for future work, we are interested in designing incen-tive mechanisms that can analyze the quality of sensingdata in a more comprehensive way and steer the partici-pants to improve their sensing quality over time in large-scale deployments.

ACKNOWLEDGMENTS

This work was supported in part by the State Key Develop-ment Program for Basic Research of China (973 project2014CB340303), in part by China National Science Founda-tion grants 61672348, 61672353, 61422208, and 61472252, inpart by the Shanghai Science and Technology fund15220721300, and in part by the Scientific Research Founda-tion for the Returned Overseas Chinese Scholars. The opin-ions, findings, conclusions, and recommendations expressedin this paper are those of the authors and do not necessarilyreflect the views of the funding agencies or the government.

REFERENCES

[1] J. A. Burke, et al., “Participatory sensing,” Center Embedded Netw.Sens., 2006.

[2] R. K. Ganti, F. Ye, and H. Lei, “Mobile crowdsensing: Currentstate and future challenges,” IEEE Commun. Mag., vol. 49, no. 11,pp. 32–39, Nov. 2011.

[3] “Opensense,” 2010. [Online]. Available: http://www.opensense.ethz.ch/trac/

[4] “Creek watch,” 2010. [Online]. Available: http://creekwatch.researchlabs.ibm.com/

[5] N. Maisonneuve, M. Stevens, M. Niessen, and L. Steels,“Noisetube: Measuring and mapping noise pollution with mobilephones,” in Proc. Inf. Technol. Environ. Eng., 2009, pp. 215–228.

[6] “Noisemonitoring,” 2015. [Online]. Available: http://www.larsondavis.com/

[7] P. Mohan, V. N. Padmanabhan, and R. Ramjee, “Nericell: Richmonitoring of road and traffic conditions using mobilesmartphones,” in Proc. 6th ACM Conf. Embedded Netw. Sens. Syst.,2008, pp. 323–336.

[8] Y. Wang, X. Liu, H. Wei, G. Forman, C. Chen, and Y. Zhu,“CrowdAtlas: Self-updating maps for cloud and personal use,” inProc. 11th Annu. Int. Conf. Mobile Syst. Appl. Serv., 2013, pp. 469–470.

[9] J. Biagioni, T. Gerlich, T. Merrifield, and J. Eriksson, “EasyTracker:Automatic transit tracking, mapping, and arrival time predictionusing smartphones,” in Proc. 9th ACM Conf. Embedded Netw. Sens.Syst., 2011, pp. 68–81.

[10] P. Zhou, Y. Zheng, and M. Li, “How long to wait?: Predicting busarrival time with mobile phone based participatory sensing,” inProc. 10th ACM Conf. Embedded Netw. Sens. Syst., 2012, pp. 459–460.

[11] S. Mathur, et al., “ParkNet: Drive-by sensing of road-side parkingstatistics,” in Proc. 8th Annu. Int. Conf. Mobile Syst. Appl. Serv.,2010, pp. 123–136.

[12] S. Nawaz, C. Efstratiou, and C. Mascolo, “ParkSense: A smart-phone based sensing system for on-street parking,” in Proc. 19thAnnu. Int. Conf. Mobile Comput. Netw., 2013, pp. 75–86.

[13] M. Azizyan, I. Constandache, and R. Roy Choudhury,“SurroundSense: Mobile phone localization via ambience finger-printing,” in Proc. 15th Annu. Int. Conf. Mobile Comput. Netw., 2009,pp. 261–272.

[14] A. Rai, K. K. Chintalapudi, V. N. Padmanabhan, and R. Sen, “Zee:Zero-effort crowdsourcing for indoor localization,” in Proc. 18thAnnu. Int. Conf. Mobile Comput. Netw., 2012, pp. 293–304.

[15] J.-S. Lee and B. Hoh, “Sell your experiences: A market mechanismbased incentive for participatory sensing,” in Proc. IEEE Int. Conf.Pervasive Comput. Commun., 2010, pp. 60–68.

[16] D. Yang, G. Xue, X. Fang, and J. Tang, “Crowdsourcing to smart-phones: incentive mechanism design for mobile phone sensing,”in Proc. 18th Annu. Int. Conf. Mobile Comput. Netw., 2012, pp. 173–184.

[17] Y. Zhang and M. Van Der Schaar, “Reputation-based incentiveprotocols in crowdsourcing applications,” in Proc. 31st Annu.IEEE Int. Conf. Comput. Commun., 2012, pp. 2140–2148.

[18] D. Zhao, X.-Y. Li, and H. Ma, “How to crowdsource tasks truth-fully without sacrificing utility: Online incentive mechanismswith budget constraint,” in Proc. 33rd Annu. IEEE Int. Conf. Com-put. Commun., 2014, pp. 1213–1221.

[19] L. Gao, F. How, and J. Huang, “Providing long-term participationincentive in participatory sensing,” in Proc. 34th Annu. IEEE Int.Conf. Comput. Commun., 2015, pp. 2803–2811.

[20] P. Chandra, Y. Narahari, D. Mandal, and P. Dey, “Novel mecha-nisms for online crowdsourcing with unreliable, strategic agents,”in Proc. 29th AAAI Conf. Artif. Intell., 2015, pp. 1256–1262.

[21] Y. Wei, Y. Zhu, H. Zhu, Q. Zhang, and G. Xue, “Truthful onlinedouble auctions for dynamic mobile crowdsourcing,” in Proc. 34thIEEE Int. Conf. Comput. Commun., 2015, pp. 2074–2082.

[22] T. Luo, H.-P. Tan, and L. Xia, “Profit-maximizing incentive forparticipatory sensing,” in Proc. 33rd Annu. IEEE Int. Conf. Comput.Commun., 2014, pp. 127–135.

[23] S. Reddy, D. Estrin, and M. Srivastava, “Recruitment frameworkfor participatory sensing data collections,” in Proc. 2010 IEEEInt. Conf. Pervasive Comput. Commun., 2010, pp. 138–155.

[24] M. H. Cheung, R. Southwell, F. Hou, and J. Huang, “Distributedtime-sensitive task selection in mobile crowdsensing,” in Proc.16th ACM Int. Symp. Mobile Ad Hoc Netw. Comput., 2015, pp. 157–166.

[25] I. Koutsopoulos, “Optimal incentive-driven design of participa-tory sensing systems,” in Proc. 32nd Annu. IEEE Int. Conf. Comput.Commun., 2013, pp. 1402–1410.

[26] R. Kawajiri, M. Shimosaka, and H. Kashima, “Steered crowdsens-ing incentive design towards quality-oriented place-centriccrowdsensing,” in Proc. 16th ACM Int. Conf. Ubiquitous Comput.,2014, pp. 691–701.

[27] Z. Feng, Y. Zhu, Q. Zhang, L. M. Ni, and A. V. Vasilakos, “TRAC:Truthful auction for location-aware collaborative sensing inmobile crowdsourcing,” in Proc. 33rd Annu. IEEE Int. Conf. Com-put. Commun., 2014, pp. 1231–1239.

[28] Q. Zhang, Y. Wen, X. Tian, X. Gan, and X. Wang, “Incentivizecrowd labeling under budget constraint,” in Proc. 34th IEEE Int.Conf. Comput. Commun., 2015, pp. 2812–2820.

[29] F. Restuccia, S. K. Das, and J. Payton, “Incentive mechanisms forparticipatory sensing: Survey and research challenges,” ACMTrans. Sens. Netw., vol. 12, no. 2, pp. 13:1–13:40, 2016.

[30] S. Reddy, D. Estrin, M. Hansen, and M. Srivastava, “Examiningmicro-payments for participatory sensing data collections,” inProc. 12th ACM Int. Conf. Ubiquitous Comput., 2010, pp. 33–36.

[31] A. Papoutsaki, et al., “Crowdsourcing from scratch: A pragmaticexperiment in data collection by novice requesters,” in Proc. 3rdAAAI Conf. Human Comput. Crowdsourcing, 2015, pp. 140–149.

[32] A. Xu, X. Feng, and Y. Tian, “Revealing, characterizing, anddetecting crowdsourcing spammers: A case study in communityQ & A,” in Proc. 34th IEEE Int. Conf. Comput. Commun., 2015,pp. 2533–2541.

[33] A. P. Dawid and A. M. Skene, “Maximum likelihood estimation ofobserver error-rates using the EM algorithm,” Appl. Statist.,vol. 28, pp. 20–28, 1979.

[34] J. Rice, Mathematical Statistics and Data Analysis., Boston,MA, USA: Cengage Learning, 2006.

[35] C. F. J. Wu, “On the convergence properties of the EM algorithm,”Annals Statist., vol. 11, no. 1, pp. 95–103, 1983.

[36] Y. Zhang, X. Chen, D. Zhou, and M. I. Jordan, “Spectral methodsmeet EM: A provably optimal algorithm for crowdsourcing,” inProc. 28th Annu. Conf. Neural Inf. Process. Syst., 2014, pp. 1260–1268.

[37] R. W. Yeung, Information Theory and Network Coding. Berlin,Germany: Springer, 2008.

[38] C. E. Shannon, “A mathematical theory of communication,” BellSyst. Tech. J., vol. 27, pp. 379–423,623–656, 1948.

318 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17, NO. 2, FEBRUARY 2018

Page 13: IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 17 ...cs.sjtu.edu.cn/~fwu/res/Paper/PWC18TMC.pdfThecrowdsensingprocess(therightpart)canbediscribedas follows. First, the service provider

[39] L. S. Shapley, “A value for n-person games,” DTIC Document,Tech. Rep., 1952.

[40] “Noisetube,” 2008. [Online]. Available: http://www.noisetube.net/

[41] F. Pukelsheim, “The three sigma rule,” Amer. Statist., vol. 44,pp. 88–91, 1994.

[42] J. J. Horton and L. B. Chilton, “The labor economics of paidcrowdsourcing,” in Proc. 11th ACM Conf. Electron. Commerce, 2010,pp. 209–218.

[43] Z. Pan, H. Yu, C. Miao, and C. Leung, “Efficient collaborativecrowdsourcing,” in Proc. 30th AAAI Conf. Artif. Intell., 2016,pp. 4248–4249.

[44] L. Yu, P. Andr�e, A. Kittur, and R. Kraut, “A comparison of social,learning, and financial strategies on crowd engagement and out-put quality,” in Proc. 17th ACM Conf. Comput. Supported Coopera-tive Work Social Comput., 2014, pp. 967–978.

[45] I. Krontiris and A. Albers, “Monetary incentives in participatorysensing using multi-attributive auctions,” Int. J. Parallel Emerg.Distrib. Syst., vol. 27, no. 4, pp. 317–336, 2012.

[46] J. Wang and P. G. Ipeirotis, “Quality-based pricing for crowd-sourced workers,” New York Univ., Tech. Rep. 2451/31833, 2013.

[47] H. J. Jung and M. Lease, “Modeling temporal crowd work qualitywith limited supervision,” in Proc. 3rd AAAI Conf. Human Comput.Crowdsourcing, 2015, pp. 83–91.

[48] J. Bragg, Mausam, and D. S. Weld, “Optimal testing for crowdworkers,” in Proc. 2016 Int. Conf. Autonomous Agents MultiagentSyst., 2016, pp. 966–974.

[49] E. Davami and G. Sukthankar, “Improving the performance ofmobile phone crowdsourcing applications,” in Proc. 2016Int. Conf. Autonomous Agents Multiagent Syst., 2015, pp. 145–153.

Dan Peng received the PhD degree in the Depart-ment of Computer Science and Engineering atShanghai Jiao Tong University, P. R. China, andthe BS degree in computer science and Technol-ogy from the Huazhong University of Science &Technology in 2012. Her research interestsinclude algorithmic game theory, wireless net-working, andmobile computing.

Fan Wu received the BS degree in computer sci-ence from Nanjing University, in 2004, and thePhD degree in computer science and engineeringfrom the State University of New York at Buffalo,in 2009. He is an associate professor in theDepartment of Computer Science and Engineer-ing, Shanghai Jiao Tong University, P. R. China.He has visited the University of Illinois at Urbana-Champaign (UIUC) as a post doc research asso-ciate. His current research interests include wire-less networking, algorithmic mechanism design,

and privacy preservation. He received the Excellent Young ScholarAward from Shanghai Jiao Tong University in 2011, and the PujiangScholar Award in 2012. He is a member of the ACM, CCF, and IEEE.

Guihai Chen received the BS degree from Nanj-ing University in 1984, ME degree from South-east University in 1987, and PhD degree from theUniversity of Hong Kong in 1997. He is a distin-guished professor at Shanghai Jiaotong Univer-sity, China. He was invited as a visiting professorby many universities including the Kyushu Insti-tute of Technology, Japan, in 1998, University ofQueensland, Australia in 2000, and Wayne StateUniversity, USA, during September 2001 toAugust 2003. He has a wide range of research

interests with a focus on sensor networks, peer-to-peer computing, high-performance computer architecture, and combinatorics. He has pub-lished more than 200 peer-reviewed papers, and more than 120 of themare in well-archived international journals such as the IEEE Transactionson Parallel and Distributed Systems, Journal of Parallel and DistributedComputing, Wireless Networks, The Computer Journal, InternationalJournal of Foundations of Computer Science, and Performance Evalua-tion, and also in well-known conference proceedings such as HPCA,MOBIHOC, INFOCOM, ICNP, ICPP, IPDPS, and ICDCS. He is a seniormember of the IEEE.

" For more information on this or any other computing topic,please visit our Digital Library at www.computer.org/publications/dlib.

PENG ET AL.: DATA QUALITY GUIDED INCENTIVE MECHANISM DESIGN FOR CROWDSENSING 319