socially-optimal design of resource exchange systems with reputation update errors

Model Related Work Optimal Design Conclusion

Socially-Optimal Design of Resource ExchangeSystems with Reputation Update Errors

Yuanzhang Xiao, Yu Zhang and Mihaela van der Schaar

Department of Electrical Engineering, UCLAEmail: [email protected]

October 17, 2012

1 / 13

Resource Exchange Systems

A general resource exchange system:

A set of users N , {1, . . . ,N}Users have resources valuable to the others

Users stay in the system for a long period of time t = 0, 1, 2At each period t:

Each user (as a client) requests for resourcesEach user (as a server) is matched to a clientEach server chooses the effort level in providing resources

Applications:

Yahoo! Answer: knowledge

Crowdsourcing Platforms: labor

Peer-to-peer Systems: data

2 / 13

Applications:

2 / 13

Applications:

2 / 13

Applications:

2 / 13

Applications:

2 / 13

The System Model

Assumptions:

Two effort levels: low and high ({0, 1})Server has no cost in requestingHomogeneous users:

Servers cost for exerting high (or low) effort same across users.Clients benefit from high (or low) effort same across users.

No monetary exchange

(Stage-game) Model:

A gift-giving game between a client and a serverhigh effort low effort

request (b,c) (0, 0)A matching: m : N N , i 7 m(i)Set of matchings: M = {m : m(i) 6= i ,i N}A matching rule: : M (M)Focus on uniformly random matching

3 / 13

The System Model

Assumptions:

(Stage-game) Model:

3 / 13

The System Model

Assumptions:

(Stage-game) Model:

3 / 13

Reputation Mechanisms

Reputation summarizes past behavior:

Assign each user i with a reputation i , {0, 1}Reputation profile: N (Unknown to users)Reputation distribution: s() = (s0(), s1()) (Known)

Model with Reputation Mechanisms: (in each period t)The platform displays s(), N, and announces the recommended action0 : {0, 1}Each user i requests for resources

Each user i is matched as a server to user m(i) with probability (m)

Each user i is informed by the platform of the servers reputation m(i)

Based on its action i , each user i chooses its effort level i (m(i), i )

Each server reports its (erroneous) assessment of the effort level to the platform

The platform updates the reputation profile based on the reputation update rule : {0, 1} .

4 / 13

An Illustrating Example

Altruistic action a(r , w ) = 1,r , w {0, 1}Fair action f(r , w ) =

{0 w > r1 w r

Selfish action s(r , w ) = 0,r , w {0, 1}Aafs = {a, f , s}, Aas = {a, s}Erroneous report (with error probability

R(z |z) ={

1 , z = z, z 6= z

Reputation update rule

(s |c , s , z)=

+s ,

s = 1, z 0(c , s)

1 +s , s = 0, z 0(c , s)1 s , s = 1, z < 0(c , s)s ,

s = 0, z < 0(c , s)

, for s = 0, 1.

5 / 13

Existing Works

The idea of social norm and reputations:

Kandori, 1992

Does not work under reputation update errors

Experimental works:

Feldman, Papadimitriou, Chuang, and Stoica, 2006, etc.

Theoretical design framework:

Dellarocas, 2005

No differential punishment

Zhang and van der Schaar, 2011-2012

Stationary Markov strategies

6 / 13

Existing Works

Kandori, 1992

Experimental works:

Dellarocas, 2005

6 / 13

Existing Works

Kandori, 1992

Experimental works:

Dellarocas, 2005

6 / 13

Existing Works

Kandori, 1992

Experimental works:

Dellarocas, 2005

6 / 13

Stochastic Game Formulation

Stochastic game:

Players: the users and the platform N {0}State: reputation profile NAction set: A , {| : {0, 1}}Stage-game payoff: ui (

t , pi0(ht), pi(ht))

History at period t: ht = (0, . . . ,t) HtStrategy: pii :

t=0Ht A, i = 0, 1, . . . ,N

Strategy profile pi = (pi1, . . . , piN)

Recommended action 0 and Recommended strategy pi0

Overall payoff

Ui (0, pi0,pi) = Eh

{(1 )

t=0

tui (t , pi0(h

t), pi(ht))

}.

7 / 13

Restrictions on Strategies

Symmetric strategy profile: pi 1NFeasible strategy:

Definition (Feasible Strategy)

A strategy pi is feasible, if for all t 0 and for all ht , ht Ht , wehave

pi(ht) = pi(ht), if s(k) = s(k), k = 0, 1, . . . , t.

We write the set of all feasible strategies as f .

Set of symmetric feasible strategies restricted on the subset ofactions B A: f (B)

We focus on symmetric feasible strategy profiles restricted on asubset.

8 / 13

Equilibrium Definition

Continuation strategy: pii |hk (ht) = pii (hkht)

Definition

A pair of a feasible recommended strategy and a symmetricfeasible strategy profile (pi0, pi 1N) f Nf is a SF-PPE, if forall t 0, for all ht Ht , and for all i N , we have pii |ht f

9 / 13

Equilibrium Definition

Continuation strategy: pii |hk (ht) = pii (hkht)

Definition

A pair of a feasible recommended strategy and a symmetricfeasible strategy profile (pi0, pi 1N) f Nf is a SF-PPE, if forall t 0, for all ht Ht , and for all i N , we have pii |ht f

9 / 13

The Platform Designers Problem

Maximize the social welfare at the equilibrium in the worst case(with respect to different initial reputation distributions)

max,(pi0,pi1N)0fNf

min0N

1

N

iN

Ui (0, pi0, pi 1N)

s.t. (pi0, pi 1N) is a SF PPE.

10 / 13

Asymptotically Social Optimal Design

Theorem (Asymptotically Achieve Social Optimum)

Choose any small real numbers 1 > 0 and 0 (1, qt 1). If the following three sets ofconditions are satisfied

Condition 1: +1 > 1 1 and x+1 , (1 )+1 + (1 1 ) > 11+ c(N1)b

;

Condition 2: +0 > 1 0 andx+0 , (1 )+0 + (1 0 )

(1x+1

c(N1)b

,1+[1+ c

(N1)b ](1x+1 )

[1+ c(N1)b ]

2

);

Condition 3: [, 1), where

= max

{0 1 (b + cN1 )

(0 1)( 1N1+1 + N2N1 x+1 ) (b + cN1 ),

max{0,1}

c

c + (1 2)(+ (1 ))(0 1)

};

then there exists a SF-PPE (pi0, pi0 1N) f (Aafs) Nf (Aafs) that achievesUi (

0, pi0, pi0 1N) = b c i for all i N , starting from any initial reputationprofile 0.

11 / 13

;

Condition 2: +0 > 1 0 andx+0 , (1 )+0 + (1 0 )

(1x+1

c(N1)b

,1+[1+ c

(N1)b ](1x+1 )

[1+ c(N1)b ]

2

);

= max

{0 1 (b + cN1 )

(0 1)( 1N1+1 + N2N1 x+1 ) (b + cN1 ),

max{0,1}

c

c + (1 2)(+ (1 ))(0 1)

};

11 / 13

;

Condition 2: +0 > 1 0 andx+0 , (1 )+0 + (1 0 )

(1x+1

c(N1)b

,1+[1+ c

(N1)b ](1x+1 )

[1+ c(N1)b ]

2

);

= max

{0 1 (b + cN1 )

(0 1)( 1N1+1 + N2N1 x+1 ) (b + cN1 ),

max{0,1}

c

c + (1 2)(+ (1 ))(0 1)

};

11 / 13

;

Condition 2: +0 > 1 0 andx+0 , (1 )+0 + (1 0 )

(1x+1

c(N1)b

,1+[1+ c

(N1)b ](1x+1 )

[1+ c(N1)b ]

2

);

= max

{0 1 (b + cN1 )

(0 1)( 1N1+1 + N2N1 x+1 ) (b + cN1 ),

max{0,1}

c

c + (1 2)(+ (1 ))(0 1)

};

11 / 13

The Recommended Strategy

Require: 0, 1, , sInitialization: t = 0, v = b c .repeat

if s1() = 0 thenif v0 large then

t0 = t = a, update v0 and v1

elset0 =

t = s, update v0 and v1

endelseif s1() = N then

if v1 large thent0 =

t = a, update v0 and v1

elset0 =

t = s, update v0 and v1

endelse

if v1 close to v0 thent0 =

t = a, update v0 and v1

elset0 =

t = f , update v0 and v1

endendt t + 1

until 12 / 13

Conclusions

Differential punishment

Nonstationary Markov strategies

13 / 13

System ModelRelated WorksSocially-Optimal DesignConclusions

socially-optimal design of resource exchange systems with reputation update errors

Documents

set of users n

period t

clienteach server

resourceseach user

long period of time

requestinghomogeneous

client requests

yahoo answer