intrinsic robustness of the price of anarchy “smoothness”
Post on 24-Feb-2016
43 Views
Preview:
DESCRIPTION
TRANSCRIPT
Ofir Chen for 2014 POA Seminar by Prof. Michal Feldman
Paper by Tim Roughgarden ‘13
Intrinsic Robustness of the Price of Anarchy
“Smoothness”
2
Presentation divides into 3 parts: Pure NE case, extensions to other equilibrias ,and a proof of tightness of the result for congestion games.
We’d like to bound the cost of unilateral deviation from any strategy to any other strategy.
Comparing any 2 strategies bounds the POA.
Bounding all outcomes has implications beyond games with pure-NE (PNE) – hence the intrinsic property.
smooth game reduces to a robust POA automatically.
Finally we’ll see that the achieved bound is tight for Congestion Games.
Introduction
3
Definition: smoothness: a cost minimization game is ( if for any 2 outcomes and :
Smoothness in Pure-NE
4
Smoothness in Pure-NE
Pure-NE key equation: Immediate result for smooth games: :
Definition: Robust POA :NE smoothness
5
Definition Congestion Game: a cost minimization game that has a ground set E, strategy sets , anda non-decreasing cost function .
Example: Non-atomic-selfish routing: -edges, are paths on the graph. : the set of edges used by player : the load on edge Player- cost under strategy
Claim: the game is -smooth(
Ex-1: Congestion Games
6
Proof: Mathematical fact (without proof) for : For outcomes
Ex-1: Congestion Games (Cont.)
(*) The load on e in is at most +1 than in under unilateral deviation and it is suffered by exactly players
)*(
7
Definition: Valid Utility Games:Payoff maximization, payoff functions Ground set , strategies Non-negative submodular function
Definition submodular : Target function: while
2 Conclusions follow:(1 )
(2 )relaxation on sum objective functions
Example: players are trying to win ‘red’ tokens on a board by placing their tokens – each red token goes to closest token’s player. The payoff is the number of red tokens.
Ex-2: Valid Utility Games
8
Claim: if is non-decreasing, VUGs are , with POA=0.5 Note: redefine smootheness in max-payoff games :
. Proof: L be the union of all players' strategies in , together with the
strategies of players 1… in : ],
Ex-2: Valid Utility Games(Cont.)
𝑪𝒐𝒏𝒅 .𝟏 :Π 𝑖 (𝒔 )≥ Π (𝒔 )−Π (∅ , 𝒔− 𝑖 ) s Submodularity
Telescopic summation
9
Other Equilibriums Coarse Correlated Equilibriums
Coarse Correlated Eq
Pure-NEDon’t always
exist
PNE
Mixed NEAlways exist
Hard to compute MNE
Correlated EquilibriumsEasy to compute
Correlated Eq
=
=
=
10
Model: Players can play their strategies with some probability.Definition: Mixed NE: no player can decrease its expected cost
under the product independent distributions over players’ strategies
Lemma: Smoothness of distribution: for a ( , )-smooth game, when is an independent outcomes distribution: for any
Mixed Strategies Games
11
Proof :
POA: for every MNE:
Mixed Strategies Smooth Games
(*) 's distribution doesn't matter for since is fixed.
(*)
Smoothness
linearity
(*)
MNE
12
Model: Players’ strategies are correlated, players are getting a ‘hint’.
Definition: Correlated Equilibrium (CE): no player can decrease its expected cost given a posterior ‘hint’ :
Lemma: if a game is ( , )-smooth, – Correlated outcomes distribution then for any
Proof and POA: MNE proof follows for any joint distribution!
Correlated Strategies
13
Model: Players’ strategies are correlated (no ‘hint’).Definition: Coarse Correlated Equilibrium (CCE): no player
can decrease its expected cost
Lemma: if a game is ( , )-smooth, – Correlated outcomes distribution then for any
Proof and POA follow similarly.
Coarse-Correlated Strategies
14
Congestion game: 6 singleton strategies: (players pick 1 resource)Pure NE: each of the options for mutex choices.Mixed NE: pick uniformly - the expected cost is . Correlated eq.: pick uniformly– ‘hints’: 1 edge with 2 players
and 2 edges 1 player each: expected cost is . Coarse Correlated eq.: sets are: {0,2,4} or {1,3,5} w.p ½.
Expected cost is 3/2.
Ex-3: Inclusion of Equilibriums
1 2 3 4𝑖
12
3 4hints 𝑖
1 23 4
15
Model: a sequence is a game divided into time-intervals - each with its own distribution over outcomes: ~
Example:
Sequential Games
16
Model: sequential game without ‘too much’ regret in hindsight.Definition: regret is the cost over all intervals minus the cost of the
best fixed strategy in hindsight. Definition: a no regret sequence is a sequence of distributions
over outcomes where the total expected cost of each player is lesser than that of the best fixed strategy in hindsight, by at most :
No Regret Sequence
17
Definition: Price of total anarchy is a WC ratio between the expected cost of no-regret sequence that of an optimal outcome.
PoTA Under Smoothness: if is optimal, define: When the game is smooth:
Averaging over T will give us the PoTA :
Price of Total Anarchy
o(1)
𝑇→∞
smoothness
18
Congestion Games are Tight for the smoothness-bound
19
Definition: Tight Games: Let denote all legal values of ( ,)-Smooth games in Let be the games with at least 1 pure-NE For a game Let be the POA of pure-NE a class of games is tight if
Idea: characterize games by restricting their cost functions.
Congestion Games are Tight
𝒢 �̂�𝐺
the best upper bound provable through smoothness
The WC POA of pure-NE
Theorem: for a set of nondecreasing positive cost functions, the congestion games with cost functions in are tight.
20
: a set of congestion games with cost functions in . all legal values of ( ,)-Smooth games in
for every and for every. Note: is the best POA achievable through smoothness.
to be the cost function of resource Denote /to be the number of players using resource in outcomes
respectively.
Notations
21
Claim: for every set the robust POA of is at most .Proof: consider a game and outcomes . We'll show that is -
smooth.
Hence the POA upper bound is achieved.
Upper Bound
For each player at most one deviation is made Definition of
22
Definitions: Denote the pairs with . Denote
Note: there’re -pairs, hence linear 2 variable equationsrepresenting half-plains
Lemma-1: for a finite and , if such that - then is the intersection point of 2 half plains .
Mathematically: there exist such that:
Lower Bound
23
Proof: by inspecting a minimized Note: the problem resembles LP,
with non-linear target function. The half-planes create a convex hull increases in The minimal is on the convex!On the convex boundary, for every every has its values. increases in when
Lower Bound Geometric Proof
𝜆
𝜇
𝑓
𝛽 𝜇(0,0,0)
24
Take a down-walk on the convex hull:
When go down, otherwise – stop. That will assure is minimized.
is on the convex on 2 intersecting half-planes:
Lemma-2: for a finite and , if there exist such that - then there exist such that
Proof: for the above
Lower Bound Geometry (Cont.)
𝜆
𝜇
is min
+∙𝜂
∙(1−𝜂)Q.E.D
ℋ1
ℋ2
25
Theorem: for every cost functions set , there’re congestion games with cost functions in with pure POA arbitrary close to
Proof: Define by following the lemmas: - each a labeled ‘cycle’ with elements ’s cost function: and ’s cost function: players – each with 2 strategies and = use consecutive elements of and consecutive elements of starting the
th element of each cycle. = use consecutive elements of and consecutive elements of ending with
the 'th element of each cycle.
’s size assures and are disjoint.
Building the Game
……𝑖𝑖−1
……𝑖𝑖−1
26
Denote - the outcome of , the outcome of . By symmetry : , and : ,
Building the Game (Cont.)
symmetry
Lemma 2
Q and P are disjoint
Conclusion – the game is in a pure-NE
27
Repeating the Proof steps:The lemma told us where to find candidates.We built a game according to the lemma and found it has valid In this game the pure-POA is , hence the supremum is higher –
and we proved the lower bound.
The POA of the Game
Lemma 1
Conclusion – is a valid pair in
28
Questions?
29
Model: Myopic response, if the game is not in equilibrium, choose a non-optimal player and improve his cost.
The game will converge if it has a potential function satisfying:
Claim: Let be a best-response sequence of a smooth game with robust-POA , s* is optimal.
denote the deviating player at time t . If satisfies Then w.h.p.:all but of outcomes satisfies .
We’ll not show the Proof here
Best-Response Dynamics (BRD)
30
Model: a game where players are duplicated -times . Example: routing: existing infrastructure, various amount of identical
players. Denote a superposition of 2 outcomes. We assume Lemma: if G is smooth with duplicated players, then
Proof:
Bicriteria Bounds
NE
superposition
smoothness
top related