THE PRICE OF STOCHASTIC ANARCHY
Christine Chung
University of Pittsburgh
Katrina Ligett
Carnegie Mellon University
Kirk Pruhs University of PittsburghAaron Roth Carnegie Mellon
University
Time Needed
Machine 1
Machine 2
Job 1
Job 2
Job 3
Load Balancing on Unrelated Machines
2
Machine 1
Machine 2
n players, each with a job to run, chooses one of m machines to run it on
Each player’s goal is to minimize her job’s finish time.
NOTE: finish time of a job is equal to load on the machine where the job is run.
Time Needed
Machine 1
Machine 2
Job 1
Job 2
Job 3
Load Balancing on Unrelated Machines
3
Machine 1
Machine 2
n players, each with a job to run, chooses one of m machines to run it on
Each player’s goal is to minimize her job’s finish time.
NOTE: finish time of a job is equal to load on the machine where the job is run.
n players, each with a job to run, chooses one of m machines to run it on
Each player’s goal is to minimize her job’s finish time.
NOTE: finish time of a job is equal to load on the machine where the job is run.
Time Needed
Machine 1
Machine 2
Job 1
Job 2
Job 3
Load Balancing on Unrelated Machines
4
Machine 1
Machine 2
Time Needed
Machine 1
Machine 2
Job 1
Job 2
Job 3
Load Balancing on Unrelated Machines
5
Machine 1
Machine 2
n players, each with a job to run, chooses one of m machines to run it on
Each player’s goal is to minimize her job’s finish time.
NOTE: finish time of a job is equal to load on the machine where the job is run.
Unbounded Price of Anarchy in the Load Balancing Game on Unrelated Machines6
Price of Anarchy (POA) measures the cost of having no central authority.
Let an optimal assignment under centralized authority be one in which makespan is minimized.
POA = (makespan at worst Nash)/(makespan at OPT)
Bad POA instance: 2 players and 2 machines (L and R).
OPT here costs δ. Worst Nash costs 1. Price of Anarchy:
cost of worst Nash 1
cost at OPT
L
job 2
job 1
R
δ1
1δδ1
Drawbacks of Price of Anarchy A solution characterization with no road
map. If there is more than one Nash, don’t
know which one will be reached. Strong assumptions must be made about
the players: e.g., fully informed and fully convinced of one anothers’ “rationality.”
Nash are sometimes very brittle, making POA results feel overly pessimistic.
7
Evolutionary Game Theory8
Young (1993) specified a model of adaptive play.
Evolutionary Game Theory9
Young (1993) specified a model of adaptive play that allows us to predict which solutions will be chosen in the long run by self-interested decision-making agents with limited info and resources.
“I dispense with the notion that people fully understand the structure of the games they
play, that they have a coherent model of others’ behavior, that they can make
rational calculations of infinite complexity, and that all of this is common knowledge. Instead I postulate a world in which people base their decisions on limited data, use
simple predictive models, and sometimes do unexplained or even foolish things.”
– P. Young, Individual Strategy and Social Structure, 1998
Evolutionary Game Theory10
Young (1993) specified a model of adaptive play.
Adaptive play allows us to predict which solutions will be chosen in the long run by self-interested decision-making agents with limited info and resources.
In each round of play, each player uses some simple, reasonable dynamics to decide which strategy to play. E.g., imitation dynamics
Sample s of the last mem strategies I played Play the strategy whose average payoff was
highest (breaking ties uniformly at random) best response dynamics
Sample the other player’s realized strategy in s of the last mem rounds.
Assume this sample represents the probability distribution of what the other player will play the next round, and play a strategy that is a best response (minimizes my expected cost).
Adaptive Play Example11
L
job 2
job 1
R
δ1
1δ
In each round of play, each player uses some simple, reasonable dynamics to decide which strategy to play. E.g., imitation dynamics
Sample s of the last mem strategies I played Play the strategy whose average payoff was
highest (breaking ties uniformly at random) best response dynamics
Sample the other player’s realized strategy in s of the last mem rounds.
Assume this sample represents the probability distribution of what the other player will play the next round, and play a strategy that is a best response (minimizes my expected cost).
Adaptive Play Example12
L
job 2
job 1
R
δ1
1δ
Let mem = 4.
If s = 3, each player randomly samples three past plays from the memory, and picks the strategy among them that worked best (yielded the highest payoff).
LLRRLLLL
Adaptive Play Example: a Markov process
13
LLLLLLLL
player 1player 2
3/4 1/4 1
LLLRLLLL
LLLLLLLR
RRRRLRRR
RRRRRRRR...
LLRLLLLL
(Then there are 2^8 = 256 total states in the state space.)
1
L
job 2
job 1
R
δ1
1δ
Absorbing Sets of the Markov Process
14
An absorbing set is a set of states that are all reachable from one another, but cannot reach any states outside of the set.
In our example, we have 4 absorbing sets:
But which state we end up in depends on our initial state. Hence we perturb our Markov process as follows: During each round, each player, with
probability ε, does not use imitation dynamics, but instead chooses a machine at random.
1
RRRRRRRR
1
RRRRLLLL
1
LLLLRRRR
1
LLLLLLLL
NASH
OPT
L
job 2
job 1
R
δ1
1δ
Stochastic Stability15
The perturbed process has only one big absorbing set (any state is reachable from any other state).
Hence we have a unique stationary distribution με (where μεP = με). The probability distribution με is the time-
average asymptotic frequency distribution of Pε.
A state z is stochastically stable if
0)(lim0
z
L
job 2
job 1
R
δ1
1δ
Finding Stochastically Stable States16
Theorem (Young, 1993): The stochastically stable states are those states contained in the absorbing sets of the unperturbed process that have minimum stochastic potential.
1
RRRRRRRR
1
RRRRLLLL
1
LLLLRRRR
1
LLLLLLLL
L
job 2
job 1
R
δ1
1δ
Finding Stochastically Stable States17
Theorem (Young, 1993): The stochastically stable states are those states contained in the absorbing sets of the unperturbed process that have minimum stochastic potential.
RRRRRRRR
RRRRLLLL
LLLLRRRR
LLLLLLLL
1
RRRRRRRR
LR
LR
LR
LR
LLLLRRRR
LL
LL
LL
LL
3
L
job 2
job 1
R
δ1
1δ
Theorem (Young, 1993): The stochastically stable states are those states contained in the absorbing sets of the unperturbed process that have minimum stochastic potential.
= cost of min
spanning tree rooted
there
Finding Stochastically Stable States18
RRRRRRRR
RRRRLLLL
LLLLRRRR
LLLLLLLL
1 3
2 6
L
job 2
job 1
R
δ1
1δ
Finding Stochastically Stable States19
Theorem (Young, 1993): The stochastically stable states are those states contained in the absorbing sets of the unperturbed process that have minimum stochastic potential.
RRRRRRRR
RRRRLLLL
LLLLRRRR
LLLLLLLL
3
216
L
job 2
job 1
R
δ1
1δ
6
= cost of min
spanning tree rooted
there
Finding Stochastically Stable States20
Theorem (Young, 1993): The stochastically stable states are those states contained in the absorbing sets of the unperturbed process that have minimum stochastic potential.
RRRRRRRR
RRRRLLLL
LLLLRRRR
LLLLLLLL
3
116
5
L
job 2
job 1
R
δ1
1δ
6
= cost of min
spanning tree rooted
there
Finding Stochastically Stable States21
Theorem (Young, 1993): The stochastically stable states are those states contained in the absorbing sets of the unperturbed process that have minimum stochastic potential.
RRRRRRRR
RRRRLLLL
LLLLRRRR
LLLLLLLL2
116
5
4Stochastically Stable!
L
job 2
job 1
R
δ1
1δ
6
= cost of min
spanning tree rooted
there
Recap: Adaptive Play Model
Assume the game is played repeatedly by players with limited information and resources.
Use a decision rule (aka “learning behavior” or “selection dynamics”) to model how each player picks her strategy for each round.
This yields a Markov Process where the states represent fixed-sized histories of game play.
Add noise (players make “mistakes” with some small positive probability and don’t always behave according to the prescribed dynamics)
22
23
Stochastic Stability
The states in the perturbed Markov process with positive probability in the long-run are the stochastically stable states (SSS).
In our paper, we define the Price of Stochastic Anarchy (PSA) to be
OPTat cost
SSS ofcost max
SSS
23
Recall bad instance: POA = 1/δ (unbounded) But the bad Nash in this case is not a SSS. In fact,
OPT is the only SSS here. So PSA = 1 in this instance.
Our main result: For the game of load balancing on unrelated
machines, while POA is unbounded, PSA is bounded.
Specifically, we show PSA ≤ m∙(Fib(n)(mn+1)), which is m times the (mn+1)th n-step Fibonacci number.
We also exhibit instances of the game where PSA > m.
PSA for Load Balancing
(m is the number of machines, n is the number of jobs/players)
Ω(m) ≤ PSA ≤ m∙Fib(n)(mn+1) Ω(m) ≤ PSA ≤ m∙Fib(n)(mn+1)
24
L
job 2
job 1
R
δ1
1δ
In the game of load balancing on unrelated machines, we found that while POA is unbounded, PSA is bounded.
Indeed, in the bad POA instances for many games, the worst Nash are not stochastically stable.
Finding PSA in these games are interesting open questions that may yield very illuminating results.
PSA allows us to determine relative stability of equilibria, distinguishing those that are brittle from those that are more robust, giving us a more informative measure of the cost of having no central authority.
Closing Thoughts25
You might notice in this game that if players could coordinate or form a team, they would play OPT.
Instead of being unbounded, [AFM2007] have shown the strong price of anarchy is O(m).
We conjecture that PSA is also O(m), i.e., that a linear price of anarchy can be achieved without player coordination.
Conjecture26
L
job 2
job 1
R
δ1
1δ