n a s h e q u ilib ria o f b im a trix g a m e s · 2010. 5. 17. · n a s h e q u ilib ria o f b...
TRANSCRIPT
Nash equilibria of bimatrix games
0 6 2 1A = 2 5 B = 1 3
3 3 4 3
Nash equilibrium =
pair of strategies x , y with
x best response to y andy best response to x.
Mixed equilibria
0 6 2 1A = 2 5 B = 1 3
3 3 4 3
2/3x = 1/3 xTB = 5/3 5/3
0
4Ay = 4 yT = 1/3 2/3
3
only pure best responses can have probability > 0
Best response condition
Let x and y be mixed strategies of player I and II, respectively.Then x is a best response to y⇐⇒ for all pure strategies i of player I:
xi > 0 =⇒ (Ay)i = u = max{(Ay)k | 1≤ k ≤ m}.Here, (Ay)i is the i th component of Ay , which is the expectedpayoff to player I when playing row i .
Proof.
xAy =m
∑i=1
xi (Ay)i =m
∑i=1
xi (u− (u− (Ay)i)
=m
∑i=1
xi u−m
∑i=1
xi (u− (Ay)i) = u−m
∑i=1
xi (u− (Ay)i)≤ u,
because xi ≥ 0 and u− (Ay)i ≥ 0 for all i . Furthermore,xAy = u ⇐⇒ xi > 0 implies (Ay)i = u , as claimed.
Best responses to mixed strategy of player 2
2
4 5
1
3
player Ipayoffs to
0,11,0
= A523 3
60
Best responses to mixed strategy of player 2
2
4 5
1
3
player Ipayoffs to
0,11,0
= A523 3
60
Best responses to mixed strategy of player 2
2
4 5
1
3
player Ipayoffs to
0,11,0
1
0
6
= A523 3
60
Best responses to mixed strategy of player 2
2
4 5
1
3
player Ipayoffs to
0,11,0
2
2
5
= A523 3
60
Best responses to mixed strategy of player 2
2
4 5
1
3
player Ipayoffs to
0,11,0
33 3
= A523 3
60
Best responses to mixed strategy of player 2
2
4 5
1
3
player Ipayoffs to
0,11,0
32
1
3 3
2
5
0
6
= A523 3
60
best response polyhedron
Best responses to mixed strategy of player 2
2
4 5
1
3
player Ipayoffs to
0,11,0
32
1
= A523 3
60
with facet labelsbest response polyhedron
Best responses to mixed strategy of player 2
2
4 5
1
3
player Ipayoffs to
5 4
0,11,0
32
1
= A523 3
60
Best responses to mixed strategy of player 2
2
4 5
1
3
player Ipayoffs to
5 4
5 43 2 1
32
1
= A523 3
60
Best responses to mixed strategy of player 2
2
4 5
1
3
player Ipayoffs to
5 43 2 1
= A523 3
60
Best responses to mixed strategy of player 1
100
010
001
2
4 5
1
3
2 11 34 3
payoffs toplayer II
= B
Best responses to mixed strategy of player 1
100
010
001
2
4 5
1
3
2 11 34 3
payoffs toplayer II
= B
Best responses to mixed strategy of player 1
100
010
001
2
4 5
1
3
2 11 34 3
payoffs toplayer II
= B
2
1
4
Best responses to mixed strategy of player 1
100
010
001
2
4 5
1
3
2 11 34 3
payoffs toplayer II
= B
2
1
4
Best responses to mixed strategy of player 1
100
010
001
2
4 5
1
3
2 11 34 3
payoffs toplayer II
= B
3
3
1
Best responses to mixed strategy of player 1
100
010
001
2
4 5
1
3
2 11 34 3
payoffs toplayer II
= B
3
3
1
Best responses to mixed strategy of player 1
100
010
001
2
4 5
1
3
2 11 34 3
payoffs toplayer II
= B
Best responses to mixed strategy of player 1
100
010
001
2
4 5
1
3
2 11 34 3
payoffs toplayer II
= B
best response
with facet labelspolyhedron
Best responses to mixed strategy of player 1
100
010
001
2
4 5
1
3
2 11 34 3
payoffs toplayer II
= B
5
4
12
3 (front)
Alternative view
Chop off Toblerone prism
Chop off Toblerone prism
Chop off Toblerone prism
Chop off Toblerone prism
Chop off Toblerone prism
Best responses to mixed strategy of player 1
100
010
001
2
4 5
1
3
2 11 34 3
payoffs toplayer II
= B
Best responses to mixed strategy of player 1
2
3
145
2
4 5
1
3
2 11 34 3
payoffs toplayer II
= B
Equilibrium = completely labeled strategy pair
45
3
3 2 145
2 1
Equilibrium = completely labeled strategy pair
45
3
3 2 145
2 1
Equilibrium = completely labeled strategy pair
45
3
3 2 145
2 1
Constructing games using geometry
low dimension: 2, 3, (4) pure strategies:
subdivide mixed strategy simplex intoresponse regions, label suitably
high dimension:
use polytopes with known combinatorial structuree.g. for constructing games with many equilibria,or long Lemke-Howson computations[Savani & von Stengel, FOCS 2004,Econometrica 2006]
Construct isolated non−quasi−strict equilibrium
12
5
4
6
3
12
45
6 3
1
2
3
4 5 6
2 0 0A = 111
0 2 01 0 2B =
011
1 0 0
Construct isolated non−quasi−strict equilibrium
12
5
4
6
3
12
45
6 3
The Lemke−Howson algorithm
45
3
3 2 1
12
5 4
The Lemke−Howson algorithm
45
3
3 2 1
12
5 4
The Lemke−Howson algorithm
artificial equilibrium000 0,0
45
3
3 2 1
12
5 4
The Lemke−Howson algorithm
missing label 2
artificial equilibrium000 0,0
45
3
3 2 1
12
5 4
The Lemke−Howson algorithm
2missing label
000 0,0
45
3
3 1
12
5 4
2
The Lemke−Howson algorithm
2missing label
000 0,0
45
3
3 1
12
5 4
2
The Lemke−Howson algorithm
2missing label
000 0,0
45
3
3 1
12
5 4
2
The Lemke−Howson algorithm
2missing label
000 0,0
45
3
3 1
12
5 4
2
The Lemke−Howson algorithm
2found label
000 0,0
45
3
3 2 1
12
5 4
Why Lemke-Howson works
LH finds at least one Nash equilibrium because
• finitely many "vertices"
for nondegenerate (generic) games:
• unique starting edge given missing label
• unique continuation
precludes "coming back" like here:
start at Nash equilibrium
The Lemke−Howson algorithm
2missing label45
3
3 2 1
12
5 4
start at Nash equilibrium
The Lemke−Howson algorithm
2missing label45
3
3 1
12
5 4
2
start at Nash equilibrium
The Lemke−Howson algorithm
2missing label45
3
3 1
12
5 4
2
start at Nash equilibrium
Odd number of Nash equilibria!
2found label45
3
3 2 1
12
5 4
Nondegenerate bimatrix games
Given: m × n bimatrix game (A,B)
X = { x ∈ Rm | x ≥ 0, x1 + . . . + xm = 1 }Y = { y ∈ Rn | y ≥ 0, y1 + . . . + yn = 1 }
supp(x) = { i | xi > 0 } supp(y) = { j | yj > 0 }
(A,B) nondegenerate ⇔ ∀ x ∈X, y ∈Y:
| { j | j best response to x } | ≤ | supp(x) |,| { i | i best response to y } | ≤ | supp(y) |.
Nondegeneracy via labels
m × n bimatrix game (A,B) nondegenerate
⇔ no x X has more than m labels,no y Y has more than n labels.
E.g. x with > m labels,s labels from { 1 , . . . , m } ,
⇒ > m−s labels from { m+1 , . . . , m+n }⇔ > |supp(x)| best responses to x.⇒ degenerate.
Example of a degenerate game
3
1
4 5
2
4 5
1
3
2
2 11 34
payoffs toplayer II
= B4
Handling degenerate games
Lemke–Howson implemented by pivoting, i.e., changing fromone basic feasible solution of a linear system to another by choos-ing an entering and a leaving variable.
Choice of entering variable via complementarity (only differenceto simplex algorithm for linear programming).
Leaving variable is unique in nondegenerate games.
In degenerate games: perturb system by adding (ε, . . . ,εn)> ,creates nondegenerate system.Implemented symbolically by lexicographic rule.