n a s h e q u ilib ria o f b im a trix g a m e s · 2010. 5. 17. · n a s h e q u ilib ria o f b...

Nash equilibria of bimatrix games

0 6 2 1A = 2 5 B = 1 3

3 3 4 3

Nash equilibrium =

pair of strategies x , y with

x best response to y andy best response to x.

Mixed equilibria

0 6 2 1A = 2 5 B = 1 3

3 3 4 3

2/3x = 1/3 xTB = 5/3 5/3

0

4Ay = 4 yT = 1/3 2/3

3

only pure best responses can have probability > 0

Best response condition

Let x and y be mixed strategies of player I and II, respectively.Then x is a best response to y⇐⇒ for all pure strategies i of player I:

xi > 0 =⇒ (Ay)i = u = max{(Ay)k | 1≤ k ≤ m}.Here, (Ay)i is the i th component of Ay , which is the expectedpayoff to player I when playing row i .

Proof.

xAy =m

∑i=1

xi (Ay)i =m

∑i=1

xi (u− (u− (Ay)i)

=m

∑i=1

xi u−m

∑i=1

xi (u− (Ay)i) = u−m

∑i=1

xi (u− (Ay)i)≤ u,

because xi ≥ 0 and u− (Ay)i ≥ 0 for all i . Furthermore,xAy = u ⇐⇒ xi > 0 implies (Ay)i = u , as claimed.

Best responses to mixed strategy of player 2

2

4 5

1

3

player Ipayoffs to

0,11,0

= A523 3

60


2

4 5

1

3

player Ipayoffs to

0,11,0

1

0

6

= A523 3

60


2

4 5

1

3

player Ipayoffs to

0,11,0

2

2

5

= A523 3

60


2

4 5

1

3

player Ipayoffs to

0,11,0

33 3

= A523 3

60


2

4 5

1

3

player Ipayoffs to

0,11,0

32

1

3 3

2

5

0

6

= A523 3

60

best response polyhedron


2

4 5

1

3

player Ipayoffs to

0,11,0

32

1

= A523 3

60

with facet labelsbest response polyhedron


2

4 5

1

3

player Ipayoffs to

5 4

0,11,0

32

1

= A523 3

60


2

4 5

1

3

player Ipayoffs to

5 4

5 43 2 1

32

1

= A523 3

60


2

4 5

1

3

player Ipayoffs to

5 43 2 1

= A523 3

60


100

010

001

2

4 5

1

3

2 11 34 3

payoffs toplayer II

= B


100

010

001

2

4 5

1

3

2 11 34 3

payoffs toplayer II

= B

2

1

4


100

010

001

2

4 5

1

3

2 11 34 3

payoffs toplayer II

= B

3

3

1


100

010

001

2

4 5

1

3

2 11 34 3

payoffs toplayer II

= B

best response

with facet labelspolyhedron


100

010

001

2

4 5

1

3

2 11 34 3

payoffs toplayer II

= B

5

4

12

3 (front)

Alternative view

Chop off Toblerone prism


100

010

001

2

4 5

1

3

2 11 34 3

payoffs toplayer II

= B


2

3

145

2

4 5

1

3

2 11 34 3

payoffs toplayer II

= B

Equilibrium = completely labeled strategy pair

45

3

3 2 145

2 1

Constructing games using geometry

low dimension: 2, 3, (4) pure strategies:

subdivide mixed strategy simplex intoresponse regions, label suitably

high dimension:

use polytopes with known combinatorial structuree.g. for constructing games with many equilibria,or long Lemke-Howson computations[Savani & von Stengel, FOCS 2004,Econometrica 2006]

Construct isolated non−quasi−strict equilibrium

12

5

4

6

3

12

45

6 3

1

2

3

4 5 6

2 0 0A = 111

0 2 01 0 2B =

011

1 0 0

Construct isolated non−quasi−strict equilibrium

12

5

4

6

3

12

45

6 3

The Lemke−Howson algorithm

45

3

3 2 1

12

5 4


artificial equilibrium000 0,0

45

3

3 2 1

12

5 4


missing label 2

artificial equilibrium000 0,0

45

3

3 2 1

12

5 4


2missing label

000 0,0

45

3

3 1

12

5 4

2


2found label

000 0,0

45

3

3 2 1

12

5 4

Why Lemke-Howson works

LH finds at least one Nash equilibrium because

• finitely many "vertices"

for nondegenerate (generic) games:

• unique starting edge given missing label

• unique continuation

precludes "coming back" like here:

start at Nash equilibrium


2missing label45

3

3 2 1

12

5 4



2missing label45

3

3 1

12

5 4

2


Odd number of Nash equilibria!

2found label45

3

3 2 1

12

5 4

Nondegenerate bimatrix games

Given: m × n bimatrix game (A,B)

X = { x ∈ Rm | x ≥ 0, x1 + . . . + xm = 1 }Y = { y ∈ Rn | y ≥ 0, y1 + . . . + yn = 1 }

supp(x) = { i | xi > 0 } supp(y) = { j | yj > 0 }

(A,B) nondegenerate ⇔ ∀ x ∈X, y ∈Y:

| { j | j best response to x } | ≤ | supp(x) |,| { i | i best response to y } | ≤ | supp(y) |.

Nondegeneracy via labels

m × n bimatrix game (A,B) nondegenerate

⇔ no x X has more than m labels,no y Y has more than n labels.

E.g. x with > m labels,s labels from { 1 , . . . , m } ,

⇒ > m−s labels from { m+1 , . . . , m+n }⇔ > |supp(x)| best responses to x.⇒ degenerate.

Example of a degenerate game

3

1

4 5

2

4 5

1

3

2

2 11 34

payoffs toplayer II

= B4

Handling degenerate games

Lemke–Howson implemented by pivoting, i.e., changing fromone basic feasible solution of a linear system to another by choos-ing an entering and a leaving variable.

Choice of entering variable via complementarity (only differenceto simplex algorithm for linear programming).

Leaving variable is unique in nondegenerate games.

In degenerate games: perturb system by adding (ε, . . . ,εn)> ,creates nondegenerate system.Implemented symbolically by lexicographic rule.

n a s h e q u ilib ria o f b im a trix g a m e s · 2010. 5. 17. · n a s h e q u ilib ria o f b...

Documents