nash equilibria in graphical games on trees edith elkind leslie ann goldberg paul goldberg
Post on 15-Dec-2015
219 Views
Preview:
TRANSCRIPT
Games and Strategies
• Games: strategic interactions between rational entities
• Solution concepts: what’s going to happen?– dominant strategies– Nash equilibrium– ….
• Can it be computed?– if your computer cannot find it, the market
probably cannot either
Matrix (normal form) Games
2 0
0 1
1 0
0 3
Row player:
Column player:
0
1
0 1 0 1
0
1
• finite set of players {1, …, n}
• each player has k actions
(pure strategies): 1, …, k
• payoffs of the ith player: Pi: {1, …, k}n → R
Nash Equilibrium
2 0
0 1
1 0
0 3
Row player:
Column player:
0
1
0 1 0 1
0
1
• Nash equilibrium: a strategy profile such that
noone wants to deviate given other players’ strategies, i.e., each player’s strategy is a best response to others’ strategies:– (0, 0) and (1, 1) are both NE
Pure vs. Mixed Strategies
1 -1
-1 1
-1 1
1 -1
Row player:
Column player:
H
T
H T H T
H
T
• NE in pure strategies may not exist!– “matching pennies”
• Mixed strategy: a probability distribution over actions– 50% tail, 50% head
Existence of NE
• Theorem (Nash 1951): any n-player k-action game in normal form has an equilibrium in mixed strategies
can we find one in poly-time?
2 players, n actions
• Representation: two n x n matrices• Computation:
– all known methods are exptime– can it be NP-hard? no: NE always exists– PPAD-hardness: notion of hardness for total search
problems– DGP’06: finding NE in 4-player games is PPAD-hard– CD’06: finding NE in 2-player games is PPAD-hard– DGP reduction uses graphical games
(the topic of this talk!)
n-player 2-action games
• representation: payoffs to each player for every action profile (vector in {0, 1}n): n2n numbers
• graphical games:– players are vertices of a graph– V’s payoff depends on actions of W in N(V) U V– n players, max degree d => n2d+1 numbers
TU
V
W t=0, u=0, v=0, w=0: 12t=1, u=0, v=0, w=0: 31 ….t=1, u=1, v=1, w=1: -6
W’s payoffs(16 cases):
Complexity: what is known
• Bounded-degree trees:– Exp-time algorithm/poly-time approximation
algorithm to find all NE (Kearns, Littman, Singh, UAI 2001)
– ??? poly-time algorithm to find a single NE (Kearns, Littman, Singh, NIPS’2001)
• Heuristics for graphs with cycles
• General graphs:– PPAD-complete (DGP’06) even if max deg=3
Our Results (1)
• Algorithm in NIPS’01 paper is incorrect (does not always output a NE)
• We fix the NIPS’01 algorithm, but…– our algorithm runs in poly-time on paths– with a trick, also on cycles– can be used to find
• (a representation for) all NE in n3 time, or• a single NE in n2 time
Our Results (2)
• There is a graph of pathwidth 2 on which our algorithm runs in exp time– true for all algorithms that use the basic
approach of the UAI’01 paper
• The problem is PPAD-complete for bounded pathwidth graphs
• Open question: what if pathwidth = 1?– generalizes a cool geometry problem (talk to
me if you like those, or see the paper)
Warm-up: 2-player 2-action games
2 0
0 1
1 0
0 3
Row player:
Column player:
0
1
0 1 0 1
0
1
Suppose R plays 1 w.p. r
EP(C) from playing 0: (1-r)*1
EP(C) from playing 1: r*3
1-r > 3r iff r < ¼
Suppose C plays 1 w.p. c
EP(R) from playing 0: (1-c)*2
EP(R) from playing 1: c*1
(1-c)*2 > c iff c < 2/3
1/4 1
r
BR(C)c1
2/3BR(R)
mixed NE: r=1/4, c=2/3
• Potential best response: v is a PBR to w iff when W plays w, there is a NE for TV in which V plays v.
• upstream pass: construct PBRV(w) from PBRU1(v), PBRU2(v) and PBRU3(v)
• downstream pass: root selects its strategy based on the children’s PBR’s; propagates to leaves
Algorithm for Trees (KLS’01)
TV
W
V
U1U2
U3
v
w
KLS algorithm: running time
• For bounded-degree trees, constructs all PBR (and then find a NE) in exp time
• FPTAS for an -NE:– superimpose PBR with a -grid– there exists a grid point -close to PBR– -NE ( = poly() ):
no one can gain more than by deviating
Computing PBR: Example
• Payoffs to V: – P000 = 1, P001 = -9, P100 = 9, P101 = -1, Pu1w = 0 for u, w =0, 1
• E0 = EP(V) from playing 0: (1-u)(1-w)*1+(1-u)w*(-9)+u(1-w)*9+uw*(-1) = 1+8u-10w
• E1 = EP(V) from playing 1: 0• E0 = E1 iff w = (8u+1)/10 = f(u)
U V W
.5 1
1u
v
v
1
1
.5
.1 .9 w
(v, u) → (f(u), v)
PBRU(v) PBRV(w)
Trees: too many segments
v v w
ut v
v1 v2 v1 v2
v1
v2
KLS (NIPS’01): can “trim” PBR
Incorrect!
W
V
T U
(v,t), (v,u) → (f(u,t), v)
u2
u1
t2
t1
Solutions?
• Solution 1 (for paths): algorithm of UAI’01 paper, careful analysis– the number of segments/rectangles in each
PBR is O(n2)– running time O(n3)
• Solution 2 (for paths): can pick a subset of each PBR consisting of O(n) segments– O(n2) running time
O(n3) algorithm• f(u) =(au+b)/(cu+d) u*: cu*+d = 0 [v1, v2] x {u} => {f(u)} x [v1, v2]
{v} x [u1, u2] => [f(u1), f(u2)] x {v} if u* not in [u1, u2]
[0, f(u2)] U [f(u1), 1] x {v} if u1≤ u*≤ u2
• PBRV(w) vs PBRU(v): new segments at v=1 and v=0,
some segments break into two --- double in size?
• no: count the event points
u
v
v
w
(v, u) → (f(u), v)
PBRU(v) PBRV(w)
u*
Hardness results
• pathwidth 2: our algorithm is not poly-time– and neither is any two-pass algorithm that
stores subsets of PBR
• pathwidth > k: (probably) all algorithms are not poly-time– finding NE in this case is PPAD-hard– idea: modify the construction in DGP’06
Good Nash Equilibria
2 0
0 1
1 0
0 3
Row player:
Column player:
0
1
0 1 0 1
0
1
Nash equilibria: • (0, 0): total payoff is 3• (1, 1): total payoff is 4• (1/4, 2/3): total payoff is 17/12
not all NE are created equal…
What is a good NE?
• maximize sum of player’s payoffs
• guarantee to each player a payoff of at least ti
• (almost) equal payoffs
• any combination of those….
Can we use PBR data structure to compute those?
Can we represent it?
• any GG with integer payoffs on a tree has a rational NE
• Any PBR consists of segments and rectangles with rational coordinates
• Yet, total payoff-maximizing NE may be irrational
Our result (EGG’07): for any algebraic , deg() = n, there is a GG with int payoffs on a path of length O(n) in whichin the best NE player 1 plays
Approximation
• Can we use the FPTAS of KLS’01? – superimpose PBR with a -grid
• Observation: there is a grid point -close to best NE– look for best point on the grid close to PBR
• dynamic programming
– -NE ( = poly() ): no one can gain more than by deviating
True Nash
• -NE is not always appropriate– what if players are not willing to lose ?
• Can we find a (true) NE that is -close to the best (true) NE?
• Idea: – add borders of rectangles
in PBR to the grid– only consider grid points in PBR
Bounded Payoff Nash
• Similar algorithm works --- FPTAS– Also for other kinds of “good” NE
• If all payment bounds are rational, there is a BP NE that is “almost” rational (deg ≤ 2)
• Open question: can we compactly represent all bounded payoff NE?– perhaps by incorporating payoff bounds into
PBR?
top related