institute for applied information processing and communications (iaik) 1 tu graz/computer...
TRANSCRIPT
Institute for Applied Information Processing and Communications (IAIK)
1
TU Graz/Computer Science/IAIK Graz, 2009 AK Design and Verification
Presentation for the Lecture:
AK Design and Verification
by
Robert Könighofer
A Strategy Improvement Algorithm for Mean Payoff Games
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
2
TU Graz/Computer Science/IAIK AK Design and Verification
Contents
Main Source: H. Björklund, S. Sandberg, S. Vorobyov: A combinatorial strongly subexponential strategy improvement algorithm for mean payoff games. [1]
Recap: Mean Payoff Games 0-Mean Partition Problem
Longest Shortest Path Problem Algorithm Improvements
Ergodic Partition Problem Complexity Appendix: proof sketches of main theorems
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
3
TU Graz/Computer Science/IAIK AK Design and Verification
Recap: Mean Payoff Games (MPG)
Given: Finite, directed, edge weighted, leafless graph:
G = (V, E, w) V = VMAX ∪ VMIN , w: E {-W, … , 0, … , W}
Example:
-1
2
-8
4
17
...VMAX
...VMIN
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
4
TU Graz/Computer Science/IAIK AK Design and Verification
Recap: Mean Payoff Games (MPG)
Notation: 2 Player: MIN and MAX Play ρ = e0e1e2e3 … payoff(ρ) = val(ρ) = average(w(ei)) Positional Strategy for MAX:
σMAX: VMAX V so that (v, σ(v)) ∈ EGoals: MAX: maximize val(ρ) MIN: minimize val(ρ)
k
ii
kew
k 0
)(1
lim
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
5
TU Graz/Computer Science/IAIK AK Design and Verification
MPG: Properties
Optimal strategy is positional val(v) = val(ρ) when ρ starts in v and both
players play optimal Optimal σMAX: ensures payoff(ρ) ≥ val(v)
Optimal σMIN: ensures payoff(ρ) ≤ val(v)
Play ρ = finite stem + loop val(ρ) = average(loop)
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
6
TU Graz/Computer Science/IAIK AK Design and Verification
Computational Problems
Decision Problem: Can MAX guarantee payoff > p from v0 ?
p-Mean Partition: Divide V into V≤p and V>p
MAX can guarantee payoff >p from all v∈V>p
MIN can guarantee payoff ≤p from all v∈V≤p
Ergodic Partition: Compute val(v) for all v∈V
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
7
TU Graz/Computer Science/IAIK AK Design and Verification
0-Mean Partition: Approach
MPG LSP (Longest Shortest Path Problem) Solve LSP by Strategy Improvement:
σ = σ0
while(σ changes):
σ = Improve(σ)
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
8
TU Graz/Computer Science/IAIK AK Design and Verification
Longest Shortest Path Problem
Given: Finite, directed, edge weighted graph:
G = (V, t, E, w) V = VMAX ∪ VMIN
t = unique sink, t ∉ VMAX
here: σ0, avoiding negative cycles
Find: positional σ: shortest path from every v to t is as
long as possible in Gσ = G ∩ σ
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
9
TU Graz/Computer Science/IAIK AK Design and Verification
Transformation MPG LSP
Insert ‘retreat vertex’ t For all vi ∈ VMAX: add edge ei = (vi,t), w(ei) = 0
Add edge (t,t) with w(t,t) = 0 Example:
-1
2
-8
4
17
...VMAX
...VMIN
0
0
0
t
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
10
TU Graz/Computer Science/IAIK AK Design and Verification
Relation MPG LSP
MPG LSP v ∈ V>0 dist(v,t) = ∞
MAX: enforce pos. loop MAX: enforce pos. loop
MIN: enforce neg. loop MAX: retreat, dist(v,t) < ∞
-1
2
-8
4
17
-1
2
-8
4
17 0
0
0t
-1
2
-8
-2
17
-1
2
-8
-2
17 0
0
0t
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
11
TU Graz/Computer Science/IAIK AK Design and Verification
Relation MPG LSP
Admissable strategy: enforces positive loops OR retreat
we iterate over admissable strategies only σ0: go to t from every v∈VMAX
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
12
TU Graz/Computer Science/IAIK AK Design and Verification
Remember our approach:
MPG LSP (Longest Shortest Path Problem) Solve LSP by Strategy Improvement:
σ = σ0
while(σ changes):
σ = Improve(σ)
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
13
TU Graz/Computer Science/IAIK AK Design and Verification
Quality of a Strategy
Only admissable strategies
MIN: take shortest path to t
(any other loop is positive)
valσ(v): shortest distance from v to t in Gσ
σ is better than σ* (σ > σ*) iff: ∀v∈V: valσ(v) ≥ valσ*(v) AND
∃v∈V: valσ(v) > valσ*(v)
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
14
TU Graz/Computer Science/IAIK AK Design and Verification
Computing valσ(v): Shortest Path Problem
Given: Finite, directed, edge weighted graph:
G = (V, t, E, w) t = unique sink
Find: shortest path from every v to t
Algorithms: Dijkstra's algorithm: only positive weights Bellman Ford algorithm: also negative weights
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
15
TU Graz/Computer Science/IAIK AK Design and Verification
Bellman Ford Algorithm [3]
Foreach v in V distance[v]= ∞ succ[v] = Nonedistance[t] = 0succ[t] = tdo |V|-1 times: foreach (u,v) in E: if(distance[v] + w(u,v) < distance[u]): distance[u] = distance[v] + w(u,v) succ[u] = v
u v2
5 2u v
2
4 2
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
16
TU Graz/Computer Science/IAIK AK Design and Verification
Bellman Ford Algorithm
Example:
-1
2
-8
4
17
0
0
0
tv0
v1
v2
v3
...VMAX
...VMIN
-1
-8
7
0
0
0
tv0
v1
v2
v3 e1
e2
e3
e4
e5
0 1 2 3
t 0 0 0 0 0 0 0
v0 ∞ ∞ 0 0 0 0 0
v1 ∞ 0 0 0 0 0 0
v2 ∞ ∞ ∞ ∞ 6 -8 -8
v3 ∞ ∞ ∞ -1 -1 -1 -1
distances:
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
17
TU Graz/Computer Science/IAIK AK Design and Verification
Bellman Ford Algorithm
Another Example:
0
t
v0
v1
0 1 2
t 0 0 0 0
v0 ∞ 12 12 9
v1 ∞ ∞ 10 10
distances:
12
-1 -2e2 e1
e3
Bellman Ford does not work with negative loops
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
18
TU Graz/Computer Science/IAIK AK Design and Verification
Switching the strategy
Picking another successor in a VMAX vertex: Notation: σ[x y]
σ[x y](x) = y σ[x y](a) = σ(a)
Switch σ[v u] is: attractive iff: w(v,u) + valσ(u) > valσ(v)
profitable iff: σ[v u] > σ expensive to check
v u3
4 2v u
3
5 2
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
19
TU Graz/Computer Science/IAIK AK Design and Verification
Main Theorems
Theorem 5.1: Switch is attractive Switch is profitable
Also holds for combinations of switches
Theorem 5.2: No more attractive switches strategy at least
as good as any other admissable strategy.
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
20
TU Graz/Computer Science/IAIK AK Design and Verification
Putting the pieces togethersolve_0-mean_partition(G’): G = MPGtoLSP(G’) σ0 = computeInitialAdmissableStrategy(G) σ = σ0
while(σ changes): (σ, distance) = Improve(σ, G) VMAX = VMIN = emptySet foreach v in (G.V\t): distance[v] == ∞ ? VMAX.add(v) : VMIN.add(v) return (VMAX, VMIN, σ)
Improve(σ, G): Gσ = restrictGraph(G, σ) distance = BellmanFord(Gσ) (v, u, failed) = findAttractiveSwitch(distance) if(failed): return (σ, distance) return (σ[v->u], None)
findAttractiveSwitch(distance): foreach (v,u) in (G.E \ Gσ.E): if(w(v,u) + distance[u] > distance [v]): return (v,u,0) return (None, None,1)
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
21
TU Graz/Computer Science/IAIK AK Design and Verification
Putting the pieces together
Example:
-1
2
-8
4
17
-1
2
-8
4
17 0
0
0t
0
0
-8
-1
MPG to LSP
σ = σ0
-1
2
-8
4
17 0
0
0t
∞
∞
∞
∞
σ = Improve(σ)-1
2
-8
4
17 0
0
0t
1
0
-7
-1
σ = Improve(σ)-1
2
-8
4
17
0
0t
1
0
-7
-1
σ = Improve(σ)
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
22
TU Graz/Computer Science/IAIK AK Design and Verification
Improvements: Switches
Any combination of attractive switches improves the strategy
Multiple switches per iteration Try heuristics for selecting single or multiple
attractive switches Random, all attractive switches, ... Initial Multiple Switching Proceeding in Stages
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
23
TU Graz/Computer Science/IAIK AK Design and Verification
Improvements: Randomization
Order of switches is crutial for complexity Facet F[u v] = set of strategies
where succ[u] = v, u ∈ VMAX
Randomization scheme [4]:
find_best_strategy(σ,G) if(G == Gσ): return σ while(true): randomly pick some F[u->v] not containing σ σ* = find_best_strategy(σ, G\(u,v)) if(σ* is optimal in G): return σ* G = F σ = σ[u->v]
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
24
TU Graz/Computer Science/IAIK AK Design and Verification
Improvements: Randomization
Example:
-1
2
-8
4
17 0
0
0t
v0
v1
v2
v3
pick F[v1v0]
-1
2
-817 0
0
0t
v0
v1
v2
v3
(G\(v1,v0), σ):(G, σ):
σ* optimal in G?
NO! There is an
attractive switch!
-1
2
-8
4
7 0
0
t
v0
v1
v2
v3
σ = σ* , G = F[v1v0](G, σ):
-1
2
-8
4
17 0
0
0t
v0
v1
v2
v3
σ* = find_best_strategy(σ, G\(v1,v0))(G, σ*):
call
recursive
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
25
TU Graz/Computer Science/IAIK AK Design and Verification
Improvements: Randomization
Example continued:
-1
2
-8
4
7 0
0
t
v0
v1
v2
v3
pick F[v0->t]
(G\(v0,t), σ ):
-1
2
-8
4
7 0
t
v0
v1
v2
v3
σ* optimal in G?
YES! No more attractive switches!
-1
2
-8
4
17 0
0
0t
v0
v1
v2
v3
(G, σ):
σ* = find_best_strategy(σ, G\(v0,t))(G, σ*):
-1
2
-8
4
7 0
t
v0
v1
v2
v3
0
recursive
call
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
26
TU Graz/Computer Science/IAIK AK Design and Verification
Improvements: Recomputing the measure
Switch from σ to σ* = σ[v u] valσ(v) = valσ*(v) for some v
Compute nodes that change their value Bellman Ford Algorithm only for these
nodes
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
27
TU Graz/Computer Science/IAIK AK Design and Verification
Which values change?
σ* = σ[u1 v1][u2 v2] ...
U = {u1, u2, ...}
Mark all vertices in Uwhile(U not empty): u = U.pop() foreach unmarked predecessors p of u in Gσ*: if w(p,x) + d[x] > d[p] for all unmarked succ x of p in Gσ*: mark u U.push(u)
u
p
x1 x2
5
23
46
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
28
TU Graz/Computer Science/IAIK AK Design and Verification
Which values change?
Example:
-1
2
-8
4
17 0
0
0t
v0
v1
v2
v3
0
0
-8
-1
(G, σ):
-1
2
-8
4
17 0
0
0t
v0
v1
v2
v3
(G, σ*):
σ* = σ[v0v3]
Switch
-8
-1
2
-87 0
0t
0
-1
0v0
v1
v2
v3
(Gσ*, σ*):
marked U u {pi} {xj} condition TRUE for all xj ?
v0 {v0} v0 v2 v3 w(v2,v3)+d[v3]>d[v2]
7-1 > -8
YES
v0, v2 {v2} v2 - - - -
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
29
TU Graz/Computer Science/IAIK AK Design and Verification
Complexity: p-Mean Partition
Without Improvement: n·W finite values n²·W switches Bellman Ford: O(n·m) O(n³·m·W)
With Improvement: Bellman Ford: O(ni · m), ni = |changing nodes|
Σ(ni) = n²·W
O(n²·m·W)
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
30
TU Graz/Computer Science/IAIK AK Design and Verification
Complexity: p-Mean Partition
With Randomization [4]:
All together:
nnOWmnO log2 2,min
nnO log2
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
31
TU Graz/Computer Science/IAIK AK Design and Verification
Ergodic Partition
w-, w+ : smallest and biggest edge weights val(ρ) = average(w(ei)) in [w-, w+]
denominator ≤ n Repeated p-mean partitioning:
Break interval [w-, w+] in parts of length 1/n² decide which v are in each interval
Min. difference between 2 values:
Unique value in each interval
2
1
)1(
11
1
1
nnnnn
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
32
TU Graz/Computer Science/IAIK AK Design and Verification
Complexity: Comparison
This Approach Zwick / Paterson
Ergodic Partition
p-Mean Partition
nnOWmnO log2 2,min WmnO 2
WmnO 3 nnOW
WnWmnOlog
3
2)(log
,)log(min
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
33
TU Graz/Computer Science/IAIK AK Design and Verification
Summary
The first strongly subexponential algorithm Algorithm for p-Mean Partition problem
Longest Shortest Path Problem Strategy improvement Improvements
Extended to Ergodic Partition problem
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
34
TU Graz/Computer Science/IAIK AK Design and Verification
References[1] H. Björklund, S. Sandberg, S. Vorobyov, A combinatorial strongly subexponential
strategy improvement algorithm for mean payoff games, in: Proc. 29th International Symposium on Mathematical Foundations of Computer Science (MFCS), Vol. 3153 of Lecture Notes in Computer Science, Springer-Verlag, 2004, pp. 673–685.
[2] U. Zwick and M. Paterson. The complexity of mean payoff games on graphs. Theor.Comput. Sci., 158:343–359, 1996.
[3] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. MIT Press and McGraw-Hill Book Company, Cambridge, MA, 2nd edition, 2001.
[4] J. Matousek, M. Sharir, and M. Welzl. A subexponential bound for linear programming. In 8th ACM Symp. on Computational Geometry, pages 1–8, 1992.
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
35
TU Graz/Computer Science/IAIK AK Design and Verification
Questions / Discussion
... thank you for your attention
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
36
TU Graz/Computer Science/IAIK AK Design and Verification
Appendix
Proof sketches for Theorem 5.1 and 5.2
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
37
TU Graz/Computer Science/IAIK AK Design and Verification
Proof sketch: Theorem 5.1(attractive profitable)
Value increases at least in one vertex: Attractive switch σ* = σ[vu]:
w(v,u) + valσ(u) > valσ(v)
valσ*(v) > valσ(v)
Values do not decrease: New loops are positive New paths to the sink are longer
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
38
TU Graz/Computer Science/IAIK AK Design and Verification
Proof sketch: Theorem 5.1(attractive profitable)
New loops are positive: Switch σ* = σ[vu]: New loop must contain switching vertex v
0
t
v
y
u
x y = valσ(v) < w(v,u) + valσ(u) ≤ x + y
x > 0
switch is attractive valσ(u) ≤ x – w(v,u) + y
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
39
TU Graz/Computer Science/IAIK AK Design and Verification
Proof sketch: Theorem 5.1(attractive profitable)
New paths to the sink are longer : Switch σ* = σ[vu]: New path from any vertex n to t must contain
switching vertex v
0
tv
y
u
x y = valσ(v) < w(v,u) + valσ(u) ≤ x
y < xn
a
switch is attractive valσ(u) ≤ x – w(v,u)
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
40
TU Graz/Computer Science/IAIK AK Design and Verification
Proof sketch: Theorem 5.2(stable optimal)
Proof for one-player games: MIN has no choices Finite values cannot become infinite
no more attractive switches no more new positive loops
Finite values do not improve finitely no more attractive switches no more new longer paths to t
Extension to two-player games: MIN does not need choices
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
41
TU Graz/Computer Science/IAIK AK Design and Verification
Proof sketch: Theorem 5.2(stable optimal)
No more new positive loops: Assumption: there is a new positive loop
0
t
v
y
u
x
switch attractive y < x + y
no attractive switches y ≥ x + y
x ≤ 0
http://www.iaik.tugraz.at
Institute for Applied Information Processing and Communications (IAIK)
42
TU Graz/Computer Science/IAIK AK Design and Verification
Proof sketch: Theorem 5.2(stable optimal)
No more new longer paths to t : Assumption: there is a new longer path to t
switch attractive y < x
no attractive switches y ≥ x
can not have better finite values0
tv
y
u
xn
a