csr2011 june14 16_30_ibsen-jensen

The complexity of solving reachability games using value andstrategy iteration

Kristoffer Arnsfelt HansenRasmus Ibsen-Jensen Peter Bro Miltersen

Aarhus UniversityDenmarkCSR 2011, 14’th June

Overview

What are concurrent reachabillity games? Two standard algorithms solving concurrent

reachabillity games: The value iteration algorithm The strategy iteration algorithm

Examplify important facts for the proof of the time lower bound for both algorithms

Matrix games von Neumann 1928

0 -1 1

1 0 -1

-1 1 0

Matrix games von Neumann 1928

0 -1 1

1 0 -1

-1 1 0

0 -1 1

1 0 -1

-1 1 0

Each entry can be either 0, 1 or a pointer

vs.Dante* Lucifer*

Concurrent reachability games Everett 1957/de Alfaro, Henzinger, Kupferman 1998

* Naming convention from Hansen, Koucky and Miltersen, 2009 3/42

vs.Dante* Lucifer*

* Naming convention from Hansen, Koucky and Miltersen, 2009 3/42

Histories

Histories and strategies

History: Sequence of positions and choices for each player in each position.

Strategy: Map from histories to probability distributions over choices in the position we arrive at after the history

S1: Set of strategies for Dante

S2: Set of strategies for Lucifer

H1/H2: Sets of stationary strategies (sets of strategies that only depends on the position we arrive at after the history)

Payoffs

v(i,σ,π): The probability to eventually reach a 1, from position i, if Dante plays by strategy σ and Lucifer by π.

Everett 1957

iviviv

),,( supinf),,( infsup :i1221 SSSS

Value of i

),,( supinf),,( infsup :i1221 SSH

Algorithmic problems

Quantitatively solving a game: Given the game, compute the value of all positions.

Strategically solving a game: Given the game and ε>0, compute σ such that for all π and i: v(i,σ,π)>vi-ε.

Value iteration Shapley 1953

Value iteration computes the value of each position in Gt in iteration t, on the basis of the value of each position in Gt-1.

Gt: A modified version of G, where Dante loses after t moves.

Our results: Lower bound for value iteration There exists a concurrent reachabillity game

G, with N matrices and m rows and columns in each matrix, so that:

val(G)=1 and val(Gt) = 3m-N/2, for t=2mN/2

Our results: Upper bound for value iteration For any concurrent reachabillity game G val(G)-val(Gt)<ε for t=(1/ε)mO(N)

Value iteration example – G0

0 0 1 0

0 0000

0.33333/

0.33333/0 00

0 0 1 0

0 0000

00000.33333/

0.33333/0

0.33333/0.33333

0.11111/ 0/

0.11111

0.33333/0.33333

0.11111/ 0/

0.03704/

0.11111

0.03704

0.33333/0.33333

0.11111/ 0.01235/

0.03704/

0.11111

0.03704

0.01235

0.33748/0.33333

0.11533/ 0.01754/

0.04147/

0.11533

0.04147

0.01754

0.33925/0.33748

0.11855/ 0.02172/

0.04493/

0.11855

0.04493

0.02172

0.34068/0.33925

0.12064/ 0.02519/

0.04772/

0.12064

0.04772

0.02519

0.34187/0.34068

0.12388/ 0.02815/

0.04991/

0.12388

0.04991

0.02815

0.34378/0.34187

0.12517/ 0.03070/

0.05129/

Strategy iterationChatterjee, de Alfaro, Henzinger ’06

Was conjectured to be fast

Our results: Upper bound for strategy iteration An ε-optimal strategy is computed after

t=(1/ε)mO(N) iterations of strategy iteration

This follows from the corresponding results for value iteration

Our results: Lower bound for strategy iteration There exists a concurrent reachabillity game

G, with N matrices, for large N, and m rows and columns in each matrix, so that:

val(G)=1 and The strategy optained by strategy iteration

guarantees winning probability at most 4m-N/2, for t= 2mN/4

Strategy iteration, m=2

N Number of iterations neededto get over 1/2

7 18446744073709551617

8 340282366920938463463374607431768211457

9 115792089237316195423570985008687907853269984665640564039457584007913129639937

Strategy iteration: Before iteration 1

1. Start strategy for Dante:= Uniform

Strategy iteration: Before iteration 1

1. Start strategy for Dante:= Uniform

0.33333

Strategy iteration: Iteration 1

1. Best response for Lucifer2. Calculate values from those strategies3. Update strategy for Dante

0.33333

0.66667

The numbers on the edges are the probability that the edge is used.Edges without a number have probability 0.33333 to be used.

0.33333

0.66667

The numbers on the edges are the probability that the edge is used.Edges without a number have probability 0.33333 to be used.

0.66667

0.33333

0.11111

0.03704

0.01235

0.33333

0.11111

0.03704

0.01235

0.33333

0.01235

0.012350.012350.01235

0.33748

0.11111

0.03704

0.01235

0.33333

0.33748

0.33332

0.32920

0.33333

0.11111

0.03704

0.01235

0.33333

0.33748

0.33332

0.32920

0.34599

0.33317

0.32084

0.37327

0.33180

0.29493

0.47368

0.31579

0.21053

0.11111

0.03704

0.01235

0.33333

0.33748

0.33332

0.32920

0.34599

0.33317

0.32084

0.37327

0.33180

0.29493

0.47368

0.31579

0.21053

0.11111

0.03704

0.01235

0.33333

0.33748

0.33332

0.32920

0.34599

0.33317

0.32084

0.37327

0.33180

0.29493

0.47368

0.31579

0.21053

0.11111

0.03704

0.01235

0.33333

0.33748

0.33332

0.32920

0.34599

0.33317

0.32084

0.37327

0.33180

0.29493

0.47368

0.31579

0.21053

0.11677

0.04359

0.02065

0.33748

0.33332

0.32920

0.34599

0.33317

0.32084

0.37327

0.33180

0.29493

0.47368

0.31579

0.21053

0.11677

0.04359

0.02065

0.33748

0.34031

0.33329

0.32640

0.35458

0.33289

0.31253

0.39987

0.33180

0.32917

0.55453

0.29186

0.15361

0.11677

0.04359

0.02065

0.33748

0.34031

0.33329

0.32640

0.35458

0.33289

0.31253

0.39987

0.33180

0.32917

0.55453

0.29186

0.15361

0.11677

0.04359

0.02065

0.33748

0.34031

0.33329

0.32640

0.35458

0.33289

0.31253

0.39987

0.33180

0.32917

0.55453

0.29186

0.15361

0.11677

0.04359

0.02065

0.33748

0.34031

0.33329

0.32640

0.35458

0.33289

0.31253

0.39987

0.33180

0.32917

0.55453

0.29186

0.15361

0.12067

0.04825

0.02676

0.34031

0.33329

0.32640

0.35458

0.33289

0.31253

0.39987

0.33180

0.32917

0.55453

0.29186

0.15361

0.12067

0.04825

0.02676

0.34031

0.33329

0.32640

0.35458

0.33289

0.31253

0.39987

0.33180

0.32917

0.55453

0.29186

0.15361

0.12067

0.04825

0.02676

0.34031

0.34241

0.33325

0.32434

0.36097

0.33259

0.30644

0.41947

0.32646

0.25407

0.60831

0.27098

0.12071

0.12067

0.04825

0.02676

0.34031

0.34241

0.33325

0.32434

0.36097

0.33259

0.30644

0.41947

0.32646

0.25407

0.60831

0.27098

0.12071

0.12067

0.04825

0.02676

0.34031

0.34241

0.33325

0.32434

0.36097

0.33259

0.30644

0.41947

0.32646

0.25407

0.60831

0.27098

0.12071

0.12067

0.04825

0.02676

0.34031

0.34241

0.33325

0.32434

0.36097

0.33259

0.30644

0.41947

0.32646

0.25407

0.60831

0.27098

0.12071

0.12067

0.04825

0.02676

0.34031

0.34241

0.33325

0.32434

0.36097

0.33259

0.30644

0.41947

0.32646

0.25407

0.60831

0.27098

0.12071

0.12360

0.05185

0.03154

0.34241

0.33325

0.32434

0.36097

0.33259

0.30644

0.41947

0.32646

0.25407

0.60831

0.27098

0.12071

0.12360

0.05185

0.03154

0.34241

0.33325

0.32434

0.36097

0.33259

0.30644

0.41947

0.32646

0.25407

0.60831

0.27098

0.12071

0.12360

0.05185

0.03154

0.34241

0.34407

0.33322

0.32271

0.36601

0.33230

0.30169

0.43486

0.32390

0.24125

0.64720

0.25350

0.09930

0.12360

0.05185

0.03154

0.34241

0.34407

0.33322

0.32271

0.36601

0.33230

0.30169

0.43486

0.32390

0.24125

0.64720

0.25350

0.09930

0.12360

0.05185

0.03154

0.34241

0.34407

0.33322

0.32271

0.36601

0.33230

0.30169

0.43486

0.32390

0.24125

0.64720

0.25350

0.09930

0.12360

0.05185

0.03154

0.34241

0.34407

0.33322

0.32271

0.36601

0.33230

0.30169

0.43486

0.32390

0.24125

0.64720

0.25350

0.09930

0.12593

0.05476

0.03544

0.34407

0.33322

0.32271

0.36601

0.33230

0.30169

0.43486

0.32390

0.24125

0.64720

0.25350

0.09930

0.12593

0.05476

0.03544

0.34407

0.34543

0.33319

0.32138

0.37015

0.33202

0.29783

0.44745

0.32152

0.23103

0.67692

0.23882

0.08426

0.12593

0.05476

0.03544

0.34407

0.34543

0.33319

0.32138

0.37015

0.33202

0.29783

0.44745

0.32152

0.23103

0.67692

0.23882

0.08426

0.12593

0.05476

0.03544

0.34407

0.34543

0.33319

0.32138

0.37015

0.33202

0.29783

0.44745

0.32152

0.23103

0.67692

0.23882

0.08426

0.12593

0.05476

0.03544

0.34407

0.34543

0.33319

0.32138

0.37015

0.33202

0.29783

0.44745

0.32152

0.23103

0.67692

0.23882

0.08426

0.12786

0.05721

0.03873

0.34543

0.33319

0.32138

0.37015

0.33202

0.29783

0.44745

0.32152

0.23103

0.67692

0.23882

0.08426

0.12786

0.05721

0.03873

0.34543

0.33319

0.32138

0.37015

0.33202

0.29783

0.44745

0.32152

0.23103

0.67692

0.23882

0.08426

0.12786

0.05721

0.03873

0.34543

0.34658

0.33316

0.32026

0.37366

0.33177

0.29457

0.45807

0.31933

0.22260

0.70055

0.22633

0.07312

0.12786

0.05721

0.03873

0.34543

0.34658

0.33316

0.32026

0.37366

0.33177

0.29457

0.45807

0.31933

0.22260

0.70055

0.22633

0.07312

0.12786

0.05721

0.03873

0.34543

0.34658

0.33316

0.32026

0.37366

0.33177

0.29457

0.45807

0.31933

0.22260

0.70055

0.22633

0.07312

0.12786

0.05721

0.03873

0.34543

0.34658

0.33316

0.32026

0.37366

0.33177

0.29457

0.45807

0.31933

0.22260

0.70055

0.22633

0.07312

0.12950

0.05932

0.04156

0.34658

0.33316

0.32026

0.37366

0.33177

0.29457

0.45807

0.31933

0.22260

0.70055

0.22633

0.07312

0.12950

0.05932

0.04156

0.34658

0.33316

0.32026

0.37366

0.33177

0.29457

0.45807

0.31933

0.22260

0.70055

0.22633

0.07312

0.12950

0.05932

0.04156

0.34658

0.34758

0.33313

0.31929

0.37670

0.33153

0.29177

0.46723

0.31730

0.21547

0.71988

0.21557

0.06455

0.12950

0.05932

0.04156

0.34658

0.34758

0.33313

0.31929

0.37670

0.33153

0.29177

0.46723

0.31730

0.21547

0.71988

0.21557

0.06455

0.12950

0.05932

0.04156

0.34658

0.34758

0.33313

0.31929

0.37670

0.33153

0.29177

0.46723

0.31730

0.21547

0.71988

0.21557

0.06455

0.12950

0.05932

0.04156

0.34658

0.34758

0.33313

0.31929

0.37670

0.33153

0.29177

0.46723

0.31730

0.21547

0.71988

0.21557

0.06455

0.13093

0.06118

0.04404

0.34758

0.33313

0.31929

0.37670

0.33153

0.29177

0.46723

0.31730

0.21547

0.71988

0.21557

0.06455

0.13093

0.06118

0.04404

0.34758

0.34845

0.33311

0.31844

0.37937

0.33130

0.28933

0.47527

0.31541

0.20932

0.73606

0.20618

0.05776

0.13093

0.06118

0.04404

0.34758

0.34845

0.33311

0.31844

0.37937

0.33130

0.28933

0.47527

0.31541

0.20932

0.73606

0.20618

0.05776

0.13093

0.06118

0.04404

0.34758

0.34845

0.33311

0.31844

0.37937

0.33130

0.28933

0.47527

0.31541

0.20932

0.73606

0.20618

0.05776

0.13093

0.06118

0.04404

0.34758

0.34845

0.33311

0.31844

0.37937

0.33130

0.28933

0.47527

0.31541

0.20932

0.73606

0.20618

0.05776

0.13219

0.06283

0.04624

0.34845

0.33311

0.31844

0.37937

0.33130

0.28933

0.47527

0.31541

0.20932

0.73606

0.20618

0.05776

0.13219

0.06283

0.04624

0.34845

0.33311

0.31844

0.37937

0.33130

0.28933

0.47527

0.31541

0.20932

0.73606

0.20618

0.05776

0.13219

0.06283

0.04624

0.34845

0.34923

0.33309

0.31768

0.38176

0.33109

0.28715

0.48241

0.31366

0.20393

0.74985

0.19791

0.05224

Generalized Purgatory P(N,m) Lucifer repeatedly hides a number between 1

and m. Dante must try to guess the number. If he guesses correctly N times in a row, he

goes to heaven. If he ever guesses incorrectly overshooting

Lucifer’s number, he goes to hell.

Interesting fact

The probability that Dante goes to heaven from purgatory is nearly 1, if he plays well enough.

Exemplifying important factsValue iteration on 1 matrix

Strategy iteration on 1 matrix

Strategy iteration on 3 matrices

t:=10.5

0.66667

0.33333

0.66667

0.33333

0.57143

0.42857

0.53333

0.46667

t:=20.5

0.66667

0.33333

0.66667

0.33333

0.57143

0.42857

0.53333

0.46667

t:=20.5

0.66667

0.33333

0.66667

0.33333

0.57143

0.42857

0.53333

0.46667

t:=20.66667

0.66667

0.33333

0.66667

0.33333

0.57143

0.42857

0.53333

0.46667

0.66667

0.53333

0.30476

0.20317

t:=20.66667

0.75000

0.25000

0.75000

0.25000

0.61765

0.38235

0.55654

0.44346

0.66667

0.53333

0.30476

0.20317

t:=30.66667

0.75000

0.25000

0.75000

0.25000

0.61765

0.38235

0.55654

0.44346

0.66667

0.53333

0.30476

0.20317

t:=30.66667

0.75000

0.25000

0.75000

0.25000

0.61765

0.38235

0.55654

0.44346

0.66667

0.53333

0.30476

0.20317

t:=30.75000

0.75000

0.25000

0.75000

0.25000

0.61765

0.38235

0.55654

0.44346

0.75000

0.55654

0.34374

0.25781

t:=30.75000

0.80000

0.20000

0.80000

0.20000

0.65072

0.34928

0.57399

0.42601

0.75000

0.55654

0.34374

0.25781

The end

Open problems: Find a fast algorithm for the problem

There exists a PSPACE algorithm for the problem, but it is not fast.

Thanks for listening

csr2011 june14 16_30_ibsen-jensen

solving reachability

iteration

iteration

iteration

iteration

iteration

iteration

iteration

Documents

csr2011 june18 12_00_nguyen

csr2011 june14 15_15_romashchenko

june14, 2007 introductionêtechnique

bbb vita june14

lightnews final june14 2013

rspo marketperformance june14

csr2011 june18 15_15_bomhoff

csr2011 june14 14_00_agrawal

marico investor presentation - june14

axsem rf june14

biobazaar ideck june14

lp june14 merged cropped

csr2011 june16 11_30_georgiadis

csr2011 june14 12_00_hansen

csr2011 june18 09_30_shpilka

biobazaar inv_deck june14

csr2011 june18 11_30_remila

csr2011 june14 15_45_musatov

csr2011 june15 12_00_davydow

csr2011 june16 14_00_kari