![Page 1: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/1.jpg)
Randomness for Free
Laurent Doyen
LSV, ENS Cachan & CNRS
joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger
![Page 2: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/2.jpg)
Stochastic games on graphs
• Games for synthesis
- Reactive system synthesis = finding a winning strategy in a two-player game
• Game played on a graph
- Infinite number of rounds
- Player’s moves determine successor state
- Outcome = infinite path in the graph
When is randomness more powerful ?
When is randomness for free ?
![Page 3: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/3.jpg)
• Games for synthesis
- Reactive system synthesis = finding a winning strategy in a game
• Game played on a graph
- Infinite number of rounds
- Player’s moves determine successor state
- Outcome = infinite path in the graph
When is randomness more powerful ?
When is randomness for free ?
… in game structures ?
… in strategies ?
Stochastic games on graphs
![Page 4: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/4.jpg)
Interaction: how players' moves are combined.
Information: what is visible to the players.
• Classification according to Information & Interaction
Round 1
Stochastic games on graphs
![Page 5: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/5.jpg)
Interaction: how players' moves are combined.
Information: what is visible to the players.
• Classification according to Information & Interaction
Round 2
Stochastic games on graphs
![Page 6: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/6.jpg)
Interaction: how players' moves are combined.
Information: what is visible to the players.
• Classification according to Information & Interaction
Round 3
Stochastic games on graphs
![Page 7: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/7.jpg)
Mode of interaction
• Classification according to Information & Interaction Interaction
General case: concurrent & stochastic
Player 1’s move
Player 2’s move
Probability distribution on successor state
Players choose their moves simultaneously and independently
-player games
![Page 8: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/8.jpg)
Mode of interaction
• Classification according to Information & Interaction Interaction
General case: concurrent & stochastic
Player 1’s move
Player 2’s move
Probability distribution on successor state
Players choose their moves simultaneously and independently
-player games (MDP)if A1 or A2 is a singleton.
![Page 9: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/9.jpg)
Mode of interaction
Interaction
General case: concurrent & stochastic
Separation of concurrency & probabilities
![Page 10: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/10.jpg)
Mode of interaction
Interaction
Special case: turn-based
Player 1 state
Player 2 state
In each state, one player’s move determines successor
![Page 11: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/11.jpg)
Knowledge
• Classification according to Information & Interaction
In state , player sees such that
Information
General case: partial observation
Two partitions and
Player 1’s view
Player 2’s view
![Page 12: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/12.jpg)
Knowledge
Information
Special case 1: one-sided complete observation
or
Player 1’s view
Player 2’s view
![Page 13: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/13.jpg)
Knowledge
Information
Special case 2: complete observation
and
Player 1’s view
Player 2’s view
![Page 14: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/14.jpg)
Classification
• Classification according to Information & Interaction
complete observation
one-sided complete obs.
partial observation
turn-based
concurrent
Information Interaction
-player games
![Page 15: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/15.jpg)
Classification
• Classification according to Information & Interaction
complete observation MDP
one-sided complete obs. POMDP
turn-based
concurrent
=
InformationInteraction
-player games
Markov Decision Process
partial observation
=
![Page 16: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/16.jpg)
Strategies
A strategy for Player is a function that maps histories to probability distribution over actions.
Strategies are observation-based:
![Page 17: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/17.jpg)
Strategies
A strategy for Player is a function that maps histories to probability distribution over actions.
Strategies are observation-based:
Special case: pure strategies
![Page 18: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/18.jpg)
Objectives
An objective (for player 1) is a measurable set of infinite sequences of states:
![Page 19: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/19.jpg)
Objectives
An objective (for player 1) is a measurable set of infinite sequences of states:
Examples:
- Reachability, safety - Büchi, coBüchi - Parity - Borel
B C
Büchi coBüchi
Reachability Safety
![Page 20: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/20.jpg)
Value
Probability of finite prefix of a play:
induces a unique probability measure on measurable sets of plays:
and a value function for Player 1:
![Page 21: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/21.jpg)
Value
Probability of finite prefix of a play:
induces a unique probability measure on measurable sets of plays:
and a value function for Player 1:
Our reductions preserve values and existence of optimal strategies.
![Page 22: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/22.jpg)
Outline
• Randomness in game structures
- for free with complete-observation, concurrent
- for free with one-sided, turn-based
• Randomness in strategies
- for free in (PO)MDP
• Corollary: undecidability results
![Page 23: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/23.jpg)
Preliminary
Rational probabilities
Probabilities only
1. Make all probabilistic states have two successors
12 3 4
1
2
3 4
![Page 24: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/24.jpg)
Preliminary
Rational probabilities
Probabilities only
2. Use binary encoding of p and q-p to simulate [Zwick,Paterson 96] :
![Page 25: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/25.jpg)
Preliminary
Rational probabilities
Probabilities only
2. Use binary encoding of p and q-p to simulate [Zwick,Paterson 96] :
![Page 26: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/26.jpg)
Preliminary
Rational probabilities
Probabilities only
2. Use binary encoding of p and q-p to simulate [Zwick,Paterson 96] :
• Polynomial-time reduction, for parity objectives
![Page 27: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/27.jpg)
Outline
• Randomness in game structure
- for free with complete-observation, concurrent
- for free with one-sided, turn-based
• Randomness in strategies
- for free in (PO)MDP
• Corollary: undecidability results
![Page 28: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/28.jpg)
Classification
• Classification according to Information & Interaction
complete observation
one-sided complete obs.
partial observation
turn-based
concurrent
Information Interaction
![Page 29: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/29.jpg)
Reduction
Simulate probabilistic state with concurrent state
![Page 30: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/30.jpg)
Probability to go from s to s’0:
Reduction
Simulate probabilistic state with concurrent state
![Page 31: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/31.jpg)
Reduction
Simulate probabilistic state with concurrent state
Probability to go from s to s’0:
If , then
![Page 32: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/32.jpg)
Probability to go from s to s’0:
If , then
If , then
Reduction
Simulate probabilistic state with concurrent state
![Page 33: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/33.jpg)
Probability to go from s to s’0:
If , then
If , then
Simulate probabilistic state with concurrent state
Each player can unilaterally decide to simulate the original game.
Reduction
![Page 34: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/34.jpg)
Simulate probabilistic state with concurrent state
Player 1 can obtain at least the value v(s) by playing all actions uniformly at random: v’(s) ≥ v(s)
Player 2 can force the value to be at most v(s) by playing all actions uniformly at random: v’(s) ≤ v(s)
Reduction
![Page 35: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/35.jpg)
For {complete,one-sided,partial} observation, given a game with rational probabilities
we can construct a concurrent game with deterministic transition function
such that:
and existence of optimal observation-based strategies is preserved.
The reduction is in polynomial time for complete-observation parity games.
Reduction
![Page 36: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/36.jpg)
Information leak from the back edges…
Partial information
![Page 37: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/37.jpg)
Information leak from the back edges…
Partial information
No information leakage if all probabilities have same denominator q
use 1/q=gcd of all probabilities in G
![Page 38: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/38.jpg)
Information leak from the back edges…
Partial information
No information leakage if all probabilities have same denominator q
use 1/q=gcd of all probabilities in GThe reduction is then exponential.
![Page 39: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/39.jpg)
Example
![Page 40: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/40.jpg)
Outline
• Randomness in game structure
- for free with complete-observation, concurrent
- for free with one-sided, turn-based
• Randomness in strategies
- for free in (PO)MDP
• Corollary: undecidability results
![Page 41: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/41.jpg)
Overview
• Classification according to Information & Interaction
complete observation
one-sided complete obs.
partial observation
turn-based
concurrent
Information Interaction
![Page 42: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/42.jpg)
Reduction
Simulate probabilistic state with imperfect information turn-based states
Player 1 states
Player 2 state Player 1
observation
![Page 43: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/43.jpg)
Reduction
Each player can unilaterally decide to simulate the probabilistic stateby playing uniformly at random:
Player 2 chooses states (s,0), (s,1) unifiormly at random Player 1 chooses actions 0,1 unifiormly at random
Simulate probabilistic state with imperfect information turn-based states
![Page 44: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/44.jpg)
Games with rational probabilities can be reduced to turn-based-games with deterministic transitions and (at least) one-sided complete observation.
Values and existence of optimal strategies are preserved.
Reduction
![Page 45: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/45.jpg)
Randomness for free
In transition function (this talk):
![Page 46: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/46.jpg)
When randomness is not for free
• Complete-information turn-based ( ) games
- in deterministic games, value is either 0 or 1 [Martin98]- MDPs with reachability objective can have values in [0,1] Randomness is not for free.
• -player games (MDP & POMDP)
- in deterministic partial info -player games, value is either 0 or 1 [see later] - MDPs have value in [0,1]
Randomness is not for free.
![Page 47: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/47.jpg)
Randomness for free
In transition function (this talk):
![Page 48: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/48.jpg)
Randomness for free
In transition function (this talk):
In strategies (Everett’57, Martin’98, CDHR’07):
![Page 49: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/49.jpg)
Randomness in strategies
Example- concurrent, complete observation
- reachability
Reminder: randomized strategy for Player :
pure strategy for Player :
Randomized strategies are more powerful
![Page 50: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/50.jpg)
Randomness in strategies
Example- turn-based, one-sided complete
observation - reachability
Randomized strategies are more powerful
![Page 51: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/51.jpg)
Outline
• Randomness in game structure
- for free with complete-observation, concurrent
- for free with one-sided, turn-based
• Randomness in strategies
- for free in (PO)MDP
• Corollary: undecidability results
![Page 52: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/52.jpg)
(p)
Randomness in strategies
Randomized
strategy
probability p
a
b(1-p)
RandStrategy(p) { x = rand(0…1) if x≤p then out(a) else out(b) }
![Page 53: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/53.jpg)
(p)
Randomness in strategies
Randomized
strategy
probability p
a
b(1-p)
Random generator
Uniform probability
x [0,1]
Pure strategya
Pure strategyb
x ≤ p
x > p
![Page 54: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/54.jpg)
Randomness for free in strategies
For every randomized observation-based strategy , there exists a pure observation-based strategy such that:
1½-player game (POMDP), s0 initial state.
Proof. (assume alphabet of size 2, and fan-out = 2)
Given σ, we show that the value of σ can be obtained as the average of the value of pure strategies σx:
![Page 55: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/55.jpg)
Randomness for free in strategies
Strategies σx are obtained by «de-randomization» of σ
Assume σ(s1)(a) = p and σ(s1)(b) = 1-p
σ is equivalent to playing σ0 with frequency p and σ1 with frequency 1-p where:
![Page 56: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/56.jpg)
Randomness for free in strategies
Strategies σx are obtained by «de-randomization» of σ
Assume σ(s1)(a) = p and σ(s1)(b) = 1-p
σ is equivalent to playing σ0 with frequency p and σ1 with frequency 1-p where:
![Page 57: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/57.jpg)
Randomness for free in strategies
Equivalently, toss a coin x [0,1],play σ0 if x ≤ p, and play σ1 if x > p.Playing σ in G can be viewed as a sequence of coin tosses:
s0
a
b
x ≤ p
x > p
toss x [0,1]
[ p = σ(s1)(a) ]
![Page 58: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/58.jpg)
s1
Randomness for free in strategies
Equivalently, toss a coin x [0,1],play σ0 if x ≤ p, and play σ1 if x > p.Playing σ in G can be viewed as a sequence of coin tosses:
s0
a
b
x ≤ p
x > p
y ≤ qa
y > qa
y ≤ qb
y > qb
s’
s’’
s’’’
s’’’’
s1
s1
s1
toss x [0,1]
toss y [0,1]
[ p = σ(s1)(a) ]
[ qa = δ(s0,a)(s1) ]s’
[ qb = δ(s0,b)(s1 ) ]s’’’
![Page 59: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/59.jpg)
Randomness for free in strategies
Equivalently, toss a coin x [0,1],play σ0 if x ≤ p, and play σ1 if x > p.Playing σ in G can be viewed as a sequence of coin tosses:
s0
a
b
x ≤ p
x > p
y ≤ qa
y > qa
y ≤ qb
y > qb
…
…
s’
s’’
s’’’
s’’’’
s1
s1
s1
s1
toss x [0,1]
toss y [0,1]
toss x [0,1]
[ p = σ(s1)(a) ]
[ qa = δ(s1,a)(s1) ]s’
[ qb = δ(s1,b)(s1 ) ]s’’’
![Page 60: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/60.jpg)
Randomness for free in strategies
Given an infinite sequence x=(xn)n≥0 [0,1]ω, define for all s0, s1, …, sn:
σx is a pure and observation-based strategy !
σx plays like σ, assuming that the result of the coin tosses is the sequence x.
The value of σ is the « average of the outcome » of the strategies σx.
![Page 61: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/61.jpg)
Randomness for free in strategies
Assume x=(xn)n≥0 and y=(yn)n≥0 are fixed.
Then σ determines a unique outcome…which is the outcome of some pure strategy !
s0
a
b
x0 ≤ p0
x0 > p0
y0 ≤ q0a
y0 > q0a
y0 ≤ q0b
y0 > q0b
s’
s’’
s’’’
s’’’’
s1
s1
s1
s1
x1 > p1
y1 > q1b
![Page 62: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/62.jpg)
Randomness for free in strategies
s0
a
b
x0 ≤ p
x0 > p
y0 ≤ qa
y0 > qa
y0 ≤ qb
y0 > qb
…
…
s’
s’’
s’’’
s’’’’
s1
s1
s1
s1
Assume x=(xn)n≥0 and y=(yn)n≥0 are fixed.
Let where
![Page 63: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/63.jpg)
Randomness for free in strategies
s0
a
b
x0 ≤ p
x0 > p
y0 ≤ qa
y0 > qa
y0 ≤ qb
y0 > qb
…
…
s’
s’’
s’’’
s’’’’
s1
s1
s1
s1
Assume x=(xn)n≥0 and y=(yn)n≥0 are fixed.
Let where
Theorem:
![Page 64: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/64.jpg)
Randomness for free in strategies
s0
a
b
x0 ≤ p
x0 > p
y0 ≤ qa
y0 > qa
y0 ≤ qb
y0 > qb
…
…
s’
s’’
s’’’
s’’’’
s1
s1
s1
s1
Assume x=(xn)n≥0 and y=(yn)n≥0 are fixed.
Let where
Playing σ in G is equivalent to choosing (xn) and (yn) uniformly at random, and then producing
![Page 65: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/65.jpg)
Randomness for free in strategies
s0
a
b
x0 ≤ p
x0 > p
y0 ≤ qa
y0 > qa
y0 ≤ qb
y0 > qb
…
…
s’
s’’
s’’’
s’’’’
s1
s1
s1
s1
Assume x=(xn)n≥0 and y=(yn)n≥0 are fixed.
Let where
Playing σ in G is equivalent to choosing (xn) and (yn) uniformly at random, and then producing
• The random sequences (xn) and (yn) can be generated separately, and independently !
• σx is a pure strategy !
pure strategy
![Page 66: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/66.jpg)
Proof
The value of σ is the « average of the outcome » of the strategies σx.
![Page 67: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/67.jpg)
Proof
The value of σ is the « average of the outcome » of the strategies σx.
By Fubini’s theorem…
![Page 68: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/68.jpg)
Proof
The value of σ is the « average of the outcome » of the strategies σx.
Pure and randomized strategies are equally powerful in POMDPs !
![Page 69: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/69.jpg)
Randomness for free
In transition function:
In strategies:
![Page 70: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/70.jpg)
Outline
• Randomness in game structure
- for free with complete-observation, concurrent
- for free with one-sided, turn-based
• Randomness in strategies
- for free in (PO)MDP
• Corollary: undecidability results
![Page 71: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/71.jpg)
Corollary
Almost-sure coBüchi (and positive Büchi) with randomized (or pure) strategies is undecidable for POMDPs.
Using [BaierBertrandGrößer’08]:
Using randomness for free in transition functions :
Almost-sure coBüchi (and positive Büchi) is undecidable for deterministic turn-based one-sided complete observation games.
![Page 72: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger](https://reader038.vdocument.in/reader038/viewer/2022110206/56649cdc5503460f949a78d6/html5/thumbnails/72.jpg)
Thank you !
Questions ?