lecture 7 synthesis of reactive control protocolsmurray/courses/afrl-sp12/l7...lecture 7 synthesis...

34
Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California Institute of Technology AFRL, 25 April 2012 Outline Review: networked control systems and cooperative control systems Asynchronous execution / group messaging systems (virtual synchrony) Verification of async control protocols for multi-agent, cooperative control Applications of model checking to Alice’s actuation interface

Upload: others

Post on 30-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Lecture 7Synthesis of Reactive Control

Protocols

Richard M. MurrayNok Wongpiromsarn Ufuk TopcuCalifornia Institute of Technology

AFRL, 25 April 2012Outline

• Review: networked control systems and cooperative control systems• Asynchronous execution / group messaging systems (virtual synchrony)• Verification of async control protocols for multi-agent, cooperative control• Applications of model checking to Alice’s actuation interface

Page 2: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

DR,NP,S STO,NP,S

DR,P,S STO,P,S

no coll.-free path

coll.-free path found

stopped & obs detected

no coll.-free path

coll.-free path found

passing finished orobs disappeared

DR,PR,S

no coll.-free path & #lanes = 1

STO,PR,S

no coll.-free path &#DR,PR,S < M

coll.-free path found

passing finished or obs disappeared

STO,A

BACKUP

no coll.-free path & #lanes > 1

no coll.-free path &#DR,PR,S >= M &#lanes > 1

backup finished& #BACKUP >= N

no coll.-free path& #DR,PR,S >= M

& #lanes = 1

backup finished& #BACKUP < N

DR,Ano coll.-free path

coll.-free path found

STO,BDR,Bno coll.-free path

coll.-free path found

no coll.-free path

FAILED

no coll.-free path & #lanes>1

PAUSEDOFF-ROAD

no coll.-free path& #lanes=1

ROAD REGION

no coll.-free path

coll.-free path forDR,A found

coll.-free path forDR,PR,S found

Alice’s Logic Planner

2

Given a specification , whether the planner is correct with respect to depends on the environment’s actions (e.g., how obstacles move)

• a “correct” planner needs to ensure that is satisfied for all the possible valid behaviors of the environment

� �

How to design such a correct planner?

Page 3: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Open System SynthesisP

C

y

An open system is a system whose behaviors can be affected by external influencey

E x

Open (synchronous) synthesis:

Given

• a system that describes all the possible actions- plant actions y are controllable- environment actions x are uncontrollable

• a specification

find a strategy for the controllable actions which will maintain the specification against all possible adversary moves, i.e.,

�(x, y)

8x · �(x, f(x))

f(x)

E CPx0

x1

x2

x3

time

y0 = f(x0)

y1 = f(x0x1)

y2 = f(x0x1x2)

y3 = f(x0x1x2x3)

y

x

E

CP

3

Page 4: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Consider the synthesis of a reactive system with input x and output y, specified by the linear temporal formula .

• The system contains 2 components S1 (i.e., “environment”) and S2 (i.e., “reactive module”)

- Only S1 can modify x- Only S2 can modify y

• Want to show that S2 has a winning strategy for y against all possible x scenarios the environment may present to it.

- Two-person game: treat environment as adversary‣ S2 does its best, by manipulating y, to maintain

‣ S1 does its best, by manipulating x, to falsify

• If a winning strategy for S2 exists, we say that is realizable

Reactive System Synthesis

Reactive systems are open systems that maintain an ongoing interaction with their environment rather than producing an output on termination.

�(x, y)

�(x, y)�(x, y)

�(x, y)

4

yx

S1

S2

Page 5: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Satisfiability ≠ Realizability

• Realizability should guarantee the specification against all possible (including adversarial) environment (Rosner 98)

‣ To solve the problem one must find a satisfying tree where the branching represents all possible inputs

• Satisfiability of only ensures that there exists at least one behavior, listing the running values of x and y that satisfies

‣ There is a way for the plant and the environment to cooperate to achieve

• Existence of a winning strategy for S2 can be expressed by the AE-formula

�(x, y)�(x, y)

�(x, y)

8x9y · �(x, y)

x0,y0

x0,y1 x1,y2

x0,y3 x1,y4 x0,y5 x1,y6

x 2 {x0, x1}x0 x1

x0 x0x1 x1

5

Page 6: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

The Runner Blocker System

R B Goal

Runner R tries to reach Goal. Blocker B tries to intercept and stop R.

6

Page 7: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

win

lose lose

The Runner Blocker System

7

Page 8: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Solving Reactive System Synthesis• Solution is typically given as the winning set

• The winning set is the set of states starting from which there exists a strategy for S2 to satisfy the specification for all the possible behaviors of S1

• A winning strategy can then be constructed by saving intermediate values in the winning set computation

• Worst case complexity is double exponential

• Construct a nondeterministic Buchi automaton from ⇒ first exponent

• Determinize Buchi automaton into a deterministic Rabin automaton ⇒ second exponent

• Follow a similar procedure as in closed system synthesis and construct the product of the system and the deterministic Rabin automaton

• Find the set of states starting from which all the possible runs in the product automaton are accepting ⇒ This set can be obtained by computing the recurrent and the attractor sets

• Special Cases of Lower Complexity

• For a specification of the form or , the controller can be synthesized in O(N2) time where N is the size of the state space

• Avoid translation of the formula to an automaton and determinization of the automaton

�(x, y)

⇤p,⌃p,⇤⌃p ⌃⇤p

8

Page 9: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

• Transition system

• Specification

• Define the set

• Define the predecessor operator by

• The set of all the states starting from which WIN is satisfiable (if the plant and the environment to cooperate) can be computed efficiently by the iteration sequence

From Tarski-Knaster Theorem:

- There exists a natural number n such that Rn = Rn-1- Such an Rn is the minimal solution of the fix-point equation

- The minimal solution of the above fix-point equation is denoted by

Special Case: Satisfiability

TS = (S, Act,!, I,AP, L)

� = ⌃p

WIN , {s 2 S : s |= p}Pre9 : 2S ! 2S

R = WIN [ Pre9(R)

R0 = WINRi = Ri�1 [ Pre9(Ri�1),8i > 0

Pre9(R) = {s 2 S : 9r 2 R s.t. s ! r}

µR.(WIN [ Pre9(R))

9

Page 10: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

The Runner Blocker System

R0

R1

R2

R3

win

lose lose

R4

Page 11: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

• Transition system

• Specification

• Define the set

• Define the operator and by

• The set of all the states starting from which the controller can force the system into WIN can be computed efficiently by the iteration sequence

- There exists a natural number n such that Rn = Rn-1- Such Rn is the minimal solution of the fix-point equation

- The minimal solution of the above fix-point equation is denoted by

Reachability in Adversarial SettingTS = (S, Act,!, I,AP, L)

� = ⌃p

WIN , {s 2 S : s |= p}Pre8 : 2S ! 2S Pre98 : 2S ! 2S

Pre8(R) = {s 2 S : 8r 2 S if s ! r, then r 2 R}= the set of states whose all successors are in R

Pre89(R) = Pre8(Pre9(R))

= the set of states whose all successors

have at least one successor in R

R0 = WINRi = Ri�1 [ Pre89(Ri�1),8i > 0

R = WIN [ Pre89(R)

µR.(WIN [ Pre89(R))11

Page 12: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

The Runner Blocker System

R0

R1

R3

win

lose lose

Page 13: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

DR,NP,S STO,NP,S

DR,P,S STO,P,S

no coll.-free path

coll.-free path found

stopped & obs detected

no coll.-free path

coll.-free path found

passing finished orobs disappeared

DR,PR,S

no coll.-free path & #lanes = 1

STO,PR,S

no coll.-free path &#DR,PR,S < M

coll.-free path found

passing finished or obs disappeared

STO,A

BACKUP

no coll.-free path & #lanes > 1

no coll.-free path &#DR,PR,S >= M &#lanes > 1

backup finished& #BACKUP >= N

no coll.-free path& #DR,PR,S >= M

& #lanes = 1

backup finished& #BACKUP < N

DR,Ano coll.-free path

coll.-free path found

STO,BDR,Bno coll.-free path

coll.-free path found

no coll.-free path

FAILED

no coll.-free path & #lanes>1

PAUSEDOFF-ROAD

no coll.-free path& #lanes=1

ROAD REGION

no coll.-free path

coll.-free path forDR,A found

coll.-free path forDR,PR,S found

More Complicated Case

Game Automata Approach

• Consider the specification as the winning condition in an infinite two-person game between input player (S1) and output player (S2).

• Decide whether player S2 has a winning strategy, and if this is the case construct a finite state winning strategy.

13

Page 14: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Game Structures

V = {x, y},X = {x},⌃X = {x0, x1},Y = {y},⌃Y = {y0, y1, y2},x0 |= ✓e, x1 6|= ✓e,

(x0, y0) |= ✓s,

(xi, yj) 6|= ✓s,8i, j 6= 0,

((x0, yi), xj) |= ⇢e,8i, j,((x1, y0), x0) |= ⇢e,

((x1, y0), x1) 6|= ⇢e,

((x1, yi), x0) 6|= ⇢e,8i 2 {1, 2},((x1, yi), x1) |= ⇢e,8i 2 {1, 2},((x0, y0), x0, yi) |= ⇢s,8i,((x0, y0), x1, y0) |= ⇢s,

((x0, y0), x1, yi) 6|= ⇢s,8i 6= 0,

. . .

x0,y0

x0,y1 x0,y2 x1,y0

x1,y1 x1,y2

x0

x0 x0 x1

x0x0

x1

x1

x1

x1

A game structure is a tuple G = (V,X ,Y, ✓e, ✓s, ⇢e, ⇢s,AP , L,')

• V = {v1, . . . , vn} is a finite set of state variables. ⌃Vis the set of all the possible assignments to variables

in V

• X ✓ V is a set of input variables

• Y = V \ X is a set of output variables

• ✓e(X ) is a proposition characterizing the initial states

of the environment

• ✓s(V) is a proposition characterizing the initial states

of the system

• ⇢e(V,X 0) is a proposition characterizing the transition

relation of the environment

• ⇢s(V,X 0,Y 0) is a proposition characterizing the tran-

sition relation of the system

• AP is a set of atomic propositions

• L : ⌃V ! 2

APis a labeling function

• ' is an LTL formula characterizing the winning con-

dition

primed copy of representsthe set of next input variables

X

14

Page 15: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Autonomous Car Example

Game Structure G = (V,X ,Y, ✓e, ✓s, ⇢e, ⇢s,AP , L,')

• X (environment): obstacles, other cars, pedestrians

• Y (plant): vehicle state (drive VS stop, passing?,

reversing?, etc)

• ✓e describes the valid initial states of the environ-

ment, e.g., where obstacles can be

• ✓s describes the valid initial states of the vehicle,

e.g., the stop state

• ⇢e describes how obstacles may move

• ⇢s describes the valid transitions of the vehicle state

• ' describes the winning condition, e.g., vehicle does

not get stuck

15

Page 16: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Playsinfinite or the last state in the sequence has no valid successor

Game structure G = (V,X ,Y, ✓e, ✓s, ⇢e, ⇢s,AP , L,')

• A play of G is a maximal sequence of states � = s0s1 . . . satisfying s0 |= ✓e ^ ✓s and(sj , sj+1) |= ⇢e ^ ⇢s,8j � 0.

– Initially, the environment chooses an assignment sX 2 ⌃X such that sX |= ✓e andthe system chooses an assignment sY 2 ⌃Y such that (sX , sY) |= ✓e ^ ✓s.

– From a state sj , the environment chooses an input sX 2 ⌃X such that (sj , sX ) |=⇢e and the system chooses an output sY 2 ⌃Y such that (s, sX , sY) |= ⇢s.

• A play � is winning for the system if either

– � = s0s1 . . . sn is finite and (sn, sX ) 6|= ⇢e,8sX 2 ⌃X , or

– � is infinite and � |= '.

Otherwise � is winning for the environment .

x0,y0

x0,y1 x1,y0

x1,y1 x1,y2

x0

x0 x0 x1

x0x0

x1

x1

x1

x1

x0,y2

16

' = ⇤⌃(x = x0 ^ y = y2)

• � =

�(x0, y0), (x0, y2), (x0, y1)

�!is winning for the system

• � =

�(x0, y0)

�!is winning for the environment

Page 17: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

StrategiesGame structure G = (V,X ,Y, ✓e, ✓s, ⇢e, ⇢s,AP , L,')

• A strategy for the system is a function f : M ⇥ ⌃V ⇥ ⌃X ! M ⇥ ⌃Y such that

for all s 2 ⌃V , sX 2 ⌃X , m 2 M , if f(m, s, sX ) = (m0, sY) and (s, sX ) |= ⇢e, then

(s, sX , sY) |= ⇢s.

• A play � = s0s1 . . . is compliant with strategy f if f(mi, si, si+1|X ) = (mi+1, si+1|Y),8i.

• A strategy f is winning for the system from state s 2 ⌃V if all plays that start from sand are compliant with f are winning for the system. If such a winning strategy exists,

we call s a winning state for the system.

memory domain

f(m, (x0, y0), x0) = (m, y2)f(m, (x0, y0), x1) = (m, y0)f(m, (x0, y1), x0) = (m, y0)f(m, (x0, y1), x1) = (m, y1)

f(m, (x0, y2), x0) = (m, y1)f(m, (x0, y2), x1) = (m, y0)f(m, (x1, y0), x0) = (m, y2)f(m, (x1, y0), x1) = (m, y2)

f(m, (x1, y1), x0) = (m, y2)f(m, (x1, y1), x1) = (m, y2)f(m, (x1, y2), x0) = (m, y2)f(m, (x1, y2), x1) = (m, y0)

x0,y0

x0,y1 x1,y0

x1,y1 x1,y2

x0

x0 x0 x1

x0x0

x1

x1

x1

x1

x0,y2

Gx0,y0

x0,y1 x1,y0

x1,y1 x1,y2

x0 x0 x1

x0x0

x1

x1

x1

x1

x0,y2

f

Is f winning for the system?

17

Page 18: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Winning Games

x0,y0

x0,y1 x1,y0

x1,y1 x1,y2

x0

x0 x0 x1

x0x0

x1

x1

x1

x1

x0,y2

Gx0,y0

x0,y1 x1,y0

x1,y1 x1,y2

x0 x0 x1

x0x0

x1

x1

x1

x1

x0,y2

f

A game structure G = (V,X ,Y, ✓e, ✓s, ⇢e, ⇢s,AP , L,') is winning for the system if for each

sX 2 ⌃X such that sX |= ✓e, there exists sY 2 ⌃Y such that (sX , sY) |= ✓s and (sX , sY) is a

winning state for the system

x0 |= ✓e but x1 6|= ✓e

(x0, y0) is a winning state for the system

(x0, y0) |= ✓s

G is winning for the system

18

Page 19: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Runner Blocker Example

19

s0

B

s1

s2

s3

s4

R

Game Structure G = (V,X ,Y, ✓e, ✓s, ⇢e, ⇢s,AP , L,')

• X := {x}, ⌃X = {s0, s1, s2, s3, s4}

• Y := {y}, ⌃Y = {s0, s1, s3, s4}

• ✓e := (x = s2)

• ✓s := (y = s0)

• ⇢e :=

�(x = s2) =) (x

0 6= s2)�^

�(x 6= s2) =)

(x

0= s2)

• ⇢s :=

�(y = s0 _ y = s4) =) (y

0= s1 _ y

0=

s3)�^

�(y = s1 _ y = s3) =) (y

0= s0 _ y

0=

s4)�^ (y

0 6= x

0)

• ' describes the winning condition, e.g., ⇧(y = s4)

Page 20: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Runner Blocker Example

Play: An infinite sequence of system (blocker + runner) states such that s0 is a valid initial state and (sj, sj+1) satisfies the transition relation of the blocker and the runner

Strategy: A function that gives the next runner state, given a finite number of previous system states of the current play, the current system state and the next blocker state

Winning state: A state starting from which there exists a strategy for the runner to satisfy the winning condition for all the possible behaviors of the blocker

� = s0s1 . . .

Winning game: For any valid initial blocker state sx, there exists a valid initial runner state sy such that (sx, sy) is a winning state

Solving game: Identify the set of winning states

20

q0

B

q1

q2

q3

q4

R

Page 21: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Richard M. Murray, Caltech CDSEECI, Mar 2011

Solving Game StructuresGeneral solutions are hard• Worst case complexity is double exponential (roughly in number of states)

Special cases are easier• For a specification of the form or , the controller can be synthesized

in O(N2) time where N is the size of the state space

Another special case: GR(1) formulas

Thm (Piterman, Sa’ar, Pneuli, 2007) A game structure G with a GR(1) winning condition can be solved by a symbolic algorithm in time proportional to

More useful form:

• Can show (tomorrow) that this can be “converted” to GR(1) form

21

⇤p,⌃p,⇤⌃p ⌃⇤p

' = (⇤⌃p1 ^ . . . ^⇤⌃pm)| {z }'e

=) (⇤⌃q1 ^ . . . ^⇤⌃qn)| {z }'s

nm�⌃V �3

Page 22: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Solving Reachability Games• Game structure G = (V,X ,Y , ✓e, ✓s,⇢e,⇢s,AP,L,')• For a proposition p, let [[p]] = {s ⇥ ⌃V s � p}• For a set R, let

[[⌥⇥R]] = ⇧s ⇥ ⌃V ⇤s�X ⇥ ⌃X , (s, s�X ) � ⇢e ⇥ ⌅s�Y ⇥ ⌃Y s.t. (s, s�X , s�Y) � ⇢s and (s�X , s�Y) ⇥ R⌥• Reachability game: ' = ⇥p• The set of winning states can be computed e�ciently by the iteration sequence

R0 = �Ri+1 = [[p]] ⌃ [[⌥⇥Ri]],⇤i ⇤ 0

– Ri+1 is the set of states starting from which the system can force the play

to reach a state satisfying p within i steps

– There exists a natural number n such that Rn = Rn⇥1– Such Rn is the minimal solution of the fix-point equation R = [[p]]⌃[[⌥⇥R]]– In µ-calculus, the minimal solution of the above fix-point equation is de-

noted by µR(p ⇧⌥⇥R)

similar to the Pre∀∃ operator we saw earlier

least fixpoint

22

Page 23: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Runner Blocker Example: R1

[[p]]

23

Page 24: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Runner Blocker Example: R2

[[p]]

24

Page 25: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Runner Blocker Example: R3 = R4 = ...

[[p]]

25

Page 26: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Solving Safety Games• Game structure G = (V,X ,Y , ✓e, ✓s,⇢e,⇢s,AP,L,')• For a proposition p, let [[p]] = {s ⇥ ⌃V s � p}• For a set R, let

[[⌥⇥R]] = ⇧s ⇥ ⌃V ⇤s�X ⇥ ⌃X , (s, s�X ) � ⇢e ⇥ ⌅s�Y ⇥ ⌃Y s.t. (s, s�X , s�Y) � ⇢s and (s�X , s�Y) ⇥ R⌥• Safety game: ' = �p• The set of winning states can be computed e�ciently by the iteration sequence

R0 = ⌃VRi+1 = [[p]] ⌃ [[⌥⇥Ri]],⇤i ⇤ 0

– Ri+1 is the set of states starting from which the system can force the play

to stay in states satisfying p for i steps

– There exists a natural number n such that Rn = Rn⇥1– Such Rn is the maximal solution of the fix-point equation R = [[p]]⌃[[⌥⇥R]]– In µ-calculus, the minimal solution of the above fix-point equation is de-

noted by ⌫R(p ⇧ ⌥⇥R)greatest fixpoint

26

Page 27: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Runner Blocker Example: R1 = R2 = ...

27

Page 28: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Solving GamesGame structure G = (V,X ,Y, ✓e, ✓s, ⇢e, ⇢s,AP , L,')

' The set of winning states for the system

⇥p µX(p ⌅⇧⇥X)�p ⌫X(p ⇤⇧⇥X)�⇥ p ⌫XµY ⇥(p ⇤⇧⇥X) ⌅⇧⇥ Y ⌅

28

• ⌫X(p ⇥⌅�X) is the largest set S of states such that

– all the states in S satisfy p, and

– starting from a state in S, the system can force the play to transition to a state

in S

• ⌫XµY ⇥(p ⇥⌅�X) ⇤⌅� Y ⌅ is the set of state starting from which the system can force

the play to satisfy p infinitely often

– The disjunction and µY operators ensure that the system is in a state where it

can force the play to reach a state satisfying p

– The conjunction and the ⌫X operators ensure that the above statement is true at

all time

Page 29: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Games and Realizability

x0,y0

x0,y1 x1,y0

x1,y1 x1,y2

x0

x0 x0 x1

x0x0

x1

x1

x1

x1

x0,y2

G

Xi , (x = xi), Yi , (y = yi), X

0i , (x0 = xi), Y

0i , (y0 = yi)

✓e , X0, ✓s , Y0

⇢e ,�(X1 ^ Y0) =) X

00

�^

�(X1 ^ Y1) =) X

01

�^

�(X1 ^ Y2) =) X

01

⇢s ,�(X0 ^ Y0 ^X

00) =) (Y 0

1 _ Y

02)

�^

�(X0 ^ Y0 ^X

01) =) (Y 0

0)�^

�(X0 ^ Y1 ^X

00) =) (Y 0

0 _ Y

02)

�^

�(X0 ^ Y1 ^X

01) =) (Y 0

1)�^

�(X0 ^ Y2 ^X

00) =) Y

01

�^

�(X0 ^ Y2 ^X

01) =) Y

00

�^

�(X1 ^ Y0 ^X

00) =) Y

02

�^

�(X1 ^ Y1 ^X

01) =) Y

02

�^

�(X1 ^ Y2 ^X

01) =) (Y 0

0 _ Y

01)

' , ⇤⌃(X0 ^ Y2)

Game structure G = (V,X ,Y, ✓e, ✓s, ⇢e, ⇢s,AP , L,')

The system wins in G i↵ the specification

= (✓e =) ✓s) ^�✓e =) ⇤((�⇢e) =) ⇢s)

�^

�(✓e ^⇤⇢e) =) '

is realizable.

Given an LTL specification , we construct G as follows

• ✓e and ✓s include the non-temporal specification parts of

• ⇢e and ⇢s include the local limitations on the next values of variables in X and Y

• ' includes all the remaining properties in that are not included in ✓e, ✓s, ⇢e and ⇢s

29

Page 30: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Games and RealizabilityMore intuitive specification

• Fulfillment of the system safety depends on the liveness of the environment

- The system may violate its safety if it ensures that the environment cannot fulfill its liveness

• implies

- If is realizable, a controller for is also a controller for (but not vice versa)

- If the system wins in , then is realizable (but not vice versa)

• By adding extra output variables that represent the memory of whether the system or the environment violate their initial requirements or their safety requirements, we can construct a game such that is won by the system iff is realizable

0 =�✓e ^⇤⇢e ^ 'e

�=)

�✓s ^⇤⇢s ^ 's

0

0

G = (V,X ,Y, ✓e, ✓s, ⇢e, ⇢s,AP , L,'e =) 's) 0

• 0is realizable

– The system always picks y = y0

• is not realizable

• The system does not win in G

x0,y0

x0,y1 x1,y1

x1,y0

x1

x1

x1

x1

X = {x},⌃X = {x0, x1},Y = {y},⌃Y = {y0, y1},✓e, ✓s , true

⇢e , (x0 = x1)⇢s , ((x0 = x1) =) (y0 = y1))'e , ⇤⌃((x = x1) ^ (y = y1))'s , ⇤⌃(y = y0)

G

0G0 G0

30

Page 31: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

General Reactivity(1) GamesGR(1) game is a game with the winning condition

The winning states in a GR(1) game can be computed using the fixpoint expression

• characterizes the set of states from which the system can force the play to stay indefinitely in states

• The two outer fixpoints make sure that the system wins from the set

- The disjunction and operators ensure that the system is in a state where it can force the play to reach a state in a finite number of steps

- The conjunction and operators ensure that after visiting , we can loop and visit

G = (V,X ,Y, ✓e, ✓s, ⇢e, ⇢s,AP , L,')

' = (⇤⌃p1 ^ . . . ^⇤⌃pm)| {z }'e

=) (⇤⌃q1 ^ . . . ^⇤⌃qn)| {z }'s

¬pi

qj ∧��Zj⊕1 ∨�� Y

⌥↵↵↵↵↵↵↵↵↵↵↵↵

Z1

Z2

⇧Zn

�������������⌦

⌥↵↵↵↵↵↵↵↵↵↵↵↵↵↵

µY ⇤ m�i=1⌫X⇥(q1 ⇤⌃�Z2) ⌅⌃� Y ⌅ (¬pi ⇤⌃�X)⇧⌃

µY ⇤ m�i=1⌫X⇥(q2 ⇤⌃�Z3) ⌅⌃� Y ⌅ (¬pi ⇤⌃�X)⇧⌃

⇧µY ⇤ m�

i=1⌫X⇥(qn ⇤⌃�Z1) ⌅⌃� Y ⌅ (¬pi ⇤⌃�X)⇧⌃

���������������⌦µY ⌫X⇥⇧� Y ⌅ (¬pi ⇤⇧�X)⌅

µYqj ∧��Zj⊕1

⌫Zj qj

qj⊕1

31

Page 32: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Richard M. Murray, Caltech CDSAFRL, Apr 2012

Lecture Schedule

32

Tue Wed Thu

8:30L1: Intro to Protocol-

Based Control Systems

Computer Lab 1

Spin

L8: Receding Horizon Temporal Logic

Planning

10:30 L2: Automata Theory L5: Verification of Control Protocols

Computer Lab 2

TuLiP

12:00 Lunch Lunch Lunch

13:30 L3: Linear Temporal Logic

L6: Hybrid Systems Verification

L9: Extensions, Applications and Open

Problems

15:30 L4: Model Checking and Logic Synthesis

L7: Synthesis of Reactive Control

Protocols

http://www.cds.caltech.edu/~murray/wiki/afrlcourse2012

Page 33: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Extracting GR(1) Strategies

The intermediate values in the computation of the fixpoint can be used to compute a strategy, represented by a finite transition system, for a GR(1) game.

This strategy does one of the followings

• Iterates over strategies f1, ..., fn where fj ensures that the play reaches a qj state

• Eventually uses a fixed strategy ensuring that the play does not satisfy one of the liveness assumptions pj

Complexity: A game structure G with a GR(1) winning condition can be solved by a symbolic algorithm in time proportional to nm�⌃V �3

33

Page 34: Lecture 7 Synthesis of Reactive Control Protocolsmurray/courses/afrl-sp12/L7...Lecture 7 Synthesis of Reactive Control Protocols Richard M. Murray Nok Wongpiromsarn Ufuk Topcu California

Extensions

34

The algorithm for solving GR(1) game can be applied to any game with the winning condition of the form

where pi, qj are past formulas.

• Add to the game additional variables and a transition relation which encodes the deterministic Buchi automaton

• Examples:

- Introduce a Boolean variable x

- Initial condition: x = 1

- Transition relation for the environment:

- Winning condition:

' = (⇤⌃p1 ^ . . . ^⇤⌃pm)| {z }'e

=) (⇤⌃q1 ^ . . . ^⇤⌃qn)| {z }'s

�(p �⇥ ⇥q)

⇢e ⇥ ⇥x′ = (q ⇤ x ⇥ ¬p)⌅�� x