controller synthesis with uppaal-tigaafsec.asr.cnrs.fr/wp-content/uploads/2010/11/tiga.pdfcontroller...
TRANSCRIPT
Controller Synthesis with UPPAAL-TIGA
Alexandre DavidKim G. Larsen, Didier Lime,
Franck Cassez, Jean-François Raskin…
18-11-2010 AFSEC 2
OverviewTimed Games.
Algorithm (CONCUR’05).Strategies.Code generation.Architecture of UPPAAL-TIGA.
Timed Games with Partial Observability.Algorithm (ATVA’07).
18-11-2010 AFSEC 3
Controller Synthesis/TGAGiven
System moves S,Controller moves C,and a property φ,
finda strategy Sc s.t. Sc||S φ,or prove there is no such strategy.
18-11-2010 AFSEC 4
Timed Game AutomataIntroduced by Maler, Pnueli, Sifakis[Maler & al. ’95].The controller continuously observes the system (all delays & moves are observable).The controller can
wait (delay action),take a controllable move, orprevent delay by taking a controllable move.
18-11-2010 AFSEC 5
Timed Game AutomataTimed automata with controllable and uncontrollable transitions.Reachability & safety games.
control: A<> TGA.goalcontrol: A[] not TGA.L4
Memoryless strategy:state → action.
18-11-2010 AFSEC 6
TGA – Let’s Play!control: A<> TGA.goal
x<1 : λx==1 : c
x<2 : λx≥2 : c
Strategy x≤1 : c
x<1 : λx==1 : c
Note: This is one strategy.There are other solutions.
18-11-2010 AFSEC 7
Results[Maler & al. ’95, De Alfaro & al. ’01]There is a symbolic iterative algorithm to compute the set W* of winning states for timed games.[Henziger & Kopke ’99]Safety and reachability control are EXPTIME-complete.[Cassez & al. ’05]Efficient on-the-fly algorithm for safety & reachability games.
18-11-2010 AFSEC 8
AlgorithmOn-the-fly forward algorithm with a backwardfix-point computation of the winning/losing sets.
Use all the features of UPPAAL in forward.Possible to mix forward & backward exploration.
Solved by Liu & Smolka 1998 for untimed games.Extended symbolic version at CONCUR’05.
18-11-2010 AFSEC 9
18-11-2010 AFSEC 10
Backward Propagation
0 1 2
L0
L1
L2
L3
Back-propagatewhengoal isreached.
Note: This is not a strategy,it’s only the set of winningstates.
18-11-2010 AFSEC 11
Backward Propagation
G
B
B
Predecessors of G avoiding B?
pred t
18-11-2010 AFSEC 12
predt – From Federation to Zone
U I
U U UI↓↓↓↓=
=
)\)(()\(),(
),(),(
BBGBGBGpred
BGpredBGpred
t
i j i jjitjit
18-11-2010 AFSEC 13
Query Language (1)Reachability properties:
control: A[ p U q ]control: A<> q ⇔ control: A[ true U q ]
Safety properties:control: A[ p W q ]control: A[] p ⇔ control: A[ p W false ]
Tuning:change search ordering,add back-propagation of winning+losing states.
Back-propagate winningstates & BFS+DFS.
Back-propagate losingstates & BFS-BFS.
18-11-2010 AFSEC 14
Query Language (2)Time-optimality
control_t*(u,g): A[ p U q ]u is an upper-bound to prune the search, act like an invariant but on the path = expression on the current state.g is the time to the goal from the current state (a lower-bound in fact), also used to prune the search. States with t+g > u are pruned.
Cooperative strategies.E<> control: φProperty satisfied iff φ is reachable but the obtained strategy is maximal.
18-11-2010 AFSEC 15
Cooperative StrategiesState-space is partitioned between states from which there is a strategy and those from which there is no strategy.Cooperative strategy suggests moves from the opponent that would “help” the controller.Being used in testing.
Maximalpartition with
winningstrategies.
18-11-2010 AFSEC 16
Strategies?The algorithm computes sets of winning and losing states, not strategies.Strategies are computed on top:
Take actions that lead to winning states (reachability).Take actions to avoid losing states (safety).Partition states with actions to guarantee progress.This is done on-the-fly and the obtained strategy depends on the exploration order.
18-11-2010 AFSEC 17
Winning States → Strategy
0 1 2
L0
L1
L2
L3
Winning states
Strategy
L1→goal pastwait
L3→L1
past
wait
L2→L3
L0→L1
past
wait
L0→L1
wait L0→L1Also possible
18-11-2010 AFSEC 18
Strategies as PartitionsBuilt on-the-fly.Guarantee progress in the strategy.
No loop.
Deterministic strategy.Different problem than computing the set of winning states.Different ordering searches can give different strategies … with possibly the same set of winning states.
18-11-2010 AFSEC 19
Code GenerationMapping state → action.
# entries = # states.
Decision graph state → action.# tests = # variables.More compact.Based on a hybrid BDD/CDD with multi-terminals.
18-11-2010 AFSEC 20
Decision GraphBDD: boolean variables.CDD: constraints on clocks.Multi-terminals: actions.
It works because we have a partition.
18-11-2010 AFSEC 21
Graph ReductionTesting consecutive bits:
Replace by one testing with a mask.Can span on several variables.
18-11-2010 AFSEC 22
Decision Graph
18-11-2010 AFSEC 23
Pipeline ArchitecturePipeline Components
Source
Sink
Filter
State
Successor
DataBuffer
18-11-2010 AFSEC 24
Pipeline Architecture
State-graph
Waiting queue
Transition Successor Delay Extrapolation+
Sources,F
forward.
Destinations’,B
backward.Predecessor predt
winlose?
s’,F
update?s,Bupdate?
s*,B
Inclusioncheck+add
18-11-2010 AFSEC 25
Query Language (4)Partial observable systems:
{ obs1, obs2 … } control: reachability | safetyGenerate a controller based on partial observations of the system (obs1, obs2…).
Games with Buchi accepting states.control: A[] ( p && A<> q )control: A[] A<> q
Franck CassezJF RaskinDidier Lime
18-11-2010 AFSEC 26
Query Language(5)Simulation checking of TA & TGA.
{ A, B … } <= { C, D … }Built-in new algorithm that bypasses a manual encoding we had [LSV tech. report KGL, AD, TC].Strategies & simulation in the GUI available.FORMATS’09
Thomas ChatainPeter Bulychev
18-11-2010 AFSEC 27
TGA with Buchi Accepting States
Use TIGA’s algorithm as an intermediate steps in a fixpoint, but modified:
winning states are states that can reach goalsgoal states defined by queries, not necessarily winningFixpoint:Win = SOTF(goal)while (Win ∩ goal) ≠ goal
goal = Win ∩ goalWin = SOTF(goal)
donereturn S0∈Win
Didier Lime
18-11-2010 AFSEC 28
Application: Non-zeno Strategies
Add a monitor with the rest of the system.Askcontrol: A[] (p && A<> Monitor.Check)
u
x==1
x=0
Check
18-11-2010 AFSEC 29
Timed Gameswith Partial Observability
Previous: Perfect information.Not always suitable for controllers.
Partial observation.States or events, here states.Distinguish states w.r.t. observations.Strategy keeps track of states w.r.t. observations.Observations = predicates over states.
18-11-2010 AFSEC 30
ResultsDiscrete event systems
[Kupferman & Vardi ’99, Reif ’84, Arnold & al. ’03]. Game given as modal logic formula: Full-observation as hard as partial observation.[Chatterjee & al. ’06, De Wulf & al. ’06]. Game given as explicit graph: Full-observation PTIME, partial observation EXPTIME.
Timed systems, game given as a TA[Cassez & al. ’07] Efficient on-the-fly algorithm, EXPTIME.
18-11-2010 AFSEC 31
State Based Full Observation
2-player reachability game, controllable + uncontrollable actions.Full observation: in l2 do c1, in l3 do c2.
Franck Cassez ATVA’07
18-11-2010 AFSEC 32
State Based Partial Observation
Partition the state-space l2=l3.Can’t win here.
Franck Cassez ATVA’07
18-11-2010 AFSEC 33
State Based Partial ObservationFranck Cassez ATVA’07
18-11-2010 AFSEC 34
Observation For Timed SystemsFranck Cassez ATVA’07
18-11-2010 AFSEC 35
Stuttering-Free Invariant ObservationsFranck Cassez ATVA’07
18-11-2010 AFSEC 36
TGA with PO: RulesIf player 1 wants to take an action c, then
player 2 can choose to play any of his actions or c as long as the observation stays the same, orplayer 2 can delay as long as c is not enabledand the observation stays the same.⇒ c is urgent.
If player 1 wants to delay, thenplayer 2 can delay or take any of his actions as long as the observation stays the same.
The turn is back to player 1 as soon as the observation changes.
18-11-2010 AFSEC 37
On-the-Fly Algorithm
18-11-2010 AFSEC 38
AlgorithmPartition the state-space w.r.t. observations.Observations O1 O2 O3.Winning/losing is observable.
O1∧O2∧O3
¬O1∧¬O2∧¬O3
O1∧O2∧¬O3
¬O1∧¬O2∧O3
O1∧¬O2∧O3
O1∧¬O2∧¬O3
¬O1∧O2∧¬O3
¬O1∧O2∧O3change of observation
18-11-2010 AFSEC 39
Algorithm with Reachability Objective
1
2
3
W’
18-11-2010 AFSEC 40
Initialization
Passed list { sets of W }.Waiting list { tuples (W,α,W’) }.Win[W] maps to 1 (winning) or 0 (unknown).Similar Losing[W].Depend[W] records the graph.Sinkα: sink states by doing α.
18-11-2010 AFSEC 41
Sets of Symbolic States &Successors
Set W of symbolicstates within oneobservation.
Nextα(W): sets of symbolic successors for some action α.
α
α
Nextα(W)∩O2O1
Nextα(W)∩O3
Nextα: do {α asap and wait for α}until change of observation.
18-11-2010 AFSEC 42
Forward PhaseUpdate Win,Losing, graph.
Partition successors.
Detect deadlocks.
Back-propagate if necessary.
Continue forward.
18-11-2010 AFSEC 43
Backward PhaseIf there is a c whose successors are all winning
then W is winning and back-propagate.
If every c has a losing successor
then W is losing and back-propagate.
Update graph in case of new paths to passed states.
18-11-2010 AFSEC 44
AlgorithmInitial state in some partition.Compute successors { set of states } w.r.t. a controllable action.Successors distinguished by observations.
18-11-2010 AFSEC 45
AlgorithmConstruct the graph of sets of symbolic states.Back-propagate winning/losing states.
18-11-2010 AFSEC 46
NotesStuttering-free invariant observations
sinks possible in observations(deadlock/livelock/loop)
Actions are urgentdelay until actions are enabled(or observation changes)
18-11-2010 AFSEC 47
AlgorithmForward exploration
constrained by action + observationsdelay special
Back-propagation.If all successorsa are winning,declare current state winning,strategy: take action a.If one successora is losing,avoid action a.If no action is winning the current state is losing.
18-11-2010 AFSEC 48
Example
Observations: L, H, E, B, y in [0,1[
18-11-2010 AFSEC 49
Example – Memoryful Strategy
24 26
166 164
30 34
172 176
38 42
180 184
46 50
188 192
54 58
196 200
62
135 27
165 169
31 35
173 177
39 43
181 185
47 51
189 193
55 59
197 201
63 67
205 209
71137 295
162
0
HL Ey EyLy Hy
Partition:delayy=0eject!
Actions:
Controllable with y∈[0,½[,not with y∈[0,1[.
18-11-2010 AFSEC 50
Case-Studies[FORMATS’07] Guided Controller Synthesis Using UPPAAL-TIGA. Jan Jakob Jessen, Jacob Illum Rasmussen, Kim G. Larsen, AlexandreDavid.[HSCC’09] Automatic Synthesis of Robust and Optimal Controllers – An Industrial Case Study. Franck Cassez, J.J. Jessen, Kim G. Larsen, Jean-François Raskin, Pierre-Alain Reynier. (Hydac)
18-11-2010 AFSEC 51
18-11-2010 AFSEC 52
18-11-2010 AFSEC 53
18-11-2010 AFSEC 54
18-11-2010 AFSEC 55
18-11-2010 AFSEC 56
18-11-2010 AFSEC 57
18-11-2010 AFSEC 58
18-11-2010 AFSEC 59
18-11-2010 AFSEC 60
18-11-2010 AFSEC 61
18-11-2010 AFSEC 62
18-11-2010 AFSEC 63
Generalizing this WorkThe case-study
ad-hoc translationad-hoc scriptad-hoc interface
What we did:Defined the interface (inputs & outputs in Simulink).Rewrote the Ruby translator
generic inputs/outputsgeneric strategiesdiscretizes time.
Integrated the workflow inside Simulink.
18-11-2010 AFSEC 64
Workflow Between TIGA & Simulink
18-11-2010 AFSEC 65
ResultsFrom a TIGA model and a Simulink model we can
generate the S-function that acts as the discrete controller inside Simulink,simulate the model and validate the controller.