by: messias, spaan, lima presented by: mike plasker dmes – ocean engineering
TRANSCRIPT
![Page 1: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/1.jpg)
GSMDPs for Multi-Robot Sequential Decision Making
By: Messias, Spaan, Lima
Presented by: Mike PlaskerDMES – Ocean Engineering
![Page 2: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/2.jpg)
IntroductionRobotic Planning under uncertaintyMDP solutionsLimited real-world application
![Page 3: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/3.jpg)
Assumptions for Multi-Robot teamsCommunication (Inexpensive, free, or costly)Synchronous and steady state transitionsDiscretization of environment
![Page 4: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/4.jpg)
A Different ApproachStates and actions discrete (like MDP)Continuous measure of timeState transitions regarded as random ‘events’
![Page 5: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/5.jpg)
AdvantagesNon-Markovian effects of discretization
minimizedFully reactive to changesCommunication only required for ‘events’
![Page 6: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/6.jpg)
GSMDPsGeneric temporal probability distributions
over eventsCan model concurrent (persistently enabled)
eventsSolvable by discrete-time MDP algorithms by
obtaining an equivalent (semi-)Markovian model
Avoids negative effects of synchronous alternatives
![Page 7: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/7.jpg)
Why GSMDPs for RoboticsCooperative Robotics requires:
Operation in inherently continuous environments
Uncertainty in actions (and observations)Joint decision making for optimizationReactive
![Page 8: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/8.jpg)
Definitionsmultiagent GSMDP: tuple <d, S, X, A, T, F, R, C, h>
d = number agentsS = state space (contains state factors)X = state factorsA = set of joint actionsT = transition functionF = time modelR = instantaneous reward functionC = cumulative reward rateh = planning over continuous time
![Page 9: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/9.jpg)
DefinitionsEvent in a GSMDP:An abstraction to state transitions that share the same properties
Persistently enabled events:Events that are enabled from step ‘t’ to step ‘t+1’, but not triggered at step ‘t’
![Page 10: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/10.jpg)
Common ApproachSynchronous actionPre-defined time step
• Performance• Reaction time
![Page 11: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/11.jpg)
GSMDPsPersistently enabled events modeled by
allowing their temporal distributions to depend on the time they were enabled
Explicit modeling of non-Markovian effects from discretization
Communication efficiency
![Page 12: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/12.jpg)
Modeling EventsGroup state transitions as events to minimize
temporal distributions and transitions(battery low)
Transition function found by estimating relative frequency of each transition in the event
Time model found by timing the transition data
Approximated as a phase-type distributionReplaces events with acyclic Markov chains
![Page 13: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/13.jpg)
Events (cont.)Not always possibleDecompose events with minimum duration
into deterministically timed transitionsCan then better approximate using phase-
type distribution
![Page 14: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/14.jpg)
Solving a GSMDPCan be viewed as an equivalent discrete-time
MDPAlmost all solution algorithms for MDPs work
![Page 15: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/15.jpg)
ExperimentRobotic soccerScore a goal (reward 150)Passing around obstacle (reward 60)
![Page 16: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/16.jpg)
ResultsMDP: T = 4s
GSMDP
![Page 17: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/17.jpg)
ResultsNo idle timeReduced
communicationImproved scoring
efficiencySystem failures
(zero goals) independent of model
![Page 19: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/19.jpg)
Future WorkExtend to partially observable domainsApply bilateral phase distributions to
increase the class of non-Markovian events that are able to be modeled
![Page 20: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/20.jpg)
Questions?
![Page 21: By: Messias, Spaan, Lima Presented by: Mike Plasker DMES – Ocean Engineering](https://reader035.vdocument.in/reader035/viewer/2022070401/56649f175503460f94c2e16c/html5/thumbnails/21.jpg)
MESSIAS, J.; SPAAN, M.; LIMA, P.. GSMDPs for Multi-Robot Sequential Decision-Making. AAAI Conference on Artificial Intelligence, North America, jun. 2013. Available at: <http://www.aaai.org/ocs/index.php/AAAI/AAAI13/paper/view/6432/6843>. Date accessed: 06 Apr. 2014