learning to plan with logical automata · s1 s2 s3 g t. s0 o Ø s0 s1 s2 s3 g t ... learning to...
TRANSCRIPT
![Page 1: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark](https://reader036.vdocument.in/reader036/viewer/2022062414/5f0230357e708231d40302cf/html5/thumbnails/1.jpg)
Learning to Plan with Logical Automata
Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3,Mark Donahue3, Cristian-Ioan Vasile1, Daniela Rus1
1Massachusetts Institute of Technology2Columbia University
3MIT Lincoln Laboratory
*Equal contributors
1
![Page 2: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark](https://reader036.vdocument.in/reader036/viewer/2022062414/5f0230357e708231d40302cf/html5/thumbnails/2.jpg)
2
![Page 3: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark](https://reader036.vdocument.in/reader036/viewer/2022062414/5f0230357e708231d40302cf/html5/thumbnails/3.jpg)
Goals
Learn to plan in an environment with rules1. Learn the rules in a way that they can be easily interpreted by humans
2. Incorporate the rules into planning so that modifying the rules results in predictable changes in behavior
3
![Page 4: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark](https://reader036.vdocument.in/reader036/viewer/2022062414/5f0230357e708231d40302cf/html5/thumbnails/4.jpg)
Packing a Lunchbox
Pack a burger or a sandwich; then pack a banana
4
![Page 5: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark](https://reader036.vdocument.in/reader036/viewer/2022062414/5f0230357e708231d40302cf/html5/thumbnails/5.jpg)
Goal 1 – Interpretability
Pack a burger or a sandwich;
then pack a banana
5
InitialState
Picked upor
Packedor
Picked up Packed
GOAL!
Rules
Finite State Automaton
![Page 6: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark](https://reader036.vdocument.in/reader036/viewer/2022062414/5f0230357e708231d40302cf/html5/thumbnails/6.jpg)
6
Low-level MDP
High-level MDPPack sandwich or burger;
Then pack bananaAvoid obstacles
Factoring the Environment
![Page 7: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark](https://reader036.vdocument.in/reader036/viewer/2022062414/5f0230357e708231d40302cf/html5/thumbnails/7.jpg)
77
Discrete 2D gridworld
Finite state automaton
Representing the Environment
Pack sandwich or burger;Then pack banana
Avoid obstacles
InitialState
Picked upor
Packedor
Picked up Packed
![Page 8: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark](https://reader036.vdocument.in/reader036/viewer/2022062414/5f0230357e708231d40302cf/html5/thumbnails/8.jpg)
Goal 2 – Manipulability
Incorporate FSA into planning
8
InitialState
Picked upor
Packedor
Picked up Packed
S0 S1 S2 S3 G
S0 o ØS0
S1
S2
S3
G
T
![Page 9: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark](https://reader036.vdocument.in/reader036/viewer/2022062414/5f0230357e708231d40302cf/html5/thumbnails/9.jpg)
Learn reward
Learn transitions
Learn transitions of FSA
9
One VIN for each FSA state
Differentiable Recursive Planning
Based on Tamar, Aviv, et al. "Value iteration networks." Advances in Neural Information Processing Systems. 2016.
![Page 10: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark](https://reader036.vdocument.in/reader036/viewer/2022062414/5f0230357e708231d40302cf/html5/thumbnails/10.jpg)
Experiments - Interpretability
10
Propositions
FSA
Sta
tes
S0 o ØS0
S1
S2
S3
G
T
![Page 11: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark](https://reader036.vdocument.in/reader036/viewer/2022062414/5f0230357e708231d40302cf/html5/thumbnails/11.jpg)
S0 o ØS0
S1
S2
S3
G
T
Experiments - Interpretability
11
Picking up the sandwich or the hamburger causes a transition to the next state
![Page 12: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark](https://reader036.vdocument.in/reader036/viewer/2022062414/5f0230357e708231d40302cf/html5/thumbnails/12.jpg)
Experiments – Manipulability
We can modify the FSA so that it will only pick up the burger and not the sandwich.
12
InitialState
Picked upor
Packedor
Picked up Packed
![Page 13: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark](https://reader036.vdocument.in/reader036/viewer/2022062414/5f0230357e708231d40302cf/html5/thumbnails/13.jpg)
Experiments – Manipulability
We can modify the FSA so that it will only pick up the burger and not the sandwich.
13
InitialState
Picked upor
Packedor
Picked up Packed
![Page 14: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark](https://reader036.vdocument.in/reader036/viewer/2022062414/5f0230357e708231d40302cf/html5/thumbnails/14.jpg)
Experiments – Manipulability
We can modify the FSA so that it will only pick up the burger and not the sandwich.
14
S0 o ØS0
S1
S2
S3
G
T
![Page 15: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark](https://reader036.vdocument.in/reader036/viewer/2022062414/5f0230357e708231d40302cf/html5/thumbnails/15.jpg)
S0 o ØS0
S1
S2
S3
G
T
Experiments – Manipulability
We can modify the FSA so that it will only pick up the burger and not the sandwich.
15
![Page 16: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark](https://reader036.vdocument.in/reader036/viewer/2022062414/5f0230357e708231d40302cf/html5/thumbnails/16.jpg)
S0 o ØS0
S1
S2
S3
G
T
Experiments – Manipulability
We can modify the FSA so that it will only pick up the burger and not the sandwich.
16
![Page 17: Learning to Plan with Logical Automata · S1 S2 S3 G T. S0 o Ø S0 S1 S2 S3 G T ... Learning to Plan with Logical Automata Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3, Mark](https://reader036.vdocument.in/reader036/viewer/2022062414/5f0230357e708231d40302cf/html5/thumbnails/17.jpg)
Learning to Plan with Logical Automata
Brandon Araki1*, Kiran Vodrahalli2*, Thomas Leech1,3,Mark Donahue3, Cristian-Ioan Vasile1, Daniela Rus1
1Massachusetts Institute of Technology2Columbia University
3MIT Lincoln Laboratory
*Equal contributors
17