optimal real-time landing daniel hennes using deep networks · neural networks) beat the world’s...

16
Optimal real-time landing using deep networks Carlos Sánchez and Dario Izzo Advanced Concepts Team (ESA) Daniel Hennes Robotics Innovation Center (DFKI) 6th International Conference on Astrodynamics Tools and Techniques (ICATT) Artist's impression of the Phoenix spacecraft as it lands on Mars, NASA/JPL/Corby Waste (2007)

Upload: others

Post on 16-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Optimal real-time landing Daniel Hennes using deep networks · neural networks) Beat the world’s best Go player (Convolutional neural networks and Monte Carlo tree search) Vinyals,

Optimal real-time landing using deep networks

Carlos Sánchez and Dario IzzoAdvanced Concepts Team (ESA)

Daniel HennesRobotics Innovation Center (DFKI)

6th International Conference on Astrodynamics Tools and Techniques

(ICATT) Artist's impression of the Phoenix spacecraft asit lands on Mars, NASA/JPL/Corby Waste (2007)

Page 2: Optimal real-time landing Daniel Hennes using deep networks · neural networks) Beat the world’s best Go player (Convolutional neural networks and Monte Carlo tree search) Vinyals,

We use deep neural networks to find On-board real-time optimal control systemAn on-board real-time optimal control system

Artist's impression of the Curiosity rover and descent stage as it lands on Mars, NASA/JPL-Caltech (2011)

Page 3: Optimal real-time landing Daniel Hennes using deep networks · neural networks) Beat the world’s best Go player (Convolutional neural networks and Monte Carlo tree search) Vinyals,

Goal: Solve the deterministic continuous-time optimal control problem, that is:

Current methods (direct or indirect) are not suitable for real-time on-board implementation, an alternative is to correct the deviations from a precomputed profile

Our approach:

1. Pre-compute many optimal trajectories

2. Train an artificial neural network to approximate the optimal behaviour (supervised learning)

3. Use the network to drive the spacecraft

Optimal control problem

Hamilton-Jacobi-Bellman equation

Page 4: Optimal real-time landing Daniel Hennes using deep networks · neural networks) Beat the world’s best Go player (Convolutional neural networks and Monte Carlo tree search) Vinyals,

Landing models

Multicopter(Earth pinpoint landing)

in [x, vx, z, zx, θ]out [u1, u2]

● Fixed mass● Thrust + Torque● Optimize power & time

Spacecraft I(Moon free landing)

Spacecraft II(Moon free landing)

in [m, x, vx, z, vz, θ, vθ]out [u1, u2, u3]

● Variable mass● Main thrust +

two lateral engines● Optimize mass

in [m, x, vx, z, zx, θ]out [u1, u2]

● Variable mass● Thrust + exchange

momentum wheel● Optimize mass

Page 5: Optimal real-time landing Daniel Hennes using deep networks · neural networks) Beat the world’s best Go player (Convolutional neural networks and Monte Carlo tree search) Vinyals,

Recent success of Deep Neural Networks

Generate captions for images(Convolutional and recurrentneural networks)

Beat the world’s best Go player(Convolutional neural networksand Monte Carlo tree search)

Vinyals, O., Toshev, A., Bengio, S. & Erhan, D. Show and tell: a neural image caption generator. In Proc. International Conference on Machine Learning (2015)

Silver, David, et al. Mastering

the game of Go with deep neural

networks and tree search. In

Nature (2016)

Page 6: Optimal real-time landing Daniel Hennes using deep networks · neural networks) Beat the world’s best Go player (Convolutional neural networks and Monte Carlo tree search) Vinyals,

- Networks with 1 - 4 hidden layers

- Stochastic Gradient Descent (and momentum)

- Minimize the squared loss error (C)

- We integrate over time the dynamics to get the full DNN-driven trajectory

Methodology

Page 7: Optimal real-time landing Daniel Hennes using deep networks · neural networks) Beat the world’s best Go player (Convolutional neural networks and Monte Carlo tree search) Vinyals,

Training data generation

- The training data is generated using the Hermite-Simpson transcription and a non-linear programming (NLP) solver

- Chattering effects in the training data have a huge negative impact on the results. Regularization techniques are used to remove them, although the use of a direct method would alleviate this problem resulting in cleaner data.

- 150,000 trajectories are generated for each one of the problems(9,000,000 - 15,000,000 data points)

Page 8: Optimal real-time landing Daniel Hennes using deep networks · neural networks) Beat the world’s best Go player (Convolutional neural networks and Monte Carlo tree search) Vinyals,

Results

Error with respect to optimal value and success rate for 100 randomly initialized trajectories

● High success rates (landing without crash)- 100% from the training area (except Sp II)- High values from outside

● The results of the value function are always close to the optimal ones

Page 9: Optimal real-time landing Daniel Hennes using deep networks · neural networks) Beat the world’s best Go player (Convolutional neural networks and Monte Carlo tree search) Vinyals,

Results

Quadrotor (power) Quadrotor (time) Spacecraft I

hovering(9.8N)

Page 10: Optimal real-time landing Daniel Hennes using deep networks · neural networks) Beat the world’s best Go player (Convolutional neural networks and Monte Carlo tree search) Vinyals,

Results

Spacecraft II

Noisy training data due to the indirect method. We expect a direct method to considerably improve the results.

Page 11: Optimal real-time landing Daniel Hennes using deep networks · neural networks) Beat the world’s best Go player (Convolutional neural networks and Monte Carlo tree search) Vinyals,

Results

Spacecraft II

Page 12: Optimal real-time landing Daniel Hennes using deep networks · neural networks) Beat the world’s best Go player (Convolutional neural networks and Monte Carlo tree search) Vinyals,

Shallow vs Deep Networks

● Deep networks are always better thanshallow networks with the same number ofparameters.

Challenging landing example

Page 13: Optimal real-time landing Daniel Hennes using deep networks · neural networks) Beat the world’s best Go player (Convolutional neural networks and Monte Carlo tree search) Vinyals,

Generalization

● Successful landings from states outside of the training initial conditions

● This suggest that an approximation to the solution of the HJB equations is learned

Quadrotor (power) Quadrotor (power) Spacecraft I

Page 14: Optimal real-time landing Daniel Hennes using deep networks · neural networks) Beat the world’s best Go player (Convolutional neural networks and Monte Carlo tree search) Vinyals,

NLP vs DNN

Even w.r.t. an NLP solver with a low number of nodes (sub-optimal), the DNNs are up to 500 times faster.

Page 15: Optimal real-time landing Daniel Hennes using deep networks · neural networks) Beat the world’s best Go player (Convolutional neural networks and Monte Carlo tree search) Vinyals,

Conclusions

- Deep networks trained with large datasets succesfully approximate the optimal control even for regions out of the training data

- DNNs can be used as an on-board reactive control system, while current methods fail to compute optimal trajectories in an efficient way

Page 16: Optimal real-time landing Daniel Hennes using deep networks · neural networks) Beat the world’s best Go player (Convolutional neural networks and Monte Carlo tree search) Vinyals,

Optimal real-time landing using deep networks

Artist's impression of the Phoenix spacecraft asit lands on Mars, NASA/JPL/Corby Waste (2007)