019 20160907 decoupled neural interfaces using synthetic gradients
TRANSCRIPT
Decoupled Neural Interfaces using Synthetic Gradients
Tran Quoc Hoan
@k09ht haduonght.wordpress.com/
Paper Alert 2016-09-09, Hasegawa lab., Tokyo
The University of Tokyo
Max Jaderberg, Wojciech Marian Czarnecki, Simon Osindero, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu
https://arxiv.org/abs/1608.05343
Findings
Decoupled Neural Interfaces using Synthetic Gradients 2
• Modelling error gradients: by using the modeled synthetic gradient in place of true back propagated error gradients, decouple subgraphs and update independently and asynchronously
• Speed up training process and save memory for RNN
Neural network and the problem of locking
Decoupled Neural Interfaces using Synthetic Gradients 3
• Gradients have been back-propagated sequentially
• Layer 1 must wait for forward/backward computation at layer 2&3 for update
• Layer 1 is locked, coupled to the rest of network
Time consuming problem for complex network or big distributed network spread over multiple machines
Image source: https://deepmind.com/blog#decoupled-neural-interfaces-using-synthetic-gradients
Idea: Synthetic Gradient
Decoupled Neural Interfaces using Synthetic Gradients 4
predict this instead using back-propagationhi
�̂i
�̂i
Update
Train estimator
Image source: https://deepmind.com/blog#decoupled-neural-interfaces-using-synthetic-gradients
Idea: Synthetic Gradient
Decoupled Neural Interfaces using Synthetic Gradients 5
Mi : mini/simple neural network
Image source: https://deepmind.com/blog#decoupled-neural-interfaces-using-synthetic-gradients
Synthetic Gradient for RNN
Decoupled Neural Interfaces using Synthetic Gradients 6Image source: https://deepmind.com/blog#decoupled-neural-interfaces-using-synthetic-gradients
Only use synthetic gradient at trunked point of RNN
Experiments
Decoupled Neural Interfaces using Synthetic Gradients 7
Q: How about hardware setup for improvement (specially in DeepMind)? Does it work int my GPU clusters?
Experiments
Decoupled Neural Interfaces using Synthetic Gradients 8