tutorial on deep learning
TRANSCRIPT
![Page 1: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/1.jpg)
Tejas Kulkarni!1!
![Page 2: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/2.jpg)
Introduction of Deep Learning!
Zaikun XuUSI, Master of Informatics
HPC Advisory Council Switzerland Conference 2016
2!
![Page 3: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/3.jpg)
Overview!Introduction to deep learning!
Big data + Big computational power (GPUs)!
Latest updates of deep learning!
Examples!
Feature extraction!
3!
![Page 4: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/4.jpg)
Machine Learning!
source : http://www.cs.utexas.edu/~eladlieb/RLRG.html!
Reinforcement learning
Introduction to deep learning!
4!
![Page 5: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/5.jpg)
Inspiration from Human brain!
Introduction to deep learning!
• Over 100 billion neurons !• Around 1000 connections !• Electrical signal transmitted through axon, can be inhibitory
or excitatory!• Activation of a neuron depends on inputs!
5!
![Page 6: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/6.jpg)
Artificial Neuron!
• Circles —————————- Neuron!
source : http://cs231n.stanford.edu/!
• Arrow —————————— Synapse!• Weight —————————- Electrical signal!
Introduction to deep learning!
6!
![Page 7: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/7.jpg)
Nonlinearity!Introduction to deep learning!
• Neural network learns complicated structured data with non-linear activation functions!
7!
![Page 8: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/8.jpg)
Perceptron!
• Perceptron with linear activation!
Introduction to deep learning!
• In1969, cannot learn XOR function and takes very long to train!
Winter of AI!
8!
![Page 9: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/9.jpg)
Back-Propogation!
• 1970, got BS on psychology and keep studying NN at University of Edinburg !
Introduction to deep learning!
• 1986, Geoffrey Hinton and David Rumelhart, Nature paper "Learning Representations by Back-propagating errors”!
• Hidden layers added!
• Computation linearly depends on # of neurons !
• Of course, computers at that time is much faster!
9!
![Page 10: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/10.jpg)
Convolutionary NN!
• Yann Lecun, born at Paris, post-doc of Hinton at U Toronto.!
Introduction to deep learning!
• LeNet on hand-digital recognition at Bell Labs, 5% error rate!
• Attack skill : convolution, pooling!
10!
![Page 11: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/11.jpg)
Local connection!
11!
Introduction to deep learning!
![Page 12: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/12.jpg)
Weight sharing!
12!
Introduction to deep learning!
![Page 13: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/13.jpg)
Convolution!
• Feature map too big!
13!
Introduction to deep learning!
![Page 14: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/14.jpg)
Pooling!
14!
Introduction to deep learning!
http://vaaaaaanquish.hatenablog.com/entry/2015/01/26/060622!
![Page 15: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/15.jpg)
Support Vector Machine!
Introduction to deep learning!
• Vladmir Vapnik, born at Soviet Union, colleague of Yann Lecun , invented SVM at 1963!
• Attack skill : Max-margin + kernel mapping!
15!
![Page 16: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/16.jpg)
Another winter looming!Introduction to deep learning!
• SVM : good at “capacity control”, easy to use, repeat!
• NN : good at modelling complicated structure!
• SVM achieved 0.8% error rate as 1998 and 0.56% in 2002 on the same hand-digit recognition task. !
• NN stops at local optima, hard and takes long to train, overfitting!
16!
![Page 17: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/17.jpg)
Recurrent Neural Nets!Introduction to deep learning!
http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-2-implementing-a-language-model-rnn-with-python-numpy-and-theano/!
17!
Model sequential data!
![Page 18: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/18.jpg)
Vanish gradient!Introduction to deep learning!
18!
![Page 19: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/19.jpg)
LSTM!
• http://www.cs.toronto.edu/~graves/handwriting.html!19!
Introduction to deep learning!
![Page 20: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/20.jpg)
10 years funding!
• Hinton was happy and rebranded its neural network to deep learning. The story goes on …………..!
Introduction to deep learning!
20!
![Page 21: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/21.jpg)
Other Pioneers!Introduction to deep learning!
21!
![Page 22: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/22.jpg)
Data preparation!• Input : Can be image, video, text, sounds….!
• Output : Can be a translation of another language, location of an object, a caption describing a image ….!
Introduction to deep learning!
Training data! Validation data ! Test data !
Model!
22!
![Page 23: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/23.jpg)
Training !• Forward pass, propagating information from input to
ouput!
Introduction to deep learning!
23!
![Page 24: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/24.jpg)
Training!• Backward pass, propagating errors back and
update weights accordingly!
Introduction to deep learning!
24!
![Page 25: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/25.jpg)
Parameter update!• Gradient descent (sgd, mini-batch)!
Introduction to deep learning!
animation by Alec Radford!
• x = x - lr * dx!
source : http://cs231n.stanford.edu/!
25!
![Page 26: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/26.jpg)
Parameter update!• Gradient descent!
Introduction to deep learning!
• x = x - lr * dx!
animation by Alec Radford)!
• Second order method ? Quasi-Newton’s method!26!
![Page 27: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/27.jpg)
Problems!• While neural networks can approximate any
function of input X to output y, it is very hard to train with multiple layers!
• Initialization!
Introduction to deep learning!
• Pre-training!• More data/Dropout to avoid overfitting!• Accelerate with GPU!• Support training in parallel!
27!
![Page 28: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/28.jpg)
ImageNet!
• source :https://devblogs.nvidia.com/parallelforall/nvidia-ibm-cloud-support-imagenet-large-scale-visual-recognition-challenge/!
Big data + Big computational power!
• >1M well labeled images of 1000 classes!
28!
![Page 29: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/29.jpg)
Big data + Big computational power!
29!
![Page 30: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/30.jpg)
GPUs!
• source :http://techgage.com/article/gtc-2015-in-depth-recap-deep-learning-quadro-m6000-autonomous-driving-more/!
Big data + Big computational power!
30!
![Page 31: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/31.jpg)
Parallelization!• Data Parallelization: same weight but different data,
gradient needs to be synchronized!
Big data + Big computational power!
• Not scale well with size of cluster!
• Works well on smaller clusters and easy to implement !
source : Nvidia!31!
![Page 32: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/32.jpg)
• Model Parallelization: same data, but different parts of weight!
Parallelization!Big data + Big computational power!
• Parameters in a big Neural net can not fit into memory of a single GPU!
Figure from Yi Wang! 32!
![Page 33: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/33.jpg)
Model+Data Parallelization!Big data + Big computational power!
Krizhevsky et al. 2014!
computation!heavy!
weight heavy!
33!
![Page 34: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/34.jpg)
!
• Trained on 30M moves + Computation + Clear Reward scheme!
Big data + Big computational power!
34!
![Page 35: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/35.jpg)
• Task : A classifier to recognize hand written digits!
Feature extraction!
source :https://indico.io/blog/visualizing-with-t-sne/!
35!
![Page 36: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/36.jpg)
Hierarchal feature extraction!
Feature extraction!
• source : http://www.rsipvision.com/exploring-deep-learning/!36!
![Page 37: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/37.jpg)
Transfer learning!• Use features learned on ImageNet to tune your only classifier on
other tasks.!
Feature extraction!
• 9 categories + 200, 000 images!
• extract features from last 2 layers + linear SVM!37!
![Page 38: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/38.jpg)
Word embedding!• Embed words into vector space such that similar words are more
close to each other in vector space!
Feature extraction!
source : http://anthonygarvan.github.io/wordgalaxy/!
• Calculate cosine similarity !Pairs - France+ Italy = Rome!
38!
![Page 39: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/39.jpg)
Examples!
39!
![Page 40: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/40.jpg)
Go!
• 3361 different legal positions, roughly 10170!
Examples!
Brute Force does not work !!!!
• Monto Carlo Tree Search !Heuristic Searching method!Still slow since the search tree is so big!
40!
![Page 41: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/41.jpg)
AlphaGo!Examples!
• Offline learning : from human go players, learn a Deep learning based policy network and a rollout policy
• Use self-playing result to train a value network, which evaluate the current situation!
41!
![Page 42: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/42.jpg)
Online Playing!Examples!
• When playing with Lee Sedol, policy network give possible moves (about 10 to 20), the value network calculates the weight of position.!
• MCTS update the weight of each possible moves and select one move.!
• The engineering effort on systems with CPU and GPU with data , model parallelism matters
42!
![Page 43: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/43.jpg)
Image recognition!Examples!
Train in a supervised fashion !43!
![Page 44: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/44.jpg)
Caption generation!Examples!
44!
![Page 45: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/45.jpg)
Neural MT!
• The main idea behind RNNs is to compress a sequence of input symbols into a fixed-dimensional vector by using recursion.!
Examples!
source : https://devblogs.nvidia.com/parallelforall/introduction-neural-machine-translation-with-gpus/!45!
![Page 46: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/46.jpg)
Neural arts!Examples!
46!
![Page 47: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/47.jpg)
Go deeper !Updates of deep learning!
• Deep Residual Learning for Image Recognition, MSRA, 2015!47!
![Page 48: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/48.jpg)
Go multi-modal!Updates of deep learning!
48!
![Page 49: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/49.jpg)
Go End-End!Updates of deep learning!
source : https://devblogs.nvidia.com/parallelforall/deep-speech-accurate-speech-recognition-gpu-accelerated-deep-learning/#.VO56E8ZtEkM.google_plusone_share!
49!
![Page 50: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/50.jpg)
Go with attention!Updates of deep learning!
50!
![Page 51: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/51.jpg)
Go with Specalized Chips!Updates of deep learning!
• Prototypes!
51!
![Page 52: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/52.jpg)
Go where!• Video understanding!
Updates of deep learning!
• Deep reinforcement learning!
• Natural language understanding!
• Unsupervised learning !!!!
52!
![Page 53: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/53.jpg)
Action classification!
53!
![Page 54: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/54.jpg)
54!
![Page 55: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/55.jpg)
55!
![Page 56: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/56.jpg)
Summary!• “The takeaway is that deep learning excels in tasks
where the basic unit, a single pixel, a single frequency, or a single word/character has little meaning in and of itself, but a combination of such units has a useful meaning.” — Dallin Akagi!
• Structured data + well defined cost function/rewards !
• learning in a End-End fashion is nice trend!
56!
![Page 57: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/57.jpg)
Thoughts!• Deep leaning will change our lives in a positive way!
• We learn fast by either competing with friend like AlphaGo, or they directly teach us in a effective way
• What might happen in the future!
• Still long way to go towards real AI!
57!
![Page 58: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/58.jpg)
list of materials!
• https://github.com/ChristosChristofidis/awesome-deep-learning!
58!
![Page 59: Tutorial on Deep Learning](https://reader033.vdocument.in/reader033/viewer/2022050614/5a647b347f8b9a88568b4787/html5/thumbnails/59.jpg)
Acknowledgement!
• Tim Dettmers!
• Lyu xiyu!
59!