deep learning presentation
TRANSCRIPT
About Me
Education:● B.S. Math - University of Utah● M.S. Statistics - Utah State University
Career:● Data Scientist - PurePredictive
Hobbies:● Traveling, Tennis, Movies
What is Deep Learning?
(Short) Definition: A set of algorithms intended to model high-level abstractions in data by layering non-linear transformations.
Deep Learning has several names: ● Hierarchical Learning● (Artificial) Neural Networks● Multilayer Perceptrons● Deep Structured Learning
One Learning Algorithm Hypothesis
There is a lot of evidence that most perception (input processing) in the brain is due to one learning algorithm.
[Metin & Frost, 1989][Roe et al., 1992]
Differences from the Brain
● The way artificial neurons fire is fundamentally different than biological neurons.
Differences from the Brain
● The way artificial neurons fire is fundamentally different than biological neurons.
● A human brain has 100 billion neurons and 100 trillion connections (synapses) and operates on 20 watts (enough to run a dim light bulb) - in comparison the 2012 Google Brain project had 10 million neurons and 1 billion connections on 16,000 CPUs (about 3 million watts).
Differences from the Brain
● The way artificial neurons fire is fundamentally different than biological neurons.
● A human brain has 100 billion neurons and 100 trillion connections (synapses) and operates on 20 watts (enough to run a dim light bulb) - in comparison the 2012 Google Brain project had 10 million neurons and 1 billion connections on 16,000 CPUs (about 3 million watts).
● The brain is limited to 5 types of input data from the 5 senses.
Differences from the Brain
● The way artificial neurons fire is fundamentally different than biological neurons.
● A human brain has 100 billion neurons and 100 trillion connections (synapses) and operates on 20 watts (enough to run a dim light bulb) - in comparison the 2012 Google Brain project had 10 million neurons and 1 billion connections on 16,000 CPUs (about 3 million watts).
● The brain is limited to 5 types of input data from the 5 senses.● Children do not learn what a cow is by reviewing 100,000 pictures
labeled “cow” and “not cow”, but this is how deep learning works.
Records set in the Past Six Months
New records set, surpassing human performance, on the following datasets:● CIFAR-10 (7% human error)
○ Oct. 20 - 4.5% error using Spatially-Sparse Convolutional Nets (Dr. Ben Graham, University of Warsaw)
Records set in the Past Six Months
New records set, surpassing human performance, on the following datasets:● ImageNet - 3.2 million images with 1,000 classification categories
(5.1% human error)
Records set in the Past Six Months
New records set, surpassing human performance, on the following datasets:● ImageNet - 3.2 million images with 1,000 classification categories
(5.1% human error)
○ Jan. 16 - 5.98% error using data augmentation (Baidu)
Records set in the Past Six Months
New records set, surpassing human performance, on the following datasets:● ImageNet - 3.2 million images with 1,000 classification categories
(5.1% human error)
○ Jan. 16 - 5.98% error using data augmentation (Baidu)
○ Feb. 10 - 4.94% error using Parametric Rectified Linear Units (Microsoft)
Records set in the Past Six Months
New records set, surpassing human performance, on the following datasets:● ImageNet - 3.2 million images with 1,000 classification categories
(5.1% human error)
○ Jan. 16 - 5.98% error using data augmentation (Baidu)
○ Feb. 10 - 4.94% error using Parametric Rectified Linear Units (Microsoft)
○ Feb. 11 - 4.8% error using Batch Normalization (Google)
Image Caption GenerationSome of the captions are unbelievably good...
“Two pizzas sitting on top of a stove top oven” “A group of young people playing a game of frisbee”
Image Caption GenerationSome are not so great...
“A refrigerator filled with lots of food and drinks” “A yellow school bus parked in a parking lot”
Weaknesses and Criticisms
Some deep learning models have been somewhat of a let down...● Unsupervised Learning● Energy Based Models
[Google, 2012]
Weaknesses and Criticisms
Each of these pictures were classified by a trained deep neural net with >= 99.6% certainty.
[Nguyen, 2014]
Weaknesses and Criticisms
Pictures on the left are classified correctly by a trained deep neural network, while pictures on the right (although indistinguishable to humans) are misclassified. The pictures in the middle represent the pixel differences between the left and right pictures.
[Google, 2014]
When to use Deep Learning
When to use Deep Learning:● You have a large amount of data and need a model that scales well
[Andrew Ng, 2012]
When to use Deep Learning
When to use Deep Learning:● You have a large amount of data and need a model that scales well● You are modeling image/audio/language/time-series data
When to use Deep Learning
When to use Deep Learning:● You have a large amount of data and need a model that scales well● You are modeling image/audio/language/time-series data● Your data contains high-level abstractions
When to use Deep Learning
When to use Deep Learning:● You have a large amount of data and need a model that scales well● You are modeling image/audio/language/time-series data● Your data contains high-level abstractions● You need a model that is less reliant on handmade features, and
instead can learn important features from the data
When to use Deep Learning
When to use Deep Learning:● You have a large amount of data and need a model that scales well● You are modeling image/audio/language/time-series data● Your data contains high-level abstractions● You need a model that is less reliant on handmade features, and
instead can learn important features from the dataWhen not to use Deep Learning:● You don’t have time to clean your data
When to use Deep Learning
When to use Deep Learning:● You have a large amount of data and need a model that scales well● You are modeling image/audio/language/time-series data● Your data contains high-level abstractions● You need a model that is less reliant on handmade features, and
instead can learn important features from the dataWhen not to use Deep Learning:● You don’t have time to clean your data● You have a small data set
When to use Deep Learning
When to use Deep Learning:● You have a large amount of data and need a model that scales well● You are modeling image/audio/language/time-series data● Your data contains high-level abstractions● You need a model that is less reliant on handmade features, and
instead can learn important features from the dataWhen not to use Deep Learning:● You don’t have time to clean your data● You have a small data set● Your primary goal is to extract insights from the data
When to use Deep Learning
When to use Deep Learning:● You have a large amount of data and need a model that scales well● You are modeling image/audio/language/time-series data● Your data contains high-level abstractions● You need a model that is less reliant on handmade features, and
instead can learn important features from the dataWhen not to use Deep Learning:● You don’t have time to clean your data● You have a small data set● Your primary goal is to extract insights from the data● You read a tech article about it being magic
Tip 1: Understand the Methodology
Alex KrizhevskyGoogle
Yann LecunFacebook
Yoshua BengioU. of Montreal
Geoff HintonGoogle
First go through this tutorial: http://deeplearning.stanford.edu/tutorial/
Then start reading papers by these guys:
Andrew NgBaidu
backpropagation,boltzmann machines convolution
stacked auto-encoders GPU utilization dropout
Tip 2: Choose a Framework
Language Library
C++ / CUDACuda-Convnet2
Caffe
Julia MOCHA
Lua Torch7
PythonCaffe
Theano
R H20
Tip 3: Use a (Nvidia) GPU
It is a lot faster than a CPU, and much more economical than using a CPU cluster.
Tip 4: Start with a Well-Studied Data Set
Why I recommend this:● Starter code is abundantly available, no matter which programming
language you choose
Tip 4: Start with a Well-Studied Data Set
Why I recommend this:● Starter code is abundantly available, no matter which programming
language you choose ● The benchmarks are well established
Tip 4: Start with a Well-Studied Data Set
Why I recommend this:● Starter code is abundantly available, no matter which programming
language you choose ● The benchmarks are well established● These datasets can run out-of-the-box
Tip 4: Start with a Well-Studied Data Set
Why I recommend this:● Starter code is abundantly available, no matter which programming
language you choose ● The benchmarks are well established● These datasets can run out-of-the-box
MNIST - Live Demo