deep generative models
TRANSCRIPT
Discriminative vs. Generative Learning
Discriminative Learning Generative Learning
Learn π(π¦|π₯) directly Model π π¦ , π π₯ π¦ first,
Then derive the posterior distribution:
π π¦ π₯ =π π₯ π¦ π(π¦)
π(π₯)
2
Undirected Graph vs. Directed Graph
Undirected Directed
β’ Boltzmann Machines
β’ Restricted Boltzmann Machines
β’ Deep Boltzmann Machines
β’ Sigmoid Belief Networks
β’ Variational Autoencoders (VAE)
β’ Generative Adversarial Networks
(GAN)
3
a b
c
a b
c
Deep Belief
Networks
Boltzmann Machines
β’ Stochastic Recurrent Neural Network and Markov Random
Field invented by Hinton and Sejnowski in 1985
β’ π· π =ππ±π©(βπ¬ π )
π> E(x): Energy function
> Z: partition function where Οπ₯ π π₯ = 1
β’ Energy-based model: positive values all the time
β’ Single visible layer and single hidden layer
β’ Fully connected: not practical to implement
4
Restricted Boltzmann Machines
β’ Dimensionality reduction, classification, regression,
collaborative filtering, feature learning and topic modeling
β’ π· π― = π, π‘ = π =π
πππ±π©(βπ¬ π, π )
β’ Two layers like BMs
β’ Building blocks of deep probabilistic models
β’ Gibbs sampling with Contrastive Divergence (CD) or Persistent
CD
5
Comparison btw BMs and RBMs
Boltzmann Machines Restricted Boltzmann Machines
v1 v2 v3
h1 h2 h3 h4
v1 v2 v3
h1 h2 h3 h4
6
π¬ π, π = βππ»πΉπ β ππ»πΊπ β ππ»πΎπβ ππ»π β ππ»π π¬ π, π = βππ»π β ππ»π β ππ»πΎπ
Deep Belief Networks
β’ Unsupervised
β’ Small dataset
β’ Stacked RBMs
β’ Pre-train each RBM
β’ Undirected + Directed
7
RBM
Sigmoid
Belief
Net
β’ π v, h1, h2, h3 =
π v h1 π h1 h2 π(h2, h3)
β’ π v h1 = Οπ π(π£π|h1)
β’ π h1 h2 = Οπ π(βπ1|h2)
β’ π h2, h3 =1
π(π3)exp(h2ππ3h3)
8
RBM
Sigmoid
Belief
Net
β1
β2
β3
v
π3
π2
π1
Sigmoid Belief Net RBM
Limitations of DBN (By Ruslan Salakhutdinov)
β’ Explaining away
β’ Greedy layer-wise pre-training
> no optimization over all layers
β’ Approximation inference is feed-forward
> no bottom-up and top-down
9
http://www.slideshare.net/zukun/p05-deep-boltzmann-machines-cvpr2012-deep-learning-methods-for-vision
Deep Boltzmann Machines
β’ Unsupervised
β’ Small dataset
β’ Stacked RBMs
β’ Pre-train each RBM
β’ Undirected
10
β’ ππ v =
1
π(π)Οh1,h2,h3 exp( v
ππ1h1 +
h1ππ2h2 + h2ππ3h3)
β’ π = {π1,π2,π3}
β’ Bottom-up and Top-down:
β’ π βπ2 = 1|h1, h3 =
π(ΟππΎπππ ππ
π + ΟππΎπππ ππ
π )
11
β1
β2
β3
v
π3
π2
π1
Variational Autoencoders (VAE)
12
Encoder Decoderx xβ
ΞΌ
Ο
Z
π(π§|π₯) π(π₯|π§)
Sampling Reconstruct
β π = βπ«π²π³ π π π β₯ ππππ ππ π + πΌπ~π π π π₯π¨π ππππ ππ(π|π)
π ~π(π, π)
http://www.slideshare.net/KazukiNitta/variational-autoencoder-68705109
Stochastic Gradient Variational Bayes
(SGVB) Estimator
13
Encoder Decoderx xβ
ΞΌ
Ο
Z
Back Prop.
Feed Forward
π ~π π, π°
π = π + ππ
Generative Adversarial Networks(GAN)
15
Data Sample DiscriminatorGenerator
Sample
Noise
Yes or No
Generator
https://ishmaelbelghazi.github.io/ALI
Deep Convolutional Generative Adversarial Networks (DCGAN)
17
https://openai.com/blog/generative-models/