lecture 11: deep sequential models - github pages › lectures › apr2019 ›...
TRANSCRIPT
![Page 1: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/1.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 1
Lecture 11: Deep Sequential ModelsEfstratios Gavves
![Page 2: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/2.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 2
oAutoregressive Models
oPixelCNN, PixelCNN++, PixelRNN
oWaveNet
oTime-Aligned DenseNets
Lecture overview
![Page 3: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/3.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 3
oLet’s assume we have signal modelled by an input random variable 𝑥◦Can be an image, video, text, music, temperature measurements
o Is there an order in all these signals?
Autoregressive Models
![Page 4: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/4.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 4
oLet’s assume we have signal modelled by an input random variable 𝑥◦Can be an image, video, text, music, temperature measurements
o Is there an order in all these signals?
Autoregressive Models
![Page 5: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/5.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 5
oLet’s assume we have signal modelled by an input random variable 𝑥◦Can be an image, video, text, music, temperature measurements
o Is there an order in all these signals? Other signals and orders?
Autoregressive Models
![Page 6: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/6.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 6
oLet’s assume we have signal modelled by an input random variable 𝑥◦Can be an image, video, text, music, temperature measurements
o Is there an order in all these signals?
Autoregressive Models
![Page 7: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/7.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 7
o If 𝑥 is sequential, there is an order: 𝑥 = [𝑥1, … , 𝑥𝑘]◦E.g., the order of words in a sentence
o If 𝑥 is not sequential, we can create an artificial order 𝑥 = [𝑥𝑟(1), … , 𝑥𝑟(𝑘)]◦E.g., the order with which pixels make (generate) an image
oThen, the marginal likelihood is a product of conditionals
𝑝 𝑥 =ෑ
𝑘=1
𝐷
𝑝(𝑥𝑘|𝑥𝑗<𝑘)
oDifferent from Recurrent Neural Networks(a) no parameter sharing(b) chains are not infinite in length
Autoregressive Models
![Page 8: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/8.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 8
oPros: because of the product decomposition, 𝑝 𝑥 is tractable
oCons: because the 𝑝 𝑥 is sequential, training is slower◦To generate every new word/frame/pixel, the previous words/frames/pixels in the order must be generated first no parallelism
Autoregressive Models
![Page 9: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/9.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 9
oSequential data is a natural fit◦Language modelling, time series, etc
◦For non-sequential data they are ok, although arguably artificial
oQuestion: How to model the conditionals 𝑝(𝑥𝑘|𝑥𝑗<𝑘)◦Logistic regression (Frey et al., 1996)
◦Neural networks (Bengio and Bengio, 2000)
oModern deep autoregressors◦NADE, MADE, PixelCNN, PixelCNN++, PixelRNN, WaveNet
The where and now of Deep Autoregressive Models
![Page 10: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/10.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 10
NADE
Neural Autoregressive Distribution Estimation, Larochelle and Murray, AISTATS 2011
![Page 11: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/11.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 11
oMinimizing negative log-likelihood as usual
ℒ = − log 𝑝 𝑥 = −
𝑘=1
𝐷
𝑝(𝑥𝑘|𝑥<𝑘)
oThen, we model the conditional as𝑝 𝑥𝑑 𝑥<𝑑 = 𝜎(Vd,∶ ⋅ ℎ𝑑 + bd)
where the latent variable ℎ𝑑 is defined asℎ𝑑 = 𝜎(𝑊:,𝑑 ⋅ 𝑥𝑑 + 𝑐)
NADE
![Page 12: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/12.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 12
o “Teacher forcing” training
NADE: Training & Testing
Training: Use ground truth values (e.g. of pixels) Testing: Use predicted values in previous order
![Page 13: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/13.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 13
NADE Visualizations
![Page 14: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/14.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 14
oQuestion: How could we construct an autoregressive autoencoder?
oTo rephrase: How to modify an autoencoder such that each output 𝑥𝑘depends only on the previous outputs 𝑥<𝑘 (autoregressive property)?◦Namely, the present 𝑘-th output 𝑥𝑘 must not depend on a computational path from future inputs 𝑥𝑘+1, … , 𝑥𝐷◦Autoregressive: 𝑝 𝑥|𝜃 = ς𝑘=1
𝐷 𝑝(𝑥𝑘|𝑥𝑗<𝑘 , 𝜃)
◦Autoencoder: 𝑝 𝑥|𝑥, 𝜃 = ς𝑘=1𝐷 𝑝(𝑥𝑘|𝑥𝑘 , 𝜃)
MADE
Masked Autoencoder for Distribution Estimation, Germain, Mathieu et al., ICML 2015
![Page 15: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/15.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 15
oQuestion: How could we construct an autoregressive autoencoder?
oTo rephrase: How to modify an autoencoder such that each output 𝑥𝑘depends only on the previous outputs 𝑥<𝑘 (autoregressive property)?◦Namely, the present 𝑘-th output 𝑥𝑘 must not depend on a computational path from future inputs 𝑥𝑘+1, … , 𝑥𝐷◦Autoregressive: 𝑝 𝑥|𝜃 = ς𝑘=1
𝐷 𝑝(𝑥𝑘|𝑥𝑗<𝑘 , 𝜃)
◦Autoencoder: 𝑝 𝑥|𝑥, 𝜃 = ς𝑘=1𝐷 𝑝(𝑥𝑘|𝑥𝑘 , 𝜃)
oAnswer: Masked convolutions!ℎ 𝑥 = 𝑔 𝑏 + 𝑊⨀𝑀𝑊 ⋅ 𝑥𝑥 = 𝜎 𝑐 + 𝑉⨀𝑀𝑉 ⋅ ℎ(𝑥)
MADE
Masked Autoencoder for Distribution Estimation, Germain, Mathieu et al., ICML 2015
![Page 16: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/16.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 16
MADE
Masked Autoencoder for Distribution Estimation, Germain, Mathieu et al., ICML 2015
![Page 17: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/17.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 17
oUnsupervised learning: learn how to model 𝑝(𝑥)
oDecompose the marginal
𝑝 𝑥 =ෑ
𝑖=1
𝑛2
𝑝(𝑥𝑖|𝑥1, … , 𝑥𝑖−1)
oAssume row-wise pixel by pixel generation and sequential colors RGB◦Each color conditioned on all colors from previous pixels and specific colors in the same pixel
𝑝 𝑥𝑖,𝑅|𝑥<𝑖 ⋅ 𝑝 𝑥𝑖,𝐺|𝑥<𝑖 , 𝑥𝑖,𝑅 ⋅ 𝑝 𝑥𝑖,𝐵|𝑥<𝑖 , 𝑥𝑖,𝑅 , 𝑥𝑖,𝐺
oFinal output is 256-way softmax
PixelRNN
Pixel Recurrent Neural Networks, van den Oord, Kalchbrenner and Kavukcuoglu, arXiv 2016
![Page 18: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/18.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 18
oHow to model the conditionals?
𝑝 𝑥𝑖,𝑅|𝑥<𝑖 , 𝑝 𝑥𝑖,𝐺|𝑥<𝑖 , 𝑥𝑖,𝑅 , 𝑝 𝑥𝑖,𝐵|𝑥<𝑖 , 𝑥𝑖,𝑅 , 𝑥𝑖,𝐺
oLSTM variants◦12 layers
oRow LSTM
oDiagonal Bi-LSTM
PixelRNN
Row LSTM Diagonal Bi-LSTM
![Page 19: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/19.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 19
oHidden state (i, j) = Hidden state (i-1, j-1) + Hidden state (i-1, j) +Hidden state (i-1, j+1) +p(i, j)
oBy recursion the hidden statecaptures a fairly triangular region
PixelRNN - RowLSTM
Row LSTM
![Page 20: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/20.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 20
oHow to capture the whole previous context
oPixel (i, j) = Pixel (i, j-1) + Pixel (i-1, j)
oProcessing goes on diagonally
oReceptive layer encompasses entire region
PixelRNN – Diagonal BiLSTM
Diagonal Bi-LSTM
![Page 21: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/21.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 21
oPropagate signal faster
oSpeed up convergence
PixelRNN – Residual connections
![Page 22: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/22.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 22
oPros: good modelling of 𝑝(𝑥) nice image generation
oHalf pro: Residual connections speeds up convergence
oCons: still slow training, slow generation
PixelRNN – Pros & Cons
Row LSTM Diagonal Bi-LSTM
![Page 23: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/23.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 24
PixelRNN - Generations
![Page 24: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/24.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 25
oUnfortunately, PixelRNN is too slow
oSolution: replace recurrent connections with convolutions◦Multiple convolutional layers to preserve spatial resolution
oTraining is much faster because all true pixels are known in advance, so we can parallelize◦Generation still sequential (pixels must be generated) still slow
PixelCNN
Stack of masked convolutions
Pixel Recurrent Neural Networks, van den Oord, Kalchbrenner and Kavukcuoglu, arXiv 2016
![Page 25: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/25.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 26
oUse masked convolutions again to enforce autoregressive relationships
PixelCNN
![Page 26: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/26.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 27
oCons: Performance is worse than PixelRNN
oWhy?
PixelCNN – Pros and Cons
![Page 27: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/27.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 28
oCons: Performance is worse than PixelRNN
oWhy?
oNot all past context is taken into account
oNew problem: the cascaded convolutions create a “blind spot”
PixelCNN – Pros and Cons
![Page 28: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/28.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 30
oBecause of(a) the limited receptive field of convolutions and(b) computing all features at once (not sequentially) cascading convolutions makes current pixel not depend on all previous blind spot
Blind spot
![Page 29: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/29.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 31
oUse two layers of convolutions stacks◦Horizontal stack: conditions on current row and takes as input the previous layer output and the vertical stack
◦Vertical stack: conditions on all rows above current pixels
oAlso replace ReLU with a tanh(𝑊 ∗ 𝑥)⋅ 𝜎(𝑈 ∗ 𝑥)
Fixing the blind spot: Gated PixelCNN
![Page 30: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/30.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 32
oCoral reef
PixelCNN - Generations
![Page 31: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/31.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 33
oSorrel horse
PixelCNN - Generation
![Page 32: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/32.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 34
oSandbar
PixelCNN - Generation
![Page 33: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/33.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 35
oLhasa Apso
PixelCNN - Generation
![Page 34: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/34.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 36
o Improving the PixelCNN model
oReplace the softmax output with a discretized logistic mixture lihelihood◦Softmax is too memory consuming and gives sparse gradients
◦ Instead, assume logistic distribution of intensity and round off to 8-bits
oCondition on whole pixels, not pixel colors
oDownsample with stride-2 convs to compute long-range dependencies
oUse shortcut connections
oDropout◦PixelCNN is too powerful a framework can onverfit easily
PixelCNN++
PixelCNN++: Improving the PixelCNN with Discretized Logistic, Salimans, Karpathy, Chen, Kingma, ICLR 2017
![Page 35: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/35.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 37
PixelCNN++
![Page 36: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/36.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 38
PixelCNN++ - Generations
![Page 37: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/37.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 39
oA standard VAE with a PixelCNN generator/decoder
oBe careful. Often the generator is so powerful, that the encoder/inference network is ignored Whatever the latent code 𝑧 there will be a nice image generated
PixelVAE
PixelVAE: A Latent Variable Model for Natural Images, Gulrajani et al., ICLR 2017
![Page 38: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/38.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 40
PixelVAE
![Page 39: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/39.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 41
PixelVAE - Generations
64x64 LSUN Bedrooms 64x64 ImageNet
![Page 40: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/40.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 42
PixelVAE - Generations
Varying pixel-level noise
Varying bottom latents
Varying top latents
![Page 41: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/41.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 43
WaveNet
WaveNet: A Generative Model for Raw Audio, van den Oord et al., arXiv 2017
![Page 42: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/42.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 44
o Inspired by PixelRNN and PixelCNN
oFully convolutional neural network
oUse dilated convolutions
oSamples
WaveNet
![Page 43: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/43.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 45
WaveNet architecture
![Page 44: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/44.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 46
oVideo Time: Properties, Encoders and Evaluation◦A. Ghodrati, E. Gavves, C. Snoek, BMVC 2018
oHow to model image sequences optimally?
VideoTime: Time-Aligned DenseNets
Video Time: Properties, Encoders and Evaluation, Ghodrati, Gavves, Snoek, BMVC 2018
![Page 45: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/45.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 47
A standard VAE with a PixelCNN generator/decoder
Properties of video
Forward Backward
![Page 46: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/46.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 48
oTemporal asymmetry
oTemporal continuity
oTemporal causality
Video properties Natural order (+)
Reverse order (-)
“Pretending to put something into something”
![Page 47: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/47.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 49
Current solutions
LSTM
LSTM
LSTM
LSTM
CNN3D CNN
3D CNN
3D CNNCNNCNNCNN
3D convolutions learn spatiotemporal
patterns within a video
LSTMs learn transitions between
subsequent states
![Page 48: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/48.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 50
oAutoregressive-like: each new frame directly depends on all previous frames◦They are not generative
oNo parameter sharing like ConvNets, unlike RNNs
oSequential like RNNs, unlike ConvNets
Time-Aligned DenseNets
Laye
r 1
Laye
r 2
Laye
r 3
Laye
r 4
CNNCNNCNNCNN
FC-R
eLU
-Dro
po
ut
FC
![Page 49: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/49.jpg)
UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEP SEQUENTIAL MODELS - 51
Better latent space
![Page 50: Lecture 11: Deep Sequential Models - GitHub Pages › lectures › apr2019 › lecture11-deep...Lecture 11: Deep Sequential Models Efstratios Gavves UVA DEEP LEARNING COURSE –EFSTRATIOS](https://reader036.vdocument.in/reader036/viewer/2022081402/5f27c5e319fcd85e0e2255f5/html5/thumbnails/50.jpg)
UVA DEEP LEARNING COURSEEFSTRATIOS GAVVES
DEEP SEQUENTIAL MODELS - 52
Summary
oAutoregressive Models
oPixelCNN, PixelCNN++, PixelRNN
oWaveNet
oTime-Aligned DenseNets