unsupervised computer vision: the current state of the art
TRANSCRIPT
![Page 1: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/1.jpg)
Unsupervised Computer Vision
Stitch Fix, Styling Algorithms Research Talk
The Current State of the Art
TJ Torres Data Scientist, Stitch Fix
![Page 2: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/2.jpg)
WHY DEEP LEARNING?Before DL much of computer vision was focused on feature descriptors
and image stats.
SURF MSER Corner
Image Credit: http://www.mathworks.com/products/computer-vision/features.html
![Page 3: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/3.jpg)
WHY DEEP LEARNING?
Turns out NNs are great feature extractors.
http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
![Page 4: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/4.jpg)
WHY DEEP LEARNING?
Turns out NNs are great feature extractors. Team name Entry description Classification
errorLocalization error
GoogLeNet No localization. Top5 val score is 6.66% error. 0.06656 0.606257
VGGa combination of multiple ConvNets, including a net trained on images of different size (fusion weights learnt on the validation set); detected boxes were not updated
0.07325 0.256167
VGG a combination of multiple ConvNets, including a net trained on images of different size (fusion done by averaging); detected boxes were not updated 0.07337 0.255431
VGG a combination of multiple ConvNets (by averaging) 0.07405 0.253231
VGG a combination of multiple ConvNets (fusion weights learnt on the validation set) 0.07407 0.253501
Leaderboard
![Page 5: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/5.jpg)
WHY DEEP LEARNING?
Turns out NNs are great feature extractors. Team name Entry description Classification
errorLocalization error
GoogLeNet No localization. Top5 val score is 6.66% error. 0.06656 0.606257
VGGa combination of multiple ConvNets, including a net trained on images of different size (fusion weights learnt on the validation set); detected boxes were not updated
0.07325 0.256167
VGG a combination of multiple ConvNets, including a net trained on images of different size (fusion done by averaging); detected boxes were not updated 0.07337 0.255431
VGG a combination of multiple ConvNets (by averaging) 0.07405 0.253231
VGG a combination of multiple ConvNets (fusion weights learnt on the validation set) 0.07407 0.253501
Leaderboard
Convolution: gives local, translation invariant feature hierarchy
![Page 6: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/6.jpg)
WHY DEEP LEARNING?
Image Credit: http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution
![Page 7: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/7.jpg)
WHY DEEP LEARNING?
Edges Curves Top of 3 shapes
Softmax Output: Classification
Image Credit: http://parse.ele.tue.nl/education/cluster2
![Page 8: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/8.jpg)
WHY DEEP LEARNING?
Image Credit: http://blog.keras.io/how-convolutional-neural-networks-see-the-world.html
![Page 9: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/9.jpg)
WHY DEEP LEARNING?
Image Credit: http://blog.keras.io/how-convolutional-neural-networks-see-the-world.html
![Page 10: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/10.jpg)
WHY DEEP LEARNING?
Image Credit: http://blog.keras.io/how-convolutional-neural-networks-see-the-world.html
![Page 11: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/11.jpg)
WHY DEEP LEARNING?
Image Credit: http://blog.keras.io/how-convolutional-neural-networks-see-the-world.html
![Page 12: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/12.jpg)
WHY DEEP LEARNING?
Image Credit: http://blog.keras.io/how-convolutional-neural-networks-see-the-world.html
![Page 13: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/13.jpg)
LEARN MORE
http://cs231n.github.io/convolutional-networks/
![Page 14: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/14.jpg)
WHY UNSUPERVISED?
Unfortunately very few image sets come with labels.
![Page 15: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/15.jpg)
WHY UNSUPERVISED?
Unfortunately very few image sets come with labels.
What are the best labels for fashion/style?
![Page 16: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/16.jpg)
THE UNSUPERVISED MO
Try to learn embedding space of image data. (generally includes generative process)
![Page 17: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/17.jpg)
THE UNSUPERVISED MO
Try to learn embedding space of image data. (generally includes generative process)
1) Train encoder and decoder to encode then reconstruct image.
![Page 18: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/18.jpg)
THE UNSUPERVISED MO
Try to learn embedding space of image data. (generally includes generative process)
1) Train encoder and decoder to encode then reconstruct image.
2) Generate image from random embedding and reinforce “good” looking images.
![Page 19: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/19.jpg)
THE UNSUPERVISED MO
Try to learn embedding space of image data. (generally includes generative process)
1) Train encoder and decoder to encode then reconstruct image.
2) Generate image from random embedding and reinforce “good” looking images.
DOWNSIDES
Higher dimension embeddings = Non-interpretable
Latent distributions may contain gaps. No sensible continuum.
![Page 20: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/20.jpg)
OUTLINE
1. Variational Auto-encoders (VAE)
2. Generative Adversarial Networks (GAN)
3. The combination of the two (VAE/GAN)
4. Generative Moment Matching Networks (GMMN)
5. Adversarial Auto-encoders (AAE?)
Briefly
![Page 21: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/21.jpg)
OUTLINE
1. Variational Auto-encoders (VAE)
2. Generative Adversarial Networks (GAN)
3. The combination of the two (VAE/GAN)
4. Generative Moment Matching Networks (GMMN)
5. Adversarial Auto-encoders (AAE?)
Briefly
stitchfix/fauxtograph
![Page 22: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/22.jpg)
VARIATIONAL AUTO-ENCODERS
![Page 23: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/23.jpg)
ENCODING
input
Convolution
![Page 24: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/24.jpg)
input
ENCODING
Convolution
![Page 25: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/25.jpg)
latent
ENCODING
Convolution
![Page 26: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/26.jpg)
VARIATIONAL STEP
sample from distribution
}µ
}�
q�(z) = N (z;µ(i),�2(i)I)
![Page 27: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/27.jpg)
VARIATIONAL STEP
sampled
Deconvolution
![Page 28: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/28.jpg)
DECODING
output
Deconvolution
![Page 29: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/29.jpg)
DECODING
reconstruction
Deconvolution
![Page 30: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/30.jpg)
CALCULATE LOSS
L(x) = DKL(q�(z)||N (0, I)) +MSE(x,yout
)
![Page 31: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/31.jpg)
UPDATE WEIGHTS
W (l)⇤ij = W (l)
ij
✓1� ↵
@L@Wij
◆@L
@W
(l)ij
=
✓@L
@x
out
◆✓@x
out
@f
(n�1)
◆· · ·
@f
(l)
@W
(l)ij
!
![Page 32: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/32.jpg)
source: @genekogan
Because of pixel-wise MSE loss.
Non-centered features disproportionately penalized.
OUTPUT
![Page 33: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/33.jpg)
source: @genekogan
Because of pixel-wise MSE loss.
Non-centered features disproportionately penalized.
OUTPUT
Note Blurring hair.
![Page 34: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/34.jpg)
GENERATIVE ADVERSARIAL NETWORKS
![Page 35: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/35.jpg)
GAN STRUCTURE
Latent Random Vector
Generator Discriminator
![Page 36: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/36.jpg)
Discriminator
GAN STRUCTUREGenerator
Filtered
![Page 37: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/37.jpg)
Discriminator
GAN STRUCTUREGenerator
Image
![Page 38: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/38.jpg)
Discriminator
GAN STRUCTUREGenerator
Gen/Train Image
![Page 39: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/39.jpg)
Discriminator
GAN STRUCTUREGenerator
Filtered
![Page 40: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/40.jpg)
Discriminator
GAN STRUCTUREGenerator
Yes/No
![Page 41: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/41.jpg)
Discriminator
TRAINING
Generator
Generator and Discriminator play minimax game.
min
Gmax
DV (D,G) = E
x⇠pdata(x) [logD(x)] + Ez⇠pz(z) [log(1�D(G(z)))]
![Page 42: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/42.jpg)
Discriminator
TRAINING
Generator
Lower loss for fooling Discriminator.
Generator and Discriminator play minimax game.
Lower loss for IDing correct training/generated data.
min
Gmax
DV (D,G) = E
x⇠pdata(x) [logD(x)] + Ez⇠pz(z) [log(1�D(G(z)))]
![Page 43: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/43.jpg)
Discriminator
TRAINING
Generator
Lower loss for fooling Discriminator.
Generator and Discriminator play minimax game.
Lower loss for IDing correct training/generated data.
LD =
1
m
mX
i=1
hlogD
⇣x
(i)⌘+ log
⇣1�D
⇣G⇣z
(i)⌘⌘⌘i
LG =
1
m
mX
i=1
log
⇣1�D
⇣G⇣z(i)
⌘⌘⌘
min
Gmax
DV (D,G) = E
x⇠pdata(x) [logD(x)] + Ez⇠pz(z) [log(1�D(G(z)))]
![Page 44: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/44.jpg)
Discriminator
TRAINING
Generator
Lower loss for fooling Discriminator.
Generator and Discriminator play minimax game.
Lower loss for IDing correct training/generated data.
LD =
1
m
mX
i=1
hlogD
⇣x
(i)⌘+ log
⇣1�D
⇣G⇣z
(i)⌘⌘⌘i
LG =
1
m
mX
i=1
log
⇣1�D
⇣G⇣z(i)
⌘⌘⌘
http://arxiv.org/pdf/1406.2661v1.pdf
![Page 48: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/48.jpg)
OUTPUT
http://arxiv.org/pdf/1511.06434v2.pdf
Unfortunately Only Generative
![Page 49: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/49.jpg)
VAE+GAN
![Page 50: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/50.jpg)
VAE+GAN STRUCTUREGenerator DiscriminatorEncoder
O
![Page 51: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/51.jpg)
VAE+GAN STRUCTUREGenerator DiscriminatorEncoder
O S
![Page 52: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/52.jpg)
VAE+GAN STRUCTUREGenerator DiscriminatorEncoder
O S O
G(S)
G(E(O))
![Page 53: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/53.jpg)
VAE+GAN STRUCTUREGenerator DiscriminatorEncoder
O S O
G(S)
G(E(O))
![Page 54: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/54.jpg)
VAE+GAN STRUCTUREGenerator DiscriminatorEncoder
O S O
G(S)
G(E(O))
Yes/ No
MSE
![Page 55: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/55.jpg)
Discriminator
TRAINING
Generator
Train Encoder, Generator and Discriminator with separate optimizers.
Encoder
LE = DKL(q�(z)||N (0, I)) +MSE(Dl(x), Dl(G(E(x))))
LG = � ⇥MSE(Dl(x), Dl(G(E(x))))� LGAN
LD = LGAN = || log(D(x)) + log(1�D(E(G(x)))) + log(1�D(G(z)))||1
![Page 56: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/56.jpg)
Discriminator
TRAINING
Generator
Train Encoder, Generator and Discriminator with separate optimizers.
Encoder
LE = DKL(q�(z)||N (0, I)) +MSE(Dl(x), Dl(G(E(x))))
LG = � ⇥MSE(Dl(x), Dl(G(E(x))))� LGAN
LD = LGAN = || log(D(x)) + log(1�D(E(G(x)))) + log(1�D(G(z)))||1
VAE Prior learned similarity
learned similarity GAN
GAN discriminator loss
![Page 60: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/60.jpg)
TAKEAWAY
http://arxiv.org/pdf/1512.09300v1.pdf
We are trying to get away from pixels to begin with so why use pixel distance as metric?
![Page 61: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/61.jpg)
TAKEAWAY
http://arxiv.org/pdf/1512.09300v1.pdf
Learned similarity metric provides feature-level distance rather than pixel-level.
We are trying to get away from pixels to begin with so why use pixel distance as metric?
![Page 62: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/62.jpg)
TAKEAWAY
http://arxiv.org/pdf/1512.09300v1.pdf
Learned similarity metric provides feature-level distance rather than pixel-level.
We are trying to get away from pixels to begin with so why use pixel distance as metric?
Latent space of a GAN with the encoder of a VAE
![Page 63: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/63.jpg)
TAKEAWAY
http://arxiv.org/pdf/1512.09300v1.pdf
Learned similarity metric provides feature-level distance rather than pixel-level.
We are trying to get away from pixels to begin with so why use pixel distance as metric?
Latent space of a GAN with the encoder of a VAE
…BUT NOT THAT EASY TO TRAIN
![Page 64: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/64.jpg)
GENERATIVE MOMENT MATCHING NETWORKS
![Page 65: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/65.jpg)
DESCRIPTION
Use Maximum Mean Discrepancy between generated data and test data for loss.
Train generative network to output distribution with moments matching dataset.
LMMD2 =
������1
N
NX
i=0
�(xi)�1
M
MX
j=0
�(yj)
������
2
LMMD2 =1
N
2
NX
i=0
NX
i0=0
k(xi, xi0)�2
MN
NX
i=0
MX
j=0
k(xi, yj) +1
M
2
MX
j=0
MX
j0=0
k(yj , yj0)
![Page 66: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/66.jpg)
DESCRIPTION
![Page 68: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/68.jpg)
ADVERSARIAL AUTO-ENCODERS
![Page 69: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/69.jpg)
DESCRIPTION
Want to create an auto-encoder whose “code space” has a distribution matching an arbitrary specified prior.
Like VAE, but instead of using Gaussian KL Div., use adversarial procedure to match coding dist. to prior.
![Page 70: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/70.jpg)
DESCRIPTION
Want to create an auto-encoder whose “code space” has a distribution matching an arbitrary specified prior.
Like VAE, but instead of using Gaussian KL Div., use adversarial procedure to match coding dist. to prior.
Train encoder/decoder with reconstruction metrics.
Additionally: sample from encoding space, train encoder to produce samples indistinguishable
from specified prior.
![Page 71: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/71.jpg)
DESCRIPTION
![Page 72: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/72.jpg)
DESCRIPTIONGAN/
Regularization
![Page 73: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/73.jpg)
DESCRIPTIONGAN/
Regularization
AE/ Reconstruction
![Page 74: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/74.jpg)
SEMI-SUPERVISED
Regularize encoding space Disentangle encoding space
![Page 75: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/75.jpg)
SEMI-SUPERVISED10 2D Gaussians
Swiss roll http://arxiv.org/pdf/1511.05644v1.pdf
![Page 77: Unsupervised Computer Vision: The Current State of the Art](https://reader037.vdocument.in/reader037/viewer/2022110109/5878d1d31a28ab917a8b5357/html5/thumbnails/77.jpg)