learning representations - web.stanford.edu · learning representations cs331b: representation...

43
Learning Representations CS331B: Representation Learning in Computer Vision Amir R. Zamir Silvio Savarese

Upload: hoangtu

Post on 07-Feb-2019

234 views

Category:

Documents


0 download

TRANSCRIPT

Learning RepresentationsCS331B: Representation Learning in Computer Vision

Amir R. ZamirSilvio Savarese

(class logistics)● Paper assignments out● Next weeks papers

○ Discriminative learning of deep convolutional feature point descriptors, Simo-Serra, E., Trulls, E., Ferraz, L., Kokkinos, I., Fua, P., & Moreno-Noguer, F., ICCV15

○ Data-Driven 3D Voxel Patterns for Object Category Recognition, Yu Xiang, Wongun Choi, Yuanqing Lin and Silvio Savarese., CVPR15.

○ Convolutional-recursive deep learning for 3d object classification, Socher, R., Huval, B., Bath, B., Manning, C. D., & Ng, A. Y., NIPS12

2

What we talked about so far...

3

Things... Our Knowledge...

4

“Transcript”

Cat

Macbeth was guilty.

[ 81 20 84 64 58 39 17 54 72 15]

Representation Mathematical Model (e.g., classifier)

5

Some basics concepts related to representations● Ill-posedness● Readout Non-linearity ● Dimensionality● Computational Complexity ● Encoding power (i.e., performance)● Narrowness of application domain (vertical vs horizontal representations)

6

Ill-posedness

7C. F. Bohren, D. R. Huffman, 1983.

8

With respect to: {modeling parameters (decision) , independent variables (representation)}

Linearity

Independent var. (x,y)

Modeling Param. (a,b,c,r)

Linear non-Linear

Linear Linear

Decision boundary

Not discussing kernels, reparametrization, etc

Handcrafting...

9

Represent these cats for a cat detector!

10

Represent these cats for a cat detector! (II)

11

Represent these cats for a cat detector! (III)

12

Represent these cats for a cat detector! (IV)

13

Color Histograms

Deformable Part based Models

(DPM)

Histogram of Gradients

(HOG)

Models based Shapes

14Felzenszwalb et al., 2010. Dalal and Triggs, 2005.Beis and Lowe, 1997.

Handcrafting...● Was the only way for a long time. ● (almost) Worked for many great applications:

○ Image Retrieval, Structure-from-motion, Face detection, Identification, etc

● Why alternatives?

15

Handcrafting...● Was the only way for a long time. ● (almost) Worked for many great applications:

○ Image Retrieval, Structure-from-motion, Face detection, Identification, etc

● Why alternatives?○ Can’t quite find the discriminative signature for a problem.

16

Why a sad image?

Handcrafting...● Was the only way for a long time. ● (almost) Worked for many great applications:

○ Image Retrieval, Structure-from-motion, Face detection, Identification, etc

● Why alternatives?○ Can’t quite find the discriminative signature for a problem.○ Discriminative signature can be found, but hard to approach programmatically.

17

Track the fish.

Handcrafting...● Was the only way for a long time. ● (almost) Worked for many great applications:

○ Image Retrieval, Structure-from-motion, Face detection, Identification, etc

● Why alternatives?○ Can’t quite find the discriminative signature for a problem.○ Discriminative signature can be found, but hard to approach programmatically.○ Too many contributing factors to the problem.

■ Fusion non-trivial. Rule-based fusion outruled. ■ Fusion of contributing factors itself a comparably complex representation problem.

18

Why a sad image?

Learning Representations

19

Two approaches to representation learning● Unsupervised

○ Representation constrained on reconstruction.

20

● Supervised○ Representation constrained on task(s).

Two approaches to representation learning● Supervised

○ Representation constrained on task(s).● Unsupervised

○ Representation constrained on reconstruction.

21LeCun et al. 1998. Hinton et al. 2006.

Pros and Cons● Supervised

○ Representation constrained on task(s).● Unsupervised

○ Representation constrained on reconstruction.

22LeCun et al. 1998. Hinton et al. 2006.

Unsupervised Representation Learning

23

Unsupervised representation learning● Unsupervised

○ Representation constrained on reconstruction.

24Hinton et al. 2006.

● Clarification: ○ Unsupervised does not necessarily

mean Generic and vice versa.○ Many of the generic representations are

actually trained discriminatively.○ Lecture 12

Unsupervised representation learning● Unsupervised

○ Representation constrained on reconstruction.

25Hinton et al. 2006.

● Usually a reconstruction loss:○ L2 pixel loss

Unsupervised representation learning● Unsupervised

○ Representation constrained on reconstruction.

26Johnson et al. 2016.Hinton et al. 2006.

● Usually a reconstruction loss:○ L2 pixel loss○ Perceptual loss

Ground Truth Bicubic Perceptual

Unsupervised representation learning methods● Sparse Coding● Basic Autoencoders● Restricted Boltzmann Machine● Generative Adversarial Network based feature learning● ...

27

28

Sparse Coding

Lee et al. 2006.Salakhutdinov. 2016.

29

Sparse Coding

Lee et al. 2006.Salakhutdinov. 2016.

30

Autoencoder

Hinton et al. 2006.

31

Autoencoder

Hinton et al. 2006.

32

Autoencoder

Hinton et al. 2006.

33

Autoencoder: reconstruction results

Hinton et al. 2006.

Input

PCA

AE

Supervised Models

34

35Stanford CS231n

Neural Networks

Neural Net

A Neuron

Convolutional X

Low-level matching

36Zagoruyko & Komodakis. 2015.

● SIFT/HOG/handcrafted local features counterpart

Brown, Hua Winder, local feature learning dataset

37

Low-level matching

38Zagoruyko & Komodakis. 2015.

● Learned vs l2 metric

Low-level matching: Some qualitative results

39Zagoruyko & Komodakis. 2015.

Low-level matching: quantitative evaluation

40Zagoruyko & Komodakis. 2015.

● 1st layer filters● Notice similarity to Gabor

and sparse coding filters

Dimensionality

41Han et al. 2015.

Low-level matching: handcrafted vs learned features

42Zamir et al. 2016.