unsupervised learning in computer vision · convolutional deep belief networks for scalable...

22
UNSUPERVISED LEARNING IN COMPUTER VISION Jordan Campbell

Upload: others

Post on 16-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

UNSUPERVISED LEARNING IN COMPUTER VISION

Jordan Campbell

Page 2: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

Overview

¨  Convolutional deep belief networks ¤ CDBN

¨  Sparse coding.

Page 3: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

Natural Images

¨  2-D matrix of intensity values. ¨  Exhibit certain statistical properties. ¨  Typically composed of edges, object parts and

objects.

¨  Can we train networks to learn representations of natural images?

Page 4: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

Network Representation

¨  Sparse coding basis functions

learnt from 16x16 image

patches of natural scenes CDBN – first layer bases (edges - top box) and second layer bases (object parts – second box)

Page 5: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

Sparse coding

¨  Mammalian V1 simple-cells are localized, oriented and bandpass.

¨  A sparse code will produce basis functions with these properties and provides a highly efficient representation of the original image (removes redundancy).

Page 6: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

Sparse coding - Goal

¨  The aim was to find a set of basis functions that were:

n Sparse n Highly representative

¨  Natural images are highly non-Gaussian and not well described by orthogonal components (therefore PCA is not appropriate).

Page 7: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

Sparse coding - algorithm

¨  Image is represented as:

¨  Goal of learning is to minimise the cost function:

Page 8: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

Sparse coding – Equations

¨  Accuracy

¨  Sparseness

¨  Learning

Page 9: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

Sparse coding - results

¨  192 basis functions were trained on 16x16 image patches of natural scenes.

¨  The resultant functions were localised, oriented and bandpass.

¨  Note that these are only the main spatial properties of simple-cell receptive fields, and that there are more. Similarly there are more complex cells elsewhere in the visual hierarchy.

Page 10: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

CDBN - preliminaries

¨  RBM: ¤  two-layer undirected bipartite graph. ¤  Assume binary activations. ¤  Gibbs sampling for learning and inference

¨  Convolutional RBM: ¤  K groups of hidden and pooling layers. ¤  Weight sharing (translation invariance)

¨  Probabilistic max-pooling ¤  Shrinks the representation of the detection layer

¨  Sparsity regularization

Page 11: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

CDBN - Architecture

Page 12: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

CDBN - Training

Page 13: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

CDBN – Training

¨  Contrastive divergence (Hinton, 1996) ¤ Calculate hidden activations given input ¤ Calculate visible state given hidden ¤ Calculate hidden again. This gives us <v h>(1)

¤  It would be better to have <v h> (infinity) (i.e. to maximise the log probability of the training data given the model parameters) however this is hard because we cannot estimate the energy function given such a complex model. (product of gaussians model – integration becomes intractable).

Page 14: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

CDBN – Hierarchical Inference

¨  Once the parameters (weights) have been learned we can ask the network what it thinks about a given image.

¨  We present the image to the visible layer and then compute the network’s representation of the image by sampling from the joint distribution over all the hidden layers, using block Gibbs sampling.

Page 15: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

CDBN – Results

¨  Tested performance on a two-layer network on the Caltech-101 object classification task.

¨  Achieved 65% test accuracy, which is comparable to state of the art results.

¨  0.8% test error on MNIST digit classification ¨  Hierarchical inference could be performed on

occluded objects when hidden layers could share information.

Page 16: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

Speed!

¨  Both these algorithms require either lots of sampling (to arrive at distributions – CDBN) or complex optimisations.

¨  The CDBN can take weeks to learn. ¨  Sparse coding networks have been shown to exhibit

more realistic behaviour when the networks are much larger (end-stopping).

Page 17: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

NVIDIA GPU

Page 18: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

GPU - Architecture

Page 19: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

Issues with parallelisation

¨  CDBNs and sparse coding algorithms rely on iterative, stochastic parameter updates that are dependent on previous updates. ¤  i.e. weight updates require a sample from the whole

representation.

¨  Memory transfers between RAM and the GPU’s global memory are slow.

Page 20: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

Algorithms

¨  For sparse networks parallelisation is achieved by

noting that the joint distribution is not convex in both variables but it is convex if one variable is kept fixed while the other is optimised. ¤ The two variables are the basis functions and the

activations ¤ Algorithm optimises each activation value in parallel.

Page 21: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

Results

¨  RBM with 1 million free parameters and 4096x11008 VxH units. ¤ 130x speedup.

¨  Sparse coding with 5000 examples and 1024 activations. ¤ 15x speedup.

Page 22: UNSUPERVISED LEARNING IN COMPUTER VISION · Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International

References

¨  Hinton, G.E., Osindero, S. & Teh, Y. (2006). A Fast Learning Algorithm for Deep Belief Nets. Neural Computation. 18:1527-1554

¨   

¨  LeCun, Y., Kavukcuoglo, K. & Farabet, C. (2010). Convolutional Networks and Applications in Vision. Proceedings of the 2010 IEEE International Symposium on Circuits and Systems (ISCAS).

¨    ¨  Lee, H., Grosse, R., Ranganath, R. & Ng, A.Y. (2009). Convolutional Deep Belief Networks for Scalable

Unsupervised Learning of Hierarchical Representations. Proceedings of the 26th International Conference on Machine Learning, Montreal, Canada.

¨   

¨  Olhausen, B.A. & Field, D.J. (1996). Emergence of Simple-Cell Receptive Field Properties by Learning a Sparse Code for Natural Images. Nature. 381:607-609

¨    ¨  Raina, R., Madhavan, A. & Ng, A.Y. (2009). Large-Scale Deep Unsupervised Learning Using Graphics

Processors. Proceedings of the 26th International Conference on Machine Learning, Montreal, Canada. ¨   

¨  Salakhutdino, R. & Murray, I. (2008). On the Quantitative Analysis of Deep Belief Nets. Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland.