convolutional neural networks: an introduction

Convolutional Neural Networks

This is about to get convoluted!

Outline● Crash Course on Neural Networks

○ How they’re constructed

○ How they’re trained

○ Backpropagation Algorithm

● Convolutional Neural Networks○ How are they different than NN

○ Why use them?

○ How are they composed

○ Libraries to implement them yourself

Neuron: The basis of NN● Model defined by weight vector w, and bias b

● Set of binary Inputs and real Weights

● Produce a binary output

● Weights represent importance of the inputs

Neuron Example● Music Festival, should you go?

● Depends on:○ Is the weather nice that weekend?

○ Are the tickets cheap?

○ Are any friends going?

● W = [0.25, 0.25, 0.60]

● X = [1, 1, 0]

● B = -0.3

● W⋅X+B > 0 ⇨ Go to festival

-0.3

0.25

0.60

0.25Cheap = 1

Sunny = 1

Friends = 0

Attend = 1

Expanding: Neural Networks● Neurons

○ Make decisions based on weighted evidence

● Represent more complex decision making?○ Construct a network of Neurons aka a Neural Network!

● Model of sophisticated decision making

Training Neural Networks● Use training set to adjust Weights and Biases

○ Problem: Small changes to weight/bias leads to large output changes

● Solution: Sigmoid Neurons○ Use real values as inputs and output a real value between 0 & 1 (Activation Level)

○ Small weight/bias change means small output change

● Minimize a Cost Function○ Using gradient descent enabled by the backpropagation algorithm

Backpropagation Algorithm● Introduced in 1970’s and adopted by Hinton et al in 1986

● Workhorse of learning in neural networks

● Calculates the partial derivatives of our weight and bias vectors with

respect to the cost function○ Allowing use of gradient descent

● Pretty complicated○ Can be treated as a black box

This is getting convoluted...● Essentially neural networks are connected neurons

○ Neurons are defined by a weight vector and bias

● Neural Networks consist of a bunch of weights and biases forming

multiple layers

● These weights and biases are learned through gradient descent enabled

by the backpropagation algorithm

Why Convolutional Neural Networks?● NN not always best representation eg. images

○ Adjacent pixels are connected similarly to those far apart

○ Cannot detect features in different areas of image

● How to represent this concept?○ Convolutional Neural Networks aka CNN!

○ Use subset of inputs mapped to hidden layers

CNN Composition● CNN are composed of three ideas

○ Local Receptive Fields

■ Small region which is connected to a hidden neuron

○ Shared Weights

■ Same weights & bias for each of the hidden neurons

○ Pooling

■ Condensing hidden layers

Local Receptive Fields● Captures idea of spatiality

● Map region of neurons to hidden layers

● Move LRF and map to next hidden neuron○ Offset moved is known as the Stride length

● Eg. 5x5 LRF on 28x28 input○ Sliding the LRF produces 24x24 hidden layer

Shared Weights● Vector W is the size of the LRF, and there is a single layer bias

○ All neurons in hidden layer detect exactly the same feature

○ Learns simple input patterns which cause neurons to activate

● Detects features regardless of position

● Eg. 5x5 LRF learning 3 features (26 Parameters per feature)

Pooling● Separate layer which follows Convolutional Layer

○ Simplifies the output from hidden layer

● Example 24x24 Convolutional Layer○ Maximum of 2x2 region

○ Produces 12x12 filter

■ Condensed version of the previous filter

All Together Now● Example 28x28 pixel MNIST Imageset

○ Mapped to 20 24x24 neuron filters○ Filters are condensed to 20 12x12 neuron filters○ Eventually produce an output at 10 neurons predicting character in image

CNN in practice● TensorFlow

○ Google’s open source library for machine learning

○ Python & C++ api interfaces

● Theano○ Open source library for machine learning research

○ Python interface

● Torch7○ Used by large tech companies such as Google DeepMind, Facebook, and IBM.

○ Less adopted Lua interface

Questions?

References

[1] Michael A. Nielsen, "Neural Networks and Deep Learning", Determination Press, 2015

[2]http://deeplearning4j.org/compare-dl4j-torch7-pylearn.html [Accessed 26 May 2016].

convolutional neural networks: an introduction

Technology