machine learning and computer vision group · 01 introduction 02 perceptron 03 multi-layer...

15
Machine Learning and Computer Vision Group Deep Learning with TensorFlow http://cvml.ist.ac.at/courses/DLWT_W18 Lecture 3: Articial Neural Networks (Multilayer Perceptrons)

Upload: others

Post on 03-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Machine Learning and Computer Vision Group · 01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow Artificial neural network (ANN) Logical AND Logical OR McCulloch

Machine Learning and Computer Vision Group

Deep Learning with TensorFlowhttp://cvml.ist.ac.at/courses/DLWT_W18

Lecture 3:Artificial Neural Networks(Multilayer Perceptrons)

Page 2: Machine Learning and Computer Vision Group · 01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow Artificial neural network (ANN) Logical AND Logical OR McCulloch

Artificial Neural Networks Multi-layer Perceptrons

Lars Bollmann

Page 3: Machine Learning and Computer Vision Group · 01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow Artificial neural network (ANN) Logical AND Logical OR McCulloch

Contents

Brief history, biological neuron and artificial neural network

Introduction 01

Features & drawbacks

Perceptron 02

Description & training

Multi-layer Perceptron 03

High-Level API and plain Tensorflow

Implementation: Tensorflow 04

Page 4: Machine Learning and Computer Vision Group · 01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow Artificial neural network (ANN) Logical AND Logical OR McCulloch

01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow

Brief history

[1]

Page 5: Machine Learning and Computer Vision Group · 01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow Artificial neural network (ANN) Logical AND Logical OR McCulloch

01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow

Biological neuron

[2]

[4] [3]

Page 6: Machine Learning and Computer Vision Group · 01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow Artificial neural network (ANN) Logical AND Logical OR McCulloch

01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow

Artificial neural network (ANN)

Logical AND

Logical OR

McCulloch & Pitts (1943) LOGICAL CALCULUS FOR NERVOUS ACTIVITY 105

(e

(i 1

Figure 1. The neuron ci is always marked with the numeral i upon the body of the cell, and the corresponding action is denoted by “N” with is subscript, as in the text:

(a) N*(t) .=.N,(t- 1); (b) N,(t).s.N,(t-l)vN,(t-1);

(c) N3(t).s.N1(t-1).N2(t-1);

(d) N3(t).= N,(t-l).-N,(t-1);

(e) N,(t):=:N,(t-l).v.N,(t-3).-N,(t-2); N&).=.N2(t-2).N2(t-1);

(f) N4(t):3: --N,(t-l).N,(t-l)vN,(t-l).v.N,(t-1). N,(t-l).N,(t-1) NJt):=: -N,(t-2).N,(t-2)vN,(t-2).v.N,(t-2). N,(t-2).N,(t-2);

(g) N,(t).=.NN,(t-2).-N,(t-3); (h) N,(t).=.N,(t-l).N,(t-2); (i) N,(t):=:Nz(t-l).v.N,(t-l).(Ex)t-1 .N,(x).N,(x).

LOGICAL CALCULUS FOR NERVOUS ACTIVITY 105

(e

(i 1

Figure 1. The neuron ci is always marked with the numeral i upon the body of the cell, and the corresponding action is denoted by “N” with is subscript, as in the text:

(a) N*(t) .=.N,(t- 1); (b) N,(t).s.N,(t-l)vN,(t-1);

(c) N3(t).s.N1(t-1).N2(t-1);

(d) N3(t).= N,(t-l).-N,(t-1);

(e) N,(t):=:N,(t-l).v.N,(t-3).-N,(t-2); N&).=.N2(t-2).N2(t-1);

(f) N4(t):3: --N,(t-l).N,(t-l)vN,(t-l).v.N,(t-1). N,(t-l).N,(t-1) NJt):=: -N,(t-2).N,(t-2)vN,(t-2).v.N,(t-2). N,(t-2).N,(t-2);

(g) N,(t).=.NN,(t-2).-N,(t-3); (h) N,(t).=.N,(t-l).N,(t-2); (i) N,(t):=:Nz(t-l).v.N,(t-l).(Ex)t-1 .N,(x).N,(x).

LOGICAL CALCULUS FOR NERVOUS ACTIVITY 105

(e

(i 1

Figure 1. The neuron ci is always marked with the numeral i upon the body of the cell, and the corresponding action is denoted by “N” with is subscript, as in the text:

(a) N*(t) .=.N,(t- 1); (b) N,(t).s.N,(t-l)vN,(t-1);

(c) N3(t).s.N1(t-1).N2(t-1);

(d) N3(t).= N,(t-l).-N,(t-1);

(e) N,(t):=:N,(t-l).v.N,(t-3).-N,(t-2); N&).=.N2(t-2).N2(t-1);

(f) N4(t):3: --N,(t-l).N,(t-l)vN,(t-l).v.N,(t-1). N,(t-l).N,(t-1) NJt):=: -N,(t-2).N,(t-2)vN,(t-2).v.N,(t-2). N,(t-2).N,(t-2);

(g) N,(t).=.NN,(t-2).-N,(t-3); (h) N,(t).=.N,(t-l).N,(t-2); (i) N,(t):=:Nz(t-l).v.N,(t-l).(Ex)t-1 .N,(x).N,(x).

Heat illusion

LOGICAL CALCULUS FOR NERVOUS ACTIVITY 105

(e

(i 1

Figure 1. The neuron ci is always marked with the numeral i upon the body of the cell, and the corresponding action is denoted by “N” with is subscript, as in the text:

(a) N*(t) .=.N,(t- 1); (b) N,(t).s.N,(t-l)vN,(t-1);

(c) N3(t).s.N1(t-1).N2(t-1);

(d) N3(t).= N,(t-l).-N,(t-1);

(e) N,(t):=:N,(t-l).v.N,(t-3).-N,(t-2); N&).=.N2(t-2).N2(t-1);

(f) N4(t):3: --N,(t-l).N,(t-l)vN,(t-l).v.N,(t-1). N,(t-l).N,(t-1) NJt):=: -N,(t-2).N,(t-2)vN,(t-2).v.N,(t-2). N,(t-2).N,(t-2);

(g) N,(t).=.NN,(t-2).-N,(t-3); (h) N,(t).=.N,(t-l).N,(t-2); (i) N,(t):=:Nz(t-l).v.N,(t-l).(Ex)t-1 .N,(x).N,(x).

“learning”

All images from [5]

Page 7: Machine Learning and Computer Vision Group · 01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow Artificial neural network (ANN) Logical AND Logical OR McCulloch

01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow

Σ

The Perceptron Rosenblatt (1957)

1

Outputs

Inputs x1 x2

Features Drawbacks

w i , j(next _ step)=wi , j +η(y j − y j )xi

y

w2,1

•  Converges if training instances are linearly separable

•  Still a linear classifier

x1

x2

•  Linear threshold unit:

w1,1

y =1 if wi ,1

i=1

m

∑ ⋅ xi + b > 0

0 otherwise

⎨⎪

⎩⎪

•  Update rule for weights: b

Page 8: Machine Learning and Computer Vision Group · 01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow Artificial neural network (ANN) Logical AND Logical OR McCulloch

01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow

Multi-layer Perceptron (MLP)

1

Output layer Input layer

x1

x2 y3

y2

y1

Σ

1

Hidden layer

Σ

Σ

Σ

Σ

Σ

Σ

•  Activation of one neuron:

•  Alternative activation functions:

[6]

ai(L) = factivation w(i , j )a j

(L−1) +bj∑⎛

⎝⎜⎜

⎠⎟⎟

Page 9: Machine Learning and Computer Vision Group · 01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow Artificial neural network (ANN) Logical AND Logical OR McCulloch

01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow

Training

1

Output layer Input layer

Σ

1

Hidden layer

Σ

Σ

Σ

Σ

Σ

Σ

x1

x2

ai(L) = factivation w(i , j )a j

(L−1) +bj∑⎛

⎝⎜⎜

⎠⎟⎟

•  Activation of one neuron:

•  Find optimal weights & bias terms to reduce cost function (e.g. MSE, cross-entropy)

•  Weights: influence of neurons in previous layer

•  Bias terms: nudge towards active/inactive

(ŷ1 – y1)2

(ŷ2 - y2)2

(ŷ3 – y3)2

Page 10: Machine Learning and Computer Vision Group · 01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow Artificial neural network (ANN) Logical AND Logical OR McCulloch

01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow

Training

1

Output layer Input layer

x1

x2

(ŷ1 – y1)2

Σ

1

Hidden layer

Σ

Σ

Σ

Σ

Σ

Σ

(ŷ2 - y2)2

(ŷ3 – y3)2

•  Initialize weights & bias terms with random values

•  Average cost over all instances:

•  Mini-batches to compute gradient with

respect to weights and biases •  Gradient descent step

Minimize cost function using GD

C =1m

Cii=1

m

Page 11: Machine Learning and Computer Vision Group · 01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow Artificial neural network (ANN) Logical AND Logical OR McCulloch

01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow

•  Forward pass: compute output of all neurons for training instance

•  Backward pass: compute partial derivatives for each weight and bias

•  Average over mini-batch:

•  Combine partial derivatives

à gradient vector

Training

1

Output layer Input layer

x1 Σ

Hidden layer

Σ

a(L-1) a(L)

C0 = a(L) − y( ) 2•  Cost:

•  Activation: with

•  Derivative of C0 with respect to w(L) :

a(L) = fact (z(L) ) z (L) = w(L)a(L−1) +b

∂C0∂w(L)

=∂z (L)

∂w(L)⋅∂a(L)

∂z (L)⋅∂C0∂a(L)

∂Cmb∂w(L)

=1mmb

∂Ci∂w(L)i=1

mmb

Backpropagation

w(L)

b

Page 12: Machine Learning and Computer Vision Group · 01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow Artificial neural network (ANN) Logical AND Logical OR McCulloch

01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow

Tensorflow High-Level API Plain Tensorflow •  TF.Learn

•  DNNClassifier

•  Parameters: •  # layers •  # neurons per layer •  batch size •  # iterations •  activation function

1.  Construction phase

2.  Training phase

3.  Using the trained network

à Demo in Jupyter

Page 13: Machine Learning and Computer Vision Group · 01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow Artificial neural network (ANN) Logical AND Logical OR McCulloch

01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow

Hyperparameters # hidden layers # neurons / layer # activation

•  Funnel or constant size •  “black art”

•  Too many neurons cause overfitting

•  ReLU is a good choice in general

•  Softmax for output when

classes are mutually exclusive

•  More layers: •  exponentially fewer

neurons for complex functions

•  Converge faster •  Generalize better •  Reuse of layers

•  Start with few layers and

increase number

Page 14: Machine Learning and Computer Vision Group · 01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow Artificial neural network (ANN) Logical AND Logical OR McCulloch

Summary

Initial concepts in the 1940s, dark age and recent boom

From Artifical Neurons to Deep Neural networks

Layers, neurons, bias, weights, activation function

Multi-layer Perceptron

Gradient descent, backpropagation & mini-batches

Training of ANN

High-Level API, Plain TensorFlow, hyperparameter tuning

Implementation in Tensorflow

Page 15: Machine Learning and Computer Vision Group · 01 Introduction 02 Perceptron 03 Multi-layer Perceptron 04 Tensorflow Artificial neural network (ANN) Logical AND Logical OR McCulloch

References [1] https://www.slideshare.net/deview/251-deep-learning-using-cu-dnn/4 [2] https://en.wikipedia.org/wiki/Neuron#/media/File:Blausen_0657_MultipolarNeuron.png [3] https://en.wikipedia.org/wiki/Neural_oscillation [4] http://www.lesicalab.com/research/ [5] McCulloch and Pitts, A logical calculus of the ideas immanent in nervous activity [6] https://medium.com/@shrutijadon10104776/survey-on-activation-functions-for-deep-learning-9689331ba09