(artificial) neural networks: from perceptron to mlp · 2020-06-29 · artificial neural networks...

47
(Artificial) Neural Networks: From Perceptron to MLP Industrial AI Lab. Prof. Seungchul Lee

Upload: others

Post on 02-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

(Artificial) Neural Networks:From Perceptron to MLP

Industrial AI Lab.

Prof. Seungchul Lee

Page 2: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Perceptron: Example

2

Page 3: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Perceptron: Example

3

Page 4: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Perceptron: Example

4

Page 5: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Perceptron: Forward Propagation

5

Page 6: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

From Perceptron to MLP

6

Page 7: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

XOR Problem

• Minsky-Papert Controversy on XOR– Not linearly separable

– Limitation of perceptron

• Single neuron = one linear classification boundary

7

Page 8: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Artificial Neural Networks: MLP

• Multi-layer Perceptron (MLP) = Artificial Neural Networks (ANN)– Multi neurons = multiple linear classification boundaries

8

Page 9: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Artificial Neural Networks: Activation Function

• Differentiable nonlinear activation function

9

Page 10: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Artificial Neural Networks

• In a compact representation

10

Page 11: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Artificial Neural Networks

• A single layer is not enough to be able to represent complex relationship between input and output

⟹ perceptron with many layers and units

• Multi-layer perceptron– Features of features

– Mapping of mappings

11

Page 12: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Another Perspective:ANN as Kernel Learning

12

Page 13: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Neuron

• We can represent this “neuron” as follows:

13

Page 14: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

XOR Problem

• The main weakness of linear predictors is their lack of capacity.

• For classification, the populations have to be linearly separable.

14

Page 15: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Nonlinear Mapping

• The XOR example can be solved by pre-processing the data to make the two populations linearly separable.

Source: Dr. Francois Fleuret at EPFL 15

Page 16: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Nonlinear Mapping

• The XOR example can be solved by pre-processing the data to make the two populations linearly separable.

Source: Dr. Francois Fleuret at EPFL 16

Page 17: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Nonlinear Mapping

• The XOR example can be solved by pre-processing the data to make the two populations linearly separable.

Source: Dr. Francois Fleuret at EPFL 17

Page 18: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Kernel

• Often we want to capture nonlinear patterns in the data

– nonlinear regression: input and output relationship may not be linear

– nonlinear classification: classes may note be separable by a linear boundary

• Linear models (e.g. linear regression, linear SVM) are not just rich enough

– by mapping data to higher dimensions where it exhibits linear patterns

– apply the linear model in the new input feature space

– mapping = changing the feature representation

• Kernels: make linear model work in nonlinear settings

18

Page 19: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Kernel + Neuron

• Nonlinear mapping + neuron

𝜙: Kernel Perceptron

19

Page 20: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Neuron + Neuron

• Nonlinear mapping can be represented by another neurons

• Nonlinear Kernel – Nonlinear activation functions

𝜙: Kernel

20

Page 21: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Multi Layer Perceptron

• Nonlinear mapping can be represented by another neurons

• We can generalize an MLP

𝜙: Kernel

21

Page 22: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Summary

• Universal function approximator

• Universal function classifier

• Parameterized

22

Page 23: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Deep Artificial Neural Networks

• Complex/Nonlinear universal function approximator

– Linearly connected networks

– Simple nonlinear neurons

Feature learning Classification

Class 1

Class 2

nonlinearlinear

OutputInput

23

Page 24: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Deep Artificial Neural Networks

• Complex/Nonlinear universal function approximator

– Linearly connected networks

– Simple nonlinear neurons

Class 1

Class 2

…… … ………

nonlinearlinear

Feature learning Classification

OutputInput

24

Page 25: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Looking at Parameters

25

Page 26: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Logistic Regression in a Form of Neural Network

26

Page 27: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Logistic Regression in a Form of Neural Network

• Neural network convention

Do not indicate bias units

27

Page 28: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Nonlinearly Distributed Data

Do not include bias units

• Example to understand network’s behavior– Include a hidden layer

28

Page 29: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Multi Layers

Do not include bias units

• 𝑧 space

29

Page 30: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Multi Layers

Do not include bias units

• 𝑥 space

30

Page 31: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

(Artificial) Neural Networks:Training

Industrial AI

Prof. Seungchul Lee

Page 32: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Training Neural Networks: Loss Function

• Measures error between target values and predictions

• Example– Squared loss (for regression):

– Cross entropy (for classification):

32

Page 33: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Gradients in ANN

• Learning weights and biases from data using gradient descent

•𝜕ℓ

𝜕𝜔: too many computations are required for all 𝜔

• Structural constraint of NN: – Composition of functions

– Chain rule

– Dynamic programming

33

Page 34: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Training Neural Networks with TensorFlow

• Optimization procedure

• It is not easy to numerically compute gradients in network in general.– The good news: people have already done all the "hard work" of developing numerical solvers (or

libraries)

– There are a wide range of tools → We will use the TensorFlow

34

Page 35: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

ANN in TensorFlow:MNIST

35

Page 36: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

MNIST database

• Mixed National Institute of Standards and Technology database

• Handwritten digit database

• 28 × 28 gray scaled image

• Flattened matrix into a vector of 28 × 28 = 784

36

Page 37: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Our Network Model

Input layer(784)

hidden layer(100)

output layer(10)

Input image(28 X 28)

flatteneddigit prediction

in one-hot-encoding

37

Page 38: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Machine Learning and Deep Learning

Prof. Seungchul Lee

Industrial AI Lab.

38

Page 39: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Supervised Learning

Training Data

Chair

Dest

Generalization

39

Page 40: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Machine Learning

• Image of digit 0

• Image of digit 1

Image

40

Page 41: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

이사?

간다

온다

“좋은 특성인자 선정이 중요하다.”

전문가 지식

Page 42: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Machine Learning Deep Learning

Data

Features

Decision

Data

Decision

• Less depends on domain knowledge• AI-discovered features

Signal processing + Domain knowledge“ 딥러닝은기계학습보다는

Domain Knowledge의존도가 낮다. ”

42

Page 43: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Deep Learning

• Convolutional Neural Networks (CNN)

• Image pattern recognition problems

Image

0

1

43

Page 44: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Deep Learning

• Convolutional Neural Networks (CNN)

• Image pattern recognition problems

Image

0

1

44

Page 45: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

Handwritten Digit Recognition

Ind

ust

rial

AI L

ab.

45

Page 46: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

AI Core vs. AI + X

46

Page 47: (Artificial) Neural Networks: From Perceptron to MLP · 2020-06-29 · Artificial Neural Networks •A single layer is not enough to be able to represent complex relationship between

인공지능과공학

• Generally correct in ideal cases

• Difficult to reflect uncertainties in reality

47