introduction to machine learning10701/slides/9_multilayer_perceptron.pdf · 2017-09-28 ·...

44
Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos

Upload: others

Post on 20-May-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

Introduction to Machine Learning

Multilayer Perceptron

Barnabás Póczos

Page 2: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

2

The Multilayer Perceptron

Page 3: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

3

Multilayer Perceptron

Page 4: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

4

Dean A. Pomerleau, Carnegie Mellon University, 1989

ALVINN: AN AUTONOMOUS LAND VEHICLE IN A NEURAL NETWORK

Training: using simulated road generator

Page 5: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

5

Gradient Descent

We want to solve:

Page 6: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

6

Starting Point

Page 7: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

7

Starting Point

Page 8: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

8

Fixed step size can be too big

Page 9: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

9

Fixed step size can be too small

Page 10: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

10

Page 11: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

11

Page 12: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

12

Character Recognition with MLP

Matlab: appcr1

Page 13: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

13

The network

Noise-free input:

26 different letters ofsize 7x5

Page 14: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

14

Noisy inputs

Page 15: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

15

% Create MLP

hiddenlayers=[10, 25];

net1 = feedforwardnet(hiddenlayers);

net1 = configure(net1,X,T);

%View

view(net1);

%Train

net1 = train(net1,X,T);

%Test

Y1 = net1(Xtest);

Matlab MLP Training

Page 16: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

16

Prediction errors

▪ Network 1 was trained on clean images

▪ Network 2 was trained on noisy images. 30 noisy copies of each letter are created

Page 17: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

17

The Backpropagation Algorithm

Page 18: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

18

Multilayer Perceptron

Page 19: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

19

The gradient of the error

Page 20: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

20

Notation

Page 21: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

21

Some observations

Page 22: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

22

The backpropagated error

Page 23: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

23

The backpropagated error

Lemma

Page 24: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

24

The backpropagated error

Therefore,

Page 25: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

25

The backpropagation algorithm

Page 26: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

26

Page 27: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

27

Page 28: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

28

Page 29: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

29

Page 30: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

30

Page 31: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

31

Page 32: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

32

Page 33: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

33

What functions can multilayer perceptrons represent?

Page 34: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

34

Perceptrons cannot represent the XOR function

f(0,0)=1, f(1,1)=1, f(0,1)=0, f(1,0)=0

What functions can multilayer perceptrons represent?

Page 35: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

35

“Solve 7-th degree equation using continuous functions of two parameters.”

Related conjecture:

Let f be a function of 3 arguments such that

Prove that f cannot be rewritten as a composition of finitely many functions of two arguments.

Another rewritten form:

Prove that there is a nonlinear continuous system of three variables that cannot be decomposed with finitely many functions of two variables.

Hilbert’s 13th Problem

1902: 23 “most important” problems in mathematics

The 13th Problem:

Conjecture: It can’t be solved…

Page 36: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

36

f(x,y,z)=Φ1(ψ1(x), ψ2(y))+Φ2(c1ψ3(y)+c2ψ4(z),x)

x

z

y

Σ

Φ1

Φ2

ψ1

ψ2

ψ3

ψ4

Σ

f(x,y,z)

c1

c2

Function decompositions

Page 37: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

37

1957, Arnold disproves Hilbert’s conjecture.

Function decompositions

Page 38: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

38

Issues: This statement is not constructive.

Function decompositions

Corollary:

Page 39: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

39

Kur Hornik, Maxwell Stinchcombe and Halber White: “Multilayer feedforward networks are universal approximators”, Neural Networks, Vol:2(3), 359-366, 1989

Universal Approximators

Definition: ΣN(g) neural network with 1 hidden layer:

Theorem:

Definition:

Page 40: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

40

Theorem: (Blum & Li, 1991)

Universal Approximators

Definition:

Formal statement:

Page 41: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

41

Integral approximation in 1-dim:

i

jii

Proof

xi

xi xi

Integral approximation in 2-dim:

GOAL:

Page 42: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

42

xi

xi xi

The indicator function of Xi polygon can be learned by this neural network:

1 if x is in Xi

-1 otherwise

Proof

The weighted linear combination of these indicator functions will be a

good approximation of the original function f

GOAL:

Page 43: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

43

Proof

This linear equation can also be solved.

Page 44: Introduction to Machine Learning10701/slides/9_Multilayer_Perceptron.pdf · 2017-09-28 · Introduction to Machine Learning Multilayer Perceptron Barnabás Póczos. 2 The Multilayer

44

Thanks for your attention!