deep learning basics lecture 1: feedforward · pdf filedeep learning basics lecture 1:...

31
Deep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: Yingyu Liang

Upload: doanthien

Post on 13-Mar-2018

223 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Deep Learning Basics Lecture 1: Feedforward

Princeton University COS 495

Instructor: Yingyu Liang

Page 2: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Motivation I: representation learning

Page 3: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Machine learning 1-2-3

• Collect data and extract features

• Build model: choose hypothesis class 𝓗 and loss function 𝑙

• Optimization: minimize the empirical loss

Page 4: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Features

Color Histogram

Red Green Blue

Extract features

𝑥

𝑦 = 𝑤𝑇𝜙 𝑥build hypothesis

Page 5: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Features: part of the model

𝑦 = 𝑤𝑇𝜙 𝑥build hypothesis

Linear model

Nonlinear model

Page 6: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Example: Polynomial kernel SVM

𝑥1

𝑥2

𝑦 = sign(𝑤𝑇𝜙(𝑥) + 𝑏)

Fixed 𝜙 𝑥

Page 7: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Motivation: representation learning

• Why don’t we also learn 𝜙 𝑥 ?

Learn 𝜙 𝑥

𝑥

𝑦 = 𝑤𝑇𝜙 𝑥Learn 𝑤𝜙 𝑥

Page 8: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Feedforward networks

• View each dimension of 𝜙 𝑥 as something to be learned

𝑥

𝜙 𝑥

𝑦 = 𝑤𝑇𝜙 𝑥

……

Page 9: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Feedforward networks

• Linear functions 𝜙𝑖 𝑥 = 𝜃𝑖𝑇𝑥 don’t work: need some nonlinearity

𝑥

𝜙 𝑥

𝑦 = 𝑤𝑇𝜙 𝑥

……

Page 10: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Feedforward networks

• Typically, set 𝜙𝑖 𝑥 = 𝑟(𝜃𝑖𝑇𝑥) where 𝑟(⋅) is some nonlinear function

𝑥

𝜙 𝑥

𝑦 = 𝑤𝑇𝜙 𝑥

……

Page 11: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Feedforward deep networks

• What if we go deeper?

… …

…… …

𝑥

ℎ1 ℎ2 ℎ𝐿

𝑦

Page 12: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Figure fromDeep learning, by Goodfellow, Bengio, Courville.Dark boxes are things to be learned.

Page 13: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Motivation II: neurons

Page 14: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Motivation: neurons

Figure from Wikipedia

Page 15: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Motivation: abstract neuron model

• Neuron activated when the correlation

between the input and a pattern 𝜃

exceeds some threshold 𝑏

• 𝑦 = threshold(𝜃𝑇𝑥 − 𝑏)

or 𝑦 = 𝑟(𝜃𝑇𝑥 − 𝑏)

• 𝑟(⋅) called activation function 𝑦

𝑥1

𝑥2

𝑥𝑑

Page 16: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Motivation: artificial neural networks

Page 17: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Motivation: artificial neural networks

• Put into layers: feedforward deep networks

… …

…… …

𝑥

ℎ1 ℎ2 ℎ𝐿

𝑦

Page 18: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Components in Feedforward networks

Page 19: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Components

• Representations:• Input

• Hidden variables

• Layers/weights:• Hidden layers

• Output layer

Page 20: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Components

… …

…… …

Hidden variables ℎ1 ℎ2 ℎ𝐿

𝑦

Input 𝑥

First layer Output layer

Page 21: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Input

• Represented as a vector

• Sometimes require some

preprocessing, e.g.,• Subtract mean

• Normalize to [-1,1]

Expand

Page 22: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Output layers

• Regression: 𝑦 = 𝑤𝑇ℎ + 𝑏

• Linear units: no nonlinearity

𝑦

Output layer

Page 23: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Output layers

• Multi-dimensional regression: 𝑦 = 𝑊𝑇ℎ + 𝑏

• Linear units: no nonlinearity

𝑦

Output layer

Page 24: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Output layers

• Binary classification: 𝑦 = 𝜎(𝑤𝑇ℎ + 𝑏)

• Corresponds to using logistic regression on ℎ

𝑦

Output layer

Page 25: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Output layers

• Multi-class classification:

• 𝑦 = softmax 𝑧 where 𝑧 = 𝑊𝑇ℎ + 𝑏

• Corresponds to using multi-class

logistic regression on ℎ

𝑦

Output layer

𝑧

Page 26: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Hidden layers

• Neuron take weighted linear combination of the previous layer

• So can think of outputting one value for the next layer

……

ℎ𝑖 ℎ𝑖+1

Page 27: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Hidden layers

• 𝑦 = 𝑟(𝑤𝑇𝑥 + 𝑏)

• Typical activation function 𝑟• Threshold t 𝑧 = 𝕀[𝑧 ≥ 0]

• Sigmoid 𝜎 𝑧 = 1/ 1 + exp(−𝑧)

• Tanh tanh 𝑧 = 2𝜎 2𝑧 − 1

𝑦𝑥𝑟(⋅)

Page 28: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Hidden layers

• Problem: saturation

𝑦𝑥𝑟(⋅)

Figure borrowed from Pattern Recognition and Machine Learning, Bishop

Too small gradient

Page 29: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Hidden layers

• Activation function ReLU (rectified linear unit)• ReLU 𝑧 = max{𝑧, 0}

Figure from Deep learning, by Goodfellow, Bengio, Courville.

Page 30: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Hidden layers

• Activation function ReLU (rectified linear unit)• ReLU 𝑧 = max{𝑧, 0}

Gradient 0

Gradient 1

Page 31: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Hidden layers

• Generalizations of ReLU gReLU 𝑧 = max 𝑧, 0 + 𝛼min{𝑧, 0}

• Leaky-ReLU 𝑧 = max{𝑧, 0} + 0.01min{𝑧, 0}

• Parametric-ReLU 𝑧 : 𝛼 learnable

𝑧

gReLU 𝑧