deep learning basics lecture 1: feedforward · pdf filedeep learning basics lecture 1:...

Deep Learning Basics Lecture 1: Feedforward

Princeton University COS 495

Instructor: Yingyu Liang

Motivation I: representation learning

Machine learning 1-2-3

• Collect data and extract features

• Build model: choose hypothesis class 𝓗 and loss function 𝑙

• Optimization: minimize the empirical loss

Features

Color Histogram

Red Green Blue

Extract features

𝑦 = 𝑤𝑇𝜙 𝑥build hypothesis

Features: part of the model

𝑦 = 𝑤𝑇𝜙 𝑥build hypothesis

Linear model

Nonlinear model

Example: Polynomial kernel SVM

𝑦 = sign(𝑤𝑇𝜙(𝑥) + 𝑏)

Fixed 𝜙 𝑥

Motivation: representation learning

• Why don’t we also learn 𝜙 𝑥 ?

Learn 𝜙 𝑥

𝑦 = 𝑤𝑇𝜙 𝑥Learn 𝑤𝜙 𝑥

Feedforward networks

• View each dimension of 𝜙 𝑥 as something to be learned

𝜙 𝑥

𝑦 = 𝑤𝑇𝜙 𝑥

……

• Linear functions 𝜙𝑖 𝑥 = 𝜃𝑖𝑇𝑥 don’t work: need some nonlinearity

𝜙 𝑥

……

• Typically, set 𝜙𝑖 𝑥 = 𝑟(𝜃𝑖𝑇𝑥) where 𝑟(⋅) is some nonlinear function

𝜙 𝑥

……

Feedforward deep networks

• What if we go deeper?

… …

…… …

ℎ1 ℎ2 ℎ𝐿

Figure fromDeep learning, by Goodfellow, Bengio, Courville.Dark boxes are things to be learned.

Motivation II: neurons

Motivation: neurons

Figure from Wikipedia

Motivation: abstract neuron model

• Neuron activated when the correlation

between the input and a pattern 𝜃

exceeds some threshold 𝑏

• 𝑦 = threshold(𝜃𝑇𝑥 − 𝑏)

or 𝑦 = 𝑟(𝜃𝑇𝑥 − 𝑏)

• 𝑟(⋅) called activation function 𝑦

𝑥𝑑

Motivation: artificial neural networks

• Put into layers: feedforward deep networks

… …

…… …

ℎ1 ℎ2 ℎ𝐿

Components in Feedforward networks

Components

• Representations:• Input

• Hidden variables

• Layers/weights:• Hidden layers

• Output layer

Components

… …

…… …

Hidden variables ℎ1 ℎ2 ℎ𝐿

Input 𝑥

First layer Output layer

• Represented as a vector

• Sometimes require some

preprocessing, e.g.,• Subtract mean

• Normalize to [-1,1]

Expand

Output layers

• Regression: 𝑦 = 𝑤𝑇ℎ + 𝑏

• Linear units: no nonlinearity

Output layer

Output layers

• Multi-dimensional regression: 𝑦 = 𝑊𝑇ℎ + 𝑏

• Linear units: no nonlinearity

Output layer

Output layers

• Binary classification: 𝑦 = 𝜎(𝑤𝑇ℎ + 𝑏)

• Corresponds to using logistic regression on ℎ

Output layer

Output layers

• Multi-class classification:

• 𝑦 = softmax 𝑧 where 𝑧 = 𝑊𝑇ℎ + 𝑏

• Corresponds to using multi-class

logistic regression on ℎ

Output layer

Hidden layers

• Neuron take weighted linear combination of the previous layer

• So can think of outputting one value for the next layer

……

ℎ𝑖 ℎ𝑖+1

Hidden layers

• 𝑦 = 𝑟(𝑤𝑇𝑥 + 𝑏)

• Typical activation function 𝑟• Threshold t 𝑧 = 𝕀[𝑧 ≥ 0]

• Sigmoid 𝜎 𝑧 = 1/ 1 + exp(−𝑧)

• Tanh tanh 𝑧 = 2𝜎 2𝑧 − 1

𝑦𝑥𝑟(⋅)

Hidden layers

• Problem: saturation

𝑦𝑥𝑟(⋅)

Figure borrowed from Pattern Recognition and Machine Learning, Bishop

Too small gradient

Hidden layers

• Activation function ReLU (rectified linear unit)• ReLU 𝑧 = max{𝑧, 0}

Figure from Deep learning, by Goodfellow, Bengio, Courville.

Hidden layers

• Activation function ReLU (rectified linear unit)• ReLU 𝑧 = max{𝑧, 0}

Gradient 0

Gradient 1

Hidden layers

• Generalizations of ReLU gReLU 𝑧 = max 𝑧, 0 + 𝛼min{𝑧, 0}

• Leaky-ReLU 𝑧 = max{𝑧, 0} + 0.01min{𝑧, 0}

• Parametric-ReLU 𝑧 : 𝛼 learnable

gReLU 𝑧

deep learning basics lecture 1: feedforward · pdf filedeep learning basics lecture 1:...

Documents

the basics: from neuron to neuron to the brain ·...

mirror neurons the human mirror neuron...

industrial feedforward control technology: a...

multi-loop pi based control strategies on the activated ......

image processing with spiking neuron networksthe network...

feedforward and ratio control feedforward and ratio...

the basics: from neuron to neuron to the brain

advanced feedforward & learning control...

el feedforward

feedforward 160314 interesting_talks

analysis of model based feedforward for motion systems ·...

neuron · neuron

feedforward at rsp

engagement of pulvino-cortical feedforward and feedback...

5_1 feedforward control

neuron. neuroscience basics/a1-5. general... · 20/04/2019...

multilayer perceptron = feedforward neural network€¦ ·...

lqr feedforward

neuron-spectrum-psg neuron-spectrum-5

deep feedforward networks: overviewsrihari/cse676/6.1...