computer visioncvml.ajou.ac.kr/wiki/images/3/39/강의2_cv_and_nn.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4...

117
COMPUTER VISION

Upload: others

Post on 13-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

COMPUTER VISION

Page 2: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Tasks

ADAS

Self Driving

Localizati

onPerception

Planning/

Control

Driver

state

Vehicle

Diagnosis

Smart

factory

Meth

od

s

Trad

ition

al

Non-machine LearningGPS,

SLAM

Optimal

control

Mach

ine-L

earnin

gb

ased m

etho

d

Su

perv

ised

SVM

MLP

Pedestrian

detection(HOG+SVM)

Deep

-Learn

ing

based

CNN

Detection/

Segmentat

ion/Classif

ication

End-to-

end

Learning

RNN

(LSTM)

Dry/wet

road

classificati

on

End-to-

end

Learning

Behavior

Prediction/

Driver

identificati

on

*

DNN * *

Reinforcement *

Unsupervised *

Page 3: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Vision tasks

Object

recognition

Object

detection

Semantic

segmentation

Object

tracking

Visual

SLAM

Page 4: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Semantic segmentation

• Building/road/sky/object/grass/water/tree

Clement Farabet, Camille Couprie, Laurent Najman and Yann LeCun: Learning Hierarchical Features for Scene Labeling, IEEE Transactions

on Pattern Analysis and Machine Intelligence, August, 2013

Page 5: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Object tracking

Yi Wu, Jongwoo Lim, and Ming-Hsuan Yang, "Object Tracking Benchmark",

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015

Page 6: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Visual SLAM

Page 7: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Tasks

ADAS

Self Driving

Localizati

onPerception

Planning/

Control

Driver

state

Vehicle

Diagnosis

Smart

factory

Meth

od

s

Trad

ition

al

Non-machine LearningGPS,

SLAM

Optimal

control

Mach

ine-L

earnin

gb

ased m

etho

d

Su

perv

ised

SVM

MLP

Pedestrian

detection(HOG+SVM)

Deep

-Learn

ing

based

CNN

Detection/

Segmentat

ion/Classif

ication

End-to-

end

Learning

RNN

(LSTM)

Dry/wet

road

classificati

on

End-to-

end

Learning

Behavior

Prediction/

Driver

identificati

on

*

DNN * *

Reinforcement *

Unsupervised *

Page 8: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

ORB-SLAM in the KITTI dataset

• ORB-SLAM2 is a real-time SLAM library

for Monocular, Stereo and RGB-D cameras that computes the

camera trajectory and a sparse 3D reconstruction

Page 9: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

COMPUTER VISION

IMAGE UNDERSTANDING …

Page 10: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Why understanding images is hard

Image

Very many

sources of

variability

From J. Winn, MSR

Page 11: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Sources of image variability

Scene type

Scene geometry

Street scene

From J. Winn, MSR

Page 12: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Sources of image variability

Scene type

Scene geometry

Object classes

Street scene

Sky

Building×3

Road

Sidewalk

Tree×3

Person×4

Bicycle

Car×5

Bench

Bollard

From J. Winn, MSR

Page 13: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Sources of image variability

Street scene

Sky

Building×3

Road

Sidewalk

Tree×3

Person×4

Bicycle

Car×5

Bench

Bollard

Scene type

Scene geometry

Object classes

Object position

Object orientation

From J. Winn, MSR

Page 14: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Sources of image variability

Scene type

Scene geometry

Object classes

Object position

Object orientation

Scene type

Scene geometry

Object classes

Object position

Object orientation

Object shape

Street scene

From J. Winn, MSR

Page 15: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Scene type

Scene geometry

Object classes

Object position

Object orientation

Object shape

Sources of image variability

Scene type

Scene geometry

Object classes

Object position

Object orientation

Object shape

Depth/occlusions

From J. Winn, MSR

Page 16: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Sources of image variability

Scene type

Scene geometry

Object classes

Object position

Object orientation

Object shape

Depth/occlusions

Scene type

Scene geometry

Object classes

Object position

Object orientation

Object shape

Depth/occlusions

Object appearance

From J. Winn, MSR

Page 17: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Sources of image variability

Scene type

Scene geometry

Object classes

Object position

Object orientation

Object shape

Depth/occlusions

Object appearance

Scene type

Scene geometry

Object classes

Object position

Object orientation

Object shape

Depth/occlusions

Object appearance

Illumination

Shadows

From J. Winn, MSR

Page 18: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Sources of image variability

Scene type

Scene geometry

Object classes

Object position

Object orientation

Object shape

Depth/occlusions

Object appearance

Illumination

Shadows

From J. Winn, MSR

Page 19: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Sources of image variability

Scene type

Scene geometry

Object classes

Object position

Object orientation

Object shape

Depth/occlusions

Object appearance

Illumination

Shadows

Motion blur

Camera effects

From J. Winn, MSR

Page 20: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Computer vision problems

Scene type

Scene geometry

Object classes

Object position

Object orientation

Object shape

Depth/occlusions

Object appearance

Illumination

Shadows

Motion blur

Camera effects

Page 21: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Now you see me

Page 22: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function
Page 23: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function
Page 24: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function
Page 25: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function
Page 26: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function
Page 27: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function
Page 28: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function
Page 29: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function
Page 30: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function
Page 31: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Moravec’s Paradox

• The main lesson of 35 years of AI research is that the

hard problems are easy and the easy problems are

hard. The mental abilities of a four-year-old that we take

for granted – recognizing a face, lifting a pencil, walking

across a room, answering a question – in fact solve some

of the hardest engineering problems ever conceived... As

the new generation of intelligent devices appears, it will be

the stock analysts and petrochemical engineers and parole

board members who are in danger of being replaced by

machines. The gardeners, receptionists, and cooks are

secure in their jobs for decades to come.

• Pinker, Steven (September 4, 2007) [1994], The Language Instinct,

Perennial Modern Classics, Harper, ISBN 0-06-133646-7

Page 32: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

MACHINE LEARNINGNeural network을중심으로

Page 33: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Tasks

ADAS

Self Driving

Localizati

onPerception

Planning/

Control

Driver

state

Vehicle

Diagnosis

Smart

factory

Meth

od

s

Trad

ition

al

Non-machine LearningGPS,

SLAM

Optimal

control

Mach

ine-L

earnin

gb

ased m

etho

d

Su

perv

ised

MLPPedestrian

detection(HOG+SVM)

Deep

-Learn

ing

based

CNN

Detection/

Segmentat

ion/Classif

ication

End-to-

end

Learning

RNN

(LSTM)

Dry/wet

road

classificati

on

End-to-

end

Learning

DNN

Reinforcement

Unsupervised

Page 34: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

LINEAR PERCEPTRON

Page 35: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

뉴런: 신경망의기본단위

Page 36: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Basic model

• The first learning machine: the Perceptron (built in 1960)

• The perceptron was a linear classifier

𝑦 = sign(wTx + b)

𝑦 = +1 if 𝑤1𝑥1 + 𝑤2𝑥2 +⋯+𝑤𝑛𝑥𝑛 + 𝑏 > 0−1 otherwise

Page 37: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Linear Perceptron

• The goal: Find the best line (or hyper-plane) to separate the training data.

• In two dimensions, the equation of the line is given by a line:

• 𝑎𝑥 + 𝑏𝑦 + 𝑐 = 0

• A better notation for n dimensions: treat each data point and the coefficients as vectors. Then the equation is given by:

• w⊤x + b = 0

Class (+1)Class (-1)

Page 38: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

예시: 연어와농어의구별

Page 39: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

예시: 연어와농어의구별

밝기 (𝑙)

폭 (𝑤) 7.3𝑙 + 3.4𝑤 = 100

𝑙

𝑤

7.3𝑙 + 3.4𝑤 ≥ 100

7.3𝑙 + 3.4𝑤 < 100

농어

연어

Page 40: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

4 2 2 4

0.2

0.4

0.6

0.8

1.0

예시: 연어와농어의구별

𝑙

𝑤

7.3 × 𝑙

3.4 × 𝑤

Σ

100

연어/농어

Page 41: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Artificial Neuron

𝑤𝑇𝑥 𝑔(𝑤𝑇𝑥)

Activation function (non-linear)

Page 42: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function
Page 43: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Artificial Neuron

• However, it cannot solve non-linearly-separable problems

Page 44: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

MULTI-LAYER PERCEPTRON

Page 45: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function
Page 46: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Multi-layer Neural Network

• 1st Layer

• h1 = 𝑔(𝑊1x + 𝑏1)

• 2nd Layer

• h2 = 𝑔(𝑊2h1 + 𝑏2)

• …..

• Output layer

• o = softmax(Wnhn−1 + bn)

=

hk

hk−1

𝑊𝑘

Page 47: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

4 2 2 4

1.0

0.5

0.5

1.0

4 2 2 4

0.2

0.4

0.6

0.8

1.0

Activation function 𝑔(⋅)

• Sigmoid activation function

• Squashes the neuron’s pre-activation between 0 and 1

• Always positive/Bounded/Strictly increasing

• Hyperbolic tangent (‘‘tanh’’) activation function

• Squashes the neuron’s pre-activation between -1 and 1

• Bounded/Strictly increasing

𝑔 𝑥 =1

1 + exp(−𝑥)

𝑔 𝑥 = tanh 𝑥 =exp 𝑥 − exp(−𝑥)

exp(𝑥) + exp(−𝑥)

Page 48: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Activation function 𝑔(⋅)

• Rectified linear activation function (ReLU)

• Bounded below by 0

• Not upper bounded

• Strictly increasing

𝑔 𝑎 = rectlin 𝑎 = max 0, 𝑎

Page 49: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Soft-max activation function at the output

• For multi-class classification

• We need multiple outputs (1 output per class)

• We use the softmax activation function at the output

• strictly positive

• sums to one

𝑂 𝐚 = softmax 𝐚 =

exp 𝑎1 𝑐 exp 𝑎𝑐exp 𝑎2

𝑐 exp 𝑎𝑐⋮⋮

exp 𝑎𝑐 𝑐 exp 𝑎𝑐

예시: 𝑎, 𝑏, 𝑐 →𝑒𝑎

𝑒𝑎+𝑒𝑏+𝑒𝑐,

𝑒𝑏

𝑒𝑎+𝑒𝑏+𝑒𝑐,

𝑒𝑐

𝑒𝑎+𝑒𝑏+𝑒𝑐

Page 50: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Example (character recognition example)

𝑥 ∈ 0,1 10×14

𝑝(𝑐 = "0"|𝑥)

𝑝(𝑐 = "1"|𝑥)

𝑝(𝑐 = "9"|𝑥)

140 inputs

Layer 1

with 12 perceptrons

Layer 2

with 10 perceptrons

Each having 12 inputs𝑊1 ∈ ℜ140×12

𝑊2 ∈ ℜ12×10

softm

ax

𝑏1 ∈ ℜ12

𝑏2 ∈ ℜ10

Page 51: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

TRAINING OF MULTI-LAYER

PERCEPTRON

Page 52: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Training: Loss function

• Cross entropy (classification)

• 𝑦, 𝑦 ∈ 0,1 𝑁, 𝑖=1𝑦𝑖 = 1, 𝑖=1 𝑦𝑖 = 1

• 𝐿 = − 𝑦𝑖log 𝑦𝑖

• Square Euclidean distance (regression)

• 𝑦, 𝑦 ∈ ℜ𝑁

• 𝐿 =1

2 𝑦𝑖 − 𝑦𝑖

2

Error

Page 53: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Cross Entropy (예시)

• Label:

• [𝑦1 𝑦2 𝑦3] = [1,0,0] : class 1

• [𝑦1 𝑦2 𝑦3] = [0,1,0] : class 2

• [𝑦1 𝑦2 𝑦3] = [0,0,1] : class 3

• 예시• Network output: 𝑦 = [ 𝑦1 𝑦2 𝑦3 ] = [0.3, 0.6, 0.1]

• Loss

• If ground truth is class 1 (i.e., y = [1,0,0]) → −log 0.3 = 1.204

• If ground truth is class 2 (i.e., y = [0,1,0]) → −log 0.6 = 0.511

• If ground truth is class 3 (i.e., y = [0,0,1]) → −log 0.1 = 2.303

• Network output: 𝑦 = [ 𝑦1 𝑦2 𝑦3 ]= [0.01, 0.98, 0.01]

• Loss

• If ground truth is class 1 (i.e., y = [1,0,0]) → −log 0.01 = 4.605

• If ground truth is class 2 (i.e., y = [0,1,0]) → −log 0.98 = 0.020

• If ground truth is class 3 (i.e., y = [0,0,1]) → −log 0.01 = 4.605

𝐿 = − 𝑦𝑖log 𝑦𝑖

Page 54: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Forward/Backward propagation

• Chain rule

𝑊𝑛𝑒𝑤 = 𝑊𝑜𝑙𝑑 − 𝜂𝑑𝐿

𝑑𝑊

Page 55: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function
Page 56: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function
Page 57: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Forward/Backward propagation

Forward propagation

Backward propagation

𝑊

y

𝜕𝐿

𝜕𝑦

𝜕𝐿

𝜕x

𝜕𝐿

𝜕𝑊=𝜕𝐿

𝜕y

𝜕y

𝜕𝑊

y = 𝑔(𝑊x + 𝑏)

𝜕y

𝜕x

𝜕y

𝜕𝑊

𝜕𝐿

𝜕x=

𝜕𝐿

𝜕y⋅𝜕y

𝜕x

x

Page 58: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

FEED-FORWARD NEURAL

NETWORK (예시)

Page 59: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Forward propagation

The 1st hidden layer The 2nd hidden layer

Soft

Max

Lay

er 1

Lay

er 2

Output

Page 60: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Forward propagation matrix repr.

The 1st hidden layer The 2nd hidden layer

Soft

Max

Lay

er 1

Lay

er 2

Output

Page 61: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

BACK-PROPAGATION

ALGORITHM (예시)

Page 62: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Forward propagation

(block-based representation)

Layer 1

Layer 1

Layer 2

Layer 2

Soft

Max

Output

Page 63: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Backward propagation; 2nd layer

Ground

Truth

VS

Output

Soft

Max

Layer 1 Layer 2

• Error propagation

Page 64: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Backward propagation; 2nd layer

• Weight update• Error propagation

Ground

Truth

VS

Output

Soft

Max

Page 65: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Backward propagation; 1st layer

• Error propagation

Layer 1

Ground

Truth

VS

Output

Soft

Max

Layer 2

Page 66: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Backward propagation; 1st layer

• Error propagationLayer 1 Layer 2

• Weight update

Ground

Truth

VS

Output

Soft

Max

Page 67: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

TENSORFLOW실습

Page 68: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

TENSORFLOW INTRODUCTION

Page 69: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

What is TensorFlow?

• TensorFlow is a deep learning library open-sourced by Google.

• TensorFlow provides primitives for defining functions on

tensors and automatically computing their derivatives.

• Tensor is a multidimensional array of numbers

Page 70: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Design Choice

• Network structures

• The mathematical relationship between inputs and outputs

• Loss function

• Optimization

• Optimization methods

• Hyper-parameters (Batch size, Learning rate, … )

Page 71: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Classification vs RegressionClassification Regression

?Expected

Price

1600 sq ft

Color

Weight

Apple

Orange

?

The variable we are trying to predict is

DISCRETE

The variable we are trying to predict is

CONTINUOUS

Housing

Prices

Page 72: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

MNIST dataset (classification example)

• handwritten digits

• a training set of 60,000 examples

• 28x28 images

Page 73: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function
Page 74: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function
Page 75: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Classification Example Code

• Classification Example

𝑊 ∈ ℜ10×784=

Page 76: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Classification Example Code

=

0

0

1

2

3

4

5

6

7

8

9

1

784 = 282

28

28

Page 77: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Regression Example Code

• Regression Example

Page 78: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

VALIDATION

Page 79: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Tasks

ADAS

Self Driving

Localizati

onPerception

Planning/

Control

Driver

state

Vehicle

Diagnosis

Smart

factory

Meth

od

s

Trad

ition

al

Non-machine LearningGPS,

SLAM

Optimal

control

Mach

ine-L

earnin

gb

ased m

etho

d

Su

perv

ised

MLPPedestrian

detection(HOG+SVM)

Deep

-Learn

ing

based

CNN

Detection/

Segmentat

ion/Classif

ication

End-to-

end

Learning

RNN

(LSTM)

Dry/wet

road

classificati

on

End-to-

end

Learning

DNN

Reinforcement

Unsupervised

Page 80: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Validation set approach

• Divide the data in three parts:

• training, validation (development), and test. We use the train and

validation data to select the best model and the test data to assess the

chosen model.

WorldWorld

Samples trainingvalid

ation test

Page 81: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Validation set approach

• Training set

• To fit the models

• Validation set

• To estimate prediction error for model selection

• Test set

• To assess of the generalization error of the final chose model

Train Validation Test

Page 82: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

k-fold cross validation

• We partition the data into 𝐾 parts. For the 𝑘 −th part, we fit the

model to the other 𝐾 − 1 parts of the data, and calculate the

prediction error of the fitted model when predicting the 𝑘th part

of the data. We do this for 𝑘 = 1,2,⋯ , 𝐾 and combine the 𝐾estimates

𝐾 = 7

Page 83: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

k-Fold cross validation

WorldWorld

Samples

Page 84: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Leave-one-out cross validation

Page 85: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Development Cycle

Page 86: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

전통적인접근법

Page 87: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Tasks

ADAS

Self Driving

Localizati

onPerception

Planning/

Control

Driver

state

Vehicle

Diagnosis

Smart

factory

Meth

od

s

Trad

ition

al

Non-machine LearningGPS,

SLAM

Optimal

control

Mach

ine-L

earnin

gb

ased m

etho

d

Su

perv

ised

MLPPedestrian

detection(HOG+SVM)

Deep

-Learn

ing

based

CNN

Detection/

Segmentat

ion/Classif

ication

End-to-

end

Learning

RNN

(LSTM)

Dry/wet

road

classificati

on

End-to-

end

Learning

DNN

Reinforcement

Unsupervised

Page 88: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Conventional approach

• Image classification

“Motocycle”

Slides from “Andrew Ng”

Page 89: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Why is this hard?

Slides from “Andrew Ng”

Page 90: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Feature representation

Slides from “Andrew Ng”

Classification

Algorithm

Page 91: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Feature representation

Classification

Algorithm

Slides from “Andrew Ng”

Page 92: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Example of Feature Representation

• But, … we don’t have a handlebars detector. So, researchers try to hand-

design features to capture various statistical properties of the image

Slides from “Andrew Ng”

Page 93: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Feature representation

Slides from “Andrew Ng”

Classification

Algorithm

Page 94: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Computer vision features

Slides from “Andrew Ng”

Page 95: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Audio features

Page 96: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Traditional pattern recognition

• Fixed/engineered feature + trainable classifier

Page 97: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

CASE STUDY:

PEDESTRIAN DETECTOR

Page 98: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Tasks

ADAS

Self Driving

Localizati

onPerception

Planning/

Control

Driver

state

Vehicle

Diagnosis

Smart

factory

Meth

od

s

Trad

ition

al

Non-machine LearningGPS,

SLAM

Optimal

control

Mach

ine-L

earnin

gb

ased m

etho

d

Su

perv

ised

MLPPedestrian

detection(HOG+SVM)

Deep

-Learn

ing

based

CNN

Detection/

Segmentat

ion/Classif

ication

End-to-

end

Learning

RNN

(LSTM)

Dry/wet

road

classificati

on

End-to-

end

Learning

DNN

Reinforcement

Unsupervised

Page 99: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Detection problem (binary) classification problem

• Sliding window scheme

Page 100: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Each window is separately classified

Page 101: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

• 64x128 images of humans cropped from a varied set of personal

photos• Positive data – 1239 positive window examples (reflections->2478)

• Negative data – 1218 person-free training photos (12180 patches)

Training data

Page 102: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

• A preliminary detector

• Trained with (2478) vs (12180) samples

• Retraining

• With augmented data set

• initial 12180 + hard examples

• Hard examples

• 1218 negative training photos are searched exhaustively for false positive

Training

Page 103: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Feature: histogram of oriented gradients (HOG)

Page 104: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function
Page 105: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Averaged examples

Page 106: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Algorithm

Page 107: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Dalal, Navneet, and Bill Triggs. "Histograms of oriented gradients for human detection." 2005 IEEE Computer Society Conference on

Computer Vision and Pattern Recognition (CVPR'05). Vol. 1. IEEE, 2005.

Page 108: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Learned model

Page 109: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

BACKUPS

Page 110: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

http://playground.tensorflow.org/

Page 111: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Visualization MNIST with t-SNE

Page 112: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Why are Deep Architectures hard to train?

• Vanishing gradient problem in back-

propagation.

• Local Optimum (saddle points?) Issue in

Neural Nets

• For Deep Architectures, back-propagation is

apparently getting a local optimum (saddle

points?) that does not generalize well

Page 113: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Empirical Results: Poor performance of Backpropagation on

Deep Neural Nets [Erhan et al., 2009]

• MNIST digit classification task; 400 trials (random seed)

• Each layer: initialize weights with random numbers

• Although 𝐿 + 1 layers is more expressive, worse error than 𝐿layers.

Page 114: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function
Page 115: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

115

An example

Page 116: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Feature extraction

Width feature Lightness feature

116

Page 117: COMPUTER VISIONcvml.ajou.ac.kr/wiki/images/3/39/강의2_CV_and_NN.pdf4 2 2 4 1.0 0.5 0.5 1.0 4 2 2 4 0.2 0.4 0.6 0.8 1.0 Activation function 𝑔(⋅) •Sigmoid activation function

Classification

Simple model Complex model

117