short trip in the valley of deep learning · convolution operation (filtering, sliding) • kernels...
TRANSCRIPT
![Page 1: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/1.jpg)
Short Trip In The Valley of Deep LearningPantelis Vlachas, Guido Novati
Computational Science and Engineering Lab ETH Zürich
![Page 2: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/2.jpg)
Motivation - What is machine learning?
2
Classical Machine Learning
data regression/classification/etc. result
data
feature extraction
feature extraction + regression/classification/etc.
Deep Learning
result
![Page 3: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/3.jpg)
Deep Learning
• Backpropagation
• Backpropagation through time (BPTT)
• Variational Inference (Bayesian)
• GEMM (General Matrix to Matrix Multiplication)
Sophisticated Architectures Algorithms
LeNet of Yann LeCun et al., 1998
LSTM, 1997 GRU, 2014
• Graphical Processing Units (Hardware)
Hardware
![Page 4: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/4.jpg)
Convolutional Neural Networks
• Heavily based on GEMM (General Matrix to Matrix Multiplication)
• Parametric models suited for image processing (classification, object detection, etc.)
• Applications in self-driving cars, robotics, healthcare, physics, image and video recognition, recommender systems, image classification, medical image analysis, natural language processing, financial time series, etc.
LeNet of Yann LeCun et al., 1998
![Page 5: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/5.jpg)
Biological Intuition
• Very roughly biological brains have neurons that activate when they recognize a triggering pattern in their input
• Each unit does “simple” pattern recognition
• Complexity emerges from sheer numbers
![Page 6: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/6.jpg)
Convolutional Model of a part of a Fruit-Fly’s brainJonathan Schneider et al., 2018
![Page 7: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/7.jpg)
Convolutional Neural Network
![Page 8: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/8.jpg)
What is a Convolution ?
Input is a matrix : dIY
× dIX× dIC
dIX
dIY
dIC
Parameters are a tensor : dKY
× dKX× dIC
× dKC
dKY
dKX
dIC
KC = 1 KC = 2 KC = 3 KC = 4 …(We have filters)dKC
dOX
dOY
dKC
Output is a matrix : dOy
× dOX× dKC
• Mapping an image to another image
• Feature sizes , , can be any numbers
• Parameters are called “filters” or “kernels”
dICdKC
![Page 9: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/9.jpg)
Convolution Operation (filtering, sliding)
• Kernels
• Sliding a kernel along the spatial dimensions , and (Iterating along , and )
• At each position and for each filter in the dimension, we compute the scalar product
between the filter (of size ) and a “patch” of the image of size
• The output of the scalar product is a number which is written in a single color pixel (channel) of the output image
dKY× dKX
× dIC× dKC
IX IY IX IY
KC
dKY× dKX
× dICdKY
× dKX× dIC
![Page 10: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/10.jpg)
1D Convolution, 1D Filter
1-1201
10-1
-1xxx
+
![Page 11: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/11.jpg)
1D Convolution, 1D Filter
1-1201
10-1
-1xxx
+ -1
![Page 12: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/12.jpg)
1D Convolution, 1D Filter
1-1201
10-1
-1
xxx
+-11
![Page 13: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/13.jpg)
Padding
• What if we want to keep the output equal to the input in the spatial dimensions ?
• Size of the image is extended in both directions by and
• Usually zero padding
dIX= dOX
, dIY= dOY
dPYdPX
1-1201
10-1
xxx
+
-1-11
0
0
1
010-1
xxx
+
![Page 14: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/14.jpg)
dS = 2
Stride
• Convolution does not have to be computed by increments of pixel
• Sride (skip, stepping) and
• Here padding , stride
1dSY
dSX
1 2
1-1201
0
0
10-1
xxx
+ 1
-110-1
xxx
+dS = 2
010-1
xxx
+
![Page 15: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/15.jpg)
2-D Convolutions
• If the input is
• The filters (kernels) are
• With strides ,
• Padding ,
• The output image has size:
dIY× dIX
× dIC
dKY× dKX
× dIC× dKC
dSYdSX
dPYdPX
dOY=
dIY− dKY
+ 2dPY
dSY
+ 1
dOX=
dIX− dKX
+ 2dPX
dSX
+ 1
dOC= dKC
![Page 16: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/16.jpg)
Convolutional Neural Network
![Page 17: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/17.jpg)
Pooling Operations (Subsampling)
![Page 18: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/18.jpg)
Convolutional Neural Network
![Page 19: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/19.jpg)
Short detour in classification…
19
![Page 20: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/20.jpg)
Classification
ALGORITHM
INPUT
object, image, etc.
OUTPUT
0/1 Binary
![Page 21: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/21.jpg)
Logistic Regression
ALGORITHMW
OUTPUT
y ∈ {0,1}x ∈ ℝdx
INPUT
• Training examples
• Testing examples {(x1, y1), …, (xNtrain, yNtrain)}
{(x1, y1), …, (xNtest, yNtest)}
![Page 22: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/22.jpg)
Logistic Regression
ALGORITHMx ∈ ℝdxW
y = fW(x) ∈ ℝ
x1
x2
σ(x) =1
1 + e−x
• Output is a real number (one class)
• Ideally we want
• Training data ,
• Model 1 : linear regression :
• Model 2 : Sigmoid output layer :
• , ?
• Cross entropy loss:
y = fW(x)y = P(y = 1 | x)
{(x1, y1), …, (xNtrain, yNtrain)} yn ∈ {0,1}y = f w(x) = wTx + b
y = f w(x) = σ(wTx + b)W⋆ = argmin
WL(y, y) L (y, y) =
12 (y − y)2
L (y, y) = − (y log(y) + (1 − y) log(1 − y))
if
• • Maximum log likelihood ! if
• • Maximum log likelihood !
y = 1L (y, y) = − y log(y) = − log P(y = 1 | x)
y = 0L (y, y) = − log(1 − y) = − log P(y = 0 | x)
![Page 23: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/23.jpg)
Logistic Regression
ALGORITHMx ∈ ℝdxW
y = fW(x) ∈ ℝ
x1
x2
σ(x) =1
1 + e−x
• Output is a real number (one class)
• Ideally we want
• Training data ,
• Model 2 : Sigmoid output layer :
•
• Cross entropy loss:
• How can we classify an object to more than 2 classes ?
y = fW(x)y = P(y = 1 | x)
{(x1, y1), …, (xNtrain, yNtrain)} yn ∈ {0,1}y = f w(x) = σ(wTx + b)
W⋆ = argminW
L(y, y)
L (y, y) = − (y log(y) + (1 − y) log(1 − y))
![Page 24: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/24.jpg)
Classification
ALGORITHM
INPUT
object, image, etc.
OUTPUT
0
Multi-class
100
Dog
Cat
Mouse
Elephant
![Page 25: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/25.jpg)
Back in CNNs…
25
![Page 26: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/26.jpg)
Classification on Images
Input to the network is an image
CNN
{ 0.02 0.03 0.01 0.01 0.70 0.02 0.02 0.01 0.06 0.12 }
Probability that image is 0 1 2 3 4 5 6 7 8 9
Output of the network is the probability of the input image being one of the digits (belonging to one of the target classes)
![Page 27: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/27.jpg)
Classification Layer
• SoftMax Output layer
• Sum of outputs is equal to 1
• Represent probabilities for the target classes
• Loss function ? Cross entropy loss:
•
• Measure of dissimilarity between distributions
f(xi) =exp(xi)
∑10j=1 exp(xj)
L( f, f ) = −10
∑i=1
fi log f(xi)
f = [0, 0, 0, 0, 1, 0, 0, 0, 0, 0] ⟺
hl o = softmax(Whl + b
x
)
![Page 28: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/28.jpg)
Automatic Feature DetectionHigh level features
Low level features (edges, circels, mesh, text etc.)
Second layer features
![Page 29: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/29.jpg)
Architectures
• LeNet by Yann LeCun et al. in 1998 • Alex-Net by Alex Krizhevsky, et. al. 2012 • VGG Net by Oxford’s Visual Geometry Group 2014 • GoogLeNet by Christian Szegedy, et. al. 2014 • ResNet (Residual Network) by Kaiming He, et. al. 2015 • DenseNet by Gao Huang, et. al. 2016
LeNet of Yann LeCun et al., 1998
Alex-Net of Alex Krizhevsky et al., 2012
![Page 30: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/30.jpg)
Heuristics for Deep Learning
Data Preprocessing • Scaling (e.g. zero mean, unit variance) • Random cropping • Flipping data • PCA whitening • Noise
Initialization of Weights • Scale the weights of each layers bu the
inverse of the square root of number of
input neurons
• Xavier initialization
1Nl
Activation Functions • tanh
• sigmoid
• ReLU
• ELU
![Page 31: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/31.jpg)
Regularization
DropoutFull-connected
![Page 32: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/32.jpg)
Operating on Sequences• In many applications cases, the data have temporal order (language, time series, etc.) • Fully-connected networks, and CNNs do not take into account this feature and have fixed input
and output sizes • Recurrent Neural Networks: networks with feedback loops
![Page 33: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/33.jpg)
Operating on Sequences
xt+1
ht+1
RNN
xt�1
ht�1
RNN
xt
tanh
ht�1 ht
ht
![Page 34: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/34.jpg)
hT
yT
x3
RNNW h3
y3
x2
RNNW h2h1
Weight Sharing in Time
h0
x1
RNNW
W, b
y1 y2
![Page 35: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/35.jpg)
hT
yT
x3
RNNW h3
y3
x2
RNNW h2h1
Weight Sharing in Time
h0
x1
RNNW
W, b
y1 y2
y1
L1 = | y1 − y2 |2
y2 y3 yT
L2 L3 LT
L =1T
T
∑t=1
Lt
![Page 36: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/36.jpg)
L
Backpropagation Through Time (BPTT)
FORWARD PASS - entire sequence, compute loss
BACKWARD PASS - entire sequence, comppute gradients
![Page 37: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/37.jpg)
L
BACKWARD PASS - on some smaller amount of steps
L
Truncated BPTT
“Carry” hidden state forever
BACKWARD PASS - on some smaller amount of steps
![Page 38: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/38.jpg)
tanh(W ·)
h1
x1
h0
tanh(W ·)
h2
x2
h1
tanh(W ·)
h3
h2
x3
tanh(W ·)
h4
h3
x4
h4
Vanishing Gradients Problem
• Computing the gradient of the loss w.r.t. involves many factors of and repeated
• In case of a linear activation and no bias, you would have factors like
• The gradient vanishes (explodes) if largest singular value ( )
h0 W tanh
W(W(…(Wh0)))< 1 > 1
![Page 39: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/39.jpg)
Gating Architectures
ot
ht�1�
got
ht
ht
ct�1 ct�
gft
git
�
+
tanh
�tanh
ct
�
�
Long Short-Term Memory Cell
ot
ht�1
�
�
rt
�
1 � ·
zt
+ht
ht
�
tanh
ht�
Gated Recurrent Unit
![Page 40: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/40.jpg)
Gating Architectures
�
got
�
gft
git
�
+
tanh
�tanh
ct
�
�
�
got
�
gft
git
�
+
tanh
�tanh
ct
�
�
�
got
�
gft
git
�
+
tanh
�tanh
ct
�
�
�
got
�
gft
git
�
+
tanh
�tanh
ct
�
�
x1 x2 x3 x4
h1 h2 h3 h4
C0 C1 C2 C3 C4
h1 h2 h3h0 h4
Uninterrupted gradient flow !
![Page 41: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/41.jpg)
RNN structure
One-To-One e.g. classification
Many-To-One e.g. sentiment
analysis
One-To-Many e.g. Image captioning,
video generation
Many-To-Many e.g. Machine translation,
time-series prediction
![Page 42: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/42.jpg)
Prediction of Chaotic Dynamics
• Forecasting the state of the Kuramoto-Sivashinsky equation
∂u∂t
= − v∂4u∂x4
−∂2u∂x2
− u∂u∂x
• RNNs can be chaotic !
• They are dynamical systems
![Page 43: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/43.jpg)
Word embeddings
• Words can be represented by numbers (vectors) that encode semantic meaning • E.g. Word2Vec • Input: LARGE CORPUS OF TEXT • Learns a vector space where each word is assigned a vector • How ? Predict a word (target) from its neighboring words (context) or vice versa • Encodes context information
x1
x2
France
ParisGreece
Athens (closest word)
![Page 44: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/44.jpg)
Word embeddings
![Page 45: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/45.jpg)
Applications The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches, 2018
Object detection Object localisation Image/Video segmentation
Autonomous driving
Brain cancer detection
Scin cancer recognition
Speech recognition Machine translation Image/Video captioning Medicine/Biology
![Page 46: Short Trip In The Valley of Deep Learning · Convolution Operation (filtering, sliding) • Kernels • Sliding a kernel along the spatial dimensions , and (Iterating along , and](https://reader034.vdocument.in/reader034/viewer/2022050406/5f8326ea018a8f33a5253924/html5/thumbnails/46.jpg)
Outlook
46
Why Deep Learning ? • Universal approach for learning problems • Robust approach, does not require “much” expert
knowledge • Generalization, Scalability
Challenges ? • Big data and scalability • Generalization, transfer learning, multi-task learning • Generate new “artificial” datasets, for applications where data is scarce (Generative
models) • Understaning/Explainable models, incorporating physics • Causality and not plain pattern recognition/correlations • Energy efficient implementations on mobiles/FPGAs, etc.
Amount of data
Performance
Classical ML
Deep Learning