lecture 12 - hacettepe Üniversitesiaykut/classes/... · lecture 12: −computational graph...

69
Lecture 12: Computational Graph Backpropagation Aykut Erdem November 2018 Hacettepe University

Upload: others

Post on 22-Jan-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Lecture 12:−Computational Graph−Backpropagation

Aykut Erdem November 2018Hacettepe University

Page 2: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Last time… Multilayer Perceptron

!2

• Layer Representation

• (typically) iterate betweenlinear mapping Wx and nonlinear function

• Loss functionto measure quality ofestimate so far

yi = Wixi

xi+1 = �(yi)

x1

x2

x3

x4

y

W1

W2

W3

W4

l(y, yi)

slide by Alex Smola

Page 3: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Last time… Forward Pass

• Output of the network can be written as:

(j indexing hidden units, k indexing the output units, D number of inputs) • Activation functions f , g : sigmoid/logistic, tanh, or rectified linear (ReLU)

!3

slide by Raquel Urtasun, Richard Zem

el, Sanja Fidler

Forward Pass: What does the Network Compute?

Output of the network can be written as:

hj(x) = f (vj0 +DX

i=1

xivji )

ok(x) = g(wk0 +JX

j=1

hj(x)wkj)

(j indexing hidden units, k indexing the output units, D number of inputs)

Activation functions f , g : sigmoid/logistic, tanh, or rectified linear (ReLU)

�(z) =1

1 + exp(�z), tanh(z) =

exp(z)� exp(�z)

exp(z) + exp(�z), ReLU(z) = max(0, z)

Urtasun, Zemel, Fidler (UofT) CSC 411: 10-Neural Networks I Feb 10, 2016 16 / 62

Forward Pass: What does the Network Compute?

Output of the network can be written as:

hj(x) = f (vj0 +DX

i=1

xivji )

ok(x) = g(wk0 +JX

j=1

hj(x)wkj)

(j indexing hidden units, k indexing the output units, D number of inputs)

Activation functions f , g : sigmoid/logistic, tanh, or rectified linear (ReLU)

�(z) =1

1 + exp(�z), tanh(z) =

exp(z)� exp(�z)

exp(z) + exp(�z), ReLU(z) = max(0, z)

Urtasun, Zemel, Fidler (UofT) CSC 411: 10-Neural Networks I Feb 10, 2016 16 / 62

Forward Pass: What does the Network Compute?

Output of the network can be written as:

hj(x) = f (vj0 +DX

i=1

xivji )

ok(x) = g(wk0 +JX

j=1

hj(x)wkj)

(j indexing hidden units, k indexing the output units, D number of inputs)

Activation functions f , g : sigmoid/logistic, tanh, or rectified linear (ReLU)

�(z) =1

1 + exp(�z), tanh(z) =

exp(z)� exp(�z)

exp(z) + exp(�z), ReLU(z) = max(0, z)

Urtasun, Zemel, Fidler (UofT) CSC 411: 10-Neural Networks I Feb 10, 2016 16 / 62

Page 4: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Last time… Forward Pass in Python• Example code for a forward pass for a 3-layer network in Python:

• Can be implemented efficiently using matrix operations • Example above: W1 is matrix of size 4 × 3, W2 is 4 × 4. What about

biases and W3?!4

slide by Raquel Urtasun, Richard Zem

el, Sanja Fidler [http://cs231n.github.io/neural-networks-1/]

Page 5: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Backpropagation

!5

Page 6: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Recap: Loss function/Optimization

!6

-3.45 -8.87 0.09 2.9

4.48 8.02 3.78 1.06

-0.36 -0.72

-0.51 6.04 5.31

-4.22 -4.19 3.58 4.49

-4.37 -2.09 -2.93

3.42 4.64 2.65 5.1

2.64 5.55

-4.34 -1.5

-4.79 6.14

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

1. Define a loss function that quantifies our unhappiness with the scores across the training data.

2. Come up with a way of efficiently finding the parameters that minimize the loss function. (optimization)

TODO:

We defined a (linear) score function:

Page 7: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!7

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Softmax Classifier (Multinomial Logistic Regression)

Page 8: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!8

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Softmax Classifier (Multinomial Logistic Regression)

Page 9: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!9

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Softmax Classifier (Multinomial Logistic Regression)

Page 10: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!10

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Softmax Classifier (Multinomial Logistic Regression)

Page 11: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!11

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Softmax Classifier (Multinomial Logistic Regression)

Page 12: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!12

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Softmax Classifier (Multinomial Logistic Regression)

Page 13: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!13

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Softmax Classifier (Multinomial Logistic Regression)

Page 14: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!14

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Softmax Classifier (Multinomial Logistic Regression)

Page 15: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!15

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Softmax Classifier (Multinomial Logistic Regression)

Page 16: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Softmax Classifier (Multinomial Logistic Regression)

!16

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 17: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Optimization

!17

Page 18: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Gradient Descent

!18

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 19: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Mini-batch Gradient Descent• only use a small portion of the training set

to compute the gradient

!19

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 20: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Mini-batch Gradient Descent• only use a small portion of the training set

to compute the gradient

!20

there are also more fancy update formulas (momentum, Adagrad, RMSProp, Adam, …)

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 21: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

The effects of different update form formulas

!21(image credits to Alec Radford)

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 22: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!22

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 23: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!23

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 24: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Computational Graph

!24

x

W

* hingeloss

R

+ Ls(scores)

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 25: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Convolutional Network(AlexNet)

input imageweights

loss

!25

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 26: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Neural Turing Machine

input tape

loss

!26

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 27: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!27

e.g. x = -2, y = 5, z = -4

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 28: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!28

e.g. x = -2, y = 5, z = -4

Want: slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 29: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!29

e.g. x = -2, y = 5, z = -4

Want: slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 30: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!30

e.g. x = -2, y = 5, z = -4

Want: slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 31: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!31

e.g. x = -2, y = 5, z = -4

Want: slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 32: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!32

e.g. x = -2, y = 5, z = -4

Want: slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 33: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!33

e.g. x = -2, y = 5, z = -4

Want: slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 34: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!34

e.g. x = -2, y = 5, z = -4

Want: slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 35: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!35

e.g. x = -2, y = 5, z = -4

Want: slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 36: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!36

e.g. x = -2, y = 5, z = -4

Want: slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 37: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!37

e.g. x = -2, y = 5, z = -4

Want: slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Chain rule:

Page 38: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!38

e.g. x = -2, y = 5, z = -4

Want: slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 39: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!39

e.g. x = -2, y = 5, z = -4

Want: slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Chain rule:

Page 40: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!40

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

f

activations

Page 41: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!41

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

f

activations

“local gradient”

Page 42: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!42

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

f

activations

gradients

“local gradient”

Page 43: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!43

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

f

activations

gradients

“local gradient”

Page 44: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!44

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

f

activations

gradients

“local gradient”

Page 45: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!45

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

f

activations

gradients

“local gradient”

Page 46: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!46

Another example:

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 47: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!47

Another example:

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 48: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!48

Another example:

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 49: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!49

Another example:

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 50: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!50

Another example:

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 51: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!51

Another example:

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 52: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!52

Another example:

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 53: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!53

Another example:

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 54: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!54

Another example:

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 55: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!55

Another example:

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

(-1) * (-0.20) = 0.20

Page 56: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!56

Another example:

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 57: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!57

Another example:

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

[local gradient] x [its gradient][1] x [0.2] = 0.2[1] x [0.2] = 0.2 (both inputs!)

Page 58: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!58

Another example:

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 59: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!59

Another example:

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

[local gradient] x [its gradient]x0: [2] x [0.2] = 0.4w0: [-1] x [0.2] = -0.20.40

Page 60: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!60

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

sigmoid function

sigmoid gate0.40

Page 61: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!61

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

sigmoid function

sigmoid gate

(0.73) * (1 - 0.73) = 0.2

0.40

Page 62: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Patterns in backward flow• add gate: gradient distributor• max gate: gradient router• mul gate: gradient… “switcher”?

!62

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 63: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

!63

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Gradients add at branches

+

Page 64: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Implementation: forward/backward API

!64

Graph (or Net) object. (Rough pseudo code)

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

0.40

Page 65: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Implementation: forward/backward API

!65

(x,y,z are scalars)

*

x

y

z

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 66: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Implementation: forward/backward API

!66

(x,y,z are scalars)

*

x

y

z

slide by Fei-Fei Li & Andrej Karpathy & Justin Johnson

Page 67: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Summary• neural nets will be very large: no hope of writing down

gradient formula by hand for all parameters• backpropagation = recursive application of the chain rule

along a computational graph to compute the gradients of all inputs/parameters/intermediates

• implementations maintain a graph structure, where the nodes implement the forward() / backward() API.

• forward: compute result of an operation and save any intermediates needed for gradient computation in memory

• backward: apply the chain rule to compute the gradient of the loss function with respect to the inputs.

!67

Page 68: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Where are we now…

!68

Mini-batch SGD

Loop:1.Sample a batch of data2.Forward prop it through the graph, get loss3.Backprop to calculate the gradients4.Update the parameters using the gradient

Page 69: Lecture 12 - Hacettepe Üniversitesiaykut/classes/... · Lecture 12: −Computational Graph −Backpropagation Aykut Erdem November 2018 Hacettepe University

Next Lecture:

Convolutional Neural Networks

�69