lecture 20 forward pass and back propagation

29
Happy Wednesday! โ–ช Assignment 4 due tonight Nov 11 th , 11:59 pm (midnight) โ–ช Exceptional late policy: No penalty until Mon, Nov 23 rd , 11:59 pm โ–ช Quiz 11, Friday, Oct 30 th 6am until Nov 1 st 11:59pm (midnight) โ–ช Neural networks โ–ช Touch-point 3: deliverables due Nov 22 nd , live-event Mon, Nov 23 rd โ–ช Single-slide presentation outlining progress highlights and current challenges โ–ช Three-minute pre-recorded presentation with your progress and current challenges โ–ช Project final report due Dec 7 th 11:59pm (midnight) โ–ช GitHub page with all of the results you have achieved utilizing both unsupervised learning and supervised learning โ–ช Final seven-minute long pre-recorded presentation CS4641B Machine Learning | Fall 2020 1 Coming up soon

Upload: others

Post on 19-May-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 20 Forward Pass and Back Propagation

Happy Wednesday!โ–ช Assignment 4 due tonight Nov 11th, 11:59 pm (midnight)

โ–ช Exceptional late policy: No penalty until Mon, Nov 23rd, 11:59 pm

โ–ช Quiz 11, Friday, Oct 30th 6am until Nov 1st 11:59pm (midnight)โ–ช Neural networks

โ–ช Touch-point 3: deliverables due Nov 22nd, live-event Mon, Nov 23rd

โ–ช Single-slide presentation outlining progress highlights and current challenges โ–ช Three-minute pre-recorded presentation with your progress and current challenges

โ–ช Project final report due Dec 7th 11:59pm (midnight)โ–ช GitHub page with all of the results you have achieved utilizing both unsupervised

learning and supervised learningโ–ช Final seven-minute long pre-recorded presentation

CS4641B Machine Learning | Fall 2020 1

Coming up soon

Page 2: Lecture 20 Forward Pass and Back Propagation

CS4641B Machine Learning

Lecture 22: Neural networks: Backpropagation algorithmRodrigo Borela โ€ฃ [email protected]

Page 3: Lecture 20 Forward Pass and Back Propagation

Recap

๐‘ฅ1

1

ฮฃ

๐‘ฅ2

โ‹ฎ

ฮฃ

ฮฃ

ฮฃ

ฮฃ

๐‘ฅ๐ท

โ‹ฎ

โ‹ฎ

๐‘ฆ๐‘˜

๐‘ฆ1

1

โ–ช This is a two layer feed forward neural network

Input layer

Hidden layer

Output layer

CS4641B Machine Learning | Fall 2020 3

Page 4: Lecture 20 Forward Pass and Back Propagation

Recap: Forward pass

CS4641B Machine Learning | Fall 2020 4

๐‘ฅ1

๐‘ฅ2

๐‘ง1

๐‘ง2

๐‘ง3

๐‘ง0

๐‘ฅ0

๐‘ฆ1

๐‘ค23(1)

๐‘ค31(2)

๐‘ค01(1)

๐‘ค13(1)

๐‘ค01(2)

Page 5: Lecture 20 Forward Pass and Back Propagation

Recap: Forward pass

CS4641B Machine Learning | Fall 2020 5

โ–ช Activations

๐‘Ž1 =

๐‘–=1

๐ท

๐‘ค๐‘–1(1)๐‘ฅ๐‘– +๐‘ค01

(1)= ๐‘ค11

(1)๐‘ฅ1 +๐‘ค21

(1)๐‘ฅ2 +๐‘ค01

(1)

๐‘Ž2 =

๐‘–=1

๐ท

๐‘ค๐‘–2(1)๐‘ฅ๐‘– +๐‘ค02

(1)= ๐‘ค12

(1)๐‘ฅ1 + ๐‘ค22

(1)๐‘ฅ2 +๐‘ค02

(1)

๐‘Ž3 =

๐‘–=1

๐ท

๐‘ค๐‘–3(1)๐‘ฅ๐‘– +๐‘ค03

(1)= ๐‘ค13

(1)๐‘ฅ1 + ๐‘ค23

(1)๐‘ฅ2 +๐‘ค03

(1)

โ–ช Hidden units

๐‘ง1 = โ„Ž ๐‘Ž1 =1

1 + exp โˆ’๐‘Ž1

๐‘ง2 = โ„Ž ๐‘Ž2 =1

1 + exp โˆ’๐‘Ž2

๐‘ง3 = โ„Ž ๐‘Ž3 =1

1 + exp โˆ’๐‘Ž3

Page 6: Lecture 20 Forward Pass and Back Propagation

Recap: Forward pass

CS4641B Machine Learning | Fall 2020 6

โ–ช Output

๐‘Ž1(2)

=

๐‘–=1

๐‘€

๐‘ค๐‘–๐‘—(2)๐‘ง๐‘– +๐‘ค01

(2)= ๐‘ค11

(2)๐‘ง1 +๐‘ค21

(2)๐‘ง2 +๐‘ค31

(2)๐‘ง3 +๐‘ค01

(2)

๐‘ฆ1 = ๐œŽ ๐‘Ž1(2)

= ๐‘Ž1(2)

Page 7: Lecture 20 Forward Pass and Back Propagation

Training the network

CS4641B Machine Learning | Fall 2020 7

Page 8: Lecture 20 Forward Pass and Back Propagation

Training the networkโ–ช Stochastic gradient descent

๐ฐ ๐œ+1 = ๐ฐ ๐œ โˆ’ ๐œ‚โˆ‡๐ธ๐‘› ๐ฐ ๐œ

โ–ช Considering a linear model ๐‘ฆ๐‘˜ ๐ฑ,๐ฐ๐‘˜ = ๐ฐ๐‘˜๐‘‡๐ฑ = ฯƒ๐‘–=0

๐ท ๐‘ค๐‘–๐‘˜๐‘ฅ๐‘–

โ–ช Error function for a datapoint ๐ฑ๐‘› with a target vector of size ๐‘˜:

๐ธ๐‘› =1

2

๐‘˜=1

๐พ

๐‘ฆ๐‘›๐‘˜ โˆ’ ๐‘ก๐‘›๐‘˜2

โ–ช Calculating the gradient of this error function with respect to ๐‘ค๐‘–๐‘˜:๐œ•๐ธ๐‘›๐œ•๐‘ค๐‘–๐‘˜

= ๐‘ฆ๐‘›๐‘— โˆ’ ๐‘ก๐‘›๐‘— ๐‘ฅ๐‘›๐‘–

CS4641B Machine Learning | Fall 2020 8

Page 9: Lecture 20 Forward Pass and Back Propagation

Training the networkโ–ช Activation:

๐‘Ž๐‘— = ฯƒ๐‘–๐‘ค๐‘–๐‘—๐‘ง๐‘– (๐‘ง๐‘– input from another layer)

โ–ช Hidden unit

๐‘ง๐‘— = โ„Ž ๐‘Ž๐‘—

โ–ช The gradient of the error ๐ธ๐‘› depends on the weight ๐‘ค๐‘–๐‘— only via the summed input ๐‘Ž๐‘—to unit ๐‘—

๐œ•๐ธ๐‘›๐œ•๐‘ค๐‘–๐‘—

=๐œ•๐ธ๐‘›๐œ•๐‘Ž๐‘—

๐œ•๐‘Ž๐‘—

๐œ•๐‘ค๐‘–๐‘—

CS4641B Machine Learning | Fall 2020 9

Page 10: Lecture 20 Forward Pass and Back Propagation

Training the networkโ–ช Output layer

๐›ฟ๐‘˜ = ๐‘ฆ๐‘˜ โˆ’ ๐‘ก๐‘˜

โ–ช Hidden layers

๐›ฟ๐‘— =๐œ•๐ธ๐‘›๐œ•๐‘Ž๐‘—

=

๐‘˜

๐œ•๐ธ๐‘›๐œ•๐‘Ž๐‘˜

๐œ•๐‘Ž๐‘˜๐œ•๐‘Ž๐‘—

๐›ฟ๐‘— =๐œ•๐ธ๐‘›๐œ•๐‘Ž๐‘—

= โ„Žโ€ฒ ๐‘Ž๐‘—

๐‘˜

๐‘ค๐‘—๐‘˜๐›ฟ๐‘˜

CS4641B Machine Learning | Fall 2020 10

Page 11: Lecture 20 Forward Pass and Back Propagation

Error backpropagation algorithmโ–ช Initialize the weightsโ–ช Apply input vector ๐ฑ๐‘›โ–ช Evaluate the ๐›ฟ๐‘˜ for all the output unitsโ–ช Backpropagate the ๐›ฟ to obtain ๐›ฟ๐‘˜ for

each hidden unitโ–ช Evaluate the required derivativesโ–ช Update the weights ๐‘ง๐‘–

๐‘ง๐‘—

๐›ฟ๐‘˜

๐›ฟ1

๐›ฟ๐‘—๐‘ค๐‘–๐‘—

๐‘ค๐‘—๐‘˜

CS4641B Machine Learning | Fall 2020 11

Page 12: Lecture 20 Forward Pass and Back Propagation

Error backpropagation algorithmโ–ช Initialize the weightsโ–ช Apply input vector ๐ฑ๐‘›โ–ช Evaluate the ๐›ฟ๐‘˜ for all the output unitsโ–ช Backpropagate the ๐›ฟ to obtain ๐›ฟ๐‘˜ for

each hidden unitโ–ช Evaluate the required derivativesโ–ช Update the weights ๐‘ง๐‘–

๐‘ง๐‘—

๐›ฟ๐‘˜

๐›ฟ1

๐›ฟ๐‘—๐‘ค๐‘–๐‘—

๐‘ค๐‘—๐‘˜

CS4641B Machine Learning | Fall 2020 12

Page 13: Lecture 20 Forward Pass and Back Propagation

Backprop: example

CS4641B Machine Learning | Fall 2020 13

๐‘ฅ1

๐‘ฅ2

๐‘ง1

๐‘ง2

๐‘ง3

๐‘ง0

๐‘ฅ0

๐‘ฆ1

๐‘ค23(1)

๐‘ค31(2)

๐‘ค01(1)

๐‘ค13(1)

๐‘ค01(2)

๐›ฟ0(1)

๐›ฟ1(2)

๐›ฟ1(1)

๐›ฟ2(1)

๐›ฟ3(1) Our goal is to compute:

๐œ•๐ธ๐‘›๐œ•๐‘ค๐‘–๐‘—

,๐œ•๐ธ๐‘›๐œ•๐‘ค๐‘—๐‘˜

Page 14: Lecture 20 Forward Pass and Back Propagation

Backprop: example

CS4641B Machine Learning | Fall 2020 14

๐‘ฅ1

๐‘ฅ2

๐‘ง1

๐‘ง2

๐‘ง3

๐‘ง0

๐‘ฅ0

๐‘ฆ1

๐‘ค๐‘–๐‘— = 1.0, ๐‘ค๐‘—๐‘˜ = 1

Page 15: Lecture 20 Forward Pass and Back Propagation

Error backpropagation algorithmโ–ช Initialize the weightsโ–ช Apply input vector ๐ฑ๐‘›โ–ช Evaluate the ๐›ฟ๐‘˜ for all the output unitsโ–ช Backpropagate the ๐›ฟ to obtain ๐›ฟ๐‘˜ for

each hidden unitโ–ช Evaluate the required derivativesโ–ช Update the weights ๐‘ง๐‘–

๐‘ง๐‘—

๐›ฟ๐‘˜

๐›ฟ1

๐›ฟ๐‘—๐‘ค๐‘–๐‘—

๐‘ค๐‘—๐‘˜

CS4641B Machine Learning | Fall 2020 15

Page 16: Lecture 20 Forward Pass and Back Propagation

Backprop: example

CS4641B Machine Learning | Fall 2020 16

โ–ช Forward pass with input vector ๐ฑ๐‘› = 0.5 0.5 ๐‘‡, ๐‘ก๐‘› = 0.5

๐‘Ž11=

๐‘–=1

๐ท

๐‘ค๐‘–1(1)๐‘ฅ๐‘– + ๐‘ค01

(1)= ๐‘ค11

(1)๐‘ฅ1 +๐‘ค21

(1)๐‘ฅ2 + ๐‘ค01

(1)= 1 ร— 0.5 + 1 ร— 0.5 + 1 = 2.0

๐‘Ž2(1)

=

๐‘–=1

๐ท

๐‘ค๐‘–2(1)๐‘ฅ๐‘– + ๐‘ค02

(1)= ๐‘ค12

(1)๐‘ฅ1 +๐‘ค22

(1)๐‘ฅ2 + ๐‘ค02

(1)= 1 ร— 0.5 + 1 ร— 0.5 + 1 = 2.0

๐‘Ž3(1)

=

๐‘–=1

๐ท

๐‘ค๐‘–3(1)๐‘ฅ๐‘– + ๐‘ค03

(1)= ๐‘ค13

(1)๐‘ฅ1 +๐‘ค23

(1)๐‘ฅ2 + ๐‘ค03

(1)= 1 ร— 0.5 + 1 ร— 0.5 + 1 = 2.0

โ–ช Hidden units:

๐‘ง1 = โ„Ž ๐‘Ž11

=1

1 + exp โˆ’๐‘Ž11

=1

1 + exp(โˆ’2)= 0.88

๐‘ง2 = โ„Ž ๐‘Ž2(1)

=1

1 + exp โˆ’๐‘Ž2(1)

=1

1 + exp(โˆ’2)= 0.88

๐‘ง3 = โ„Ž ๐‘Ž3(1)

=1

1 + exp โˆ’๐‘Ž3(1)

=1

1 + exp(โˆ’2)= 0.88

Page 17: Lecture 20 Forward Pass and Back Propagation

Backprop: example

CS4641B Machine Learning | Fall 2020 17

โ–ช Output

๐‘Ž12=

๐‘–=1

๐‘€

๐‘ค๐‘–๐‘—(2)๐‘ง๐‘– + ๐‘ค01

(2)= ๐‘ค11

(2)๐‘ง1 + ๐‘ค21

(2)๐‘ง2 +๐‘ค31

(2)๐‘ง3 + ๐‘ค01

(2)= 3.64

๐‘ฆ1 = ๐œŽ ๐‘Ž12

= ๐‘Ž12= 3.64

Page 18: Lecture 20 Forward Pass and Back Propagation

Backprop: example

CS4641B Machine Learning | Fall 2020 18

Input variable ๐ฑ๐‘›โ–ช ๐‘ฅ1 = 0.5โ–ช ๐‘ฅ2 = 0.5

Hidden units:โ–ช ๐‘ง1 = 0.88โ–ช ๐‘ง2 = 0.88โ–ช ๐‘ง3 = 0.88

Outputโ–ช ๐‘ฆ1 = 3.64

Targetโ–ช ๐‘ก๐‘› = 0.5

0.5

0.5

.88

1

1

3.64

๐›ฟ0(1)

๐›ฟ1(2)

๐›ฟ1(1)

๐›ฟ2(1)

๐›ฟ3(1)

.88

.88

Page 19: Lecture 20 Forward Pass and Back Propagation

Error backpropagation algorithmโ–ช Initialize the weightsโ–ช Apply input vector ๐ฑ๐‘›โ–ช Evaluate the ๐›ฟ๐‘˜ for all the output unitsโ–ช Backpropagate the ๐›ฟ to obtain ๐›ฟ๐‘˜ for

each hidden unitโ–ช Evaluate the required derivativesโ–ช Update the weights ๐‘ง๐‘–

๐‘ง๐‘—

๐›ฟ๐‘˜

๐›ฟ1

๐›ฟ๐‘—๐‘ค๐‘–๐‘—

๐‘ค๐‘—๐‘˜

CS4641B Machine Learning | Fall 2020 19

Page 20: Lecture 20 Forward Pass and Back Propagation

Backprop: example

CS4641B Machine Learning | Fall 2020 20

โ–ช Only one output with a linear activation function, therefore:

๐›ฟ1(2)

= ๐‘ฆ1 โˆ’ ๐‘ก๐‘›1 = 3.64 โˆ’ 0.5 = 3.14

โ–ช Backpropagate

๐›ฟ๐‘—(1)

= โ„Žโ€ฒ ๐‘Ž๐‘— ๐‘ค๐‘—1(1)๐›ฟ1(2)

โ–ช For the sigmoid function:

โ„Ž ๐‘Ž =1

1 + exp โˆ’๐‘Žโ†’ โ„Žโ€ฒ ๐‘Ž = โ„Ž(๐‘Ž)(1 โˆ’ โ„Ž ๐‘Ž )

โ–ช Since ๐‘ง๐‘— = โ„Ž(๐‘Ž๐‘—), the expression then becomes:

๐›ฟ๐‘—(1)

= ๐‘ง๐‘— โˆ’ ๐‘ง๐‘—2 ๐‘ค๐‘—1

(1)๐›ฟ1(2)

โ–ช ๐›ฟ0(1)

= 0, ๐›ฟ1(1)

= 0.33, ๐›ฟ2(1)

= 0.33, ๐›ฟ3(1)

= 0.33

๐‘ฅ1

๐‘ฅ2

๐‘ง1

๐‘ง2

๐‘ง3

๐‘ง0

๐‘ฅ0

๐‘ฆ1

๐‘ค23(1)

๐‘ค31(2)

๐‘ค01(1)

๐‘ค13(1)

๐‘ค01(2)

๐›ฟ0(1)

๐›ฟ1(2)

๐›ฟ1(1)

๐›ฟ2(1)

๐›ฟ3(1)

Page 21: Lecture 20 Forward Pass and Back Propagation

Error backpropagation algorithmโ–ช Initialize the weightsโ–ช Apply input vector ๐ฑ๐‘›โ–ช Evaluate the ๐›ฟ๐‘˜ for all the output unitsโ–ช Backpropagate the ๐›ฟ to obtain ๐›ฟ๐‘˜ for

each hidden unitโ–ช Evaluate the required derivativesโ–ช Update the weights ๐‘ง๐‘–

๐‘ง๐‘—

๐›ฟ๐‘˜

๐›ฟ1

๐›ฟ๐‘—๐‘ค๐‘–๐‘—

๐‘ค๐‘—๐‘˜

CS4641B Machine Learning | Fall 2020 21

Page 22: Lecture 20 Forward Pass and Back Propagation

Backprop: example

CS4641B Machine Learning | Fall 2020 22

โ–ช Evaluate the derivatives

๐œ•๐ธ๐‘›

๐œ•๐‘ค๐‘—๐‘˜(2)

= ๐›ฟ๐‘˜๐‘ง๐‘— โ†’

๐œ•๐ธ๐‘›

๐œ•๐‘ค01(2)

= ๐›ฟ1(2)๐‘ง0 = 3.14 ร— 1 = 3.14

๐œ•๐ธ๐‘›

๐œ•๐‘ค11(2)

= ๐›ฟ1(2)๐‘ง1 = 3.14 ร— 0.88 = 2.76

๐œ•๐ธ๐‘›

๐œ•๐‘ค21(2)

= ๐›ฟ1(2)๐‘ง2 = 3.14 ร— 0.88 = 2.76

๐œ•๐ธ๐‘›

๐œ•๐‘ค31(2)

= ๐›ฟ1(2)๐‘ง3 = 3.14 ร— 0.88 = 2.76

Page 23: Lecture 20 Forward Pass and Back Propagation

Backprop: example

CS4641B Machine Learning | Fall 2020 23

โ–ช Evaluate the derivatives

๐œ•๐ธ๐‘›

๐œ•๐‘ค๐‘–๐‘—(1)

= ๐›ฟ๐‘—(1)๐‘ฅ๐‘– โ†’

๐œ•๐ธ๐‘›

๐œ•๐‘ค01(1)

= ๐›ฟ1(1)๐‘ฅ0,

๐œ•๐ธ๐‘›

๐œ•๐‘ค11(1)

= ๐›ฟ1(1)๐‘ฅ1,

๐œ•๐ธ๐‘›

๐œ•๐‘ค21(1)

= ๐›ฟ1(1)๐‘ฅ2

๐œ•๐ธ๐‘›

๐œ•๐‘ค02(1)

= ๐›ฟ2(1)๐‘ฅ0,

๐œ•๐ธ๐‘›

๐œ•๐‘ค12(1)

= ๐›ฟ2(1)๐‘ฅ1,

๐œ•๐ธ๐‘›

๐œ•๐‘ค22(1)

= ๐›ฟ2(1)๐‘ฅ2

๐œ•๐ธ๐‘›

๐œ•๐‘ค03(1)

= ๐›ฟ3(1)๐‘ฅ0,

๐œ•๐ธ๐‘›

๐œ•๐‘ค13(1)

= ๐›ฟ3(1)๐‘ฅ1,

๐œ•๐ธ๐‘›

๐œ•๐‘ค23(1)

= ๐›ฟ3(1)๐‘ฅ2

๐œ•๐ธ๐‘›

๐œ•๐‘ค01(1)

= 0.33,๐œ•๐ธ๐‘›

๐œ•๐‘ค11(1)

= 0.165,๐œ•๐ธ๐‘›

๐œ•๐‘ค21(1)

= 0.165

๐œ•๐ธ๐‘›

๐œ•๐‘ค02(1)

= 0.33,๐œ•๐ธ๐‘›

๐œ•๐‘ค12(1)

= 0.165,๐œ•๐ธ๐‘›

๐œ•๐‘ค22(1)

= 0.165

๐œ•๐ธ๐‘›

๐œ•๐‘ค03(1)

= 0.33,๐œ•๐ธ๐‘›

๐œ•๐‘ค13(1)

= 0.165,๐œ•๐ธ๐‘›

๐œ•๐‘ค23(1)

= 0.165

Page 24: Lecture 20 Forward Pass and Back Propagation

Error backpropagation algorithmโ–ช Initialize the weightsโ–ช Apply input vector ๐ฑ๐‘›โ–ช Evaluate the ๐›ฟ๐‘˜ for all the output unitsโ–ช Backpropagate the ๐›ฟ to obtain ๐›ฟ๐‘˜ for

each hidden unitโ–ช Evaluate the required derivativesโ–ช Update the weights ๐‘ง๐‘–

๐‘ง๐‘—

๐›ฟ๐‘˜

๐›ฟ1

๐›ฟ๐‘—๐‘ค๐‘–๐‘—

๐‘ค๐‘—๐‘˜

CS4641B Machine Learning | Fall 2020 24

Page 25: Lecture 20 Forward Pass and Back Propagation

Backprop: example

CS4641B Machine Learning | Fall 2020 25

โ–ช Use the update rule and the calculated error derivatives

๐‘ค๐‘›๐‘’๐‘ค = ๐‘ค๐‘œ๐‘™๐‘‘ โˆ’ ๐œ‚โˆ‡๐ธ๐‘› ๐‘ค๐‘œ๐‘™๐‘‘

โ–ช Assuming an ๐œ‚ =1

3the new weights are calculated for the network, e.g. weight ๐‘ค01

(1)(weight from

๐‘ฅ0 to ๐‘ง1 on the first layer).

๐‘ค011 ,๐‘›๐‘’๐‘ค

= ๐‘ค011 ,๐‘œ๐‘™๐‘‘

โˆ’ ๐œ‚๐œ•๐ธ๐‘›

๐œ•๐‘ค011 ,๐‘œ๐‘™๐‘‘

= 1 โˆ’1

30.33 = 0.89

Page 26: Lecture 20 Forward Pass and Back Propagation

Backprop: example

CS4641B Machine Learning | Fall 2020 26

๐‘ฅ1

๐‘ฅ2

๐‘ง1

๐‘ง2

๐‘ง3

๐‘ง0

๐‘ฅ0

๐‘ฆ1

0.89

0.89

0.890.945

โˆ’0.046

0.08

Page 27: Lecture 20 Forward Pass and Back Propagation

Test

CS4641B Machine Learning | Fall 2020 27

๐‘ฅ1

๐‘ฅ2

๐‘ง1

๐‘ง2

๐‘ง3

๐‘ง0

๐‘ฅ0

๐‘ฆ1

0.89

0.89

0.890.945

โˆ’0.046

0.08

๐ฑ =0.70.7

, ๐‘ก = โˆ’0.02

Page 28: Lecture 20 Forward Pass and Back Propagation

Test

CS4641B Machine Learning | Fall 2020 28

โ–ช Activations

๐‘Ž1 =

๐‘–=1

๐ท

๐‘ค๐‘–1(1)๐‘ฅ๐‘– +๐‘ค01

(1)= ๐‘ค11

(1)๐‘ฅ1 +๐‘ค21

(1)๐‘ฅ2 +๐‘ค01

(1)

๐‘Ž2 =

๐‘–=1

๐ท

๐‘ค๐‘–2(1)๐‘ฅ๐‘– +๐‘ค02

(1)= ๐‘ค12

(1)๐‘ฅ1 + ๐‘ค22

(1)๐‘ฅ2 +๐‘ค02

(1)

๐‘Ž3 =

๐‘–=1

๐ท

๐‘ค๐‘–3(1)๐‘ฅ๐‘– +๐‘ค03

(1)= ๐‘ค13

(1)๐‘ฅ1 + ๐‘ค23

(1)๐‘ฅ2 +๐‘ค03

(1)

โ–ช Hidden units

๐‘ง1 = โ„Ž ๐‘Ž1 =1

1 + exp โˆ’๐‘Ž1

๐‘ง2 = โ„Ž ๐‘Ž2 =1

1 + exp โˆ’๐‘Ž2

๐‘ง3 = โ„Ž ๐‘Ž3 =1

1 + exp โˆ’๐‘Ž3

Page 29: Lecture 20 Forward Pass and Back Propagation

Test

CS4641B Machine Learning | Fall 2020 29

โ–ช Output

๐‘Ž1(2)

=

๐‘–=1

๐‘€

๐‘ค๐‘–๐‘—(2)๐‘ง๐‘– +๐‘ค01

(2)= ๐‘ค11

(2)๐‘ง1 +๐‘ค21

(2)๐‘ง2 +๐‘ค31

(2)๐‘ง3 +๐‘ค01

(2)

๐‘ฆ1 = ๐œŽ ๐‘Ž1(2)

= ๐‘Ž1(2)