introduction to ann lesson 3pelillo/didattica/old stuff...microsoft powerpoint -...
TRANSCRIPT
Backpropagation Learning Algorithm
• An algorithm for learning the weights in the network,given a training set of input-output pairs { , }
• The algorithm is based on gradient descent method.• Architecture
: Weight on connection between the unit in layer (l-1) to unit in layer l
µx µo
)l(ijw thi thj
Supervised Learning
Supervised learning algorithms require the presence of a “teacher”who provides the right answers to the input questions.
Technically, this means that we need a training set of the form
where :
• is the network input vector
• is the network output vector
( ) ( ){ }pp y,x.....,y,xL 11=
( )phx hK1=
( )phy hK1=
Supervised Learning
The learning (or training) phase consists of determining a configuration of weights in such a way that the network output be as close as possible to the desired output, for all examples in the training set.Formally, this amounts to minimizing the following error function :
where is the output provided by the network when given as input.
( )∑ ∑
∑
−=
−==
h k
hk
hk
p
hhh
yout
youtE
2121 2
21
houth
x
Back - Propagation
To minimize the error function E we can use the classicgradient – descent algorithm.
To compute the partial derivates , we use the error back propagation algorithm.
It consists of two stages:
• Forward pass : the input to the network is propagatedlayer after layer in forward direction
• Backward pass : the “ error ” made by the network is propagated backward, and weights are updated properly
ijwE ∂∂
Dato il pattern µ, l’unità nascosta j riceve un input netto dato da
e produce come output :
∑=k
kjkj xwh µµ
( )
== ∑
kkjkjj xwghgV µµµ
Back-Prop : Updating Hidden-to-Output Weights
where:
( )
( )
( )
( ) ( )µ
µ
µ
µµ
µ
µµ
µ
µ
µµ
µ
µ
µµ
µµ
µ
δη
η
η
η
η
η
ji
jiii
ij
iii
ij
k
kkk
kkkij
ijij
V
Vh'gOy
WOOy
WOOy
OyW
WEW
∑
∑
∑
∑ ∑
∑ ∑
=
−=
∂∂
−=
∂∂
−=
−
∂∂
−=
∂∂
−=∆
2
21
( ) ( )µµµµδ iiii h'gOy −=
Back-Prop : Updating Input-to-Hidden Weights ( I )
( )
( ) ( )jk
iiii
i
jk
i
iii
jkjk
whh'gOy
wOOy
wEw
∂∂
−=
∂∂
−=
∂∂
−=∆
∑ ∑
∑ ∑
µµµµ
µ
µ
µ
µµ
η
η
η
( )
( )jk
jjij
jk
jij
jk
jij
jk
l
lil
jk
i
wh
h'gW
whg
W
wV
W
wVW
wh
∂∂
=
∂
∂=
∂
∂=
∂∂
=∂∂
∑
µµ
µ
µ
µµ
Back-Prop : Updating Input-to-Hidden Weights ( II )
Hence, we get :
where :
µ
µµ
k
mm
jmjkjk
j
x
xwww
h
=
∂∂
=∂∂
∑
( ) ( ) ( )( )
µ
µ
µ
µµ
µ
µ
µµ
µ
µµ
δη
δη
η
kj
kjiji,
i
jijii,
iijk
xˆ
xh'gW
h'gWh'gOyw
∑
∑
∑
=
=
−=∆
( ) iji
ijj Wh'gˆ ∑= µµµ δδ