lecture 3 perceptrons
TRANSCRIPT
![Page 1: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/1.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 1/46
The Linear Classifier,
also known as the “Perceptron”
COMP24111 lecture 3
![Page 2: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/2.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 2/46
LAST WEEK: our first“machine learning” algorithm
Testing point x
For each training datapoint x’
measure distance(x,x’)
End
Sort distances
Select K nearest
Assign most common class
The K-Nearest Neighbour Classifier
Make your own notes on itsadvantages / disadvantages.
I’ll ask for volunteers nexttime we meet…..
![Page 3: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/3.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 3/46
Model(memorize the
training data)
Testing Data(no labels)
Training data
Predicted Labels
Learning algorithm
(do nothing)
Supervised Learning Pipeline for Nearest Neighbour
![Page 4: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/4.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 4/46
Themost important concept in Machine Learning
![Page 5: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/5.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 5/46
Looks good so far…
Themost important concept in Machine Learning
![Page 6: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/6.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 6/46
Looks good so far…
Oh no! Mistakes!
What happened?
Themost important concept in Machine Learning
![Page 7: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/7.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 7/46
Looks good so far…
Oh no! Mistakes!
What happened?
We didn’t have all the data.
We can never assume that we do.
This is called “OVER-FITTIN”to the small dataset.
Themost important concept in Machine Learning
![Page 8: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/8.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 8/46
The Linear Classifier
COMP24111 lecture 3
![Page 9: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/9.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 9/46
Model(equation)
Testing Data(no labels)
Training data
Predicted Labels
Learning algorithm
(search for good
parameters)
Supervised Learning Pipeline for Linear Classifiers
![Page 10: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/10.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 10/46
A more simple,compact model?
height
weightt
dancer""else player""then)(if t weight >
![Page 11: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/11.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 11/46
What’s an algorithm to find a good threshold?
height
weight
dancer""else player""then)(if t weight >
t=40
while ( numMistakes != 0 ){
t = t + 1
numMistakes = testRule(t)
}
![Page 12: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/12.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 12/46
We have our second Machine Learning procedure.
dancer""else player""then)(if t weight >
t=40while ( numMistakes != 0 )
{
t = t + 1
numMistakes = testRule(t)
}
The threshold classifier (also known as a“Decision Stump”)
![Page 13: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/13.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 13/46
Three“ingredients” of a Machine Learning procedure
“Model”The final product, the thing you have to package upand send to a customer. A piece of code with some parameters that need to be set.
“Error function”The performance criterion: the function you use to judge how well the parameters of the model are set.
“Learningalgorithm”
The algorithm that optimises the model parameters,using the error function to judge how well it is doing.
![Page 14: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/14.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 14/46
Three“ingredients” of a Threshold Classifier
Model
Learningalgorithm
t=40
while ( numMistakes != 0 )
{ t = t + 1
numMistakes = testRule(t)
}
Error function
dancer""else player""then)(if t x >
General case – we’ re not just talking about the weight
of a rugby player – a threshold can be put on any feature‘ x’ .
![Page 15: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/15.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 15/46
![Page 16: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/16.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 16/46
Model
(memorize the
training data)
Testing Data(no labels)
Training data
Predicted Labels
Learning algorithm
(do nothing)
Supervised Learning Pipeline for Nearest Neighbour
![Page 17: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/17.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 17/46
What’s the“model” for the Nearest Neighbour classifier?
height
weight
For the k-nn, the model is thetraining data itself ! - very good accuracy - very computationally intensive!
Testing point x
For each training datapoint x’
measure distance(x,x’)
End
Sort distances
Select K nearest
Assign most common class
![Page 18: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/18.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 18/46
height
weight
New data: what’s an algorithm to find a good threshold?
Our model does notmatch the problem!
t
1 mistake…
dancer""else player""then)(if t weight >
![Page 19: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/19.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 19/46
height
weight
New data: what’s an algorithm to find a good threshold?
But our current modelcannot represent this…
dancer""else player""then)(if t weight >
![Page 20: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/20.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 20/46
We need a more sophisticated model…
dancer""else player""then)(if t x >
![Page 21: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/21.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 21/46
![Page 22: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/22.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 22/46
nput signals sent
from other neurons
f enough sufficient
signals accumulate! theneuron fires a signal"
#onnection strengths determine
ho$ the signals are accumulated
![Page 23: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/23.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 23/46
1 x
2 x
3 x
add
)( t aif >1=output output
signal
• input signals!x’ and coefficients!w’ are multiplied
• weights correspond to connection strengths
• signals are added up – if they are enough, FIRE!
else0=output
1w
2w
3w
i
M
i
iw xa ∑=
=1
incoming
signal
connection
strengthacti%ation
le%el
output
signal
![Page 24: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/24.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 24/46
Sum notation
(just like a loop from 1 to
M)
double[] x =
double[] =
Multiple corresponding
elements and add them up
a
if (actiation ! threshold) "#$% &
(actiation)
i
M
i
iw xa ∑=
=1
Calculation…
![Page 25: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/25.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 25/46
t >if 0else '1then == output output
∑
=
i
M
i
iw x1
The Perceptron Decision Rule
![Page 26: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/26.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 26/46
out"ut # $out"ut # %
t >if 0else '1then == output output
∑=
i
M
i
iw x1
Ru&'( "la(er # $)allet dancer # %
![Page 27: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/27.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 27/46
s this a good decision boundar&'
t >if 0else '1then == output output
∑=
i
M
i
iw x1
![Page 28: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/28.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 28/46
w$ # $.%
w* # %.*
t # %.%+
t >if 0else '1then == output output
∑=
i
M
i
iw x1
![Page 29: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/29.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 29/46
w$ # *.$
w* # %.*
t # %.%+
t >if 0else '1then == output output
∑=
i
M
i
iw x1
![Page 30: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/30.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 30/46
w$ # $.,
w* # %.%*
t # %.%+
t >if 0else '1then == output output
∑=
i
M
i
iw x1
![Page 31: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/31.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 31/46
han&in& the wei&htsthreshold ma/es the decision 'oundar( move.
0ointless im"ossi'le to do it '( hand 1 onl( o/ 2or sim"le *-3 case.
We need an al&orithm4.
w$ # -%.5
w* # %.%6
t # %.%+
![Page 32: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/32.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 32/46
*0 '*0 '2*0 +=w
0*2 '*0 '0*1 += x
,1* -hat is the actiation' a' of the neuron.
,2* /oes the neuron fire.
,3* -hat if e set threshold at 0* and eight 3 to ero.
0*1=t
w$
w*
w6
7$
7*
76
∑=
= M
i
iiw xa1
Take a 20 minute break
and think about this.
![Page 33: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/33.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 33/46
20 minute break
![Page 34: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/34.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 34/46
*0 '*0 '2*0 +=w
0*2 '*0 '0*1 += x
0*1=t
w$
w*
w6
7$
7*
76
∑=
= M
i
iiw xa1
w$
w*
w6
7$
7*
76
∑=
= M
i
iiw xa1
3*1)*00*2()*0*0()2*00*1(1
=×+×+×==∑=
M
i
iiw xa
" *hat is the acti%ation! a! of the neuron'
+" Does the neuron fire'
if (activation > threshold) output else output"
…# $o %es& it fires#
![Page 35: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/35.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 35/46
*0 '*0 '2*0 +=w
0*2 '*0 '0*1 += x
0*1=t
w$
w*
w6
7$
7*
76
∑=
= M
i
iiw xa1
w$
w*
w6
7$
7*
76
∑=
= M
i
iiw xa1
*0)0*00*2()*0*0()2*00*1(1
=×+×+×==∑=
M
i
iiw xa
," *hat if $e set threshold at -". and $eight /, to zero'
if (activation > threshold) output else output"
…# $o no& it does not fire##
![Page 36: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/36.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 36/46
We need a more sophisticated model…
height
weight
dancer""else player""then)(if t weight >
dancer""else player""then))((if t x f > )(
)(
2
1
kg weight x
cmheight x
==
i
d
i
i xw∑=
=1
)4()4()( 2211 xw xw x f +=
The Perceptron
![Page 37: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/37.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 37/46
The Perceptron
i
d
i
i xw∑=
=1
)4()4()( 2211 xw xw x f +=
height
weight
height
weight
dancer""else player""then)(if t x f >
, and change the position of the DECISION BOUNDARYt 1w 2w
Decisionboundary
![Page 38: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/38.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 38/46
The Perceptron
error)tionclassifica(a*k*a* mistakesof 5um6erError function
height
weight
Model
Learning algo. alues***andtheoptimisetoneed****... t w
0
1
=
=
"dancer"
"player" 0y7else1y7thenif
1
==>∑=
t xw i
d
i
i
![Page 39: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/39.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 39/46
Perceptron Learning Rule
ne eight 8 old eight 9 0*1 ( true:a6el ; output ) input×
i24 8 target = !9 output = ! : 4. then u"date # ;
i24 8 target = !9 output = " : 4. then u"date # ;
i24 8 target = "9 output = ! : 4. then u"date # ;
i24 8 target = "9 output = " : 4. then u"date # ;
×
What 'eight updates do these cases produce?
update
![Page 40: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/40.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 40/46
initialise weights to random numbers in range -1 to +1
or n = 1 to "M#$%&R'%$
or ea*h training eam,le (.)
*al*ulate a*ti/ation
or ea*h weight
u,date weight b. learning rule
end
end
end
Perceptron convergence theorem:
If the data is linearly separable, then application of the
Perceptron learning rule will find a separating decision boundary,
within a finite number of iterations
Learning algorithm for the Perceptron
![Page 41: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/41.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 41/46
Model(if… then…)
Testing Data(no labels)
Training data
Predicted Labels
Learning algorithm(search for good
parameters)
Supervised Learning Pipeline for Perceptron
N d l l bl
![Page 42: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/42.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 42/46
New data…. non-linearly separable
height
weight
Our model does notmatch the problem!
(AGAIN!)
Many mistakes!
dancer""else player""thenif 1
t xw i
d
i
i >∑=
M lil P
![Page 43: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/43.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 43/46
Multilayer Perceptron
x1
x2
x3
x4
x5
![Page 44: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/44.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 44/46
MLPd ii b d li bl ld!
![Page 45: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/45.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 45/46
MLP decision boundary – nonlinear problems, solved!
height
weight
N lNt k
![Page 46: Lecture 3 Perceptrons](https://reader036.vdocument.in/reader036/viewer/2022062413/55cf8ff4550346703ba1a4de/html5/thumbnails/46.jpg)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 46/46
Neural Networks - summary
Perceptrons are a (simple) emulation of a neuron.
Layering perceptrons gives you… a multilayer perceptron.An MLP is one type of neural network – there are others.
An MLP with sigmoid activation functions can solve highly nonlinear problems.
Downside – we cannot use the simple perceptron learning algorithm.
Instead we have the backpropagation algorithm.
This is outside the scope of this introductory course.