artificial neural networks - cvut.czlearning paradigms supervised – teaching by examples, given a...
TRANSCRIPT
![Page 1: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/1.jpg)
01001110 0110010101110101 0111001001101111 0110111001101111 0111011001100001 0010000001110011 0110101101110101 0111000001101001 0110111001100001 0010000001101011 0110000101110100 0110010101100100 0111001001111001 0010000001110000 0110111101100011 0110100101110100 0110000101100011 0111010100101100 0010000001000110 0100010101001100 0010000001000011 0101011001010101 0101010000101100 0010000001010000 0111001001100001 0110100001100001 00000000
Artificial Neural NetworksThe Introduction
Computational Intelligence GroupDepartment of Computer Science and Engineering
Faculty of Electrical EngineeringCzech Technical University in Prague
![Page 2: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/2.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Outline
● What are Artificial Neural Networks (ANNs)?
● Inspiration in Biology.
● Neuron models and learning algorithms:
– MCP neuron,
– Rosenblatt's perceptron,
– MP-Perceptron,
– ADALINE/MADALINE
● Linear separability.
![Page 3: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/3.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Information Resources
● https://cw.felk.cvut.cz/doku.php/courses/a4m33bia/start
● M. Šnorek: Neuronové sítě a neuropočítače. Vydavatelství ČVUT, Praha, 2002.
● Courseware: http://neuron.felk.cvut.cz/courseware/
● R. Rojas: Neural Networks - A Systematic Introduction, Springer-Verlag, Berlin, New-York, 1996, http://www.inf.fu-berlin.de/inst/ag-ki/rojas_home/pmwiki/pmwiki.php?n=Books.NeuralNetworksBook
● J. Šíma, R. Neruda: Theoretical Issues of Neural Networks, MATFYZPRESS, Prague, 1996, http://www2.cs.cas.cz/~sima/kniha.html
● This presentation is based on materials for 36NAN and 336VD.
![Page 4: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/4.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
What Are Artificial Neural Networks (ANNs)?
● Artificial information systems which imitate functions of neural systems of living organisms.
● Non-algorithmic approach to computation → learning, generalization.
● Massively-parallel processing of data using large number of simple computational units (neurons).
● Note: we are still very far from the complexity found in the Nature! But there have been some advances lately – we will see ;)
![Page 5: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/5.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
ANN: the Black Box
● Function of an ANN can be understood as the transformation T of an input vector Xto the output vector Y
Y = T (X)
● What transformations T can be realized by neural networks ? It is the scientific question since the beginning of the discipline.
•••
•••
Artificial
Neural
Network
Input vectors Output
x1
x2
x3
xn
y1
y2
ym
![Page 6: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/6.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
What Can Be Solved by ANNs?
● Function approximation - regression.● Classification (pattern/sequence recognition).● Time series prediction.● Fitness prediction (for optimization).● Control (i.e. robotics).● Association.● Filtering, clustering, compression.
![Page 7: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/7.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
ANN Learning & Recall
● ANNs work most frequently in two phases:● Learning phase – adaptation of ANN's internal
parameters.● Evaluation phase (recall) – use what was
learned.
![Page 8: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/8.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Learning Paradigms
● Supervised – teaching by examples, given a set of example pairs P
i=(x
i,y
i) find
transformation T which approximates yi=T(x
i) for
all i.● Unsupervised – self-organization, no teacher.● Reinforcement – Teaching examples not
availiable → they are generated by interactions with the environment (mostly control tasks).
![Page 9: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/9.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
When to Stop Learning Process?
● When for all patterns of a training set a given criterion is met:
– the ANN's error is less than...,
– stagnation for given number of epochs.
● Training set – a subset of example pattern set dedicated to learning phase.
● Epoch – application of all training set patterns.
![Page 10: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/10.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Inspiration: Biological Neurons
![Page 11: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/11.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Biological Neurons II
● Neural system of a 1 year old child contains approx. 1011-12 neurons (loosing 200 000 a day).
● The diameter of soma nucleus ranges from 3 to 18 µm.
● Dendrite length is 2-3 mm, there is 10 to 100 000 of dendrites per neuron.
● Axon length – can be longer than 1 m.
![Page 12: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/12.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Synapses
● The synapses perform a transport of signals between neurons. The synapses are of two types:
– excitatory,
– inhibitory.
● Another point of view– chemical,
– electrical,
– others.
![Page 13: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/13.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Neural Activity - a Reaction to Stimuli
● After neuron fires through axon, it is unable to fire again for about 10 ms.
● The speed of signal propagation ranges from 5 to 125 m/s.
Threshold failed to reach
Threshold gained
Neuron’s potential
Threshold
t
t
Activity in time
![Page 14: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/14.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Hebbian Learning Postulate
● Donald Hebb 1949.● “When an axon of cell A is near enough to
excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased.”
→ “cells that fire together, wire together”
![Page 15: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/15.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Neuron Model: History
● Warren McCulloch, Walter Pitts
1899 - 1969 1923 - 1969
McCulloch, W. S. and Pitts, W. H. (1943).A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5:115-133.
![Page 16: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/16.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
McCulloch-Pitts (MCP) Neuron
y neuron's output (activity)
xi neuron's i-th input out of total N,
wi i-th synaptic weight,
S (nonlinear) transfer (activation) function,
Θ threshold, bias (performs shift).
The expression in brackets is the inner potential.
They worked with binary inputs only.
No learning algorithm.neuroninputs
neuronoutput
nonlineartransferfunctionx1
x2
x3
xn
w1
w2
w3
wn weights
S y
Θ threshold
y=S ∑i=1
N
w i x iΘ 1
x
Θ
Heaviside (step) function
S
y=S ∑i=0
N
w i x i , where w0 = Θ and x
0 = 1
OR
![Page 17: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/17.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Other Transfer Functions
S(x)
x
a) linear
β
S(x)
xδ
Θ
b) two-step function
S(x)
x
+γ
S(x)
x
-γ
e) semilinear
S(x)
x
1
c) sigmoid
f) hyperbolic tangent
+1
-1
1
S(x)
x
Θ
d) Heaviside (step) function
![Page 18: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/18.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
MCP Neuron Significance
● The basic type of neuron.● Most ANNs are based on perceptron type
neurons. ● Note: there are other approaches (we will see
later...)
![Page 19: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/19.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Rosenblatt's Perceptron (1957)
Frank Rosenblatt -
The creator of the first neurocomputer: MARK I (1960).
Learning algorithm for MCP-neurons.
Classification of letters.
![Page 20: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/20.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Rosenblatt's Perceptron Network
not all connected(initially randomized, constant weights)
fully connected,learning
pattern A
pattern B
input pattern
distribution layer
feature detection - daemons
perceptrons
Rosenblatt = McCulloch-Pitts + learning algorithm
![Page 21: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/21.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Rosenblatt - Learning Algorithm
● Randomize weights.
● If the output is correct, no weight is changed.
● Desired output was 1, but it is 0 → increment the weights of the active inputs.
● Desired output was 0, but it is 1 → decrement the weights of the active inputs.
● Three possibilities for the amplitude of the weight change:
– Increments/decrements: only fixed values.
– I/D based on error size. It is advantageous to have them higher for higher error → may lead to instability.
– Fixed or variable I/D combination based on error size
![Page 22: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/22.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Geometric Interpretation of Perceptron
● Let's have a look at the inner potential (in 2D)
● Linear dividing (hyper)plane: decision boundary.
● What will the transfer function change?
Line with direction -w1/w2 a and shift −Θ/w2.
w1 x1w2 x2=0
x2=−w1
w2
x1−/w2
![Page 23: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/23.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
MP-Perceptron
● MP = Marvin Minsky a Seymour Papert, MIT Research Laboratory of
Electronics, In 1969 they published:
Perceptrons
![Page 24: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/24.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
MP-Perceptron
Σ
Σ
xn
x3
x2
x1
wn
w3
w2
w1w0
y
d
s
-
+
inputvector
weights
threshold
error
targetoutputvalue
binaryoutput
+1
+1
-1
perceptronlearningalgorithm
![Page 25: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/25.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
MP-Perceptron: Learning
w i t1=wi t ηe t x i t
where
e t =d t − y t
wiweight of i-th input
learning rate <0;1)η
xivalue of i-th input
y neuron output
d desired output
For each example pattern, modify perceptron weights:
error
![Page 26: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/26.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Demonstration
● Learning AND...
x1 0(-1) 1(+1)
x2
0(-1
)
1
(+1)
white square 0, black 1
0 encoded as -1, 1 as +1
-1 AND -1 = false
-1 AND +1 = false
+1 AND -1 = false
+1 AND +1 = true
![Page 27: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/27.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Minsky-Papert’s Blunder 1/2
● Both have exactly proved, that this ANN (understand: perceptron) is not convenient even for implementation of so simple logic function as XOR.
-1 XOR -1 = false
-1 XOR +1 = true
+1 XOR -1 = true
+1 XOR +1 = false
Impossible to separatethe classes by a single line!
![Page 28: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/28.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Minsky-Pappert’s Blunder 2/2
● They have previous correct conclusion incorrectly generalized for all ANNs. The history showed, that they are responsible for more than two decades delay in the ANN investigations.
● Linear non-separability!
![Page 29: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/29.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
XOR Solution - Add a Layer
1
2
+1
+1
3
see http://home.agh.edu.pl/~vlsi/AI/xor_t/en/main.htm
![Page 30: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/30.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
ADALINE
● Bernard Widrow, Stanford university, 1960.● Single perceptron.● Bipolar output: +1,-1.● Binary, bipolar or real valued
input.● New learning algorithm:
– LMS/delta rule.
● Application: echo cancellation in long distance communication circuits (still used).
![Page 31: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/31.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
ADALINE - AdAptive Linear Neuron
Σ
Σxn
x3
x2
x1
wn
w3
w2
w1w0
y
s
d
-
+
http://www.learnartificialneuralnetworks.com/perceptronadaline.html
LMSalgorithm
linearerror
inputvector
binaryoutput
targetoutputvalue
weights
bias
nonlinearfunction
innerpotential
+1
+1
-1
![Page 32: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/32.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
ADALINE Learning Algorithm:LMS/Delta Rule
wi t1=wi t ηe t xi t
where
e t =d t −s t
wiweight of i-th input
learning rate <0;1)η
xivalue of i-th input
s neurons inner potential
d desired output
Least Mean Square also called Widrow-Hoff's Delta rule.
![Page 33: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/33.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
MP-perceptron vs. ADALINE
● Perceptron: ∆wi= η(d-y)x
i
● Delta rule: ∆wi = η(d-s)x
i
● y -vs- s (output or inner potential).● Delta rule can asymptotically approach
minimum of error for linearly non-separable problems. Perceptron NOT.
● Perceptron always converges to error free classifier of linearly separable problem, delta rule might not succeed!
![Page 34: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/34.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
LMS Algorithm … ?
● Goal: minimize error E,
● where E = Σ(ej(t)2),
● j is for j-th input pattern.● E is only a function of the weights.● Gradient descent method.
![Page 35: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/35.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Graphically: Gradient Descent
36NaN - Neuronové sítě M. Šnorek 35
(w1,w2)
(w1+∆w1,w2 +∆w2)
![Page 36: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/36.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
There Are Two Ways
● Incremental mode: – gradient descent over individual training
examples.
● Batch mode: – gradient descent over the entire training data
set.
![Page 37: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/37.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Multi-Class Classification?1-of-N Encoding
● Perceptron can only handle two-class outputs. What can be done?
● Answer: for N-class problems, learn N perceptrons:
– Perceptron 1 learns “Output=1” vs “Output ≠ 1”
– Perceptron 2 learns “Output=2” vs “Output ≠ 2”
– :
– Perceptron N learns “Output=N” vs “Output ≠ N”
● To classify an output for a new input, just predict with each perceptron and find out which one puts the prediction the furthest into the positive region.
![Page 38: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/38.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
MADALINE (Many ADALINE)
● Processes binary signals.● Bipolar encoding.● One hidden, one output layer.● Supervised learning● Again B. Widrow.
![Page 39: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/39.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
MADALINE: Architecture
Note: We learn only weights at ADALINE inputs.
![Page 40: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/40.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
MADALINE: Linearly Non-Separable Problems
● MADALINE can learn to linearly non-separable problems:
1
0
10
1
0
1
0
![Page 41: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/41.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
MADALINE Limitations
● MADALINE is unusable for complex tasks.● ADALINEs learn independently, they are unable
to separate the input space to form a complex decision boundary.
![Page 42: Artificial Neural Networks - cvut.czLearning Paradigms Supervised – teaching by examples, given a set of example pairs P i =(x i,y i) find transformation T which approximates y i](https://reader030.vdocument.in/reader030/viewer/2022040821/5e6a8bb926a1ce513564d510/html5/thumbnails/42.jpg)
A4M33BIA 2011
Jan Drchal, [email protected], http://cig.felk.cvut.cz
Next Lecture
● Multilayered Perceptron (MLP).● Backpropagation learning algorithm.