Download - Unit-I & II Slides
-
7/28/2019 Unit-I & II Slides
1/86
Neural Networks
Unit I & II
(of a Total of VIII units)
K Raghu Nathan
Retd Dy Controller (R&D)
-
7/28/2019 Unit-I & II Slides
2/86
Topics covered in this Unit
Biological Neural Networks
Computers & Biological Neural Networks
Models of Neuron [Artificial Neurons] ANN Terminology
Artificial Neural Networks
Historical Development of NN Principles
-
7/28/2019 Unit-I & II Slides
3/86
Topics covered [contd]
ANN Topologies
ANN Functional Usage
Pattern Recognition Tasks Learning in ANNs
Basic Learning Laws
-
7/28/2019 Unit-I & II Slides
4/86
Biological Neural Networks
Nervous System
Complex system of interconnected nerves
Made up of Nerve Cells called Neurons
Neurons Receive & Transmit informationbetween various parts/organs of the body
Sensory (Receptor) Neuron, Motor
Neuron, Inter-Neuron Transmission of signal is a complex
electro-chemical process
-
7/28/2019 Unit-I & II Slides
5/86
-
7/28/2019 Unit-I & II Slides
6/86
-
7/28/2019 Unit-I & II Slides
7/86
-
7/28/2019 Unit-I & II Slides
8/86
The Biological Neuron
-
7/28/2019 Unit-I & II Slides
9/86
-
7/28/2019 Unit-I & II Slides
10/86
-
7/28/2019 Unit-I & II Slides
11/86
Biological Neuron
Cell Body Soma Has a Nucleus
Dendrites
Fiber-like; large number; branched structure Receive signals from other neurons
Axon One per neuron; longer & thicker; branched at its end
Transmits signals to other neurons Contains Vesicles, which hold chemical substance
called neural transmitters
-
7/28/2019 Unit-I & II Slides
12/86
Biological Neuron [contd]
Synapse Synaptic Cleft Synaptic Gap
Junction of axon & dendrites
Pre-synaptic neuron
Transmitting neuron
Post-synaptic neuron Receiving neuron
-
7/28/2019 Unit-I & II Slides
13/86
The Synapse
-
7/28/2019 Unit-I & II Slides
14/86
-
7/28/2019 Unit-I & II Slides
15/86
-
7/28/2019 Unit-I & II Slides
16/86
Neuron Signals
Complex electro-chemical process
Incoming signals raise or lower the electrical potentialinside the neuron
If potential crosses a threshold, a short electrical pulse isproduced
We say the neuron fires [is triggered or activated] The pulse is sent down the axon
electrical activity inside the neurons
Chemical activity occurs at the synapses
Vesicles in the axon release chemical substance, calledneural transmitters
These are collected by dendrites of receiving neuron
This raises/lowers electric potential in the receivingneuron
-
7/28/2019 Unit-I & II Slides
17/86
Neuron Signals
Each neuron receives lot of input signalsthru its dendrites
from many other neurons
Sends an output signal thru its axon to many other neurons
Output depends on all inputs
Cell body acts like a summing &processing device
Process depends on type of neuron
-
7/28/2019 Unit-I & II Slides
18/86
Characteristics of Biological NN
Robustness & Fault Tolerance Decay of nerve cells does not seem to affect
performance significantly
Flexibility
Automatically adjusts to new environment
Ability to deal with wide variety ofsituations
Uncertain, Vague, Inconsistent, Noisy
Collective Computation
Massively Parallel
Distributed
-
7/28/2019 Unit-I & II Slides
19/86
Aspect Computer Biological NN
Speed Numeric: FasterPatterns: Slower
SlowerFaster
Processing Sequential Massively Parallel
Size &
Complexity
Less complex Very Complex
Storage in memory locations;
addressable; fixed capacity;
new info overwrites old info
In the strengths of the
interconnections;
Adaptable size, to add
new infoFault Tolerance no yes
Control
Mechanism
centralized distributed
f
-
7/28/2019 Unit-I & II Slides
20/86
Artificial Neuron - Neuron Models
Mathematical Models of Neuron
M & P model
Perceptron Adaline
Madaline
Neocognitron
-
7/28/2019 Unit-I & II Slides
21/86
McCulloch & Pitts Model
aiwi + bf(x)
a1
a2
an
ai
xs
w1
w2
wi
wn
inputs weights
Summing part Output part
Activation
value
Output
signal
[b = bias]
[s= f(x)]output
function
-
7/28/2019 Unit-I & II Slides
22/86
Output Function
Binary : if x>=t, s=1
else, s=0
(t = threshold)
Linear : s x
s = k t
t
0
1
x
s
x
0
s
-
7/28/2019 Unit-I & II Slides
23/86
Ramp :
Sigmoid :s = 1/(1+e-x)
s
x0
0
-
7/28/2019 Unit-I & II Slides
24/86
NOR gate, using the M&P model
a1
a2
-1
-1
s
a1 a2 x s
0 0 1 1
0 1 0 0
1 0 0 0
1 1 -1 0
-
7/28/2019 Unit-I & II Slides
25/86
Perceptron
Inputs are first processed by Association
Units
Weights are adjustable, to enable
Learning
Actual output is compared with desired
output; the difference is Error
Error is used to adjust the weights, to
obtain desired output
-
7/28/2019 Unit-I & II Slides
26/86
Perceptron (contd)
a1
a2
a3
aiwi
+b
f(x)
Sensory
units
Association
units
Summing
unit
Output
function
s
w1
w2
w3
x
-
7/28/2019 Unit-I & II Slides
27/86
Perceptron (contd)
Expected output = s
Actual output = s
Error = = ss Weight change =wi = ai
is the Learning Rate parameter
-
7/28/2019 Unit-I & II Slides
28/86
Perceptron Learning
Perceptron Learning Rule
Procedure for adjusting the weights
If weight adjustments lead to zero-error,
we say it converges Whether error reduces to 0, depends on
nature of desired input-output pairs of data
Perceptron Convergence Theorem To determine whether desired input-output
pairs are representable [achievable]
-
7/28/2019 Unit-I & II Slides
29/86
Adaline
Adaline = Adaptive LinearElement
Similar to Perceptron; difference is :
Employs Linear Output Function (s=x)
Weight update rule minimises the mean
squared error, averaged over all inputs
Hence known as LMS (Least Mean Squared)
Error Learning Rule
Also known as : Gradient Descent Algorithm
-
7/28/2019 Unit-I & II Slides
30/86
Terminology
Processing Unit
Summing part, output part
Inputs, weights, bias, activation value
Output function, output signal Interconnections various Topologies
Operations
Activation Dynamics, Learning Laws Update
Synchronous, Asynchronous
-
7/28/2019 Unit-I & II Slides
31/86
Artificial Neural Networks
It is possible to create models of the
biological neurons as processing units
and link them to form closely interconnected
networks
Models may be electronic / software
Such networks are called Artificial Neural
Networks [ANN]
-
7/28/2019 Unit-I & II Slides
32/86
ANN
ANNs exhibit abilities surprisingly similar
to Biological NNs
They can Learn, Recognize, Remember,
Match & Retrieve Patterns of Information
Hardware implementations of ANN are
also available nowadays
Costly but faster than software
implementation
-
7/28/2019 Unit-I & II Slides
33/86
Historical Development of ANN
1943 - McCulloch & Pitts Model of Neuron 1949 - Hebbian Learning Law
1958 - Rosenblatts Model Perceptron
1960 Widrow & Hoff Adaptive LinearElement [Adaline] & Least Mean Squared[LMS] Error Learning Law
1969 Minsky & Papert - Multilayer Perceptron
1971 Kohonen Associative Memory
1971 Wilshaw Self-Organization
1974 Werbos Error Backpropagation
-
7/28/2019 Unit-I & II Slides
34/86
Historical Development of ANN [contd]
1976 Grossberg Adaptive ResonanceTheory [ART]
1980 Fukushima - Neocognitron
1982 Hopfield Energy Analysis 1985 Sejnowski Boltzmann Machine
1987 Nielsen Counterpropagation [CPN]
1988 Kosko Bidirectional Associative
Memory [BAM] 1988 Broomhead Radial Basis Function
[RBF]
-
7/28/2019 Unit-I & II Slides
35/86
Topology
Topology is the physical organisation of the ANN
Arrangement of the processing units,interconnections & pattern input & output
ANN is made up of Layers of Neurons All Neurons within one layer have same
activation dynamics & output function
In addition to interlayer connections, intralayer
connections may also be made Connections across the layers may be in feed-
forward or feed-back manner
-
7/28/2019 Unit-I & II Slides
36/86
Topology (contd)
One Input layer, one output layer
zero or more intermediate layers (usuallyreferred as hidden layers)
No limit on no. of layers There can be any no. of neurons in any layer; all
layers need not have same no. of neurons
If there is no hidden layer, the ANN is called
single-layer network If one or more hidden layers are present, it is
called multi-layer network
-
7/28/2019 Unit-I & II Slides
37/86
Topology (contd)
Feedforward Networks
the units are connected such that data flowsonly in forward direction, ie. from input layer tooutput layer, via successive hidden layers ifany
Feedback Networks data flows in forward direction, as above
in addition, the connections allow data flowfrom output layer towards input layer also
the reverse flow (feedback) is for error-correction, for adjusting weights suitably toget desired output, which is an essential
feature of the mechanism for NN Learning
-
7/28/2019 Unit-I & II Slides
38/86
Single Layer FF Network
Input layerOutput layer
-
7/28/2019 Unit-I & II Slides
39/86
Multilayer Feedforward Network
Input
layer
Output
layerHidden layers
-
7/28/2019 Unit-I & II Slides
40/86
Feedback Network
-
7/28/2019 Unit-I & II Slides
41/86
Neuronal Dynamics
Operation of NN governed by Neuronal
Dynamics
Dynamics of activation state
Dynamics of synaptic weights
Short term Memory (STM) modelled by
activation state of the NN
Long Term Memory (LTM) corresponds to
encoded pattern of info in synaptic weights
Applications of Artificial Neural
-
7/28/2019 Unit-I & II Slides
42/86
Artificial
Intellect with
Neural
Networks
Intell igent
Contro l
Technical
Diagnist i
cs
Intell igent
Data An alysisand Signal
Processing
Advance
Robot ics
Machine
Vision
Image &
Pattern
Recognit ion
Intell igent
Securi ty
Systems
Intell igent
l
Medicine
Devices
Intell igent
Expert
Systems
Applications of Artificial Neural
Networks
42
-
7/28/2019 Unit-I & II Slides
43/86
Major Areas of Usage
Pattern Recognition Tasks
These tasks necessarily involve Learning
Memory
Information Retrieval
-
7/28/2019 Unit-I & II Slides
44/86
Patterns
Computers deal with Data
Humans deal with Patterns
Objects/Images, voices/sounds, even
actions [walking etc] have patterns Different images, sounds & actions have
different patterns
Patterns enable us to recognise, classify &identify objects & to take decisions basedon such identification
-
7/28/2019 Unit-I & II Slides
45/86
Pattern Recognition Tasks
Pattern Association
Pattern Classification
Pattern Mapping
Pattern Clustering (aka Pattern Grouping)
Feature Mapping
-
7/28/2019 Unit-I & II Slides
46/86
Pattern Association
Every input pattern is associated with an outputpattern, to form a pair of input-output patterns
There will be many such pairs of input-outputpatterns
A well-designed ANNs can be trained to learn(remember) many such pairs of patterns
Whenever a pattern is input, the ANN shouldretrieve (output) the corresponding outputpattern
Supervised Learning has to be employed [beingtaught]
This is purely a memory function & is calledauto-assoc iat ion task
-
7/28/2019 Unit-I & II Slides
47/86
Pattern Association (contd)
Desirable : even if the input pattern is incompleteor noisy [ie. contains some errors], we shouldget correct output pattern
Among the various input patterns in its memory,
the ANN should select one pattern which isclosest to the test input & the correspondingoutput pattern should be output by the ANN
This needs content addressable memory & the
process is called accret ive behaviou r Example of Pattern Association task : OCR of
printed characters
-
7/28/2019 Unit-I & II Slides
48/86
Pattern Classification
Objects belonging to the same class have manycommon features/patterns
This fact enables us to classify objects into classes & toidentify new classes
Supervised Learning the patterns for each class has tobe taught to the system
Pattern classification tasks must exhibit accret ivebehaviour ie. an incomplete or noisy input shouldpoduce an output corresponding to its closest known
input pattern
Example of Pattern Classification task: VoiceRecognition, Handwriting Recognition
-
7/28/2019 Unit-I & II Slides
49/86
Pattern Mapping
Capturing the relation between the input
pattern & its corresponding output pattern
This is a general isat iontask, not mere
memorising
This is called in terpolative behaviou r
Example of Pattern Mapping task: Speech
Recognition
-
7/28/2019 Unit-I & II Slides
50/86
Pattern Clustering
Identifying subsets of patterns having
similar distinctive features & grouping
them together
Sounds similar to Pattern Classification,
but is not the same
Has to employ Unsupervised Learning
-
7/28/2019 Unit-I & II Slides
51/86
Classification
Patterns for eachclass are input
separately
That is, system is
trained to learn
patterns of one class
first
Then it is taught thepatterns of another
class
Clustering
Patterns belonging toseveral groups are
mixed in the set of
inputs
System has to resolve
them into different
groups
-
7/28/2019 Unit-I & II Slides
52/86
Feature Mapping
In several patterns, the features may not beunambiguous
May vary over a time-period
Therefore, difficult to cluster
In this case, system learns a feature map-rather than clustering or classifying
Has to employ unsupervised learning
Example: you see a new object - for the first time
never seen it before - & it has some distinctfeatures, as well some features common tomany known classes or groups
-
7/28/2019 Unit-I & II Slides
53/86
Pattern Recognition Problem
In any pattern recognition task, we have aset of input patterns & a set of desired
output patterns
Depending on the nature of desired outputpatterns & the nature of the task
environment, the problem would be one of
the following three types:
Pattern Association Problem
Pattern Classification Problem
Pattern Mapping Problem
-
7/28/2019 Unit-I & II Slides
54/86
Pattern Association Problem
Problem: to design an ANN
Input-output pairs are (a1,b1), (a2, b2), (a3,
b3), ., (aL,bL)
al = (al1,al2,,alM) & bl = (bl1, bl2,,blN) are
vectors of dimensions M & N
The ANN should associate the input
patterns with the corresponding output
patterns
-
7/28/2019 Unit-I & II Slides
55/86
Pattern Association Problem (contd)
If al & bl are distinct, the problem is hetero-
associative
If al = bl, it is auto-associative; al=bl means M=N,
the input & output patterns both refer to thesame point in a N dimensional space
Storing the association of the pairs of input &
output patterns = deciding the weights in the
network, by applying the operations of thenetwork on the input pattern
-
7/28/2019 Unit-I & II Slides
56/86
Pattern Association Problem (contd)
If a given input pattern = same as whatwas used for training the network, theoutput pattern = same as what was usedduring training
If input pattern is slightly different(incomplete or noisy), output may also bedifferent
If actual input a = al + [ = noise vector] If output is bl [as desired] NW is showingacretive behaviour
If output is b = bl + , and 0 as 0,
NW is showing interpolative behaviour
B i F ti l U it
-
7/28/2019 Unit-I & II Slides
57/86
Basic Functional Units
Basic functional unit = simplest form in the3 types of NN viz. FF, FB & Combination
NWs
Simplest FF NN is a single-layer NW
Si l t FB NN h N it h
-
7/28/2019 Unit-I & II Slides
58/86
Simplest FB NN has N units, each
connected to all others & to itself
-
7/28/2019 Unit-I & II Slides
59/86
Simplest Combination of FF & FB NW [aka
Compet i tive Learning NW] is a single-
layer NW in which the units in output layerhave feedback connections among
themselves
T f ANN & th i it bl t k
-
7/28/2019 Unit-I & II Slides
60/86
Types of ANN & their suitable tasks
FF NN Pattern Association, Classification & Mapping
FB NN
Auto-Association, Pattern Storage (LTM),Pattern Environment Storage (LTM)
FF & FB (CL) NN
Pattern Storage (STM), Clustering & FeatureMapping
FF NN P tt A i ti
-
7/28/2019 Unit-I & II Slides
61/86
FF NN Pattern Association
a1
a2
a5
a3
a4
b1
b2
b3
b4
For input pattern ai, the corresponding output pattern is bi.
a5 & a6 are noisy versions of a3.
In a5 the noise is less, it is nearest to a3 - NW outputs b3 [desired], it is
accretive.
In a6 noise is more, it is nearer to a4 than a3 NW may output b4.
a6
R l Lif E l
-
7/28/2019 Unit-I & II Slides
62/86
Real-Life Example
A 1000001
B 1000010
.
.
.
Z 1011010
A
B
Z
Inputs are 8x8 grids of pixels
of binary values.
Input pattern space is a binary
256-dimensional space.
Outputs are 5-bit binary
numbers (7-bit ascii
characters).
Output pattern space is binary
5-dimensional space.
Noisy versions of input
patterns can occur, whensome values of some pixels
get changed, due to noise in
transmission channel or
dust/stain spots on the
document being scanned.
FF NN P tt Cl ifi ti
-
7/28/2019 Unit-I & II Slides
63/86
FF NN Pattern Classification
Some of the output patterns may be identical
So, a set of input patterns may correspond to the
same output pattern
No. of distinct output patterns = a class label
Input patterns corresponding to each class =samples of that class
In such cases, the NN has to classify the input
patterns That is: for each input pattern, the NN should
identify the class [output pattern] to which it
belongs
R l Lif E l
-
7/28/2019 Unit-I & II Slides
64/86
Real-Life Example
A
B
A A
B B
A 1000001
B 1000010
CL NN P tt Cl ifi ti
-
7/28/2019 Unit-I & II Slides
65/86
CL NN Pattern Classification
Accretive behaviour
FF NN P tt M i
-
7/28/2019 Unit-I & II Slides
66/86
FF NN Pattern Mapping
NN is trained with some pairs of input-output
patterns, not all possible pairs
When a new input pattern is given, the NN ismade to find the coresponding output pattern[though the NN was not trained with this pair]
Suppose the NN has been trained with i/o pairsan & bn
If the new input pattern am is closer to some
known input pattern am, the NN tries to find anoutput pattern b which is closer to bn
Interpolative behaviour
-
7/28/2019 Unit-I & II Slides
67/86
Pattern Mapping Action
a1
a2
a6
a3
a4
a5
b1
b2
b6
b3
b4
b5
NN trained with (a1,b1) to (a5,b5) only; not trained with (a6,b6) pair.
a6 closer to a3; so, NN maps it on to b6, which is closer to b3.
FB NN P tt A i ti
-
7/28/2019 Unit-I & II Slides
68/86
FB NN Pattern Association
If input patterns are identical to output
patterns, input & output spaces are
identical
Problem reduces to auto-association
trivial; the NW merely stores the input patterns
If a noisy pattern arrives at input, NW
outputs the same noisy pattern as outputAbsence of accretive behaviour
FB NN P tt A i ti ( td)
-
7/28/2019 Unit-I & II Slides
69/86
FB NNPattern Association (contd)
a1
a2
a5
a3
a4
a1
a2
a5
a3
a4
FB NN Pattern Storage (LTM)
-
7/28/2019 Unit-I & II Slides
70/86
FB NN Pattern Storage (LTM)
Auto-association with accretive behaviour
Input patterns are stored; stored patterns can beretrieved by a noisy/approximate input patternalso
Very useful in practice
Two possibilities : Stored patterns = same as input patterns; input
pattern space is continuous; output pattern space is aset of fixed finite set of patterns that are stored
Stored patterns = some transformed versions of inputpatterns; output space has same dimensions as inputspace
FB NN Pattern Storage (contd)
-
7/28/2019 Unit-I & II Slides
71/86
FB NNPattern Storage (cont d)
FB NN Pattern Environment Storage
-
7/28/2019 Unit-I & II Slides
72/86
FB NN Pattern Environment Storage
Pattern Environment = a set of patterns +
the probabilities of their occurrence
NW is designed to recall the patterns with
lowest probability of error
More about this in Unit-VII
CL NN Pattern Storage (STM)
-
7/28/2019 Unit-I & II Slides
73/86
CL NN Pattern Storage (STM)
STM = short term memory = temporary storage
Given input [as it is or a transformed version] is
stored
As long as same pattern is input, the storedpattern is recalled
When new pattern is input, stored pattern is lost
new pattern is stored
Such NW can be studied on academic interest
only not of practical use
CL NN Pattern Clustering
-
7/28/2019 Unit-I & II Slides
74/86
g
Patterns are grouped, based on similarities
Input is an individual pattern; ouput is the patternof group to which the input belongs
That is : a group of approximately similar
patterns are identified with one & the same
cluster label & will produce the same output
pattern
Two types possible : new input pattern, not
belonging to any group, is forced to one of the groups (Accretive behaviour)
shown as belonging to a new group
Input is close to some known input pattern x
New group is close to xs group (Interpolative behaviour)
CL NN Pattern Clustering (contd)
-
7/28/2019 Unit-I & II Slides
75/86
CL NNPattern Clustering (cont d)
Interpolative behaviour
CL NN Feature Mapping
-
7/28/2019 Unit-I & II Slides
76/86
CL NN Feature Mapping
Similar to clustering; difference is:
Similar inputs produce similar output [not
the same output]
similarities of inputs is retained at the
output
No accretive behaviour; only interpolative
Output patterns are much larger [than for
clustering]
-
7/28/2019 Unit-I & II Slides
77/86
Types of Learning (contd)
-
7/28/2019 Unit-I & II Slides
78/86
Types of Learning (cont d)
Reinforcement Learning Bridges the gap between supervised &
unsupervised methods
Output is not known
System receives feedback from environment
Reward for correctness
Punishment for error
System adapts its parameters based on this
feedback
Learning Equation
-
7/28/2019 Unit-I & II Slides
79/86
Learning Equation
Implementation of Synaptic Dynamics
Expression for updating of weights Express the weight vector ofith processing unit
at time instant t+1, in terms of that weight vectorat time instant t
wi(t+1) = wi(t) +wi(t) wi(t) is the change in the weight vector
Different researchers have proposed differentexpressions for calculatingwi(t); these are
called Learning Laws
Learning Laws
-
7/28/2019 Unit-I & II Slides
80/86
Learning Laws
Hebbs Law [Hebbian Learning Law]
Perceptron Learning Law
Delta Learning Law
LMS Learning Law
Correlation Learning Law
Instar [winner-take-all] Learning Law Outstar Learning Law
B lt L i
-
7/28/2019 Unit-I & II Slides
81/86
Boltzmann Learning
Stochastic Learning Algorithm
A Network designed to apply Boltzmann
Learning Rule is called Boltzmann
Machine
The neurons constitute a recurrent
structure & give binary output [+1 or -1]
corresponding to whether the neuron is onor off
M b d L i
-
7/28/2019 Unit-I & II Slides
82/86
Memory-based Learning
Past experiences = patterns which the NN hasbeen trained to recognise/classify
Each experience is a pair of input & output
patterns All or most of the past experiences are stored in
a large memory
Any new input pattern can be compared with
patterns stored in memory & the corresponding
output pattern can be output
M b d L i ( td)
-
7/28/2019 Unit-I & II Slides
83/86
Memory-based Learning (contd)
Memory-based learning algorithms involve2 essential ingredients
Criteria applied to define local
neighbourhood [patterns which are similar]
Learning rule applied for training the NN
Algorithms will differe based on how these
2 ingredients are defined
S f L i L
-
7/28/2019 Unit-I & II Slides
84/86
Summary of Learning Laws
See table 1.2 on page 35 of
Yegnanarayanas book
LearningLaw
WeightUpdate (Wi)
Formula
Initial Weights Type ofLearning
Remarks
-
7/28/2019 Unit-I & II Slides
85/86
(forj= 1, 2, ..., M)
Hebbian siaj Near zero Unsupervised
Perceptron (bi- si) aj Random Supervised Bipolar OutputFunctions
Delta (bi- si) f(xi) aj Random Supervised
Widrow-Hoff
(LMS) [bi - wi
Ta] ajRandom Supervised
Correlation bi ajNear zero Supervised
Winner-Take-All
(Instar) (aj wkj)
Random, but
normalised
Unsupervised Competitive
Learning;
K is the Winning Unit
Outstar (bi
wjk
) Zero Supervised Grossberg Learning
-
7/28/2019 Unit-I & II Slides
86/86
End of Units I & II