artificial neural networks and applications dr. l. iliadis assis. professor democritus university of...
TRANSCRIPT
Artificial Neural Networks
and applications
Dr. L. Iliadis Assis. Professor Democritus University of
Thrace, Greece
Overview1. Definition of a Neural Network
2. The Human Brain
3. Neuron Models
4. Artificial neural networks ANN
5. Historical Notes
6. ANN Architecture
7. Learning processes – Training and Testing ANN
7.1. Backpropagation Learning
8. Well Known Applications of ANN
What is a Neural Network?A Neural Network is a collection of units connected in some pattern to allow communication between them and it acts as a massively distributed processor.
These units are also referred to as neurons or nodes. The Neural Network has a natural propensity for storing experiential knowledge and making it available for use.
It has two main characteristics:
Knowledge is acquired by the network from its environment through a learning process.
Interneuron connection strengths, known as synaptic weights, are used to store the acquired knowledge.
Animals are able to react adaptively to changes in their external and internal environment, and they use their nervous system to perform these behaviours. This is called Plasticity.
The Role of the synapses
synapses
axon dendrites
The synapses are responsible for the information transmission between two connected Neurons
Primary motor: voluntary movementPrimary somatosensory: tactile, pain, pressure, position, temp., mvt.Motor association: coordination of complex movementsSensory association: processing of multisensorial informationPrefrontal: planning, emotion, judgementSpeech center (Broca’s area): speech production and articulationWernicke’s area: comprehension
of speechAuditory: hearingAuditory association: complex auditory processingVisual: low-level visionVisual association: higher-level
vision
Functional Areas of the Brain
HUMAN BRAIN VERSUS SILLICON
The Human cortex has approximately 10 billion neurons and 60 trillion synapses. The net results is that the Brain is an enormously efficient structureThe Energetic Efficiency of the Brain is approximately 10-16 Joules per Operation per second whereas the corresponding value for the best computers in use today is about 10-6 Joules per Operation per second Human Brain Neurons (where events happen in the millisecond range 10-3 sec) are 5 or 6 orders of magnitude slower than silicon logic gates (where events happen in the nanosecond range 10-9 sec).
A signal Xi at the input of synapse i, connected to neuron k is multiplied by the synaptic weight wi
An Adder is Summing the input signals weighted by the respective synapses
An activation function for limiting the amplitude of the output of a neuron. It squashes the permissible amplitude range of output signal to a finit value in the interval [0,1] or alternatively in [-1,1].
Inputs
Outputw2
w1
w3
wn
wn-1
. . .
x1
x2
x3
…
xn-1
xn
y
n
iii xwz
1
Synaptic Weights
Summing function
ARTIFICIAL NEURON MODEL
)( kbzHy
Activation function
Bias bk
Artificial neural NetworksArtificial Neural Networks consist of interconnected elements which were inspired from studies of biological nervous Systems. They are an attempt to create machines that work in a similar way to the human brain, using components that behave like biological neurons.
• Synaptic strengths are translated as synaptic weights;
• Excitation means positive product between the incoming spike rate and the corresponding synaptic weight;
• Inhibition means negative product between the incoming spike rate and the corresponding synaptic weight;
Neuron’s OutputNonlinear generalization of the neuron:
Sigmoidal or Gaussian functions may be used
),( wxHy Where y is the neuron’s output, x is the vector of inputs, and w is the vector of synaptic weights.
2
2
2
||||
1
1
a
wx
axw
ey
ey T
Sigmoidal neuron
Gaussian neuron
Software or Hardware?Although ANN can be implemented as fast Hardware devices, much research has been performed using conventional computer running software simulations. Software Simulations provide a somewhat cheap and flexible environment in which to research ideas for many real-world applications as there exists an adequate performance. e.g. An ANN Software package might be used to develop a System from Credit Scoring of an individual who applies for a Bank Loan.
Historical NotesRamon y Cajal in 1911 introduced the idea of neurons as structural constituents of the brain The origins of ANN go way back to the 1940s, when McCulloch and Pitts published the first mathematical model of a biological NeuronResearch on ANN stopped for more than 20 years.In the mid 1980s emerged a huge interest in ANN due to the publication of the book “Parallel Distributed Processors” by Rumelhart and McClellandANN made a grate come-back in the 1990’s and they are now widely accepted as a tool in the development of Intelligent Systems
Architecture of Artificial Neural Networks
Inp
ut L
ayer
Output Layer
The way that the artificial neurons are linked together to compose an ANN may vary according to its Architecture. The above architectural graph illustrates the layout of a Multilayer Feedforward (data flow only to one direction) ANN in the case of a single Hidden Layer. The Hidden Layer is where the process takes place.
Hidden Layer
Supervised LearningLearning = learning by adaptation
For example: Animals learn that the green fruits are sour and the yellowish/reddish ones are sweet. The learning happens by adapting
the fruit picking behavior.
Learning can be perceived as an optimisation process. When an ANN is in its SUPERVISED training or learning phase, there are
three factors to be considered:
The Inputs applied are chosen from a Training Set, where the
desired response of the System to these Inputs is Known
The Actual Output Produced when an Input Pattern is applied, is compared to the desired Output and an Error is estimated
In ANN the learning occurs by changing the synaptic strengths (change of the weights) eliminating some synapses, and building new
ones.
PERCEPTRONThe Perceptron is one of the early ANN which is built around a nonlinear neuron, namely the McCulloch-Pitts neuron model.
It produces an output equal to +1 if the hard limiter input is positive, and -1 if its is negative.
Learning with a Perceptron
Perceptron: xwy Tout
Input Data: ),(),...,,(),,( 22
11
NN yxyxyx
Error:22 ))(())(()( t
tTtout yxtwytytE
Learning (Weight Adjustment):
tj
m
jj
T
tit
tTii
i
ttT
ii
ii
xtwxtw
xyxtwctwtw
w
yxtwctw
w
tEctwtw
1
2
)()(
))(()()1(
))(()(
)()()1(
The synapse strength modification rules for artificial neural networks
can be derived by applying mathematical optimisation methods
Learning with MLP ANN
MLP Multi Layer Process ANN:
with p layers
Data: ),(),...,,(),,( 22
11
NN yxyxyx
Error: 22 ));(())(()( tt
tout yWxFytytE
1
221
2
22
111
1
11
);(
...
),...,(
,...,1,1
1
),...,(
,...,1,1
1
2
212
1
11
ppTout
TM
aywk
TM
axwk
ywWxFy
yyy
Mke
y
yyy
Mke
y
kkT
kkT
Calculation of the weight changes is too complicated
xyout
1 2 … p-1 p
Backpropagation LearningIt was developed by Werbos but Rumelhart et. al. in 1986 gave a new lease of life to ANN. The Weight adaption rule is known as Backpropagation.
• It defines 2 sweeps of the ANN. First it performs a forward sweep from the input layer. Thus it calculates first the changes for the synaptic weights of the output neuron;
• Then is performs a backward sweep from the output layer to the Input. In this way it calculates the changes backward starting from layer p-1, and propagates backward the local error terms.
•The bakcward sweep is similar to the forward, except that error values are propagated back through the ANN to determine how the weights are to be changed during training.
EXTDBD Learning Rule
The Extended Delta Bar DeltaExtended Delta Bar Delta is a Heuristic technique that has been used successfully in a wide range of applications and its main characteristic is that it uses a termed momentum. More specifically, a term is added to the standard weight change, which is proportional to the previous weight change. In this way good general trends are reinforced and oscillations are damped.
EVALUATION INSTRUMENTSThe RMS ErrorRMS Error adds up the squares of the errors for each PE in the output layer, divides by the number of PEs in the output layer to obtain an average and then takes the square root of that average hence the name “root square”.
Also another instrument is the Common Mean CorrelationCommon Mean Correlation (CMC) coefficient of the desired (d) and the actual (predicted) output (y) across
the Epoch. The CMC is calculated by
where and
It should be clarified that d stands for the desired values, y for the predicted values and i ranges from 1 to n (the number of cases in the data
training set) and E is the Epoch size which is the number of sets of training data presented to the ANN learning cycles between weight
updates.
22
yydd
yyddCMC
ii
ii
idE
d1
iyE
y1
TESTING AND OVERTRAINING
Over-TrainingOver-Training is a very serious problem!!!
Testing is the process that actually determines the strength of the ANN and its ability to generalize.generalize.
The performance of an ANN is critically dependent on the training data that must be representative of the task to learn (Callan, 1999). For this purpose in the Testing phase we chose randomly A lomg set of actual cases (records) that were not appliednot applied in the training phase.
New methods for learning with neural networks
Bayesian learning:
the distribution of the neural network parameters is learnt
Support vector learning:
the minimal representative subset of the available data is used to calculate the synaptic weights of the neurons
Tasks Performed by Artificial neural networks
The following tasks are usually performed by ANN
• Controlling the movements of a robot based on self-perception and other information (e.g., visual information)
• Decision making
• Pattern Recognition (e.g. recognizing a visual object, a familiar face)
ANN tasks
• Control
• Classification
• Prediction
• Approximation
These can be reformulated in general as
FUNCTION APPROXIMATION
tasks.
With the term Approximation we mean: given a set of values of a function g(x) build a neural network that approximates the g(x) values for any input x.
Learning to approximate
Error measure:
N
ttt yWxF
NE
1
2));((1
Rule for changing the synaptic weights:
ji
ji
newji
ji
ji
www
Ww
Ecw
,
)(
Where c is the learning parameter (usually a constant)
Summary• Artificial neural networks are inspired by the learning processes that take place in biological systems.
• Artificial neurons and neural networks try to imitate the working mechanisms of their biological counterparts.
• Learning can be perceived as an optimisation process.
• Biological neural learning happens by the modification of the synaptic strength. Artificial neural networks learn in the same way.
• The synapse strength modification rules for artificial neural networks can be derived by applying mathematical optimisation methods.
Summary• Learning tasks of artificial neural networks can be reformulated as function approximation tasks.
• Neural networks can be considered as nonlinear function approximating tools (i.e., linear combinations of nonlinear basis functions), where the parameters of the networks should be found by applying optimisation methods.
• The optimisation is done with respect to the approximation error measure.
• In general it is enough to have a single hidden layer neural network (MLP, RBF or other) to learn the approximation of a nonlinear function. In such cases general optimisation can be applied to find the change rules for the synaptic weights.