1 basic concepts of neural networks and fuzzy logic ...users.monash.edu/~app/cse5301/lnts/lad.pdf1...

18
Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005 1 Basic concepts of Neural Networks and Fuzzy Logic Systems Inspirations based on course material by Professors Heikki Koiovo http://www.control.hut.fi/Kurssit/AS-74.115/Material/ and David S. Touretzki http://www-2.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15782-s04/slides/ are duly acknowledged Neural networks and Fuzzy Logic Systems are often considered as a part of Soft Computing area: FUZZY LOGIC NEURAL NETWORKS PROBAB. REASONING SOFT COMPUTING Figure 1–1: Soft computing as a composition of fuzzy logic, neural networks and probabilistic reasoning. Intersections include neuro-fuzzy systems and techniques, probabilistic approaches to neural networks (especially classification networks) and fuzzy logic systems, and Bayesian reasoning. A.P. Papli´ nski 1–1 Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005 Neuro-Fuzzy systems We may say that neural networks and fuzzy systems try to emulate the operation of human brain. Neural networks concentrate on the structure of human brain, i.e., on the “hardware” emulating the basic functions, whereas fuzzy logic systems concentrate on “software”, emulating fuzzy and symbolic reasoning. Neuro-Fuzzy approach has a number of different connotations: The term “Neuro-Fuzzy” can be associated with hybrid systems which act on two distinct subproblems: a neural network is utilized in the first subproblem (e.g., in signal processing) and a fuzzy logic system is utilized in the second subproblem (e.g., in reasoning task). We will also consider system where both methods are closely coupled as in the Adaptive Neuro-Fuzzy Inference Systems (anfis) Neural Network and Fuzzy System research is divided into two basic schools Modelling various aspects of human brain (structure, reasoning, learning, perception, etc) Modelling artificial systems and related data: pattern clustering and recognition, function approximation, system parameter estimation, etc. A.P. Papli´ nski 1–2

Upload: ngophuc

Post on 16-Mar-2018

221 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

1 Basic concepts of Neural Networks and Fuzzy Logic SystemsInspirations based on course material by Professors Heikki Koiovo http://www.control.hut.fi/Kurssit/AS-74.115/Material/ andDavid S. Touretzki http://www-2.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15782-s04/slides/ are duly acknowledged

Neural networks and Fuzzy Logic Systems are often considered as a part of Soft Computing area:

115

Chapter 8

Conclusion

Figure 8.1 Soft computing as a union of fuzzy logic, neural networks and probabilistic reasoning.Intersections include neurofuzzy techniques, probabilistic view on neural networks (especiallyclassification networks) and similar structures of fuzzy logic systems and Bayesian reasoning.

The strengthnesses of each methods are summarized in the following table:

FUZZYLOGIC

NEURALNETWORKS

PROBAB.REASONING

SOFT COMPUTING

Figure 1–1: Soft computing as a composition of fuzzy logic, neural networks and probabilistic reasoning.

Intersections include

• neuro-fuzzy systems and techniques,

• probabilistic approaches to neural networks (especially classification networks) and fuzzy logicsystems,

• and Bayesian reasoning.

A.P. Paplinski 1–1

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

Neuro-Fuzzy systems

• We may say that neural networks and fuzzy systems try to emulate the operation of human brain.

• Neural networks concentrate on the structure of human brain, i.e., on the “hardware” emulating thebasic functions, whereas fuzzy logic systems concentrate on “software”, emulating fuzzy and symbolicreasoning.

Neuro-Fuzzy approach has a number of different connotations:

• The term “Neuro-Fuzzy” can be associated with hybrid systems which act on two distinctsubproblems: a neural network is utilized in the first subproblem (e.g., in signal processing) and afuzzy logic system is utilized in the second subproblem (e.g., in reasoning task).

• We will also consider system where both methods are closely coupled as in the Adaptive Neuro-FuzzyInference Systems (anfis)

Neural Network and Fuzzy System research is divided into two basic schools

• Modelling various aspects of human brain (structure, reasoning, learning, perception, etc)

• Modelling artificial systems and related data: pattern clustering and recognition, functionapproximation, system parameter estimation, etc.

A.P. Paplinski 1–2

Page 2: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

Neural network research in particular can be divided into

• Computational Neuroscience

– Understanding and modelling operations of single neurons or small neuronal circuits, e.g.minicolumns.

– Modelling information processing in actual brain systems, e.g. auditory tract.

– Modelling human perception and cognition.

• Artificial Neural Networks used in

– Pattern recognition

– adaptive control

– time series prediction, etc.

Areas contributing to Artificial neural networks:

• Statistical Pattern recognition

• Computational Learning Theory

• Computational Neuroscience

• Dynamical systems theory

• Nonlinear optimisation

A.P. Paplinski 1–3

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

We can say that in general

• Neural networks and fuzzy logic systems are parameterised computational nonlinear algorithms fornumerical processing of data (signals, images, stimuli).

• These algorithms can be either implemented of a general-purpose computer or built into a dedicatedhardware.

• Knowledge is acquired by the network/system through a learning process.

• The acquired knowledge is stored in internal parameters (weights).

(Parameters)

UnknownSystem

Neural Network/Fuzzy System

control signals

dataEssential aspects of data, orparameters of the unknown system model

Figure 1–2: System as a data generator

A.P. Paplinski 1–4

Page 3: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

1.1 Biological Fundamentalsof Neural Networks

A typical neuron, or nervecell, has the followingstructure:

from Kandel, Schwartz and Jessel, Principles of Neural Science

• Most neurons in the vertebrate nervous system have several main featuresin common.

• The cell body contains the nucleus, the storehouse of genetic information,and gives rise to two types of cell processes, axons and dendrites.

• Axons, the transmitting element of neurons, can vary greatly in length;some can extend more than 3m within the body. Most axons in the centralnervous system are very thin (0.2 . . . 20 µm in diameter) compared withthe diameter of the cell body (50 µm or more).

• Many axons are insulated by a fatty sheath of myelin that is interrupted atregular intervals by the nodes of Ranvier.

• The action potential, the cell’s conducting signal, is initiated either at theaxon hillock, the initial segment of the axon, or in some cases slightlyfarther down the axon at the first nod of Ranvier.

• Branches of the axon of one neuron (the presynaptic neuron) transmitsignals to another neuron (the postsynaptic cell) at a site called thesynapse.

• The branches of a single axon may form synapses with as many as 1000other neurons.

• Whereas the axon is the output element of the neuron, the dendrites(apical and basal) are input elements of the neuron. Together with the cellbody, they receive synaptic contacts from other neurons.

A.P. Paplinski 1–5

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

Simplified functions of these very complex in their nature “building blocks” of a neuron are as follow:

• The synapses are elementary signal processing devices.

– A synapse is a biochemical device which converts a pre-synaptic electrical signal into a chemicalsignal and then back into a post-synaptic electrical signal.

– The input pulse train has its amplitude modified by parameters stored in the synapse. The nature ofthis modification depends on the type of the synapse, which can be either inhibitory or excitatory.

• The postsynaptic signals are aggregated and transferred along the dendrites to the nerve cell body.

• The cell body generates the output neuronal signal, activation potential, which is transferred along theaxon to the synaptic terminals of other neurons.

• The frequency of firing of a neuron is proportional to the total synaptic activities and is controlled bythe synaptic parameters (weights).

• The pyramidal cell can receive 104 synaptic inputs and it can fan-out the output signal to thousands oftarget cells — the connectivity difficult to achieve in the artificial neural networks.

A.P. Paplinski 1–6

Page 4: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

Another microscopic view of typical neuron of the mammalian cortex (a pyramidal cell):

• Note the cell body or soma, dendrites, synapsesand the axon.

• Neuro-transmitters (information-carryingchemicals) are released pre-synaptically, floatsacross the synaptic cleft, and activate receptorspostsynaptically.

• According to Calaj’s “neuron-doctrine”information carrying signals come into thedendrites through synapses, travel to the cellbody, and activate the axon. Axonal signals arethen supplied to synapses of other neurons.

0 5 10 15 20−20

0

20

40

60

80

100

120

time [msec]

Act

ion

pote

ntia

l [m

V]

Injected current and action potential

A.P. Paplinski 1–7

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

1.2 A simplistic model of a biological neuron

t

xi

t

Dendrite

Axon

Synapses

Post−synaptic signal

Axon−HillockCell Body (Soma)

Activation potential

Output Signal, j

x

y

Input Signals,

y j

Figure 1–3: Conceptual structure of a biological neuron

Basic characteristics of a biological neuron:

• data is coded in a form of instantaneous frequency of pulses

• synapses are either excitatory or inhibitory

• Signals are aggregated (“summed”) when travel along dendritic trees

• The cell body (neuron output) generates the output pulse train of an average frequency proportional tothe total (aggregated) post-synaptic activity (activation potential).

A.P. Paplinski 1–8

Page 5: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

Levels of organization of the nervous system

from T.P.Trappenberg, Fundamentals of Computational Neuroscience, Oxford University Press, 2002

A.P. Paplinski 1–9

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

A few good words about the brainAdapted from S. Haykin, Neural Networks – a Comprehensive Foundation, Prentice Hall, 2nd ed., 1999

The brain is a highly complex, non-linear, parallel information processing system. It performs tasks likepattern recognition, perception, motor control, many times faster than the fastest digital computers.

• Biological neurons, the basic building blocks of the brain, are slower than silicon logic gates. Theneurons operate in milliseconds which is about six–seven orders of magnitude slower that the silicongates operating in the sub-nanosecond range.

• The brain makes up for the slow rate of operation with two factors:

– a huge number of nerve cells (neurons) and interconnections between them. The number ofneurons is estimated to be in the range of 1010 with 60 · 1012 synapses (interconnections).

– A function of a biological neuron seems to be much more complex than that of a logic gate.

• The brain is very energy efficient. It consumes only about 10−16 joules per operation per second,comparing with 10−6 J/oper·sec for a digital computer.

Brain plasticity:

• At the early stage of the human brain development (the first two years from birth) about 1 millionsynapses (hard-wired connections) are formed per second.

• Synapses are then modified through the learning process (plasticity of a neuron).

• In an adult brain plasticity may be accounted for by the above two mechanisms: creation of newsynaptic connections between neurons, and modification of existing synapses.

A.P. Paplinski 1–10

Page 6: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

What can you do with Artificial Neural Networks (aNN)Adapted from Neural Nets FAQ: ftp://ftp.sas.com/pub/neural/FAQ.html

• In principle, NNs can compute any computable function, i.e., they can do everything a normal digitalcomputer can do.

• In practice, NNs are especially useful for classification and function approximation/mappingproblems which are tolerant of some imprecision, which have lots of training data available, but towhich hard and fast rules (such as those that might be used in an expert system) cannot easily beapplied.

• Almost any mapping between vector spaces can be approximated to arbitrary precision by feedforwardNNs (which are the type most often used in practical applications) if you have enough data and enoughcomputing resources.

• To be somewhat more precise, feedforward networks with a single hidden layer, under certainpractically-satisfiable assumptions are statistically consistent estimators of, among others, arbitrarymeasurable, square-integrable regression functions, and binary classification devices.

• NNs are, at least today, difficult to apply successfully to problems that concern manipulation ofsymbols and memory. And there are no methods for training NNs that can magically createinformation that is not contained in the training data.

A.P. Paplinski 1–11

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

Who is concerned with NNs?

• Computer scientists want to find out about the properties of non-symbolic information processing withneural nets and about learning systems in general.

• Statisticians use neural nets as flexible, nonlinear regression and classification models.

• Engineers of many kinds exploit the capabilities of neural networks in many areas, such as signalprocessing and automatic control.

• Cognitive scientists view neural networks as a possible apparatus to describe models of thinking andconsciousness (high-level brain function).

• Neuro-physiologists use neural networks to describe and explore medium-level brain function (e.g.memory, sensory system, motorics).

• Physicists use neural networks to model phenomena in statistical mechanics and for a lot of other tasks.

• Biologists use Neural Networks to interpret nucleotide sequences.

• Philosophers and some other people may also be interested in Neural Networks for various reasons.

A.P. Paplinski 1–12

Page 7: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

1.3 Taxonomy of neural networks

• From the point of view of their active or decoding phase, artificial neural networks can be classifiedinto feedforward (static) and feedback (dynamic, recurrent) systems.

• From the point of view of their learning or encoding phase, artificial neural networks can be classifiedinto supervised and unsupervised systems.

• Practical terminology often mixes up the above two aspects of neural nets.

Feedforward supervised networks

This networks are typically used for function approximation tasks. Specific examples include:

• Linear recursive least-mean-square (LMS) networks

• Multi Layer Perceptron (MLP) aka Backpropagation networks

• Radial Basis networks

A.P. Paplinski 1–13

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

Feedforward unsupervised networks

This networks are used to extract important properties of the input data and to map input data into a“representation” domain. Two basic groups of methods belong to this category

• Hebbian networks performing the Principal Component Analysis of the input data, also known as theKarhunen-Loeve Transform.

• Competitive networks used to performed Learning Vector Quantization, or tessellation/clustering ofthe input data set.

• Self-Organizing Kohonen Feature Maps also belong to this group.

Recurrent networks

These networks are used to learn or process the temporal features of the input data and their internal stateevolves with time. Specific examples include:

• Hopfield networks

• Associative Memories

• Adaptive Resonance networks

A.P. Paplinski 1–14

Page 8: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

1.4 Models of artificial neurons

Three basic graphical representations of asingle p-input (p-synapse) neuron:

.

. ...

summingnode

. . .

. . .

x1

y

w1

= [

= [ ]

] Tx

w

xx2

w2 w

p

p

vsynapsedendrite

a.activationpotential

Input layer Output layerx

x2

x1

y

w1

y

x =

x w

p

p

v

v

synapse

c.

Dendritic representation

b.

Block−diagram representation

σ

σ

σ

w

Signal flow representation

2

wn.

v = w1 · x1 + · · · + wp · xp = w · x

y = σ(v)

Artificial neural networks are nonlinear information(signal) processing devices which are built frominterconnected elementary processing devices calledneurons.

An artificial neuron is a p-input single-output signalprocessing element which can be thought of as a simplemodel of a non-branching biological neuron.

From a dendritic representation of a single neuron wecan identify p synapses arranged along a linear dendritewhich aggregates the synaptic activities, and a neuron bodyor axon-hillock generating an output signal.

The pre-synaptic activities are represented by a p-elementcolumn vector of input (afferent) signals

x = [x1 . . . xp]T

In other words the space of input patterns is p-dimensional.

Synapses are characterised by adjustable parameters calledweights or synaptic strength parameters. The weights arearranged in a p-element row vector:

w = [w1 . . . wp]A.P. Paplinski 1–15

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

• In a signal flow representation of a neuron p synapses are arranged in a layer of input nodes. Adendrite is replaced by a single summing node. Weights are now attributed to branches (connections)between input nodes and the summing node.

• Passing through synapses and a dendrite (or a summing node), input signals are aggregated(combined) into the activation potential, which describes the total post-synaptic activity.

• The activation potential is formed as a linear combination of input signals and synaptic strengthparameters, that is, as an inner product of the weight and input vectors:

v =p∑

i=1wixi = w · x =

[w1 w2 · · · wp

x1

x2...

xp

(1.1)

• Subsequently, the activation potential (the total post-synaptic activity) is passed through an activationfunction, σ(·), which generates the output (efferent) signal:

y = σ(v) (1.2)

• The activation function is typically a saturating function which normalises the total post-synapticactivity to the standard values of output (axonal) signal.

• The block-diagram representation encapsulates basic operations of an artificial neuron, namely,aggregation of pre-synaptic activities, eqn (1.1), and generation of the output signal, eqn (1.2)

A.P. Paplinski 1–16

Page 9: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

A single synapse in a dendritic representation of a neuron can be represented by the followingblock-diagram:

wi

?

∗-r?

∑-

vi−1 -vi = vi−1 + wi · xi (signal aggregation)

?

xi

xi

In the synapse model we can identify:

• a storage for the synaptic weight,

• augmentation (multiplication) of the pre-synaptic signal with the weight parameter, and

• the dendritic aggregation of the post-synaptic activities.

A synapse is classified as

• excitatory, if a corresponding weight is positive, wi > 0, and as

• inhibitory, if a corresponding weight is negative, wi < 0.

A.P. Paplinski 1–17

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

It is sometimes convenient to add an additional parameter called threshold, θ or bias b = −θ. It can bedone by fixing one input signal to be constant. Than we have

xp = +1 , and wp = b = −θ

With this addition, the activation potential is calculated as:

v =p∑

i=1wixi = v − θ , v =

p−1∑i=1

wixi

where v is the augmented activation potential.

-

σ -v v y

w1 . . . wp−1 wp

@ @ @

x1 · · · xp−1 +1

Figure 1–4: A single neuron with a biasing input

A.P. Paplinski 1–18

Page 10: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

1.5 Types of activation functions

Typically, the activation function generates either unipolar or bipolar signals.

A linear function: y = v.

Such linear processing elements, sometimes called ADALINEs, are studied in the theory of linearsystems, for example, in the “traditional” signal processing and statistical regression analysis.

A step functionunipolar:

y = σ(v) =

1 if v ≥ 0

0 if v < 0

1

0

y

v

Such a processing element is traditionally called perceptron, and it works as a threshold element witha binary output.

bipolar:

y = σ(v) =

+1 if v ≥ 0

−1 if v < 0 0

1y

-1

v

A step function with biasThe bias (threshold) can be added to both, unipolar and bipolar step function. We then say that aneuron is “fired”, when the synaptic activity exceeds the threshold level, θ. For a unipolar case, wehave:

y = σ(v) =

1 if w · x ≥ θ

0 if w · x < θθ

1

0

y

w x(The McCulloch-Pitts perceptron — 1943)

A.P. Paplinski 1–19

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

A piecewise-linear function

y = σ(v) =

0 if v ≤ − 1

αv + 12 if |v| < 1

1 if v ≥ 12α -

6

1y

− 12α

12α

v

• For small activation potential, v, the neuron works as a linear combiner (an ADALINE) with thegain (slope) α.

• For large activation potential, v, the neuron saturates and generates the output signal either ) or 1.

• For large gains α →∞, the piecewise-linear function is reduced to a step function.

Sigmoidal functions

unipolar:

σ(v) =1

1 + e−v=

1

2(tanh(v/2)− 1)

y1

0

v

bipolar:

σ(v) = tanh(βv) =2

1 + e−2βv− 1

y1

0-1

v

The parameter β controls the slope of the function.

The hyperbolic tangent (bipolar sigmoidal) function is perhaps the most popular choice of theactivation function specifically in problems related to function mapping and approximation.

A.P. Paplinski 1–20

Page 11: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

Radial-Basis Functions

• Radial-basis functions arise as optimal solutions to problems of interpolation, approximation andregularisation of functions. The optimal solutions to the above problems are specified by someintegro-differential equations which are satisfied by a wide range of nonlinear differentiablefunctions (S. Haykin, Neural Networks – a Comprehensive Foundation, Ch. 5).

• Typically, Radial-Basis Functions ϕ(x; ti) form a family of functions of a p-dimensional vector,x, each function being centered at point ti.

• A popular simple example of a Radial-Basis Function is a symmetrical multivariate Gaussianfunction which depends only on the distance between the current point, x, and the center point, ti,and the variance parameter σi:

ϕ(x; ti) = G(||x− ti||) = exp

−||x− ti||2

2σ2i

where ||x− ti|| is the norm of the distance vector between the current vector x and the centre, ti,of the symmetrical multidimensional Gaussian surface.

• The spread of the Gaussian surface is controlled by the variance parameter σi.

A.P. Paplinski 1–21

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

Two concluding remarks:

• In general, the smooth activation functions, like sigmoidal, or Gaussian, for which a continuousderivative exists,

are typically used in networks performing a function approximation task,

whereas the step functions are used as parts of pattern classification networks.

• Many learning algorithms require calculation of the derivative of the activation function (see theassignments/practical 1).

A.P. Paplinski 1–22

Page 12: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

1.6 A layer of neurons

• Neurons as in sec. 1.4 can be arrange into a layer ofneurons.

• A single layer neural network consists of m

neurons each with the same p input signals.

• Similarly to a single neuron, the neural network canbe represented in all three basic forms: dendritic,signal-flow, and block-diagram form

• From the dendritic representation of the neuralnetwork it is readily seen that a layer of neurons isdescribed by a m× p matrix W of synaptic weights.

• Each row of the weight matrix is associated withone neuron.

Operations performed by the network can be described asfollows:

v1

v2...

vm

=

w11 · · · w1p

w21 · · · w2p... · · · ...

wm1 · · · wmp

x1...

xp

;

y1

y2...

ym

=

σ(v1)

σ(v2)...

σ(vm)

......

......

......

. . .

. . .

. . .

w

w

ww

w

w11

21

2n

m1

2

m y

y2

m

W

Input layer Output layer

= yx =

xp

1p

mp

1v

v

v

x ]

w w w

yw w w

y1

w w

y

2

m

w

= [ 1

11

21

m1

12

m2

2

m

1

x

w

w

w

2

m

1

x2 x

..

1:

2:

m:

w

w

w

.

1p

v

v

v

2p

mp

pT

W = ...

. . .

22

b. Signal−flow graph

c. Block−diagram

σ

σ

σ

σ

σ

σ

a. Dendritic graph

px

m

m

y

x

W

1

... ...... ...

1y

...

v = W · x ; y = σ(W · x) = σ(v) ; v is a vector of activation potentials.A.P. Paplinski 1–23

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

From the signal-flow graph it is visible that each weight parameter wij (synaptic strength) is now relatedto a connection between nodes of the input layer and the output layer.Therefore, the name connection strengths for the weights is also justifiable.

The block-diagram representation of the single layer neural network is the most compact one, hence mostconvenient to use.

A.P. Paplinski 1–24

Page 13: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

1.7 Multi-layer feedforward neural networks

Connecting in a serial way layers of neurons presented in sec. 1.6 we can build multi-layer feedforwardneural networks also known as Multilayer Perceptrons (MLP).The most popular neural network seems to be the one consisting of two layers of neurons as in thefollowing block-diagram

W h ϕ W y

σ-

xop

-hoL

-hoL

-yom

-y

om︸ ︷︷ ︸

Hidden layer︸ ︷︷ ︸

Output layer

p input signalsL hidden neuronsm output neurons

In order to avoid a problem of counting an input layer, the two-layer architecture is referred to as a singlehidden layer neural network.

Input signals, x, are passed through synapses of the hidden layer with connection strengths described bythe hidden weight matrix, W h, and the L hidden activation signals, h, are generated.

The hidden activation signals are then normalised by the functions ψ into the L hidden signals, h.

Similarly, the hidden signals, h, are first, converted into m output activation signals, y, by means of theoutput weight matrix, W y, and subsequently, into m output signals, y, by means of the functions σ. Hence

h = ϕ(W h · x) , y = σ(W y · h)

If needed, one of the input signals and one of the hidden signals can be constant to form a biasing signals.

Functions ϕ and σ can be identical.

A.P. Paplinski 1–25

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

1.8 Static and Dynamic Systems — General Concepts

Static systems

Neural networks considered in previous sections belong to the class of static systems which can be fullydescribed by a set of m-functions of p-variables as in Figure 1–5.

f(x)-x

op

-y

om

Figure 1–5: A static system: y = f (x)

The defining feature of the static systems is that they are time-independent — current outputs dependsonly on the current inputs in the way specified by the mapping function, f .Such a function can be very complex.

Dynamic systems — Recurrent Neural Networks

In the dynamic systems, the current output signals depend, in general, on current and past input signals.

There are two equivalent classes of dynamic systems: continuous-time and discrete-time systems.

The dynamic neural networks are referred to as recurrent neural networks.

A.P. Paplinski 1–26

Page 14: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

1.9 Continuous-time dynamic systems

• Continuous-time dynamic systems operate with signals which are functions of a continuous variable,t, interpreted typically as time. A spatial variable can be also used.

• Continuous-time dynamic systems are described by means of differential equations. The mostconvenient yet general description uses only first-order differential equations in the following form:

y(t) = f (x(t),y(t)) (1.3)

wherey(t) df=

dy(t)

dtis a vector of time derivatives of output signals.

• In order to model a dynamic system, or to obtain the output signals, the integration operation isrequired. The dynamic system of eqn (1.3) is illustrated in Figure 1–6.

f(x,y)-

xop

-om

-yom

∫-y

om

s

Figure 1–6: A continuous-time dynamic system: y(t) = f(x(t),y(t))

• It is evident that feedback is inherent to dynamic systems.

A.P. Paplinski 1–27

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

1.9.1 Discrete-time dynamic systems

• Discrete-time dynamic systems operate with signals which are functions of a discrete variable, n,interpreted typically as time, but a discrete spatial variable can be also used.

• Typically, the discrete variable can be thought of as a sampled version of a continuous variable:

t = n · ts ; t ∈ R , n ∈ N

and ts is the sampling time

• Analogously, discrete-time dynamic systems are described by means of difference equations.

• The most convenient yet general description uses only first-order difference equations in thefollowing form:

y(n + 1) = f (x(n),y(n)) (1.4)

where y(n + 1) and y(n) are the predicted (future) value and the current value of the vector y,respectively.

A.P. Paplinski 1–28

Page 15: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

• In order to model a discrete-time dynamic system, or to obtain the output signals, we use the unitdelay operator, D = z−1 which originates from the z-transform used to obtain analytical solutions tothe difference equations.

• Using the delay operator, we can re-write the first order difference equation into the following operatorform:

z−1y(n + 1) = y(n)

which leads to the structure as in Figure 1–7.

f(x,y)-

x(n)op

-om

-y(n+1)

om

D -y(n)om

s

Figure 1–7: A discrete-time dynamic system: y(n+1) = f(x(n), y(n))

• Notice that feedback is also present in the discrete dynamic systems.

A.P. Paplinski 1–29

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

1.9.2 Example: A continuous-time generator of a sinusoid

• As a simple example of a continuous-time dynamic system let us consider a linear system whichgenerates a sinusoidal signal.

• Eqn (1.3) takes on the following form: y(t) = A · y(t) + B · x(t) (1.5)

where y =

y1

y2

; A =

0 ω

−ω 0

; B =

0

b

; x = δ(t)

δ(t) is the unit impulse which is non-zero only for t = 0 and is used to describe the initial condition.

• In order to show that eqn (1.5) really describes the sinusoidal generator we re-write this equation forindividual components. This yields:

y1 = ω y2 (1.6)y2 = −ω y1 + b δ(t)

• Differentiation of the first equation and substitution of the second one gives the second-order lineardifferential equation for the output signal y1:

y1 + ω2y1 = ω b δ(t)

• Taking the Laplace transform and remembering that Lδ(t) = 1, we have:

y1(s) = bω

s2 + ω2

A.P. Paplinski 1–30

Page 16: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

• Taking the inverse Laplace transform we finally have

y1(t) = b sin(ωt)

• The internal structure of the generator can be obtained from eqns (1.7) and illustrated using thedendritic representation as in Figure 1–8.

-

-

∫y1 y1

∫y2 y2

δ(t)

@

b

r@

−ωr

@

ω

Figure 1–8: A continuous-time sinusoidal generator

• The generator can be thought of as a simple example of a linear recurrent neural network with thefixed weight matrix of the form:

W = [B A] =

0 0 ω

b −ω 0

• The weights were designed appropriately rather than “worked out” during the learning procedure.

A.P. Paplinski 1–31

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

1.9.3 Example: A discrete-time generator of a sinusoid

• It is possible to build a discrete-time version of the sinusoidal generator using difference equations ofthe general form as in eqn (1.4):

y(n + 1) = A · y(n) + B · x(n) (1.7)

where

A =

cos Ω sin Ω

− sin Ω cos Ω

; B =

0

b

; x(n) = δ(n)

• This time we take the z-transform directly of eqn (1.7), which gives:

(zI − A)y(z) = B ; where I =

1 0

0 1

, and Zδ(n) = 1

Hencey(z) = (zI − A)−1B

and subsequently

y(z) = (

z − cos Ω sin Ω

− sin Ω z − cos Ω

0

b

)/(z2 − 2z cos Ω + 1)

Extracting the first component, we have

y1(z) =b sin Ω

z2 − 2z cos Ω + 1

A.P. Paplinski 1–32

Page 17: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

Taking the inverse z-transform finally yields

y1(n + 1) = b sin(Ωn)

which means that the discrete-time dynamic system described by eqn (1.7) generates a sinusoidalsignal.

• The structure of the generator is similar to the previous one and is presented in Figure 1–9.

-

-

Dy1(n+1) y1(n)

Dy2(n+1) y2(n)

δ(n)

@

b

r@

@

−sΩr

@

@

Figure 1–9: A discrete-time sinusoidal generator

• This time the weight matrix is:

W = [B A] =

0 cΩ sΩ

b −sΩ cΩ

wheresΩ = sin Ω, and cΩ = cos Ω

A.P. Paplinski 1–33

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

1.10 Introduction to learning

• In the previous sections we concentrated on the decoding part of a neural network assuming that theweight matrix, W , is given.

• If the weight matrix is satisfactory, during the decoding process the network performs some useful taskit has been design to do.

• In simple or specialised cases the weight matrix can be pre-computed, but more commonly it isobtained through the learning process.

• Learning is a dynamic process which modifies the weights of the network in some desirable way. Asany dynamic process learning can be described either in the continuous-time or in the discrete-timeframework.

• A neural network with learning has the following structure:

DECODING

W

LEARNING(ENCODING)

W

-x r

-

-yr

d

A.P. Paplinski 1–34

Page 18: 1 Basic concepts of Neural Networks and Fuzzy Logic ...users.monash.edu/~app/CSE5301/Lnts/LaD.pdf1 Basic concepts of Neural Networks and Fuzzy Logic Systems ... neuron seems to be

Neuro-Fuzzy Comp. — Ch. 1 May 25, 2005

• The learning can be described either by differential equations (continuous-time)

W (t) = L( W (t),x(t),y(t),d(t) ) (1.8)

or by the difference equations (discrete-time)

W (n + 1) = L( W (n),x(n),y(n),d(n) ) (1.9)

where d is an external teaching/supervising signal used in supervised learning.

• This signal in not present in networks employing unsupervised learning.

• The discrete-time learning law is often used in a form of a weight update equation:

W (n + 1) = W (n) + ∆W (n) (1.10)∆W (n) = L( W (n),x(n),y(n),d(n) )

A.P. Paplinski 1–35