artificial neurons: hopfield networks seminar: introduction to the theory of neural computation...

Seminar: Introduction to the Theory of Neural Computation Artificial Neurons: Hopfield Networks

IntroductionNeurophysiological BackgroundModeling Simplified Neurophysiological Information

The Hopfield ModelThe Associative Memory ProblemThe ModelUpdating rules One PatternMany PatternsStability of a particular patternStorage CapacityThe Energy Function

Discussion on Philosophy and Methodology

Artificial Neurons: Hopfield Networks


Introduction

Inspiration on today’s research in neural computation comes from neuroscienceand is largely motivated by the possibility of modeling artificial computing networks.

So models are extremely simplified when seen from a neurophysiological point of view,but one should gain insight in the behaviour of “biological” networks.

First the neurophysiological background should

be described:

information for modeling simplified

neurophysiological processes

description and the behaviour of neural

networks – Hopfield Networks.


Neurophysiological Background

basic elements for a neural network:neurons and their connections

Systematically the nervous system can be divided into three parts

- Input- central processing unit- output

In the field of ANNs, networks will be constructed from neurons which have the canonical division into an input part (dendritic arbor), a processing part (soma) and a signal transmission part (axon).


Modeling Simplified Neurophysiological Information (1)

logical structure of the neuron as a perceptron includes:

a processing unit and effiacies of synapses mentioned with ijJ .

input channels are activated by the signals, which they receive from the logical boxes j represent the values 0 or 1.

decision function [ ]ih , that will calculate if the neuron will (will not) fire, i will take the value 1 (0).



At any given moment, some of the logical inputs

are activated

The soma (processing part) receives an input a so-

called PSP (post-synaptic potential) which is the

linear sum of the efficacies ijJ of those channels

that were activated

The sum of the PSP’s is compared to the threshold

value of the neuron i and the output channel is

activated, if it exceeds the threshold, otherwise it is

not.



This operation and its components leads to the basic formular

1

N

i ij jj

h J

The operation can be expressed by the logical truth function ' [ ]i ih T

defines variables which are themselves zeros and ones

(which can also be considered as truth functions of some statement )

[]

'

j

is a function which is 1 if the statement in the square brackets is true and is 0 otherwise

indicades, whether a spike (1 is sent) will appear in the output axon



A significant leap is acomplished, when the multi-

neuron (multi-perceptron) is closed onto itself,

where the neurons form a feedback mechanism.

An ANN is no longer a linear, but a dynamical

system, when output axons (signal transmission

parts) become input channels, there is a time

shift.

If at a time t one has set of N zeros and ones, denoted by ()itthen the set of N bits composing 'i’s becomes the set of inputs a neural cycletime (1-2 milliseconds) later (1)it.

!


Overview





The Hopfield Model - The Associative Memory Problem

Hopfield networks consist of the previously described elements and are totally dynamical, so

including the time shift and possible updating rules.

basic problem: ito store a set of p patterns in such a way that when presented with

a new pattern , the network responds by producing whichever one

of the stored patterns most closely resembles

!i

i

The space of all possible states of the network,is called the configuration space.

basins of attraction: Division of the the confirguration space bystored patterns

i


The Model

The dynamics of the network can be represented by:

: sgn( )i ij j ij

S w S

where is represented for with the conversion from =0 or 1 via =2 -1

and sgn(x) is defined by: iS in in iS in

1 0;1 0;sgn( ) if xif xx

The threshold terms can be dropped in consideration on random patterns being used.


Updating rules - Two simplified versions

Synchronous or ParallelAll neurons update their activity states simultaneously at discrete time steps n, where n = 1, 2, …, as if governed by a clock. The inputs of every neuron in the network are determined by the same activity state of the network in the time interval (n-1) < t < n. This choice requires a central clock or pacemaker and is sensitive to timing errors.

Asynchronous or Sequential (more natural for both brains and artificial networks) All neurons are updated one by one, where one can proceed in either of two ways:

at each time step, select at random a unit i to be updated and apply the rule

let each unit independently choose to update itself, with some constant probability per unit, according to

: sgn( )i ij jj

S w S In this mode: every neuron coming up for a decision has full information about all the decisions of the individual neurons that have been updated before it.


One Pattern

The condition for one pattern which should be memorized is i

: sgn( )i ij jj

w i

For constant of proportionality, using 1/N:

ijw i j= 1/N

If fewer then half of the bits of the starting patterns are wrong they will be overwhelmed

in the sum for the net input iS

The network will correct errors and so the pattern is an attractor

iAll starting configurations with more than half the bits different from the original pattern will end up in the reversed state - , which leaves to a symmetrically divided configuration spaces into two basins of attraction.


Many Patterns

hypothesis made by Hebb (1949):

changes proportional to the correlation between the firing of the pre- and post-synaptic neurons

achieved through:applying the set of patterns to the network during the training phase

adjust the strenghts according to such pre/post correlations

!

i

ijw

1

1/p

ij i jw N


Overview





Stability of a particular pattern (1)

Going back to the condition for a stable one pattern

!

vi

sgn( )i ij jj

w i

and the definiton of the net input i ij j

j

h w Sthe stability condition generalizes to

sgn( )v vi ih i

Taking

1

1/p

ij i jw N

the net input to unit in pattern v is v

ih i

1/v vi i j j

j

h N

seperating the sum on into the special term = v

1/v v vi i i j j

j v

h N


Stability of a particular pattern (2)

Meaning

!

vi

1/v v vi i i j j

j v

h N

crosstalk term(is less than 1, in most cases)

If the second term were zero, one can conclude that pattern number v was stable according to

sgn( )v vi ih i

This is still true if the second term is small enough:if its magnitude is smaller than 1 it cannot change the sign of v

ih


!

Storage Capacity

One consider the quantity by

1/v v vi i i j j

j v

C N

The just depend on the patterns that one attempt to storeviC j

The distribution of values for the crosstalkterm

1/v v vi i i j j

j v

C N

For p random patterns and N units this is a Gaussian with variance

2 pN

The shaded area is ,the probability of error per bit

errorP


The Energy Function (1)

… was atopted from a physical analogy to magnetic systems into the neural network theory and is one of the most important contributions of the Hopfield paper.

One can imagine an energy landscape metaphor “above” the configuration space with a multi-dimensional surface with hills and valleys.

The energy function is

1

2 ij i jij

H w S S



central property:

It is a function that always decreases (or remains constant) as the system evolves

according to its dynamical rule.

The attractors are at local minima (the valleys) of the energy surface, the dynamics then

can be thought of as similar to the motion of a partical on the energy surface under the

influence of gravity (pulling it down) and friction (so that it does not overshoot).



alternate derivation of the Hebb prescription (as we know it from the many pattern case)

1

1/p

ij i jw N

minimized when the overlap between network configuration and the stored pattern

(one pattern case) is largest.

i

21( )

2 i ii

H SN

using:1

2 ij i jij

H w S S

analog: many-pattern case:

patterns should be made into local minima of H !

2

1

1( )

2

p

i ii

H SN

i



Multplying out out leads to the original energy function

!

2

1

1( )

2

p

i ii

H SN

1 1

1 1 1( )( ) ( )

2 2

p

i i j j i j i ji j ij

H S S S SN N

good approach of finding the appropriate connection strength , by finding an energy

function whose minimum satisfies a problem of interest, and by multiplying it out

ijw



“Simple and nice” proving of the central property of the Energy Function

!

It is a function that always decreases (or remains constant) as the system evolves according to its dynamical rule.

an*an

1( 1) ( 1) ( 1)

2 ij i jij

E t w n t n t energy function for the t state

1( ) ( ) ( )

2 ij i jij

E t w n t n t energy function for the t+1 state

( 1) ( ) 0E t E t


Discussion on Philosophy and Methodology (1)

Research in these particular areas involves many different fields of science:

- Biology

- chemistry

- physics

- (...)

natural phenomena are described by mathematical models, sometimes being interpreted that all natural phenomena are reducible to physical laws.

Alternatively - as I would say too - reduction can be given a very intuitive sense in which it not only exists but is extremely useful and productive.

Hopflied once stated that “the brain is a physical system”, which may indeed sound like a call for a reduction of thought process, nevertheless concepts originating in physics can be used as analogues, including energy, field, relaxation etc.



The theory of attractor neural networks (ANN) has engaged in providing a minimal amount of propositions which can be confronted with experiment.

This matter plays a role in discussing the attitude to verification and/or falsification and the fact that a theoratical framework must be fended off by an explanation.

In many instances systems have been constructed (hardware implemantations / computer simulations), being experimental setups for described models and providing a truly impressive agreement on predicitions by the analysis of the models

But this will not please no experimenter who records, using ingeniuos techniques, the electrical activities in the cortex of cats or monkeys, for example.

For the future: the theory of neural networks is to produce models, about cognitive processes and which should be robust to the type of disorder, fluctuations, disruptions one can imagine the brain to be operating under.

Including: parallel processing or potential for abstraction

!



So what happenes if a experiment may not show the type of bahaviour identified as the emergent dynamics.

Interpretion: a refutation of the theoretical construction or arguing that the experiment has missed the theory.

Thank you very much!Feel free to ask questions!

!

artificial neurons: hopfield networks seminar: introduction to the theory of neural computation...

Documents