introduction to artificial neural networks - alan...

12
1 Introduction to Artificial Neural Networks Kristof Van Laerhoven kristof mis.tu-darmstadt.de http://www.mis.informatik.tu-darmstadt.de/kristof @ 3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 2 Course Overview We will: Learn to interpret common ANN formulas and diagrams Get an extensive overview of the major ANNs Understand mechanisms behind ANNs Introduce it with historic background Left as exercises: Implementations of algorithms and details Evaluations and proofs 3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 3 The Start of Artificial Neural Nets Standard ANN introduction courses just mention: McCulloch, W. and Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 7:115 - 133. But let s give a few more details… Warren McCulloch (1898 - 1972) Walter Pitts (1921 - 1969) ‘real neuron’… ‘simple model’… ‘some authors’… 3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 4 before 1943: Neurobiology Trepanation: Brain = ‘cranial stuffing’ ~400BC: Hippocrates: Brain = “Seat of intelligence” ~350BC: Aristotle: Brain cools blood 160: Galen: Brain damage ~ mental functioning 1543: Vesalius: anatomy 1783: Galvani: electrical excitability of muscles, neurons 3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 5 before 1943: Neurobiology 1898: Camillo Golgi: Golgi Stains 1906: Golgi + Santiago Ramón y Cajal: Nobel prize ~1900: DuBois-Reymond, Müller, and von Helmholtz: neurons electrically excitable their activity affects electrical state of adjacent neurons 3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 6 before 1943: Neurobiology In 1943 there already existed a lively community of biophysicists doing mathematical work on neural nets Dissections, Golgi stains, and microscopes, but not: Church-Turing for the brain “One would assume, I think, that the presence of a theory, however strange, in a field in which no theory had previously existed, would have been a spur to the imagination of neurobiologists. But this did not occur at all! The whole field of neurology and neurobiology ignored the structure, the message, and the form of McCulloch’s and Pitts’s theory. Instead, those who were inspired by it were those who were destined to become the aficionados of a new venture, now called Artificial Intelligence, which proposed to realize in a programmatic way the ideas generated by the theory” -- Lettvin, 1989 ‘real’ neurons…

Upload: hahanh

Post on 01-Apr-2018

222 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Introduction to Artificial Neural Networks - Alan Dixalandix.com/academic/teaching/AI355/pdfs/2007_03_05_Intro_ANN-6up… · Introduction to Artificial Neural Networks ... Nervous

1

Introduction to

Artificial Neural Networks

Kristof Van Laerhovenkristof mis.tu-darmstadt.de

http://www.mis.informatik.tu-darmstadt.de/kristof

@

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 2

Course Overview

• We will:

Learn to interpret common ANN formulas anddiagrams

Get an extensive overview of the major ANNs

Understand mechanisms behind ANNs

Introduce it with historic background

• Left as exercises:

Implementations of algorithms and details

Evaluations and proofs

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 3

The Start of Artificial Neural Nets

• Standard ANN introduction courses just mention:

McCulloch, W. and Pitts, W. (1943). A logical

calculus of the ideas immanent in nervous

activity. Bulletin of Mathematical Biophysics,7:115 - 133.

• But let s give a few more details…

Warren McCulloch(1898 - 1972)

Walter Pitts(1921 - 1969)

‘real neuron’… ‘simple model’… ‘some authors’…

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 4

before 1943: Neurobiology

• Trepanation: Brain = ‘cranial stuffing’

• ~400BC: Hippocrates:

Brain = “Seat of intelligence”

• ~350BC: Aristotle: Brain cools blood

• 160: Galen: Brain damage ~ mental

functioning

• 1543: Vesalius: anatomy

• 1783: Galvani:electrical excitability of muscles, neurons

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 5

before 1943: Neurobiology

• 1898: Camillo Golgi: Golgi Stains

• 1906: Golgi + Santiago Ramón y

Cajal: Nobel prize

• ~1900: DuBois-Reymond, Müller,and von Helmholtz:

neurons electrically excitable

their activity affects electrical state ofadjacent neurons

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 6

before 1943: Neurobiology

• In 1943 there already existed a lively community of biophysicists doingmathematical work on neural nets

• Dissections, Golgi stains, and microscopes, but not: Church-Turing forthe brain

“One would assume, I think, that the

presence of a theory, however strange, in

a field in which no theory had previously

existed, would have been a spur to the

imagination of neurobiologists. But this did

not occur at all! The whole field of

neurology and neurobiology ignored the

structure, the message, and the form of

McCulloch’s and Pitts’s theory. Instead,

those who were inspired by it were those

who were destined to become the

aficionados of a new venture, now called

Artificial Intelligence, which proposed to

realize in a programmatic way the ideas

generated by the theory” -- Lettvin, 1989

‘real’ neurons…

Page 2: Introduction to Artificial Neural Networks - Alan Dixalandix.com/academic/teaching/AI355/pdfs/2007_03_05_Intro_ANN-6up… · Introduction to Artificial Neural Networks ... Nervous

2

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 7

Back to McCulloch-Pitts

• Warren McCulloch:Neurophysiologist (philosophy, psychology, MD)The ‘McCulloch group’

• Walter PittsSelf-taught logic and maths, Principia @12y

Homeless, non-enrolled as student, eccentricManuscript on 3D networks

• Their seminal 1943 paper:Mathematical (logic) model for neuronNervous system can be considered a kind ofuniversal computing device as described by Leibniz

» Recommended Reading: Dark Hero Of The Information Age: In Search of NorbertWiener, the Father of Cybernetics, and search the web for Jerome Lettvin

What the frog's eyetells the frog's brain

1959

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 8

(Hebbian) Learning

• Previously:Pavlov, 1927: Classical Conditioning

Thorndike, 1898: Operant Conditioning

• Donald Hebb (psychology, teacher)The Organization of Behavior

What fires together, wires together: “When anaxon of cell A is near enough to excite cell B andrepeatedly or persistently takes part in firing it, somegrowth process or metabolic change takes place inone or both cells such that A's efficiency, as one ofthe cells firing B, is increased” -- D. Hebb

• Others: Steven Grossberg, GailCarpenter: Adaptive Resonance Theory

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 9

The Perceptron

° 1957: Frank Rosenblatt “The perceptron: A perceivingand recognizing automaton (project PARA).”

x1

x2

w1

w2y

x: inputs w: weights b: bias

0 < < 1: learning rate

Sign(y) : output (usually -1 or 1)

yT : target output (usually -1 or 1)

x3

w3

.

.

.

.

.

.

y = (wixi) + bi=1,2,3,...

b

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 10

The Perceptron

x1

x2

w1

w2y

x: inputs w: weights b: bias0 < < 1: learning rateSign(y) : output yT : target output

Learning / Training Phase: (Widrow-Hoff, or Delta Rule)1. Start with random w1, w2, w3, …, b (usually near 0)2. Calculate y for the current input x1, x2, x3, …3. Update weights and bias: wi' = wi + (yT y)xi

4. Go to step 2

x3

w3

.

.

.

.

.

.

y = (wixi) + bi=1,2,3,...

b

yT?

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 11

The Perceptron: Example

• Teach the neuron to spotmale authors of emails on amailing list, given two inputs

x1: counts of {‘around’, ‘what’,‘more’} minus counts of{‘with’,’if’,’not’}

x2: counts of {‘at’,’it’,’many’}minus counts of {’myself’,’hers’,‘was’}

• Learning by examples:

Negative samples:

Positive samples:

• (loosely based on http://www.bookblog.net/gender/genie.php and “Gender, Genre, and

Writing Style in Formal Written Texts”, Shlomo Argamon, Moshe Koppel, Jonathan Fine,

Anat Rachel Shimoni)

………

(6,1)Coffee machine h…Kristof

(3,2)Re: UN petition fo…Alan

(-2,1)Printer on B floor i…Cath

(5,4)Visitors on Mon/TueHans

(-2,-1)Meeting room info…Christine

(5,-2)Camera Equipme…Gordon

(x1,x2)Subject:Sender:

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 12

The Perceptron: Example

• Teach the neuron to spotmale authors of emails on amailing list, given two inputs

x1: counts of {‘around’, ‘what’,‘more’} minus counts of{‘with’,’if’,’not’}

x2: counts of {‘at’,’it’,’many’}minus counts of {’myself’,’hers’,‘was’}

• Learning by examples:

Negative samples:

Positive samples:

• (loosely based on http://www.bookblog.net/gender/genie.php and “Gender, Genre, and

Writing Style in Formal Written Texts”, Shlomo Argamon, Moshe Koppel, Jonathan Fine,

Anat Rachel Shimoni)

x1

x2

alan

hans

kristof

gordonchristine

cath

Instead of using the whole mails,we use only keyword counts.

These type of ‘pre-processed’values are called features in

machine learning literature.

Page 3: Introduction to Artificial Neural Networks - Alan Dixalandix.com/academic/teaching/AI355/pdfs/2007_03_05_Intro_ANN-6up… · Introduction to Artificial Neural Networks ... Nervous

3

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 13

The Perceptron: Example

x1

x2

y =1

2x1

1

2x2 + 0 = 0

y = (wixi) + bi=1,2,3,...

w1=1/2

w2=-1/2

b = 0

y = w1x1+w2x2+b

= (1/2) x1 + (-1/2)x2 + 0

Initial Values:

sign(y)=-1

sign(y)=+1

x1

x2

w1

w2

b

y

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 14

The Perceptron: Example

x1

x2

y = (wixi) + bi=1,2,3,...

y =2

5x1

1

3x2

2

7= 0

w1=2/5

w2=-1/3

b = -2/7

y = w1x1+w2x2+b

= (2/5) x1 + (-1/3)x2 + (-2/7)

Values after training for all and (with =0.3):

yT=-1

yT=+1

x1

x2

w1

w2

b

y

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 15

The Perceptron: Example

x1

x2

y =9

12x1

2

23x2 2 = 0

y = (wixi) + bi=1,2,3,...

w1=9/12

w2=-2/23

b = -2

y = w1x1+w2x2+b

= (9/12) x1 + (-2/23)x2 + (-2)

Values after training 5 times for all and (with =0.3):

x1

x2

w1

w2

b

y

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 16

The Perceptron’s Delta Rule

E = yT y( )22

• Delta Rule or Gradient Descent:

• Error:

• Calculate partial derivative of the Error

w.r.t. each weight:

w1 = w'1 w1 = (yT y)x1

E

w1= (yT y)x1

Local Minima only!

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 17

The Perceptron

• Perceptron has 2 phases:

Training/Learning Phase: adaptive weights

Testing Phase: for checking performance andgeneralisation

• Neurons and their outputs are grouped in 1 layer:

x1

x2

w11

w21y1

b1

x3

w31

b2

w12w22

w32

x2b1

x3 b2

x1

bm

xn

: : :y2

y1

y2

ym

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 18

Output Functions

• Output function per

neuron: g(y) limits and

scales output

Simple threshold

Sigmoid

Gaussian

Piecewise Linear

g(x) = ex 2

2

g(x) =1

1+ e x

g(x) =

1x+1

20

...

...

...

1< x

1< x <1

x < 1

g(x) = tanh(x)

Page 4: Introduction to Artificial Neural Networks - Alan Dixalandix.com/academic/teaching/AI355/pdfs/2007_03_05_Intro_ANN-6up… · Introduction to Artificial Neural Networks ... Nervous

4

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 19

Parameters

• Bias can be seen as weight to constant

input 1, or input with constant weight 1

• Output parameters (e.g., steepness for

sigmoid)

• Momentum term to avoid local minima

• Initialisation: Use a ‘Committee of

networks’ to avoid random init effects

(many networks training on same data,

different initialised weights)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 20

The Perceptron’s Issues

° 1969: Marvin Minsky and Seymour Papert,

“Perceptrons”, Cambridge, MA: MIT Press

=> Non-linearly separable functions not solvable!

x1

x2

y = (wixi) + bi=1,2,3,...

x1=0 x2=0 -> =0x1=1 x2=0 -> =1x1=0 x2=1 -> =1x1=1 x2=1 -> =0

XOR?x1

x2

w1

w2

b

y

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 21

Backpropagation

° P. Werbos in 1974

° Compiled in 1986: D.E. Rumelhart, J.L. McClelland “Parallel

Distributed Processing: Explorations in the Microstructure of

Cognition”

• Network: Multi-layer

perceptron:

Same neurons, but:

‘Hidden’ layers

• Training propagates

per layer, from

output back to input

weights

1 hidden

layer

outputlayer

(input

layer)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 22

Backpropagation Test Phase

• Inputs are entered innetwork

1 hidden

layer

outputlayer

(input

layer)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 23

Backpropagation Test Phase

• Inputs are entered innetwork

• Weighted sumscalculate…

1 hidden

layer

outputlayer

(input

layer)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 24

Backpropagation Test Phase

• Inputs are entered innetwork

• Weighted sumscalculate…

• Outputs of hidden layerneurons

1 hidden

layer

outputlayer

(input

layer)

Page 5: Introduction to Artificial Neural Networks - Alan Dixalandix.com/academic/teaching/AI355/pdfs/2007_03_05_Intro_ANN-6up… · Introduction to Artificial Neural Networks ... Nervous

5

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 25

Backpropagation Test Phase

• Inputs are entered innetwork

• Weighted sumscalculate…

• Outputs of hidden layerneurons

• Which are weightedagain to calculate…

1 hidden

layer

outputlayer

(input

layer)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 26

Backpropagation Test Phase

• Inputs are entered innetwork

• Weighted sumscalculate…

• Outputs of hidden layerneurons

• Which are weightedagain to calculate…

• The outputs of theneurons in the outputlayer.. finally!

1 hidden

layer

outputlayer

(input

layer)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 27

Backpropagation Training

• In the training phase, wealso know the targetoutputs for the input…

1 hidden

layer

outputlayer

(input

layer)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 28

Backpropagation Training

• In the training phase, wealso know the targetoutputs for the input…

• Which can be used tore-adjust the weightsgoing to the output layer

1 hidden

layer

outputlayer

(input

layer)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 29

Backpropagation Training

• In the training phase, wealso know the targetoutputs for the input…

• Which can be used tore-adjust the weightsgoing to the output layer

• Which lead to targetoutputs for the middlelayer neurons

1 hidden

layer

outputlayer

(input

layer)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 30

Backpropagation Training

• In the training phase, wealso know the targetoutputs for the input…

• Which can be used tore-adjust the weightsgoing to the output layer

• Which lead to targetoutputs for the middlelayer neurons

• Which re-adjust theremaining weights (justlike in the perceptron)

1 hidden

layer

outputlayer

(input

layer)

Page 6: Introduction to Artificial Neural Networks - Alan Dixalandix.com/academic/teaching/AI355/pdfs/2007_03_05_Intro_ANN-6up… · Introduction to Artificial Neural Networks ... Nervous

6

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 31

Backpropagation

• XOR works!

y3

y2y1

x1

14.6

x2

-12-10.4

-6.5

-12 14.6

-11.9

4.3

-8.3

0

1

14.6

1-12

-10.4-6.5

-12 14.6

-11.9

4.3

-8.31

0

14.6

1-12

-10.4-6.5

-12 14.6

-11.9

4.3

-8.3

0

0

14.6

0-12

-10.4-6.5

-12 14.6

-11.9

4.3

-8.3 1

1

14.6

0-12

-10.4-6.5

-12 14.6

-11.9

4.3

-8.3

• Note: This network could be re-trained for OR, AND,

NOR, or NAND

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 32

Backpropagation

• In formulas:

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 33

Multilayer Perceptron Apps

• 1988: Gorman and Sejnowski “Analysis of Hidden Units in aLayered Network Trained to Classify Sonar Targets”

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 34

Multilayer Perceptron Apps

• ALVINN (Pomerleau, 1993)

• Image Compression usingBackprop (Paul Watta, BrijeshDesaie, Norman Dannug,Mohamad Hassoun, 1996)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 35

Underfitting and Overfitting

• Underfitting: ANN that

is not sufficiently

complex to correctly

detect the pattern in a

noisy data set

• Overfitting: ANN that

is too complex so it

reacts on noise in the

data

• Techniques to avoid

them:

Model selection

Jittering

Weight decay

Early stopping

Bayesian estimation

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 36

Overview so far…

• These are the most popular neural networkarchitectures and their training algorithms:

Perceptron - gradient descent / delta rule• Only linearly separable functions

• One layer

Multilayer perceptron - backpropagation• Hidden Layer(s)

• More powerful

• Training: errors are propagated back towards input weights

• Many applications

• They are SUPERVISED: we train them withinputs plus known target outputs

Page 7: Introduction to Artificial Neural Networks - Alan Dixalandix.com/academic/teaching/AI355/pdfs/2007_03_05_Intro_ANN-6up… · Introduction to Artificial Neural Networks ... Nervous

7

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 37

The Kohonen Map

• Also: Self-Organising (Feature) Map• Teuvo Kohonen (1982) “Self-organized formation

of topologically correct feature maps”

• ‘Competition’ among neuronsfor the input:

• Topology amongneurons

• Unsupervised

neighbourhoodlearning rate

w'i = wi + (xi wi)

x2

x3

x4

x5

x1 w1

w2

w3

w4

w5

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 38

neighbourhoodlearning rate

(72,19) (22,93) (24,91) (31,26)

(35,102) (95,67) (30,11) (64,19)

(12,19) (83,36) (16,13) (24,74)

(93,10) (72,56) (75,55) (61,42)

The Kohonen Map: Example

• Randomly initialised grid of (weight-)

vectors:

w'i = wi + (xi wi)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 39

w'i = wi + (xi wi)neighbourhoodlearning rate

(72,19) (22,93) (24,91) (31,26)

(35,102) (95,67) (30,11) (64,19)

(12,19) (83,36) (16,13) (24,74)

(93,10) (72,56) (75,55) (61,42)

The Kohonen Map: Example

• Randomly initialised grid of (weight-)

vectors:

• Input:

(82,41)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 40

neighbourhoodlearning rate

(72,19) (22,93) (24,91) (31,26)

(35,102) (95,67) (30,11) (64,19)

(12,19) (83,36) (16,13) (24,74)

(93,10) (72,56) (75,55) (61,42)

The Kohonen Map: Example

• Randomly initialised grid of (weight-)

vectors:

• Input:

(82,41)

w'i = wi + (xi wi)

minimum

distance

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 41

neighbourhoodlearning rate

(72,19) (22,93) (24,91) (31,26)

(35,102) (95,67) (30,11) (64,19)

(12,19) (16,13) (24,74)

(93,10) (72,56) (75,55) (61,42)

The Kohonen Map: Example

• ‘Winner Takes All’: 1 for closest neuron

• Step 1:

(82,39)

(82,41)

w'i = wi + (xi wi)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 42

neighbourhoodlearning rate

(72,19) (22,93) (24,91) (31,26)

(41,77) (91,52) (53,20) (64,19)

(80,28) (34,21) (24,74)

(88,29) (78,46) (77,37) (61,42)

The Kohonen Map: Example

• Neighbours: is smaller for surrounding

• Step 1:

(82,39)

w'i = wi + (xi wi)

Page 8: Introduction to Artificial Neural Networks - Alan Dixalandix.com/academic/teaching/AI355/pdfs/2007_03_05_Intro_ANN-6up… · Introduction to Artificial Neural Networks ... Nervous

8

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 43

neighbourhoodlearning rate

(74,21) (26,89) (27,91) (38,27)

(41,77) (91,52) (53,20) (68,22)

(80,28) (34,21) (32,69)

(88,29) (78,46) (77,37) (63,41)

The Kohonen Map: Example

• Neighbours: gets smaller and smaller…

• Step 1:

(82,39)

w'i = wi + (xi wi)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 44

neighbourhoodlearning rate

(74,21) (26,89) (27,91) (38,27)

(41,77) (91,52) (53,20) (68,22)

(80,28) (34,21) (32,69)

(88,29) (78,46) (77,37) (63,41)

The Kohonen Map: Example

• Different inputs activate different regions

• Step 2:

Newinput:

(82,39)

(70,22)

w'i = wi + (xi wi)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 45

neighbourhoodlearning rate

(74,21) (31,79) (33,72) (46,23)

(44,71) (90,47) (57,21) (68,22)

(80,28) (39,21) (38,47)

(88,29) (76,41) (77,37) (58,37)

The Kohonen Map: Example

• Different inputs activate different regions

• Step 2:

Newinput:

(79,37)

(70,20)

w'i = wi + (xi wi)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 46

neighbourhoodlearning rate

(74,21) (43,61) (37,69) (46,23)

(52,65) (89,45) (61,34) (68,22)

(79,27) (40,21) (38,47)

(88,29) (76,41) (77,37) (58,37)

The Kohonen Map: Example

• The map organises along input space

• Step 3:

Newinput:

(78,36)

(76,20)

w'i = wi + (xi wi)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 47

The Kohonen Map

• The neighbourhood and learning rate arecontrolled in 2 phases:

Self-Organising / Ordering phase: topologicalordering of the weight vectors ( , large)

Convergence Phase: fine-tuning of the map ( ,small)

larg

esm

all

input

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 48

Kohonen Map: Applications

• Detecting structure in inputdata (NNRC, 1997)

Input: 39 indicators

describing various

quality-of-life factors,

such as state of

health, nutrition,

educational services

Page 9: Introduction to Artificial Neural Networks - Alan Dixalandix.com/academic/teaching/AI355/pdfs/2007_03_05_Intro_ANN-6up… · Introduction to Artificial Neural Networks ... Nervous

9

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 49

Kohonen Map: Applications

• Traveling Salesman Problem (Fritzke &

Wilke, 1990)

• 1D Ring structure

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 50

Kohonen Map: Applications

• Visualising sensor data from different

contexts (Van Laerhoven, 1999)

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 51

x1

x2

w1

w2y y =

+1 if (wixi)i=1,2,3,...

>

1 otherwise

The Hopfield Network

• 1982: John Hopfield, "Neural networks andphysical systems with emergent collectivecomputational abilities”

• Associative Memory: ‘pony’

• Simple threshold units

• Outputs are -1 or +1 (sometimes 0 or 1):

• Connections are typically symmetric!

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 52

The Hopfield Network

• States:

2

1

3

new state if…neuron

6

6

3

4

3

3

1

1

3 fires

731117

660116

711015

640014

331103

260102

311001

240000

2 fires1 fires321state

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 53

Energy Functions

• Energy function of the network is a scalar valueused for finding local minima, regarded assolutions

• Popular method for neural network design

E =1

2wijsis j

i< j

+ isii

‘solutions’: memorised states

for Hopfield net

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 54

The Hopfield Network

• States for:

2

1

3

new state if…neuron

6

6

3

4

3

3

1

1

3 fires

731117

660116

711015

640014

331103

260102

311001

240000

2 fires1 fires321state

3 6

5 7 2 0

1 4

Page 10: Introduction to Artificial Neural Networks - Alan Dixalandix.com/academic/teaching/AI355/pdfs/2007_03_05_Intro_ANN-6up… · Introduction to Artificial Neural Networks ... Nervous

10

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 55

Hopfield Network Apps

• Pattern recognition:

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 56

Elman Network

• Jeff Elman 1990: “Finding structure in time”

• Sequences in the input: implicit

representation of time

• Hidden neuron(s) copiedas an input

• Backpropagationas usual

Context neuronInput neurons

Hidden neurons

Output neuron

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 57

Elman Network

Context neuronInput neurons

Hidden neurons

Output neuron

• Input is given

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 58

Elman Network

Context neuronInput neurons

Hidden neurons

Output neuron

• Input is given

• Hidden neurons’

output functions

are calculated

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 59

Elman Network

Context neuronInput neurons

Hidden neurons

Output neuron

• Input is given

• Hidden neurons’

output functions

are calculated

• Output is

calculated and

hidden neuron’s

output is copied

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 60

Elman Network

Context neuronInput neurons

Hidden neurons

Output neuron

• Input is given

• Hidden neurons’

output functions

are calculated

• Output is

calculated and

hidden neuron’s

output is copied

• The next input…

Page 11: Introduction to Artificial Neural Networks - Alan Dixalandix.com/academic/teaching/AI355/pdfs/2007_03_05_Intro_ANN-6up… · Introduction to Artificial Neural Networks ... Nervous

11

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 61

Elman Network Apps

• Learning formal grammars

Which strings belong to a language?

• Speech recognition:Recognition of Auditory Sound Sources and Events

(Stephen McAdams, 1990)

• Music composition:Neural network music composition by prediction: Exploring

the benefits of psychophysical constraints and multiscale

processing (Michael Mozer, 1994)

• Activity recognition:

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 62

Terminology

Architectures:

• McCulloch-Pitts

• Adaptive Resonance

Theory

• Perceptron

• Multilayer Perceptron

• Kohonen Map

• Hopfield Network

• Elman Network

Learning:

• Hebbian

• Delta rule

• Backpropagation

• (Un)Supervised

• Competitive Learning

• Associative Memory

• Energy

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 63

Architectures

perceptron multilayer perceptronHopfield net

Kohonen map Elman network

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 64

Training

perceptron multilayer perceptronHopfield net

Kohonen map Elman network

Delta rule /

Gradient descent

Supervised

Backpropagation

Supervised

Associative

memory

Unsupervised

Competitive Learning

Unsupervised

Backpropagation

Supervised

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 65

Number of Inputs

perceptron multilayer perceptronHopfield net

Kohonen map Elman network

4 4 3

5 2

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 66

Number of Weights

perceptron multilayer perceptronHopfield net

Kohonen map Elman network

12 24 3

60 9

Page 12: Introduction to Artificial Neural Networks - Alan Dixalandix.com/academic/teaching/AI355/pdfs/2007_03_05_Intro_ANN-6up… · Introduction to Artificial Neural Networks ... Nervous

12

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 67

Number of Outputs

perceptron multilayer perceptronHopfield net

Kohonen map Elman network

3 4 3

12 1

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 68

Further Research

• Spiking / Pulsed Neural Networks

‘Spike trains’: timings of pulses is crucial

Closer to biological neuron

Action potential p

tt1,1 t1,2

t2,1

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 69

Further Research

• Principal Component Analysis

Dimension reduction

• Independent Component Analysis

Cocktail party problem

‘blind source detection’

3/24/07 Kristof Van Laerhoven - Introduction in Artificial Neural Networks 70

Further Research

• Support Vector Machines

x1

x2