aims of the course (an engineering approach) the pattern recognition problem

73
PATTERN RECOGNITION: A COMPREHENSIVE APPROACH USING ARTIFICIAL NEURAL NETWORK OR/AND FUZZY LOGIC Sergio C. BROFFERIO email [email protected] • Aims of the course (An Engineering Approach) • The pattern recognition problem • Deterministic and statistical methods:models • Neural and Behavioural models • How to pass the exam? Paper review or Project

Upload: hetal

Post on 09-Jan-2016

43 views

Category:

Documents


0 download

DESCRIPTION

PATTERN RECOGNITION: A COMPREHENSIVE APPROACH USING ARTIFICIAL NEURAL NETWORK OR/AND FUZZY LOGIC Sergio C. BROFFERIO email [email protected]. Aims of the course (An Engineering Approach) The pattern recognition problem Deterministic and statistical methods:models - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Aims of the course (An Engineering Approach) The pattern recognition problem

PATTERN RECOGNITION:

A COMPREHENSIVE APPROACH USING ARTIFICIAL NEURAL NETWORK OR/AND FUZZY LOGIC

Sergio C. BROFFERIOemail [email protected]

• Aims of the course (An Engineering Approach)

• The pattern recognition problem

• Deterministic and statistical methods:models

• Neural and Behavioural models

• How to pass the exam? Paper review or Project

Page 2: Aims of the course (An Engineering Approach) The pattern recognition problem

REFERENCES FOR ARTIFICIAL NEURAL NETWORKS (ANN)a)Basic textbooks

C. M. Bishop: “Neural Network for Pattern Recognition”Clarendon Press-Oxford (1995). Basic for Engineers

S. Haykin; "Neural Networks" Prentice Hall 1999. Complete text for Staic and dynamic ANN.

T. S. Koutroumbas, Konstantinos: “ Pattern Recognition” – 4. ed.. - Elsevier Academic Press, 2003. - ISBN: 0126858756

Y.-H. Pao: “Adaptive Pattern Recognition and Neural Networks” Addison-Wesley Publishing Company. Inc. (1989) Very clear and good text

R. Hecht-Nielsen: “Neurocomputing”, Addison-Wesley Publishing Co., (1990).

G.A. Carpenter, S. Grossberg: “ART”: self-organization of stable category recognition codes for analog input pattern” Applied Optics Vol. 26, 1987

Page 3: Aims of the course (An Engineering Approach) The pattern recognition problem

b) Applications

F.-L. Luo, R. Unbehauen: “Applied Neural Networks for Signal Processing” Cambridge University Press (1997).

R. Hecht-Nielsen: “Nearest Matched filter Classification of Spatiotemporal Patterns” Applied Optis Vol. 26 n.10 (1987) pp. 1892-1898 Y. Bengio, M. Gori: “Learning the dynamic nature of speech with back-propagation for sequences””Pattern Recognition Letters n. 13 pp. 375-85 North Holland (1992) A. Waibel et al.: “Phoneme Recognition Using Time Delay Neural Networks” IEEE Trans. On Acoustics, Speech and Signal processing Vol. 37. n. 3 1989

P. J. Werbos: “Backpropagation through time: what it does and how to do it2 Proceedings of the IEEE, vol. 78 1990

Page 4: Aims of the course (An Engineering Approach) The pattern recognition problem

REFERENCES FOR FUZZY LOGIC

Y.H. Pao: “Adaptive Pattern Recognition and Neural Networks”, Addison-Wesley Publishing Company. Inc. (1989)

B. Kosko: “Neural Networks and Fuzzy Logic”Prentice Hall (1992)

G.J. Klir, U.H.St.Cair,B.Yuan: “Fuzzy Set Theory: Foundations and Applications”Prentice Hall PTR (1997)

J.-S. Roger Jang:“ ANFIS: Adaptive_Network-Based Fuzzy Inference System”, IEEE Trans. on Systems, Man, and Cybernetics, Vol. 23 No. 3 1993

Page 5: Aims of the course (An Engineering Approach) The pattern recognition problem

datiosservazioni

esperto classe

datiosservazioni

elaboratore

datiosservazioni

esperto

elaboratore classe

classe

Evoluzione dell’ automatizzazione dei metodi di riconoscimentoHistorical evolution of Pattern Recognition

Page 6: Aims of the course (An Engineering Approach) The pattern recognition problem

Trasformazione ‘fisica’

Riconoscimento

Elaborazione semantica

simboli

campioni pattern (caratteristiche) ( features)

Organizzazione a livelli delle elaborazioni per il riconoscimento automatico Hierarchical organization of Pattern recognition

segnali dal sensore

segnali all’ attuatore

informazioni semantiche

Page 7: Aims of the course (An Engineering Approach) The pattern recognition problem

xcampione(pattern)

spazio dei campioni (anche continuo)..

..

.

Il riconoscimento come mappatura dello spazio dei campioni nello spazio delle classi (o dei simboli)

Sample to Class Mapping

C3C2C1

* * *

spazio delle classi (discreto)

. .

Page 8: Aims of the course (An Engineering Approach) The pattern recognition problem

x2

x1

C3

C2

C1

x

discriminante

d31(x)=0

caratteristica(feature)

campione(pattern) classe

(simbolo)

Il riconoscimento come partizione dello spazio dei campioniSpace Partitioning for pattern Recognition

spazio dei campioni

Funzione di decisione: Di(x) con i = 1...K

Discriminante: dij(x)= Di(x)- Dj(x) con i,j= 1...K

D3(x)>0

D1(x)>0

caratteristica(feature)

Page 9: Aims of the course (An Engineering Approach) The pattern recognition problem

Pattern classifications types

Area ComputationAlgorithm

Classification of theArea value (S)Or its quantization(Sq)

S

F2

F1

E

A

O

U

SpeechRecognizer

[Hz]

[Hz]Vowel

Page 10: Aims of the course (An Engineering Approach) The pattern recognition problem

Esempio di riconoscimento di vocali con logica sfumataExample of pattern recognition (Vowel Recognition) using Fuzzy Logic

F2

F1

E

A

O

U

I

SpeechRecognizer

F1 MP P M GF2

B

A

U O

U E

A A

E I

V={I,U,O,A,E}F1={MP, P,M,G} F2={B,A}

Vowel

[Hz]

[Hz]

Page 11: Aims of the course (An Engineering Approach) The pattern recognition problem

The neuronCell bodyDendritesAxonSynaptic Connections

Page 12: Aims of the course (An Engineering Approach) The pattern recognition problem

Our Brain and its neurons

- Main characteristics Neurons: ~1011

Connections: ~1015, ~104 connections/neuro Switching time: ~1ms, (10 ps in computers) Switching energy: ~10-6 joule/cycle -Learning and adaptation paradigm: from neurology and psychology - Historical and functional approaches

Page 13: Aims of the course (An Engineering Approach) The pattern recognition problem

Caratteristiche delle RNA (ANN characteristics)

-non linearita’ (non linearrity)- apprendimento (con o senza maestro) Supervised or unsupervised learning- Adattamento: plasticita’ e stabilita’ (Adaptability: plasticity and stability)- risposta probativa (probative recognition)- informazioni contestuali (contextual information)- tolleranza ai guasti (fault tolerance)- analogie neurobiologiche (neurobiological analogies)- realizzazione VLSI (VLSI implementations)- uniformita’ di analisi e progetto (uniformity of analysis and design)

Page 14: Aims of the course (An Engineering Approach) The pattern recognition problem

# sess.add.

err %

ins. addestramento

ins. verifica

nott

Fig.34 Andamento dell’ errore di classificazione per i campioni di addestramento e quelli di verifica

Stability is the capability of recogniono in presence of noiseOverfitting produces a loss of plasticity when the number of traning sessions is above nott

Page 15: Aims of the course (An Engineering Approach) The pattern recognition problem

wji

i

yj

Components of the Artificial Neural Network(ANN)

Receptive Field

Local induced field

Neuron Activity

Neuron

Synaptic Weight

connection . . .j

Page 16: Aims of the course (An Engineering Approach) The pattern recognition problem

vettore di Y uscita

strato diuscita

stratonascosto

vettore xi

d’ ingresso X

wji

j

i

y(t) =f(x(t),W,t)

yh

Struttura di una Rete Neuronale ArtificialeLayered structure of a ANN

conness. con ritardoDelay

. . .

Page 17: Aims of the course (An Engineering Approach) The pattern recognition problem

RNA statica dinamicaCampione(Sample) Percettrone multistrato (MLP) Memoriestatico autoassociative Mappa autorganiz- dinamiche zata (SOM)

dinamico a ritardo (TDNN) spazio-temporale FIR non lin. IIR non lin.

Tipi di RNA( statiche e dinamiche)e tipi di campioni (statici e dinamici)Static and Dynamic ANN’s for either Static and Dynamic samples Pattern Recognition

Page 18: Aims of the course (An Engineering Approach) The pattern recognition problem

RNAW

Ambiente x, y*

W

x

stimolo(campione)

risposta

y

Interazione fra RNA e ambiente (stimoli e eventualmente risposta desiderata)Learning through interactions of an ANN with its environment

y*

rispostadesiderata

“adattatore”

Page 19: Aims of the course (An Engineering Approach) The pattern recognition problem

Hebb’ law

i

j

wjixi

xj

If two neurons are active the weight of their connection is increased,Otherwise their connection weight is decreased

wji = xixj

Page 20: Aims of the course (An Engineering Approach) The pattern recognition problem

wj1

wji

wjN

1

jx1

xi

xNwj(N+1)

sf(s)

yj

ingressi: x= (xi, i=1N, x(N+1)=1)

pesi: wj=(wji, i=1 N+1)

campo locale indotto : s = wji.xi con i=1 N+1

+

Struttura del neurone artificialeANN ON-OFF or “sigmoidal” node structure

funzioni di attivazione: y= f(s)=u(s)

y=f(s)=(s)= 1/(1+exp(-s)

y=f(s)=Th(s)

Page 21: Aims of the course (An Engineering Approach) The pattern recognition problem

s

f(s)

1

0.5

Funzione di attivazione sigmoidaleActivation function of a sigmoidal neuron

Page 22: Aims of the course (An Engineering Approach) The pattern recognition problem

x1

x2

f(s) = f(0)

x

Discriminante lineareLinear discrimination

n

d

s(x)=0

s>0

s<0

d= (w1x1+ w2x2+ w3)(w12+ w2

2)-1/2

o

w1

w2

1

x1

x2

w3

sf(s) y+

s= w1x1+ w2x2+ w3

Page 23: Aims of the course (An Engineering Approach) The pattern recognition problem

wj1

wji

wjN

jx1

xi

xN

exp(-d2/d02)

yj

ingressi: x= (xi, i=1 N)

pesi: wj=(wji, i=1 N)

funzione di attivazione: y=f(d)=exp(-d2/d02)

|x,wj)|d2

Neurone artificiale risonante (selettivo, radiale, radiale)Resonant (Selective, Radial Basis) Artificial Neuron

distanza: d2 = [(x,wj)]2 = ixi-wji)2

oppure distanza pesata: d2 = [(x,wj)]2 = ici(xi-wji)2

Page 24: Aims of the course (An Engineering Approach) The pattern recognition problem

Fig. 5b) Funzione di attivazione radiale y=f(s)= exp[-d/d0)2]

Funzione base radiale (Radial Basic Function, RBF)

d

f(s)

1

d0 d0

1/e~0.3

Page 25: Aims of the course (An Engineering Approach) The pattern recognition problem

x1

x2

x

wj

o

d

Attività di una funzione risonante (radiale) di due variabiliTwo components radial basis function

Page 26: Aims of the course (An Engineering Approach) The pattern recognition problem

ANN learning methods

Supervised learning (Multi Layer Perceptron))Sample-class pairs are applied (X,Y*);a) The ANN structure is definedb) Only the rule for belonging to the same class is defined (Adaptive ANN)

Unsupervised learning (Self Organising Maps SOM)Only the sample X is applied a) the number of classes K is definedb) Only the rule for belonging to the same class is defined (Adaptive ANN)

Page 27: Aims of the course (An Engineering Approach) The pattern recognition problem

Ingressi: xi ; campo locale indotto: s = wixi; uscita: y=(s)

dati per l’addestramento: coppia campione classe (x,y*); errore;e = y*-y

aggiornamento dei pesi:wi= e’(s)xi con ’(s) = y(1-y) if y = (s)=1/(1+exp(-s))

Il percettroneThe Perceptron

wi

i

xi

y

1 N N+1

1

wi

i

xi

y- y*

e

+

Page 28: Aims of the course (An Engineering Approach) The pattern recognition problem

x1

x2

f(s) = f(0)

x

Discriminante lineareLinear discrimination

n

d

s(x)=0

s>0

s<0

d= (w1x1+ w2x2+ w3)(w12+ w2

2)-1/2

o

w1

w2

1

x1

x2w3

sf(s) y+

s= w1x1+ w2x2+ w3

Page 29: Aims of the course (An Engineering Approach) The pattern recognition problem

Hebb’ law

i

j

wjixi

xj

wji = xixj

Perceptron learningy=(s); s= wTx; E(w)=(d-y)2 =1/2e2 ; Training pair (x,d)ddww =dE/dw. (-dE/dw)= - (dE/dw)2

w=-dE/dw =- (E/s) (s/w)= =- (s)xE/s = (s) is called the local gradient with respect to node 1 or ssE/s =e.’(s)wi=-dE/dwi =- (E/s) (s/wi)= - (s)xi

Gradient learning

iwi

xi

(s)

wji = sxi

Page 30: Aims of the course (An Engineering Approach) The pattern recognition problem

x2

x1

ca b

y

c

ab

A+

x2 x1 1

Page 31: Aims of the course (An Engineering Approach) The pattern recognition problem

x2

x1

y

x1 x2

c

Partizione dello spazio dei campioni di un percettrone multistratoThe partitioning of the sample space by the MLP

a b

A B

c

ab

A

B

(x, c/c*)

Page 32: Aims of the course (An Engineering Approach) The pattern recognition problem

Y y1 yh yK

stratonascosto H2

stratonascosto H1

strato d’ ingresso

vettored’ uscita

strato d’ uscita

vettore x1 xk xM

d’ ingresso X

vhj

j

i

Il percettrone multilivello The Multilayer Perceptron (MLP)

wji

k wik. ..

yi

yj

E(W)=1/2(dh-yh)2 with h=1÷K

Page 33: Aims of the course (An Engineering Approach) The pattern recognition problem

Sequential learningMulti Layer Perceptron

y=(s2); s2= vTy; y1=f(s1); s1= wTx ; E=(d-y)2 =e2

Training pair (x,d)

w=-dE/dw =- (E/s1) (s1/w)= =- (s1)xE/s1 = (s1) the local gradient with respect to node 1 or s1

sE/s2.ds2/dy1.dy1/ds1 =(s2)v1’(s1)=e1’(s1)

e1 = (s2)v1s the backpropagated error

detailed notationw =- e1’(s1)x = e ’(s2)v1 ’(s1) x

Page 34: Aims of the course (An Engineering Approach) The pattern recognition problem

1 h M

’(sj)

ej=h whj

sj)= ej’(sj)

yj

+

wji

v1jvMj

vhj

y1 yh yM

(sj)

yj

yi

wji

v1jvMj

vhj

(si)

(s1)(sh)

(sM)

Forward step Backpropagation step

sjwjixi ej=h vhj

yi(sj) j= - ej’(sj); wji = - jyi

Page 35: Aims of the course (An Engineering Approach) The pattern recognition problem

e1 eh=y*h- yh eM

whj

wji

O

H2

H1

I

’(sh)

’(sj)

wik

’(si)

yj

whj= - h yjh= ehs’(sh)

ej=h whj

j= ej’(sj)wji = - j yi

yi

ei=j j wji

i= ej’(sj) x1 xk xN

wik = - i xk

Rete di retropropagazione dell’ erroreLinear ANN for error back propagation

1 h M

1 j MH2

1 i MH1

1 k N

yh

Page 36: Aims of the course (An Engineering Approach) The pattern recognition problem

Metodo di aggiornamento sequenziale dei pesi (Sequential weights learning)

Insieme d’ addestramento: (xk,y*k), k=1-Q,Vettore uscita desiderato y*k= (y*k

m, m=1-M)Vettore uscita yk= (yk

m, m=1-M) prodotto da xk=(xki,i=1-N)

Funzione errore: E (W)= 1/2m (y*km-yk

m)2 = 1/2 m ekm)2

Formula d’ aggiornamento: wji=-.dE/dwji= -jyi = ’(sj).ejyi dove ej=mwmjm e m= - ’(sm).em Formule d’ aggiornamento (per ogni coppia xk,y*k, si e’ omesso l’apice k)

Learning expressions (for each pair xk, y*k, the apex k has been dropped)strato d’ uscita O: ym= (sm) em=y*m-ym m= em’(sm) wjm= m yj

strato nascosto H2: ej=mmwjm j= ej’(sj) wkj = j yk

strato nascosto H1: ek=jjwkj k= ek’(sk) wik = k xi

Page 37: Aims of the course (An Engineering Approach) The pattern recognition problem

Addestramento globale dei pesi sinaptici (Global synaptical weights learning)

Insieme d’ addestramento: (xk,y*k), k=1÷Q,Vettore uscita desiderato y*k= (y*k

m, m=1-M)Vettore uscita prodotto da xk=(xk

i,i=1-N) yk= (ykm, m=1-M)

Funzione errore globale: Eg(Wj)= 1/2km (y*km-yk

m)2 = 1/2k m ekm)2

Retropropagazione dell’ errore (per ogni coppia xk,y*k, si e’ omesso l’apice k)strato d’ uscita O: ym= (sm) em=y*m-ym m= em’(sm)

strato nascosto H2: ej=mmwjm j= ej’(sj)

strato nascosto H1: ek=jjwkj k= ek’(sk)

Formule per l’ aggiornamento globale:(Expressions for global learning)

wji= -.dEg/dwji= k kjyk

i = k ’(skj).ek

j dove ek

j=hjwhjkh e k

j= - ’(skj).ek

j

Page 38: Aims of the course (An Engineering Approach) The pattern recognition problem

y

x1 x2

MPL per EXOR

1

1

x1 x2 y

0 0 00 1 11 0 11 1 0

x2

1

0 1 x1

y=0

y=0

y=1

y=1

Page 39: Aims of the course (An Engineering Approach) The pattern recognition problem

yA

1 3

x11

2

x2

yA*

x1

x2

yA=fA(s) = 0.5

XA

A*yA*=fA*(s) = 0.5

+

+

Page 40: Aims of the course (An Engineering Approach) The pattern recognition problem

x1

x2

z=f(s) = 0.5

X

z=f(s) =-Tz=f(s) =T

A

A*

I

1 3

x1 1

2

x2

yAyA*z

u(z-T)u(-z-T)

Zona morta per migliorare l’affidabilità della classificazioneDead zone to improve the classifcation reliability

Page 41: Aims of the course (An Engineering Approach) The pattern recognition problem

MLP per riconoscimento di due classi con p.d.f. gaussiane (HAYKIN Cap.4.8) MLP perceptron for gaussian d.p.f. pattern (HAYKIN Cap.4.8)

B

x2

AXA

zona didecisione

ottima BayesianaB

A

rA

x1X

XAX

discrim

inante

MLP

x1 1 x2

yA yB

MLP: Pe = 0.196Bayesiana: Pe = 0.185

Parametri di addestramento=0.1, =0.5

Page 42: Aims of the course (An Engineering Approach) The pattern recognition problem

Note Notesa) metodo dei momenti (moments method) : wij(n)= wij(n-1) +i (n)x j(n) con <1

b) suddivisione suggerita per l’ insieme di addestramento+validazione suggested partitioning for the traing and validation tests

add. val.1. Sessione

2. Sessione

3. Sessione

4. Sessione

c) normalizzazione: al valor medio e agli autovalori) (normalization to the mean and the eigen value)

d) inizializzazione: pesi casuali e piccoli (funzionamento in zona lineare), =.1,~.9 initialization wth small and random values (linear zone operation), h=0.1, ~.9

Page 43: Aims of the course (An Engineering Approach) The pattern recognition problem

Mappe autoorganizzateSELF ORGANIZING MAPS (SOM)

a) Numero di classi (cluster) predefinito The number of classes is predefinedb) Paradigma di classificazione: verosimiglianza nella distribuzione

statistica Predefined classification paradigm: likelihood in statistical

distribution - modello: disposizione dei neuroni sulla corteccia cerebrale; model: disposition of the brain neurons on the cerebral cortex - Modello di apprendimento: interazione eccitatoria/inibitoria dei

neuroni; learning model: excitatory/inhibitory neuron interactions- rappresentazione geometrica: tassellazione di Voronoi; geometrical representation: Voronoi tasselation

Page 44: Aims of the course (An Engineering Approach) The pattern recognition problem

1 i N

1 j N

1 j M

x

Von der Malsburg

Kohonenwjw1wM

yjy1 yM

bidirectional interactions

Page 45: Aims of the course (An Engineering Approach) The pattern recognition problem

j

i

wj

wi

x

x2

x1

spazio delle uscite (bidimensionale)output space (two discrete dimensionality)

Spazio dei campioni (elavata dimensionalità)Pattern space (large and continous dimensionality)

Riduzione della dimensionalita’ (neuroni su reticolo)Dimensionality reduction (neurons on a grid)

Page 46: Aims of the course (An Engineering Approach) The pattern recognition problem

Struttura delle SOM SOM structure

h

k

x1 x2 xix4 xN

Input layer (N nodes)

Output layer (M nodes)TwodimensionalOutput vector y

Input vector x

Page 47: Aims of the course (An Engineering Approach) The pattern recognition problem

xi

wji

1 i N

1 j Myi

j = argmin[(x,wh); h=1M]

yj=1; yh=0 per h j)

-competizione (per la selezione e attivazione del nodo d’ uscita corrispondente alla massima attività)-competition (for the selection and activation of the output neuron corresponding to maximum activity)-cooperazione (per la modifica dei pesi)-cooperation (for weights modification)-adattamento sinaptico: eccitazione/inibizione-synaptic adaptation: excitatory/inhibitory

Paradigma di apprendimento (Learning paradigm)

Page 48: Aims of the course (An Engineering Approach) The pattern recognition problem

Turing, 1952Si puo’ realizzare una strutturazione globale mediante interazioni localiA global structure can need only local interactionsLa strutturazione e’ realizzata da interconnessioni neuronali localiThe structure is implemented by local neural interconnections

Principio 1. Le interconnessioni sono generalmente eccitatorie1. Principle: Interconnections are mainly excitatory

Principio 2. La limitazione delle ‘risorse’ facilita specifiche attivita’2. Principle: The resource limitation makes easier specific activities

Principio 3. Le modifiche dei pesi sinaptici tendono ad essere cooperative3. Principle: Weight modifcations tend to be cooperative

Principio 4. Un sistema autorganizzato deve essere ridondante4. Principle: A self organizing system has to be redundant

Page 49: Aims of the course (An Engineering Approach) The pattern recognition problem

Competizione Competitionneurone vincente winning neuron : j = argmin[||x-wh||) ; h=1M] oppure or: j = argmax[xTwh ; h=1M]

Cooperazione Cooperationdistanza reticolare d(j,i) dei nodi i e jManhattan distance d(i,j) of nodes i and jfunzioni di vicinato neighbourhood functions : Excitatory only: hi(j) = exp[- d(i,j)2 /22] oppure or

Mexican hat: hi(j) = a.exp[- d(i,j)2 /2e2] – b exp[- d(i,j)2 /2i

2]

Adattamento sinaptico (Synaptical updating):wi= hi(j)(x-wi)

e diminuiscono durante l’apprendimento decrease during learningAutorganizzazione self organisation: =0.1-0.01,Convergenza statistica stastistical convergence: =0.01, 1 d(i,j) 0

i

j

d(i,j)=5

Page 50: Aims of the course (An Engineering Approach) The pattern recognition problem

Aggiornmento dei pesi con il metodo del gradiente Weights updating by gradient learning

wi (i=1÷M) vettore prototipo del nodo i prototype vector of node i

Error function ( winning node j):

Ej(W)= 1/2i hi(j) (x- wi)2 (i=1÷M)

Computation of the gradient

Ej (wi)= grad(Ej (wj)).wi= (E(W)/wi).wi

Weight updating wi = -Ej(W)/wi = hi(j) (x- wi).

Manhattan distance Euclidean distance

Page 51: Aims of the course (An Engineering Approach) The pattern recognition problem

wji

1 i N

1 j M

Classe desiderata Y*Desired class Y*

Strato delle classiClass layer

Strato nascosto competitivoHidden competive layer

Strato d’ ingressoInput layer

Pattern vector x

SOM supervisionata Supervised SOM

1 i K

Vettore campione: x= (xi, i=1-N)xi

yi

PERCETTRONE

SOM

Page 52: Aims of the course (An Engineering Approach) The pattern recognition problem

wji

1 i N

1 qj M

Vettore quantizzato xq

(xq, i=1N)Quantized vector

Strato di quantizzazioneQuantisation layer SOM learningq=(qj;j=1÷M)

Strato d’ ingressoCampione x (xi, i=1N)

Fig. 14c) Quantizzatore vettoriale adattativo(Adaptive Learning Vector Quantization, ALVQ)

1 i N

xi

xqi

PERCETTRONE

SOM

Page 53: Aims of the course (An Engineering Approach) The pattern recognition problem

Addestramento delle SOM supervisionate Learning Vector Quantizer (LVQ) dati di addestramento learning data: (x)

a) apprendimento della SOM (con x) ; SOM learning (only x)

b1) Addestramento (x,c) dello strato d’uscita (con q,x) (x,c) eq. (q,c) Outuput layer learning (with q,x)b2) Addestramento con etichettatura, Learning with labellingb3) Addestramento e etichettatura dello strato nascosto Learning and labelling of the hidden layer wc= +/- (x-wc) se x appartiene o no alla classe C if x belong or not to class C

Page 54: Aims of the course (An Engineering Approach) The pattern recognition problem

Inferenza statistica delle RNAStatistical Inference of the ANN

RNA(ANN)

x, ck

y1(x)

ym(x)

yk(x)

yM(x)

E2= X P(x)(k P(ck /x) m [ym(x)-y*m(x)] 2})

E2= X P(x)(m {k P(ck /x) [ym(x)- m(x)k]2})

y*1 (x) = l(x) = 0

y*m(x) = m(x) = 0

y*k(x) = k(x) = 1

y*M(x) = M(x) = 0

ck =(l(x)…. k(x)….. M(x))

Page 55: Aims of the course (An Engineering Approach) The pattern recognition problem

E2 = X P(x)(m {k [ym(x)- m(x)] 2 P(ck /x) })

k[ym(x)- m(x)]2 P(ck/x)= ym2(x)-2ym(x) P(cm/x) + P(cm/x)=

as m(x)=1 only for k = m and k P(ck/x)=1,

adding and subtracting P2(cm/x) we get:

[ym2(x)-2ym(x) P(cm/x) + P2(cm/x)] + [P(cm/x) - P2(cm/x)] =

= [ym(x)-P(cm/x)]2 + P(cm/x) [1- P(cm/x)]

where only the first term depends on the ANN, that if the ANN has been correctly updated the minimum value of E2 is obtained when:dove solo il primo addendo dipende dalla rete per cui addestrandola correttamente si ottiene il minimo di E2 per:

ym(x)=P(cm/x)

Page 56: Aims of the course (An Engineering Approach) The pattern recognition problem

Reti Neuronali Adattative Teoria della risonanza adattativa

Adaptive Neural Networks (Adaptive Resonance Theory, ART)

Meccanismo psicofisiologico di adattamento all’ambiente:1) Attenzione selettiva: ricerca di una situazione nel dominio di conoscenza2) Risonanza: se l’ attenzione selettiva rileva una situazione nota3) Orientamento: ricerca o creazione di una nuova conoscenzaVantaggi: compatibilita’ fra plasticita’ e stabilita’Svantaggi: complessita’ della struttura e dell’ algoritmo di apprendimento

Paradigm of Psychological Adaptation to the Environment:1) Selective Attention: research in the knowledge domain;2) Resonance: if positive response of the knowledge domain;3)Orientation: research or implementation of new knowledgeAdvantages: plasticity and stability are compatibleDisadvantages: complexity of the structure and of the learning algorithm

Page 57: Aims of the course (An Engineering Approach) The pattern recognition problem

Plasticity and Stability

• A training algorithm is plastic if it has the potential to adapt to new vectors indefinitely

• A training algorithm is stable if it preserves previously learned knowledge

+ category representationw prototype representation

Input pattern representation

Selection based on input-prototype distanceClassification based on input-category distance

+w

w+

Page 58: Aims of the course (An Engineering Approach) The pattern recognition problem

Apprendimento:Attivazione dello strato di riconoscimento con competizione SOM (attenzione selettiva)Retropropagazione allo strato di confronto e verifica della risonanza al modello attivato Creazione di un nuovo neurone in caso di impossibilità di risonanza (orientamento)Learning ParadigmActivation of the output layer by SOM learning (selective attention) Feedback to the comparison layer and resonance evaluation with the activated patternImplementation of a new neuron if no resonance is possible (orientation)

strato delle categoriecategory layer

strato di confrontocomparison layer

1 j P P+1

1 i N

Wj

x1 xi xN

Zj

Page 59: Aims of the course (An Engineering Approach) The pattern recognition problem

strato diriconoscimento

strato diconfronto

1 j P P+1

1 i N

wji

x1 xi xN

zij

j=argmax [xTwh, h=1÷P] Attenzione selettiva Selective attention coefficiente di risonanza (resonance coefficient)xTzj > risonanza (resonance): adattamento di adaptation of wj e zj

xTzj< orientamento se (orientation if): xTzh con h > < j

Se (if) xTzh < per ogni (for each) h=1÷P si crea un nuovo nodo P+1 wP+1=x

(a new node) P+1 wP+1=x is implemented

Page 60: Aims of the course (An Engineering Approach) The pattern recognition problem

1 h j P P+1

y1 yh yj yP

x1 xN

tjx ||x

y1 yh yj yP

bji

x1xi xN

If tjx ||x || for all j then generate node P+1<

t hi

ART1For binary input pattern

Page 61: Aims of the course (An Engineering Approach) The pattern recognition problem

Learning of ART1 (Pao model)Initialization:

tji0=1 e bji

0=1/(1+N)

Competition phase: yj=bjTx

j=argmax[yp; p=1÷P]

Selective attention: verification of the resonance if tj

Tx>||x|| resonance is satisfied then (risonanza)

weight updating tjik+1= tji

k xi e bjik+1= tji

k xi/(0,5+ tjkx)

else (orientamento):a new node is implemented tji

0=1 e bji0=1/(1+N)

Page 62: Aims of the course (An Engineering Approach) The pattern recognition problem

Struttura di principio delle reti ARTBasic ART Structure

F2 strato dei nodi delle categorieSTM rappresentazione della categoria estrattaF2 field of category nodesSTM representation of the extracted category

LTM rappresentazione dell’informazione appresa e memorizzata (in F1 e F2)LTM representation of the learned and stored information (in F1 and F2)

F1 strato dei nodi di confrontoSTM rappresentazione filtrata dei pattern d’ingresso e di categoriaF1 field of comparison nodes STM representation of filtered input and category pattern

STM: Short Term Memory (Attività dei nodi)LTM: Long Term Memory (Pesi delle connessioni)

Page 63: Aims of the course (An Engineering Approach) The pattern recognition problem

A: control nodeInput I generates activity pattern X,non specifically activates A and extracts category Y

Page 64: Aims of the course (An Engineering Approach) The pattern recognition problem

Category pattern V generates activity X* and deactivates A

Page 65: Aims of the course (An Engineering Approach) The pattern recognition problem

Because of mismatch a new category is searched

Page 66: Aims of the course (An Engineering Approach) The pattern recognition problem

A new category is extracted

Page 67: Aims of the course (An Engineering Approach) The pattern recognition problem

A new comparison cycle is started !!

Page 68: Aims of the course (An Engineering Approach) The pattern recognition problem

ART2

Reset if /|r|>1

r=u+cpp=u+g(yJ)zJ

J is the selected categoryq= p/|p|

v=f(x)+bf(q)u=v/|v|

x=w/|w|w=i+au

F1: Patterns layer

F1: Categrory layer

Page 69: Aims of the course (An Engineering Approach) The pattern recognition problem

Category selection

F1 loop-processing

p =uu=v/|v|v=bf(,q)+f(,x)q=p/|p|x=w/|w|w=au+iThen:Th=p.zBh

J= argmax [Th, h= 1÷P]

Parameters: a;b;Non linear filterf(x)= 0 if x < else f(x) =x

Resonance evaluation

F2 Top-down and

F1 loop-processing

p =u+dzTJ

u=v/|v|v=bf(q,q)+f(q,x)q=p/|p|x=w/|w|w=au+iThen:r= (u+cp)/(|u|+c|p|

Resonance condition:/|r|<1

Parameters: d,c,

Page 70: Aims of the course (An Engineering Approach) The pattern recognition problem

If resonance ART learning for category J:F1-F2 connection weights updatingF1 F2: zBJ= du-d(1-d)zBJ

F2 F1: zTJ= du-d(1-d)zTJ

else Reset and Orientation: selection of another category: next lower ThIf no resonance: implementation of a new category

Page 71: Aims of the course (An Engineering Approach) The pattern recognition problem

Caratteristiche di ART2 ART2 characteristics

a. Compromesso fra stabilità e plasticità Stability/Plasticity Trade-Offb. Compromesso fra ricerca e accesso diretto Search/Direct-Access Trade-Offc. Compromesso fra inizializzazione econfronto Match/Reset trade-Offd. Invarianza delle rappresentazioni (STM) durante l’estrazione delle informazioni memorizzate (LTM) STM Invariance under Read-Out of Matched LTMe. Coesistenza dell’estrazione di LTM e normalizzazione di STM Coexistence of LTM Read-Out and STM Normalizationf. Invarianza di LTM all’ applicazione di ingressi particolari No LTM recording by Superset Inputs g. Scelta stabile fino all’azzeramento Stable choice until reset.h. Aumento del contrasto, soppressione del rumore e riduzione del confronto con filtraggi non lineari Contrast Enhancement, Noise Suppression and Mismatch Attenuation by Non Linear Filteringi. Autostabilizzazione veloce Rapid Self-stabilzationj. Normalizzazione Normalizationk. Elaborazione locale Local Computation

Page 72: Aims of the course (An Engineering Approach) The pattern recognition problem

a) b)

Classificazione ART ART classification (a) soglia bassa, low threshold (b) soglia alta, high threshold Da: G.A. Carpenter e S. Grossberg: Applied Optics, 1987, Vol 26 p. 4920, 49221

Page 73: Aims of the course (An Engineering Approach) The pattern recognition problem

x1

x2

x=(x1,x2)

Computer experiment: apply ART2 to category recognition