aims of the course (an engineering approach) the pattern recognition problem

PATTERN RECOGNITION:

A COMPREHENSIVE APPROACH USING ARTIFICIAL NEURAL NETWORK OR/AND FUZZY LOGIC

Sergio C. BROFFERIOemail [email protected]

• Aims of the course (An Engineering Approach)

• The pattern recognition problem

• Deterministic and statistical methods:models

• Neural and Behavioural models

• How to pass the exam? Paper review or Project

mailto:[email protected]

REFERENCES FOR ARTIFICIAL NEURAL NETWORKS (ANN)a)Basic textbooks

C. M. Bishop: “Neural Network for Pattern Recognition”Clarendon Press-Oxford (1995). Basic for Engineers

S. Haykin; "Neural Networks" Prentice Hall 1999. Complete text for Staic and dynamic ANN.

T. S. Koutroumbas, Konstantinos: “ Pattern Recognition” – 4. ed.. - Elsevier Academic Press, 2003. - ISBN: 0126858756

Y.-H. Pao: “Adaptive Pattern Recognition and Neural Networks” Addison-Wesley Publishing Company. Inc. (1989) Very clear and good text

R. Hecht-Nielsen: “Neurocomputing”, Addison-Wesley Publishing Co., (1990).

G.A. Carpenter, S. Grossberg: “ART”: self-organization of stable category recognition codes for analog input pattern” Applied Optics Vol. 26, 1987

b) Applications

F.-L. Luo, R. Unbehauen: “Applied Neural Networks for Signal Processing” Cambridge University Press (1997).

R. Hecht-Nielsen: “Nearest Matched filter Classification of Spatiotemporal Patterns” Applied Optis Vol. 26 n.10 (1987) pp. 1892-1898 Y. Bengio, M. Gori: “Learning the dynamic nature of speech with back-propagation for sequences””Pattern Recognition Letters n. 13 pp. 375-85 North Holland (1992) A. Waibel et al.: “Phoneme Recognition Using Time Delay Neural Networks” IEEE Trans. On Acoustics, Speech and Signal processing Vol. 37. n. 3 1989

P. J. Werbos: “Backpropagation through time: what it does and how to do it2 Proceedings of the IEEE, vol. 78 1990

REFERENCES FOR FUZZY LOGIC

Y.H. Pao: “Adaptive Pattern Recognition and Neural Networks”, Addison-Wesley Publishing Company. Inc. (1989)

B. Kosko: “Neural Networks and Fuzzy Logic”Prentice Hall (1992)

G.J. Klir, U.H.St.Cair,B.Yuan: “Fuzzy Set Theory: Foundations and Applications”Prentice Hall PTR (1997)

J.-S. Roger Jang:“ ANFIS: Adaptive_Network-Based Fuzzy Inference System”, IEEE Trans. on Systems, Man, and Cybernetics, Vol. 23 No. 3 1993

datiosservazioni

esperto classe

datiosservazioni

elaboratore

datiosservazioni

esperto

elaboratore classe

classe

Evoluzione dell’ automatizzazione dei metodi di riconoscimentoHistorical evolution of Pattern Recognition

Trasformazione ‘fisica’

Riconoscimento

Elaborazione semantica

simboli

campioni pattern (caratteristiche) ( features)

Organizzazione a livelli delle elaborazioni per il riconoscimento automatico Hierarchical organization of Pattern recognition

segnali dal sensore

segnali all’ attuatore

informazioni semantiche

xcampione(pattern)

spazio dei campioni (anche continuo)..

..

.

Il riconoscimento come mappatura dello spazio dei campioni nello spazio delle classi (o dei simboli)

Sample to Class Mapping

C3C2C1

* * *

spazio delle classi (discreto)

. .

x2

x1

C3

C2

C1

x

discriminante

d31(x)=0

caratteristica(feature)

campione(pattern) classe

(simbolo)

Il riconoscimento come partizione dello spazio dei campioniSpace Partitioning for pattern Recognition

spazio dei campioni

Funzione di decisione: Di(x) con i = 1...K

Discriminante: dij(x)= Di(x)- Dj(x) con i,j= 1...K

D3(x)>0

D1(x)>0

caratteristica(feature)

Pattern classifications types

Area ComputationAlgorithm

Classification of theArea value (S)Or its quantization(Sq)

S

F2

F1

E

A

O

U

SpeechRecognizer

[Hz]

[Hz]Vowel

Esempio di riconoscimento di vocali con logica sfumataExample of pattern recognition (Vowel Recognition) using Fuzzy Logic

F2

F1

E

A

O

U

I

SpeechRecognizer

F1 MP P M GF2

B

A

U O

U E

A A

E I

V={I,U,O,A,E}F1={MP, P,M,G} F2={B,A}

Vowel

[Hz]

[Hz]

The neuronCell bodyDendritesAxonSynaptic Connections

Our Brain and its neurons

- Main characteristics Neurons: ~1011

Connections: ~1015, ~104 connections/neuro Switching time: ~1ms, (10 ps in computers) Switching energy: ~10-6 joule/cycle -Learning and adaptation paradigm: from neurology and psychology - Historical and functional approaches

Caratteristiche delle RNA (ANN characteristics)

-non linearita’ (non linearrity)- apprendimento (con o senza maestro) Supervised or unsupervised learning- Adattamento: plasticita’ e stabilita’ (Adaptability: plasticity and stability)- risposta probativa (probative recognition)- informazioni contestuali (contextual information)- tolleranza ai guasti (fault tolerance)- analogie neurobiologiche (neurobiological analogies)- realizzazione VLSI (VLSI implementations)- uniformita’ di analisi e progetto (uniformity of analysis and design)

# sess.add.

err %

ins. addestramento

ins. verifica

nott

Fig.34 Andamento dell’ errore di classificazione per i campioni di addestramento e quelli di verifica

Stability is the capability of recogniono in presence of noiseOverfitting produces a loss of plasticity when the number of traning sessions is above nott

wji

i

yj

Components of the Artificial Neural Network(ANN)

Receptive Field

Local induced field

Neuron Activity

Neuron

Synaptic Weight

connection . . .j

vettore di Y uscita

strato diuscita

stratonascosto

vettore xi

d’ ingresso X

wji

j

i

y(t) =f(x(t),W,t)

yh

Struttura di una Rete Neuronale ArtificialeLayered structure of a ANN

conness. con ritardoDelay

. . .

RNA statica dinamicaCampione(Sample) Percettrone multistrato (MLP) Memoriestatico autoassociative Mappa autorganiz- dinamiche zata (SOM)

dinamico a ritardo (TDNN) spazio-temporale FIR non lin. IIR non lin.

Tipi di RNA( statiche e dinamiche)e tipi di campioni (statici e dinamici)Static and Dynamic ANN’s for either Static and Dynamic samples Pattern Recognition

RNAW

Ambiente x, y*

W

x

stimolo(campione)

risposta

y

Interazione fra RNA e ambiente (stimoli e eventualmente risposta desiderata)Learning through interactions of an ANN with its environment

y*

rispostadesiderata

“adattatore”

Hebb’ law

i

j

wjixi

xj

If two neurons are active the weight of their connection is increased,Otherwise their connection weight is decreased

wji = xixj

wj1

wji

wjN

1

jx1

xi

xNwj(N+1)

sf(s)

yj

ingressi: x= (xi, i=1N, x(N+1)=1)

pesi: wj=(wji, i=1 N+1)

campo locale indotto : s = wji.xi con i=1 N+1

+

Struttura del neurone artificialeANN ON-OFF or “sigmoidal” node structure

funzioni di attivazione: y= f(s)=u(s)

y=f(s)=(s)= 1/(1+exp(-s)

y=f(s)=Th(s)

s

f(s)

1

0.5

Funzione di attivazione sigmoidaleActivation function of a sigmoidal neuron

x1

x2

f(s) = f(0)

x

Discriminante lineareLinear discrimination

n

d

s(x)=0

s>0

s<0

d= (w1x1+ w2x2+ w3)(w12+ w2

2)-1/2

o

w1

w2

1

x1

x2

w3

sf(s) y+

s= w1x1+ w2x2+ w3

wj1

wji

wjN

jx1

xi

xN

exp(-d2/d02)

yj

ingressi: x= (xi, i=1 N)

pesi: wj=(wji, i=1 N)

funzione di attivazione: y=f(d)=exp(-d2/d02)

|x,wj)|d2

Neurone artificiale risonante (selettivo, radiale, radiale)Resonant (Selective, Radial Basis) Artificial Neuron

distanza: d2 = [(x,wj)]2 = ixi-wji)2

oppure distanza pesata: d2 = [(x,wj)]2 = ici(xi-wji)2

Fig. 5b) Funzione di attivazione radiale y=f(s)= exp[-d/d0)2]

Funzione base radiale (Radial Basic Function, RBF)

d

f(s)

1

d0 d0

1/e~0.3

x1

x2

x

wj

o

d

Attività di una funzione risonante (radiale) di due variabiliTwo components radial basis function

ANN learning methods

Supervised learning (Multi Layer Perceptron))Sample-class pairs are applied (X,Y*);a) The ANN structure is definedb) Only the rule for belonging to the same class is defined (Adaptive ANN)

Unsupervised learning (Self Organising Maps SOM)Only the sample X is applied a) the number of classes K is definedb) Only the rule for belonging to the same class is defined (Adaptive ANN)

Ingressi: xi ; campo locale indotto: s = wixi; uscita: y=(s)

dati per l’addestramento: coppia campione classe (x,y*); errore;e = y*-y

aggiornamento dei pesi:wi= e’(s)xi con ’(s) = y(1-y) if y = (s)=1/(1+exp(-s))

Il percettroneThe Perceptron

wi

i

xi

y

1 N N+1

1

wi

i

xi

y- y*

e

+

x1

x2

f(s) = f(0)

x

Discriminante lineareLinear discrimination

n

d

s(x)=0

s>0

s<0

d= (w1x1+ w2x2+ w3)(w12+ w2

2)-1/2

o

w1

w2

1

x1

x2w3

sf(s) y+

s= w1x1+ w2x2+ w3

Hebb’ law

i

j

wjixi

xj

wji = xixj

Perceptron learningy=(s); s= wTx; E(w)=(d-y)2 =1/2e2 ; Training pair (x,d)ddww =dE/dw. (-dE/dw)= - (dE/dw)2

w=-dE/dw =- (E/s) (s/w)= =- (s)xE/s = (s) is called the local gradient with respect to node 1 or ssE/s =e.’(s)wi=-dE/dwi =- (E/s) (s/wi)= - (s)xi

Gradient learning

iwi

xi

(s)

wji = sxi

x2

x1

ca b

y

c

ab

A+

x2 x1 1

x2

x1

y

x1 x2

c

Partizione dello spazio dei campioni di un percettrone multistratoThe partitioning of the sample space by the MLP

a b

A B

c

ab

A

B

(x, c/c*)

Y y1 yh yK

stratonascosto H2

stratonascosto H1

strato d’ ingresso

vettored’ uscita

strato d’ uscita

vettore x1 xk xM

d’ ingresso X

vhj

j

i

Il percettrone multilivello The Multilayer Perceptron (MLP)

wji

k wik. ..

yi

yj

E(W)=1/2(dh-yh)2 with h=1÷K

Sequential learningMulti Layer Perceptron

y=(s2); s2= vTy; y1=f(s1); s1= wTx ; E=(d-y)2 =e2

Training pair (x,d)

w=-dE/dw =- (E/s1) (s1/w)= =- (s1)xE/s1 = (s1) the local gradient with respect to node 1 or s1

sE/s2.ds2/dy1.dy1/ds1 =(s2)v1’(s1)=e1’(s1)

e1 = (s2)v1s the backpropagated error

detailed notationw =- e1’(s1)x = e ’(s2)v1 ’(s1) x

1 h M

’(sj)

ej=h whj

sj)= ej’(sj)

yj

+

wji

v1jvMj

vhj

y1 yh yM

(sj)

yj

yi

wji

v1jvMj

vhj

(si)

(s1)(sh)

(sM)

Forward step Backpropagation step

sjwjixi ej=h vhj

yi(sj) j= - ej’(sj); wji = - jyi

e1 eh=y*h- yh eM

whj

wji

O

H2

H1

I

’(sh)

’(sj)

wik

’(si)

yj

whj= - h yjh= ehs’(sh)

ej=h whj

j= ej’(sj)wji = - j yi

yi

ei=j j wji

i= ej’(sj) x1 xk xN

wik = - i xk

Rete di retropropagazione dell’ erroreLinear ANN for error back propagation

1 h M

1 j MH2

1 i MH1

1 k N

yh

Metodo di aggiornamento sequenziale dei pesi (Sequential weights learning)

Insieme d’ addestramento: (xk,y*k), k=1-Q,Vettore uscita desiderato y*k= (y*k

m, m=1-M)Vettore uscita yk= (yk

m, m=1-M) prodotto da xk=(xki,i=1-N)

Funzione errore: E (W)= 1/2m (y*km-yk

m)2 = 1/2 m ekm)2

Formula d’ aggiornamento: wji=-.dE/dwji= -jyi = ’(sj).ejyi dove ej=mwmjm e m= - ’(sm).em Formule d’ aggiornamento (per ogni coppia xk,y*k, si e’ omesso l’apice k)

Learning expressions (for each pair xk, y*k, the apex k has been dropped)strato d’ uscita O: ym= (sm) em=y*m-ym m= em’(sm) wjm= m yj

strato nascosto H2: ej=mmwjm j= ej’(sj) wkj = j yk

strato nascosto H1: ek=jjwkj k= ek’(sk) wik = k xi

Addestramento globale dei pesi sinaptici (Global synaptical weights learning)

Insieme d’ addestramento: (xk,y*k), k=1÷Q,Vettore uscita desiderato y*k= (y*k

m, m=1-M)Vettore uscita prodotto da xk=(xk

i,i=1-N) yk= (ykm, m=1-M)

Funzione errore globale: Eg(Wj)= 1/2km (y*km-yk

m)2 = 1/2k m ekm)2

Retropropagazione dell’ errore (per ogni coppia xk,y*k, si e’ omesso l’apice k)strato d’ uscita O: ym= (sm) em=y*m-ym m= em’(sm)

strato nascosto H2: ej=mmwjm j= ej’(sj)

strato nascosto H1: ek=jjwkj k= ek’(sk)

Formule per l’ aggiornamento globale:(Expressions for global learning)

wji= -.dEg/dwji= k kjyk

i = k ’(skj).ek

j dove ek

j=hjwhjkh e k

j= - ’(skj).ek

j

y

x1 x2

MPL per EXOR

1

1

x1 x2 y

0 0 00 1 11 0 11 1 0

x2

1

0 1 x1

y=0

y=0

y=1

y=1

yA

1 3

x11

2

x2

yA*

x1

x2

yA=fA(s) = 0.5

XA

A*yA*=fA*(s) = 0.5

+

+

x1

x2

z=f(s) = 0.5

X

z=f(s) =-Tz=f(s) =T

A

A*

I

1 3

x1 1

2

x2

yAyA*z

u(z-T)u(-z-T)

Zona morta per migliorare l’affidabilità della classificazioneDead zone to improve the classifcation reliability

MLP per riconoscimento di due classi con p.d.f. gaussiane (HAYKIN Cap.4.8) MLP perceptron for gaussian d.p.f. pattern (HAYKIN Cap.4.8)

B

x2

AXA

zona didecisione

ottima BayesianaB

A

rA

x1X

XAX

discrim

inante

MLP

x1 1 x2

yA yB

MLP: Pe = 0.196Bayesiana: Pe = 0.185

Parametri di addestramento=0.1, =0.5

Note Notesa) metodo dei momenti (moments method) : wij(n)= wij(n-1) +i (n)x j(n) con <1

b) suddivisione suggerita per l’ insieme di addestramento+validazione suggested partitioning for the traing and validation tests

add. val.1. Sessione

2. Sessione

3. Sessione

4. Sessione

c) normalizzazione: al valor medio e agli autovalori) (normalization to the mean and the eigen value)

d) inizializzazione: pesi casuali e piccoli (funzionamento in zona lineare), =.1,~.9 initialization wth small and random values (linear zone operation), h=0.1, ~.9

Mappe autoorganizzateSELF ORGANIZING MAPS (SOM)

a) Numero di classi (cluster) predefinito The number of classes is predefinedb) Paradigma di classificazione: verosimiglianza nella distribuzione

statistica Predefined classification paradigm: likelihood in statistical

distribution - modello: disposizione dei neuroni sulla corteccia cerebrale; model: disposition of the brain neurons on the cerebral cortex - Modello di apprendimento: interazione eccitatoria/inibitoria dei

neuroni; learning model: excitatory/inhibitory neuron interactions- rappresentazione geometrica: tassellazione di Voronoi; geometrical representation: Voronoi tasselation

1 i N

1 j N

1 j M

x

Von der Malsburg

Kohonenwjw1wM

yjy1 yM

bidirectional interactions

j

i

wj

wi

x

x2

x1

spazio delle uscite (bidimensionale)output space (two discrete dimensionality)

Spazio dei campioni (elavata dimensionalità)Pattern space (large and continous dimensionality)

Riduzione della dimensionalita’ (neuroni su reticolo)Dimensionality reduction (neurons on a grid)

Struttura delle SOM SOM structure

h

k

x1 x2 xix4 xN

Input layer (N nodes)

Output layer (M nodes)TwodimensionalOutput vector y

Input vector x

xi

wji

1 i N

1 j Myi

j = argmin[(x,wh); h=1M]

yj=1; yh=0 per h j)

-competizione (per la selezione e attivazione del nodo d’ uscita corrispondente alla massima attività)-competition (for the selection and activation of the output neuron corresponding to maximum activity)-cooperazione (per la modifica dei pesi)-cooperation (for weights modification)-adattamento sinaptico: eccitazione/inibizione-synaptic adaptation: excitatory/inhibitory

Paradigma di apprendimento (Learning paradigm)

Turing, 1952Si puo’ realizzare una strutturazione globale mediante interazioni localiA global structure can need only local interactionsLa strutturazione e’ realizzata da interconnessioni neuronali localiThe structure is implemented by local neural interconnections

Principio 1. Le interconnessioni sono generalmente eccitatorie1. Principle: Interconnections are mainly excitatory

Principio 2. La limitazione delle ‘risorse’ facilita specifiche attivita’2. Principle: The resource limitation makes easier specific activities

Principio 3. Le modifiche dei pesi sinaptici tendono ad essere cooperative3. Principle: Weight modifcations tend to be cooperative

Principio 4. Un sistema autorganizzato deve essere ridondante4. Principle: A self organizing system has to be redundant

Competizione Competitionneurone vincente winning neuron : j = argmin[||x-wh||) ; h=1M] oppure or: j = argmax[xTwh ; h=1M]

Cooperazione Cooperationdistanza reticolare d(j,i) dei nodi i e jManhattan distance d(i,j) of nodes i and jfunzioni di vicinato neighbourhood functions : Excitatory only: hi(j) = exp[- d(i,j)2 /22] oppure or

Mexican hat: hi(j) = a.exp[- d(i,j)2 /2e2] – b exp[- d(i,j)2 /2i

2]

Adattamento sinaptico (Synaptical updating):wi= hi(j)(x-wi)

e diminuiscono durante l’apprendimento decrease during learningAutorganizzazione self organisation: =0.1-0.01,Convergenza statistica stastistical convergence: =0.01, 1 d(i,j) 0

i

j

d(i,j)=5

Aggiornmento dei pesi con il metodo del gradiente Weights updating by gradient learning

wi (i=1÷M) vettore prototipo del nodo i prototype vector of node i

Error function ( winning node j):

Ej(W)= 1/2i hi(j) (x- wi)2 (i=1÷M)

Computation of the gradient

Ej (wi)= grad(Ej (wj)).wi= (E(W)/wi).wi

Weight updating wi = -Ej(W)/wi = hi(j) (x- wi).

Manhattan distance Euclidean distance

wji

1 i N

1 j M

Classe desiderata Y*Desired class Y*

Strato delle classiClass layer

Strato nascosto competitivoHidden competive layer

Strato d’ ingressoInput layer

Pattern vector x

SOM supervisionata Supervised SOM

1 i K

Vettore campione: x= (xi, i=1-N)xi

yi

PERCETTRONE

SOM

wji

1 i N

1 qj M

Vettore quantizzato xq

(xq, i=1N)Quantized vector

Strato di quantizzazioneQuantisation layer SOM learningq=(qj;j=1÷M)

Strato d’ ingressoCampione x (xi, i=1N)

Fig. 14c) Quantizzatore vettoriale adattativo(Adaptive Learning Vector Quantization, ALVQ)

1 i N

xi

xqi

PERCETTRONE

SOM

Addestramento delle SOM supervisionate Learning Vector Quantizer (LVQ) dati di addestramento learning data: (x)

a) apprendimento della SOM (con x) ; SOM learning (only x)

b1) Addestramento (x,c) dello strato d’uscita (con q,x) (x,c) eq. (q,c) Outuput layer learning (with q,x)b2) Addestramento con etichettatura, Learning with labellingb3) Addestramento e etichettatura dello strato nascosto Learning and labelling of the hidden layer wc= +/- (x-wc) se x appartiene o no alla classe C if x belong or not to class C

Inferenza statistica delle RNAStatistical Inference of the ANN

RNA(ANN)

x, ck

y1(x)

ym(x)

yk(x)

yM(x)

E2= X P(x)(k P(ck /x) m [ym(x)-y*m(x)] 2})

E2= X P(x)(m {k P(ck /x) [ym(x)- m(x)k]2})

y*1 (x) = l(x) = 0

y*m(x) = m(x) = 0

y*k(x) = k(x) = 1

y*M(x) = M(x) = 0

ck =(l(x)…. k(x)….. M(x))

E2 = X P(x)(m {k [ym(x)- m(x)] 2 P(ck /x) })

k[ym(x)- m(x)]2 P(ck/x)= ym2(x)-2ym(x) P(cm/x) + P(cm/x)=

as m(x)=1 only for k = m and k P(ck/x)=1,

adding and subtracting P2(cm/x) we get:

[ym2(x)-2ym(x) P(cm/x) + P2(cm/x)] + [P(cm/x) - P2(cm/x)] =

= [ym(x)-P(cm/x)]2 + P(cm/x) [1- P(cm/x)]

where only the first term depends on the ANN, that if the ANN has been correctly updated the minimum value of E2 is obtained when:dove solo il primo addendo dipende dalla rete per cui addestrandola correttamente si ottiene il minimo di E2 per:

ym(x)=P(cm/x)

Reti Neuronali Adattative Teoria della risonanza adattativa

Adaptive Neural Networks (Adaptive Resonance Theory, ART)

Meccanismo psicofisiologico di adattamento all’ambiente:1) Attenzione selettiva: ricerca di una situazione nel dominio di conoscenza2) Risonanza: se l’ attenzione selettiva rileva una situazione nota3) Orientamento: ricerca o creazione di una nuova conoscenzaVantaggi: compatibilita’ fra plasticita’ e stabilita’Svantaggi: complessita’ della struttura e dell’ algoritmo di apprendimento

Paradigm of Psychological Adaptation to the Environment:1) Selective Attention: research in the knowledge domain;2) Resonance: if positive response of the knowledge domain;3)Orientation: research or implementation of new knowledgeAdvantages: plasticity and stability are compatibleDisadvantages: complexity of the structure and of the learning algorithm

Plasticity and Stability

• A training algorithm is plastic if it has the potential to adapt to new vectors indefinitely

• A training algorithm is stable if it preserves previously learned knowledge

+ category representationw prototype representation

Input pattern representation

Selection based on input-prototype distanceClassification based on input-category distance

+w

w+

Apprendimento:Attivazione dello strato di riconoscimento con competizione SOM (attenzione selettiva)Retropropagazione allo strato di confronto e verifica della risonanza al modello attivato Creazione di un nuovo neurone in caso di impossibilità di risonanza (orientamento)Learning ParadigmActivation of the output layer by SOM learning (selective attention) Feedback to the comparison layer and resonance evaluation with the activated patternImplementation of a new neuron if no resonance is possible (orientation)

strato delle categoriecategory layer

strato di confrontocomparison layer

1 j P P+1

1 i N

Wj

x1 xi xN

Zj

strato diriconoscimento

strato diconfronto

1 j P P+1

1 i N

wji

x1 xi xN

zij

j=argmax [xTwh, h=1÷P] Attenzione selettiva Selective attention coefficiente di risonanza (resonance coefficient)xTzj > risonanza (resonance): adattamento di adaptation of wj e zj

xTzj< orientamento se (orientation if): xTzh con h > < j

Se (if) xTzh < per ogni (for each) h=1÷P si crea un nuovo nodo P+1 wP+1=x

(a new node) P+1 wP+1=x is implemented

1 h j P P+1

y1 yh yj yP

x1 xN

tjx ||x

y1 yh yj yP

bji

x1xi xN

If tjx ||x || for all j then generate node P+1<

t hi

ART1For binary input pattern

Learning of ART1 (Pao model)Initialization:

tji0=1 e bji

0=1/(1+N)

Competition phase: yj=bjTx

j=argmax[yp; p=1÷P]

Selective attention: verification of the resonance if tj

Tx>||x|| resonance is satisfied then (risonanza)

weight updating tjik+1= tji

k xi e bjik+1= tji

k xi/(0,5+ tjkx)

else (orientamento):a new node is implemented tji

0=1 e bji0=1/(1+N)

Struttura di principio delle reti ARTBasic ART Structure

F2 strato dei nodi delle categorieSTM rappresentazione della categoria estrattaF2 field of category nodesSTM representation of the extracted category

LTM rappresentazione dell’informazione appresa e memorizzata (in F1 e F2)LTM representation of the learned and stored information (in F1 and F2)

F1 strato dei nodi di confrontoSTM rappresentazione filtrata dei pattern d’ingresso e di categoriaF1 field of comparison nodes STM representation of filtered input and category pattern

STM: Short Term Memory (Attività dei nodi)LTM: Long Term Memory (Pesi delle connessioni)

A: control nodeInput I generates activity pattern X,non specifically activates A and extracts category Y

Category pattern V generates activity X* and deactivates A

Because of mismatch a new category is searched

A new category is extracted

A new comparison cycle is started !!

ART2

Reset if /|r|>1

r=u+cpp=u+g(yJ)zJ

J is the selected categoryq= p/|p|

v=f(x)+bf(q)u=v/|v|

x=w/|w|w=i+au

F1: Patterns layer

F1: Categrory layer

Category selection

F1 loop-processing

p =uu=v/|v|v=bf(,q)+f(,x)q=p/|p|x=w/|w|w=au+iThen:Th=p.zBh

J= argmax [Th, h= 1÷P]

Parameters: a;b;Non linear filterf(x)= 0 if x < else f(x) =x

Resonance evaluation

F2 Top-down and

F1 loop-processing

p =u+dzTJ

u=v/|v|v=bf(q,q)+f(q,x)q=p/|p|x=w/|w|w=au+iThen:r= (u+cp)/(|u|+c|p|

Resonance condition:/|r|<1

Parameters: d,c,

If resonance ART learning for category J:F1-F2 connection weights updatingF1 F2: zBJ= du-d(1-d)zBJ

F2 F1: zTJ= du-d(1-d)zTJ

else Reset and Orientation: selection of another category: next lower ThIf no resonance: implementation of a new category

Caratteristiche di ART2 ART2 characteristics

a. Compromesso fra stabilità e plasticità Stability/Plasticity Trade-Offb. Compromesso fra ricerca e accesso diretto Search/Direct-Access Trade-Offc. Compromesso fra inizializzazione econfronto Match/Reset trade-Offd. Invarianza delle rappresentazioni (STM) durante l’estrazione delle informazioni memorizzate (LTM) STM Invariance under Read-Out of Matched LTMe. Coesistenza dell’estrazione di LTM e normalizzazione di STM Coexistence of LTM Read-Out and STM Normalizationf. Invarianza di LTM all’ applicazione di ingressi particolari No LTM recording by Superset Inputs g. Scelta stabile fino all’azzeramento Stable choice until reset.h. Aumento del contrasto, soppressione del rumore e riduzione del confronto con filtraggi non lineari Contrast Enhancement, Noise Suppression and Mismatch Attenuation by Non Linear Filteringi. Autostabilizzazione veloce Rapid Self-stabilzationj. Normalizzazione Normalizationk. Elaborazione locale Local Computation

a) b)

Classificazione ART ART classification (a) soglia bassa, low threshold (b) soglia alta, high threshold Da: G.A. Carpenter e S. Grossberg: Applied Optics, 1987, Vol 26 p. 4920, 49221

x1

x2

x=(x1,x2)

Computer experiment: apply ART2 to category recognition

aims of the course (an engineering approach) the pattern recognition problem

Documents