neural networks - lecture 81 unsupervised competitive learning particularities of unsupervised...

26
Neural Networks - Lecture 8 1 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering Vectorial quantization Topological mapping and Kohonen networks

Upload: gavin-fowler

Post on 05-Jan-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 1

Unsupervised competitive learning

• Particularities of unsupervised learning

• Data clustering

• Neural networks for clustering

• Vectorial quantization

• Topological mapping and Kohonen networks

Page 2: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 2

Particularities of unsupervised learning

Supervised learning

• The training set contains both inputs and correct answers

• Example: classification in predefined classes for which one knows examples of labeled data

• It is similar with the optimization of an error function which measures the difference between the true answers and the answers given by the network

Unsupervised learning:

•The training set contains just input data•Example: grouping data in categories based on the similarities between them•Relies on the statistical properties of data•Does not use an error concept but a clustering quality concept which should be maximized

Page 3: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 3

Data clusteringData clustering = identifying natural groups (clusters) in the data set

such that– Data in the same cluster are highly similar

– Data belonging to different clusters are dissimilar enough

Rmk: It does not exist an apriori labeling of the data (unlike supervised classification) and sometimes even the number of clusters is unknown

Applications:• Identification of user profiles in the case of e-commerce systems,

e-learning systems etc.)• Identification of homogeneous regions in digital images (image

segmentation)• Electronic document categorization• Biological data analysis

Page 4: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 4

Data clustering

Example: image segmentation = identification the homogeneous regions in the image = reduction of the number of labels (colours) in the image in order to help the image analysis

Rmk: satellite image obtained by combining three spectral bands

Page 5: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 5

Data clusteringA key element in data clustering is the similarity/dissimilarity measure

The choice of an appropriate similarity/dissimilarity measure is influenced by the nature of attributes

• Numerical attributes• Categorical attributes• Mixed attributes

Page 6: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 6

Data clustering• Measures adequate to numerical data

)),(1(2),(

: vectorsnormalizedFor

:distance) (Euclidean measureity Dissimilar

),(

:(cos) measure Similarity

2 YXSYXd

X-Yd(X,Y)

YX

YXYXS

T

Page 7: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 7

Data clusteringClustering methods:

• Partitional methods:– Lead to a data partitions in disjoint or partially overlapping clusters

– Each cluster is characterized by one or multiple prototypes

• Hierarchical methods:– Lead to a hierarchy of partitions which can be visualized as a tree-like

structure called dendrogram.

Page 8: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 8

Data clustering

The prototypes can be:• The average of data in the

class• The median of data in the

class

The data are assigned to clusters based on the nearest neighbour prototype

Page 9: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 9

Neural networks for clustering

Problem: unsupervised classification of N-dimensional data in K clusters

Architecture: • One input layer with N units• Un linear layer with K output units• Fully connectivity between the input and the output layers (the

weights matrix, W, contains on row k the prototype corresponding to class k)

Functioning:For an input data X compute the distances between X and all

prototypes and find the closest one. This corresponds to the cluster to which X belongs.

The unit having the closest weight vector to X is called winning unit

W

Page 10: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 10

Neural networks for clusteringTraining:

Training set: {X1,X2,…, XL}

Algorithms:• the number of clusters (output units) is known

– “Winner Takes All” algorithm (WTA)

• the number of clusters (output units) is unknown– “Adaptive Resonance Theory” (ART algorithms)

Particularities of these algorithms: • Only the weights corresponding to the winning unit are adjusted• The learning rate is decreasing

Page 11: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 11

Neural networks for clusteringExamplu: WTA algorithm

)( UNTIL

1

ENDFOR

))((:

1 allfor ),(such that find

DO L1,:l FOR

REPEAT

adjustment Prototypes

1

set training thefromelement selectedrandomly :

tion Initializa

t

tt:

WXtWW

..KkXWd),Xd(Wk*

t:

W

k*lk*k*

lklk*

k

Page 12: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 12

Neural networks for clusteringRemarks:• Decreasing learning rate

)1ln(/)0()(

[0.5,1] ,/)0()(

:

)( ,)( ,0)(lim1

2

1

tt

tt

Examples

ttttt

t

Page 13: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 13

Neural networks for clusteringRemarks:• “Dead” units: units which are never winners

Cause: inappropriate initialization of prototypes

Solutions:• Using vectors from the training set as initial

prototypes

• Adjusting not only the winning units but also the other units (using a much smaller learning rate)

• Penalizing the units which frequently become winners

Page 14: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 14

Neural networks for clustering• Penalizing the units which frequently become winners: change the

criterion to establish if a unit is a winner or not

increases winner a isk unit e th

whensituations ofnumber theas decreases which threshold

),(),( **

k

kk

kk WXdWXd

Page 15: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 15

Neural networks for clustering

It is useful to normalize both the input data and the weights (prototypes):

• The normalization of data from the training set is realized before the training

• The weights are normalized during the training:

Page 16: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 16

Neural networks for clustering

Adaptive Resonance Theory: gives solutions for the following problems arising in the design of unsupervised classification systems:

• Adaptability– Refers to the capacity of the system to assimilate new data

and to identify new clusters (this usually means a variable number of clusters)

• Stability – Refers to the capacity of the system to conserve the

clusters structures such that during the adaptation process the system does not radically change its output for a given input data

Page 17: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 17

Neural networks for clusteringExample: ART2 algorithm

max

kk

max

UNTIL

1

ENDFOR

ENDIF

1 ELSE

))card(1/())card((: THEN

KK OR IF

1 allfor ),(such that find

DO L1,:l FOR

REPEAT

adjustment Prototypes

1

set training thefromelement selectedrandomly :

L)(K prototypes ofnumber initial theChoose

tion Initializa

tt

tt:

X:; WKK:

WXW

),Xd(W

..KkXWd),Xd(Wk*

t:

W

lK

k*lk*

lk*

lklk*

k

Page 18: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 18

Neural networks for clustering

Remarks:

• The value of rho influences the number of output units (clusters)– A small value of ρ leads to a large number of clusters– A large value of ρ leads to a small number of clusters

• Main drawback: the presentation order of the training data influences the training process

• The main difference between this algorithm and that used to find the centers of a RBF network is represented just by the adjustment eq.

• There are also versions for binary data (alg ART1)

Page 19: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 19

Vectorial quantization

Aim of vectorial quantization:

• Mapping a region of RN to a finite set of prototypes

• Allows the partitioning of a N-dimensional region in a finite number of subregions such that each subregion is identified by a prototype

• The cuantization process allows to replace a N-dimensional vector with the index of the region which contains it leading to a very simple compression method with a loss of information. The aim is to minimize the loss of information

• The number of prototypes should be large in the dense regions and small in less dense regions

Page 20: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 20

Vectorial quantizationExample

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

1

Page 21: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 21

Vectorial quantization• If the number of regions is predefined then one can use the

WTA algorithm

• There is also a supervised variant of vectorial quantization (LVQ - Learning Vector Quantization)

Training set: {(X1,d1),…,(XL,dL)}

LVQ algorithm:1. Initialize the prototypes by applying a WTA algorithm to the

set {X1,…,XL}2. Identify the clusters based on the nearest neighbour criterion3. Establish the label for each class by using the labels from the

training set: for each class is assigned the most frequent label from d1,…dL

Page 22: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 22

Vectorial quantization

• Iteratively adjust the prototypes by applying an algorithm similar to that of perceptron. At each iteration one applies

ENDFOR

ENDIF

))((: ELSE

))((: THEN IF

:such that *k find

DO ,1: FOR

k*lk*k*

k*lk*k*l

klk*l

WXtWW

WXtWW dc(k*)

),Wd(X),Wd(X

Ll

Page 23: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 23

Topological mapping• This is a variant of vectorial quantization which ensures a

preservation of the neighbourhood relationship:

– Similar input data either belongs to the same cluster or to neighboring clusters

– There is necessary to define a neighborhood relation on the prototypes and on the output units of the network

– The networks for topological mapping have the output units placed on the nodes of a geometrical grid (one, bi or three dimensional)

– Networks having such an architecture are called Kohonen networks

Page 24: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 24

Kohonen networks• Architecture

Two inputs and a bidimensional grid

Square grid Hexagonal grid

Page 25: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 25

Kohonen networks• Functioning: similar to all competitive networks – by using the

nearest neighbor criterion one finds the winning unit

• In order to define a topology for the output level one needs :– A position vector for each output unit– A distance between position vectors

||max),(

distance)Manhattan ( ||),(

distance) (Euclidean )(),(

,13

12

1

21

iq

ipniqp

n

i

iq

ipqp

n

i

iq

ipqp

rrrrd

rrrrd

rrrrd

Page 26: Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering

Neural Networks - Lecture 8 26

Kohonen networks• The neighbourhood of radius s of unit p is:

• Example: for a squared grid the neighborhood of radius 1 defined by the previous three distances are: