cluster algorithms adriano joaquim de o cruz ©2006 ufrj [email protected]

Cluster AlgorithmsCluster Algorithms

Adriano Joaquim de O Cruz ©2006

UFRJ

[email protected]

K-means

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 3

K-means algorithmK-means algorithm

Based on the Euclidean distances Based on the Euclidean distances among elements of the clusteramong elements of the cluster

Centre of the cluster is the mean value of Centre of the cluster is the mean value of the objects in the cluster.the objects in the cluster.

Classifies objects in a hard way. Each Classifies objects in a hard way. Each object belongs to a single cluster.object belongs to a single cluster.


K-means algorithmK-means algorithm

Consider Consider n (X={xn (X={x11, x, x22, ..., x, ..., xnn})}) objects and objects and kk clusters. clusters.

Each object Each object xxii is defined by is defined by ll

characteristics characteristics xxii=(=(xxi1i1, x, xi2i2, ..., x, ..., x

ilil))..

Consider Consider AA a set of a set of kk clusters clusters ((AA={={AA11, A, A

22, ..., A, ..., Ak k }).}).


K-means propertiesK-means properties

The union of all clusters makes the The union of all clusters makes the UniverseUniverse

No element belongs to more than one No element belongs to more than one clustercluster

There is no empty clusterThere is no empty cluster

iXA

AA

XA

i

ji

k

ii

1


Membership functionMembership function

ie

ieeA Ax

Axx

i 01

)(

n

e

ex

n

eie

jeie

k

iie

k

ieAi

1

11

0

,0

,1)(


Membership matrix Membership matrix UU

Matrix containing the values of inclusion of Matrix containing the values of inclusion of each element into each cluster (0 or 1).each element into each cluster (0 or 1).

Matrix has Matrix has cc (clusters) lines and (clusters) lines and nn (elements) (elements) columns.columns.

The sum of all elements in the column must be The sum of all elements in the column must be equal to one (element belongs only to one equal to one (element belongs only to one clustercluster

The sum of each line must be less than The sum of each line must be less than nn e e grater than 0. No empty cluster, or cluster grater than 0. No empty cluster, or cluster containing all elements.containing all elements.


Matrix examples Matrix examples

X1 X2 X3

X4 X5 X6

Two examples of clustering. What do the clusters represent?

101010

0101011U

100100

010010

001001

2U


Matrix examples cont. Matrix examples cont.

X1 X2 X3

X4 X5 X6

101010

0101011U

010101

1010102

U

U1 and U2 are the same matrices.


How many clusters?How many clusters?

The cardinality of any hard k-partition of The cardinality of any hard k-partition of n elements is n elements is

nikk

i

iik

k)1(

!1

1


How many clusters (example)?How many clusters (example)?

Consider the matrix U2 (k=3, n=6)Consider the matrix U2 (k=3, n=6)

90)3()1(3

3

)2()1(2

3

)1()1(1

3

!31

60

61

62


K-means inputs and outputsK-means inputs and outputs

Inputs: the number of clusters Inputs: the number of clusters cc and a and a database containing database containing nn objects with objects with ll characteristics each.characteristics each.

Output: A set of Output: A set of kk clusters that clusters that minimises the square-error criterion.minimises the square-error criterion.


Number of ClustersNumber of Clusters

Log of Number of Partitions

02468

101214

1 4 7 10 13 16

Number of Clusters

Lo

g o

f N

um

ber

of

Par

titi

on

s No. 5

No. 10

No. 15

No. 20


K-means algorithm v1K-means algorithm v1

Arbitrarily assigns each object Arbitrarily assigns each object to a cluster (matrix to a cluster (matrix UU).).

RepeatRepeat Update the cluster centres;Update the cluster centres;

Reassign objects to the clusters Reassign objects to the clusters to which the objects are most to which the objects are most similar;similar;

UntilUntil no change; no change;


K-means algorithm v2K-means algorithm v2

Arbitrarily choose Arbitrarily choose cc objects as objects as the initial cluster centres.the initial cluster centres.

RepeatRepeat Reassign objects to the clusters Reassign objects to the clusters to which the objects are most to which the objects are most similar.similar.

Update the cluster centres.Update the cluster centres.

Until no changeUntil no change


Algorithm detailsAlgorithm details

The algorithm tries to minimise the The algorithm tries to minimise the functionfunction

ddieie is the distance between the element is the distance between the element xxee

((mm characteristics) and the centre of the characteristics) and the centre of the cluster cluster ii ( (vvii))

n

e

c

iieie dJ

1 1

2)(),( vU

21

1

2)(

)(

l

jijejie

ieieie

vxd

dd vxvx


Cluster CentreCluster Centre

The centre of the cluster The centre of the cluster ii ( (vvii) is an ) is an ll characteristics vector.characteristics vector.

The The jthjth co-ordinate is calculated as co-ordinate is calculated as

n

eie

n

eejie

ij

xv

1

1


Detailed AlgorithmDetailed Algorithm

Choose Choose cc (number of clusters). (number of clusters). Set error (Set error ( > 0) and step ( > 0) and step (rr=0).=0). Arbitrarily set matrix Arbitrarily set matrix UU(r)(r). Do not . Do not forget, each element belongs to a forget, each element belongs to a single cluster, no empty cluster single cluster, no empty cluster and no cluster has all elements.and no cluster has all elements.


Detailed Algorithm cont.Detailed Algorithm cont.

RepeatRepeat Calculate the centre of the clusters Calculate the centre of the clusters vvii

(r)(r)

Calculate the distance Calculate the distance ddii(r)(r) of each of each

point to the centre of the clusterspoint to the centre of the clustersGenerate Generate UU(r+1)(r+1) recalculating all recalculating all characteristic functions using the characteristic functions using the equationsequations

UntilUntil ||||UU(r+1)(r+1)--UU(r)(r)|| < || <

0

)min(1 )()(

)1(kjdd r

je

r

ier

ie


Matrix normsMatrix norms

Consider a matrix Consider a matrix UU of of nn lines and lines and nn columns:columns:

Column normColumn norm

Line normLine norm

n

iij

njaA

111 max

n

jij

niaA

11max


K-means problems?K-means problems?

Suitable when clusters are compact clouds Suitable when clusters are compact clouds well separated.well separated.

Scalable because computational complexity Scalable because computational complexity is is O(nkr)O(nkr)..

Necessity of choosing Necessity of choosing cc is disadvantage. is disadvantage. Not suitable for nonconvex shapes.Not suitable for nonconvex shapes. It is sensitive to noise and outliers because It is sensitive to noise and outliers because

they influence the means.they influence the means. Depends on the initial allocation.Depends on the initial allocation.


Examples of resultsExamples of results

0

1

2

3

4

5

6

0 1 2 3 4


K-means: Actual DataK-means: Actual Data


K-means: ResultsK-means: Results


K-medoids


K-medoids methodsK-medoids methods

K-means is sensitive to outliers since an K-means is sensitive to outliers since an object with an extremely large value may object with an extremely large value may distort the distribution of data.distort the distribution of data.

Instead of taking the mean value the most Instead of taking the mean value the most centrally object (medoid) is used as reference centrally object (medoid) is used as reference point.point.

The algorithm minimizes the sum of The algorithm minimizes the sum of dissimilarities between each object and the dissimilarities between each object and the medoid (similar to k-means)medoid (similar to k-means)


K-medoids strategiesK-medoids strategies

Find k-medoids arbitrarily.Find k-medoids arbitrarily. Each remaining object is clustered with the Each remaining object is clustered with the

medoid to which is the most similar.medoid to which is the most similar. Then iteratively replaces one of the medoids Then iteratively replaces one of the medoids

by a non-medoid as long as the quality of the by a non-medoid as long as the quality of the clustering is improved.clustering is improved.

The quality is measured using a cost function The quality is measured using a cost function that measures the average dissimilarity that measures the average dissimilarity between the objects and the medoid of its between the objects and the medoid of its cluster.cluster.


Reassignment costsReassignment costs

Each time a reassignment occurs a difference Each time a reassignment occurs a difference in square-error in square-error JJ is contributed. is contributed.

The cost function The cost function JJ calculates the total cost of calculates the total cost of replacing a current medoid by a non-medoid.replacing a current medoid by a non-medoid.

If the total cost is negative then If the total cost is negative then mmjj is replaced is replaced

by by mmrandomrandom, otherwise the replacement is not , otherwise the replacement is not

accepted. accepted.


Replacing medoids case 1Replacing medoids case 1

Object Object pp belongs to medoid belongs to medoid mmjj. If . If mmjj is is replaced by replaced by mmrandomrandom and and pp is closest to is closest to one of one of mmii ( (ii<><>jj), then reassigns ), then reassigns pp to to mmii

mi mj

mrandom

p


mrandom


Object Object pp belongs to medoid belongs to medoid mmjj. If . If mmjj is is replaced by replaced by mmrandomrandom and and pp is closest to is closest to mmrandomrandom, then reassigns , then reassigns pp to to mmrandomrandom

mi mj

p


mrandom


Object Object pp belongs to medoid belongs to medoid mmii ( (ii<><>jj). If ). If mmjj is replaced by is replaced by mmrandomrandom and and pp is still is still close to close to mmii, then does not change., then does not change.

mi mjp


mrandom


Object Object pp belongs to medoid belongs to medoid mmii ( (ii<><>jj). If ). If mmjj is replaced by is replaced by mmrandomrandom and and pp is closest is closest to to mmrandomrandom,then reassigns ,then reassigns pp to to mmrandomrandom..

mi mj

p


K-medoid algorithmK-medoid algorithm

Arbitrarily choose Arbitrarily choose kk objects as the objects as the initial medoids.initial medoids.

RepeatRepeatAssign each remaining object to the Assign each remaining object to the cluster with the nearest medoid;cluster with the nearest medoid;

Randomly select a nonmedoid object, Randomly select a nonmedoid object, mmrandomrandom;;

Compute the total cost Compute the total cost JJ of swapping of swapping mmjj with with mmrandomrandom;;

If If J<0J<0 then swap then swap oojj with with oorandomrandom;;



Comparisons?Comparisons?

K-medoids is more robust than k-means K-medoids is more robust than k-means in presence of noise and outliers.in presence of noise and outliers.

K-means is less costly in terms of K-means is less costly in terms of processing time.processing time.


Fuzzy C-means


Fuzzy C-meansFuzzy C-means

Fuzzy version of K-meansFuzzy version of K-means Elements may belong to more than one Elements may belong to more than one

clustercluster Values of characteristic function range Values of characteristic function range

from 0 to 1.from 0 to 1. It is interpreted as the degree of It is interpreted as the degree of

membership of an element to a cluster membership of an element to a cluster relative to all other clusters.relative to all other clusters.


Fuzzy C-means setupFuzzy C-means setup

Consider Consider nn ( (XX={={xx11, x, x22, ..., x, ..., x

nn})}) objects and objects and cc clusters. clusters.

Each object Each object xxii is defined by is defined by ll

characteristics characteristics xxii=(=(xxi1i1, x, xi2i2, ..., x, ..., x

ilil).).

Consider Consider AA a set of a set of kk clusters clusters ((AA={={AA11, A, A

22, ..., A, ..., Ak k }).}).


Fuzzy C-means propertiesFuzzy C-means properties

The union of all clusters makes the The union of all clusters makes the UniverseUniverse

There is no empty clusterThere is no empty cluster

iXA

AA

XA

i

ji

k

ii

1



]1,0[)( ieeA xi

n

e

e

n

eie

jeie

c

iie

1

1

0

,0

,1


Problems of probabilistic clustersProblems of probabilistic clusters

Points representing circle lines (C1 e C2)Points representing circle lines (C1 e C2) Due to normalization strange results may Due to normalization strange results may

emergeemerge

C1

C2{c1=0.5,c2= 0.5}

C1 C2

{c1=0.5,c2= 0.5}

{c1=0.7,c2= 0.3} {c1=0.3,c2= 0.7}


Membership matrix Membership matrix UU

Matrix containing the values of inclusion Matrix containing the values of inclusion of each element into each cluster [0,1].of each element into each cluster [0,1].

Matrix has Matrix has cc (clusters) lines and (clusters) lines and nn (elements) columns.(elements) columns.

The sum of all elements in the column The sum of all elements in the column must be equal to one.must be equal to one.

The sum of each line must be less than n The sum of each line must be less than n e grater than 0. No empty cluster, or e grater than 0. No empty cluster, or cluster containing all elements.cluster containing all elements.


Matrix examples Matrix examples

X1 X2 X3

X4 X5 X6

Two examples of clustering. What do the clusters represent?

8.07.01.0105.0

2.03.09.0015.01U

15.0002.00

02.0015.00

05.0105.01

2U


Fuzzy C-means algorithm v1Fuzzy C-means algorithm v1

Arbitrarily assigns each object Arbitrarily assigns each object to a cluster (matrix to a cluster (matrix UU).).

RepeatRepeat Update the cluster centres;Update the cluster centres;

Reassign objects to the clusters Reassign objects to the clusters to which the objects are most to which the objects are most similar;similar;

UntilUntil no change; no change;


Fuzzy C-means algorithm v2Fuzzy C-means algorithm v2

Arbitrarily choose Arbitrarily choose cc objects as objects as the initial cluster centres.the initial cluster centres.

RepeatRepeat Reassign objects to the clusters Reassign objects to the clusters to which the objects are most to which the objects are most similar.similar.

Update the cluster centres.Update the cluster centres.




The algorithm tries to minimise the The algorithm tries to minimise the function, function, mm is the nebulisation factor. is the nebulisation factor.

ddieie is the distance between the element is the distance between the element xxee ((ll characteristics) and the centre of the characteristics) and the centre of the cluster cluster ii ( (vvii))

n

e

c

iie

mie dJ

1 1

2)()(),( vU

21

1

2)(

)(

m

jijejie

ieieie

vxd

dd vxvx


Nebulisation factorNebulisation factor

mm is the nebulisation factor. is the nebulisation factor. This value has a range [1,This value has a range [1,)) If If mm=1 the the system is crisp.=1 the the system is crisp. If If mm the all the membership values the all the membership values

tend to 1/tend to 1/cc The most common values are 1.25 and The most common values are 1.25 and

2.02.0


Cluster CentreCluster Centre

The centre of the cluster The centre of the cluster ii ( (vvii) is a ) is a ll characteristics vector.characteristics vector.

The jth co-ordinate is calculated asThe jth co-ordinate is calculated as

n

e

mie

n

eej

mie

ij

xv

1

1

)(

)(



Choose Choose cc (number of clusters). (number of clusters). Set error (Set error ( > 0), nebulisation > 0), nebulisation factor (factor (mm) and step () and step (rr=0).=0).

Arbitrarily set matrix Arbitrarily set matrix UU(r)(r). Do not . Do not forget, each element belongs to a forget, each element belongs to a single cluster, no empty cluster single cluster, no empty cluster and no cluster has all elements.and no cluster has all elements.



RepeatRepeat Calculate the centre of the Calculate the centre of the clusters clusters vvii

(r)(r)

Calculate the distance Calculate the distance ddii(r)(r) of of

each point to the centre of the each point to the centre of the clustersclusters

Generate Generate UU(r+1)(r+1) recalculating all recalculating all characteristic functions(characteristic functions(How?How?))



How to recalculate?How to recalculate?

0else

1then0ifelse

then

]..1[,0if1

1

1

2

ie

ieie

c

l

m

le

ieie

ik

d

d

d

cid

If there is any If there is any distance greater than distance greater than zero then membership zero then membership grade is the weighted grade is the weighted average of the average of the distances to all centers.distances to all centers.

else the element else the element belongs to this cluster belongs to this cluster and no one else.and no one else.


Example of clustering resultExample of clustering result

0

1

2

34

5

6

7

0 2 4 6 8

Point x y C0 C10 1 1 0,9993 0,00071 2 2 0,9999 0,00012 1 2 0,9996 0,00043 2 1 0,9996 0,00044 5 5 0,0000 1,00005 5 6 0,0005 0,99956 6 6 0,0010 0,99907 6 5 0,0005 0,99958 3,5 3,5 0,4559 0,5441


Example of clustering resultExample of clustering result


Possibilistic Clustering



]1,0[)( ieeA xi

ec

iie

,01



The membership degree is the The membership degree is the representativity of typicality of the representativity of typicality of the datum datum xx for the cluster for the cluster ii..

]1,0[)( ieeA xi e

c

iie

,01



The algorithm tries to minimise the The algorithm tries to minimise the function, function, mm is the nebulisation factor. is the nebulisation factor.

The first sum is the usual and the second The first sum is the usual and the second rewards high memberships.rewards high memberships.

c

i

n

e

miei

n

e

c

iie

mie dJ

1 11 1

2 )1()()(),( vU


How to calculate?How to calculate?

1

12

1

1

m

k

ie

ik

d

n

e

mie

n

eie

mie

k

d

1

1

)(

)(



Choose Choose cc (number of clusters) (number of clusters) and and mm..

Set error (Set error ( > 0) and step > 0) and step ((rr=0).=0).

Execute FCMExecute FCM



ForFor 2 2 timestimesInitialize Initialize UU(0) (0) and the centre of the and the centre of the clusters clusters vvii

(0)(0) with previous results with previous results

InitializeInitialize kk and and r=0r=0

RepeatRepeat Calculate the distance Calculate the distance ddii

(r)(r) of each point of each point to the centre of the clustersto the centre of the clustersGenerate Generate UU(r+1)(r+1) recalculating all recalculating all characteristic functions using the characteristic functions using the equationsequations


End FOREnd FOR


Gustafson-Kessel Algorithm


Gustafson-Kessel methodGustafson-Kessel method

This method (GK) is fuzzy clustering This method (GK) is fuzzy clustering method similar to the Fuzzy C-means method similar to the Fuzzy C-means (FCM).(FCM).

The difference is the way the distance is The difference is the way the distance is calculated.calculated.

FCM uses Euclidean distancesFCM uses Euclidean distances GK uses Mahalanobis distancesGK uses Mahalanobis distances



Mahalobis distance is calculated asMahalobis distance is calculated as

The matrices The matrices AAii are given by are given by)()(2

ikiT

ikikd vxAvx

1)det( ip

ii SSA



The Fuzzy Covariance Matrix is The Fuzzy Covariance Matrix is

n

j

mij

n

j

Tjjij

mij

i

1

1))((

vxvxS


GK commentsGK comments

The clusters are hyperellipsoids on the The clusters are hyperellipsoids on the ll..

The hyperellipsoids have aproximately The hyperellipsoids have aproximately the same size.the same size.

In order to be possible to calculate In order to be possible to calculate SS-1-1 the number of samples the number of samples nn must be at must be at least equal to the number of dimensions least equal to the number of dimensions ll plus 1. plus 1.


Results of GKResults of GK


Gath-Geva methodGath-Geva method

It is also known as Gaussian Mixture It is also known as Gaussian Mixture Decomposition.Decomposition.

It is similar to the FCM methodIt is similar to the FCM method The Gauss distance is used instead of The Gauss distance is used instead of

Euclidean distance.Euclidean distance. The clusters do not have a definite The clusters do not have a definite

shape anymore and have various sizes.shape anymore and have various sizes.


Gath-Geva MethodGath-Geva Method

Gauss distance is given byGauss distance is given by

AAii=S=Sii-1-1

)()(

21

2 )det( ikiT

ik

i

iik e

Pd

vxAvxS


Gath-Geva MethodGath-Geva Method

The term The term PPii is the probability of a is the probability of a sample belong to a cluster.sample belong to a cluster.

n

j

c

k

mij

n

j

mij

iP

1 1

1


Gath-Geva CommentsGath-Geva Comments

Pi is a parameter that influences the Pi is a parameter that influences the size of a cluster.size of a cluster.

Bigger clusters attract more elements.Bigger clusters attract more elements. The exponential term makes more The exponential term makes more

difficult to avoid local minima.difficult to avoid local minima. Usually another clustering method is Usually another clustering method is

used to initialise the partition matrix used to initialise the partition matrix UU..


GG Results – Random CentersGG Results – Random Centers


GG Results – Centers FCMGG Results – Centers FCM


Clustering based on Equivalence Clustering based on Equivalence RelationsRelations

A relation crisp A relation crisp RR on a universe on a universe XX can can be thought as a relation from be thought as a relation from XX to to XX

R is an R is an equivalence relationequivalence relation if it has the if it has the following three properties:following three properties: Reflexivity (Reflexivity (xxii, , xxii) ) RR

Symmetry (Symmetry (xxii, x, xjj) ) RR ( (xxjj, x, xii) ) RR

Transitivity (Transitivity (xxii, x, xjj) ) RR and ( and (xxjj, x, xkk) ) RR ( (xxii, ,

xxkk) ) RR


Crisp tolerance relationCrisp tolerance relation

R is a R is a tolerance relationtolerance relation if it has the if it has the following two properties:following two properties:

Reflexivity (Reflexivity (xxii, , xxii) ) RR



Composition of RelationsComposition of Relations

X Y Z

R S

T=R°S


Composition of Crisp RelationsComposition of Crisp Relations

YX

SRy

yxyxSR )],(),([

productou

min

max

The operation ° is similar to matrix The operation ° is similar to matrix multiplication.multiplication.


Transforming RelationsTransforming Relations

A tolerance relation can be transformed A tolerance relation can be transformed into a equivalence relation by at most into a equivalence relation by at most ((nn-1) compositions with itself.-1) compositions with itself.

nn is the cardinality of the set is the cardinality of the set RR..

1111

1 RRRRn


Example of crisp classificationExample of crisp classification

Let Let XX={1,2,3,4,5,6,7,8,9,10}={1,2,3,4,5,6,7,8,9,10} Let Let RR be defined as the relation “for the be defined as the relation “for the

identical remainder after dividing each identical remainder after dividing each element by 3”.element by 3”.

This relation is an equivalence relationThis relation is an equivalence relation


Relation MatrixRelation Matrix

100100100110

01001001009

00100100108

10010010017

01001001006

00100100105

10010010014

01001001003

00100100102

10010010011

10987654321


Crisp ClassificationCrisp Classification

Consider equivalent columns.Consider equivalent columns. It is possible to group the elements in It is possible to group the elements in

the following classesthe following classes RR00 = {3, 6, 9} = {3, 6, 9}

RR11 = {1, 4, 7, 10} = {1, 4, 7, 10}

RR22 = {2, 5, 8} = {2, 5, 8}


Clustering and Fuzzy Equivalence RelationsClustering and Fuzzy Equivalence Relations

A relation fuzzy A relation fuzzy RR on a universe on a universe XX can be thought can be thought as a relation from as a relation from XX to to XX

R is an R is an equivalence relationequivalence relation if it has the following if it has the following three properties:three properties: Reflexivity:Reflexivity: ( (xxii, , xxii) ) R R oror ((xxii, , xxii)) = 1= 1 Symmetry: Symmetry: ((xxii, x, xjj) ) RR ( (xxjj, x, xii) ) R R oror

((xxii, , xxjj) =) = ((xxjj, , xxii)) Transitivity:Transitivity: ( (xxii, x, xjj) and () and (xxjj, x, xkk) ) RR ( (xxii, x, xkk) ) R R

or ifor if ((xxii, , xxjj) =) = 1 1 andand ((xxjj, , xxkk) = ) = 2 2 then then ((xxii, , xxkk) = ) = andand >=min>=min((11, , 22))


Fuzzy tolerance relationFuzzy tolerance relation

R is a R is a tolerance relationtolerance relation if it has the if it has the following two properties:following two properties:

Reflexivity (Reflexivity (xxii, , xxii) ) RR



Composition of Fuzzy RelationsComposition of Fuzzy Relations

YX

SRy

yxyxSR )],(),([

productou

min

max

The operation ° is similar to matrix The operation ° is similar to matrix multiplication.multiplication.


Distance RelationDistance Relation

Let Let XX be a set of data on be a set of data on ll.. The distance function is a tolerance relation The distance function is a tolerance relation

that can be transformed into a equivalence.that can be transformed into a equivalence. The relation The relation RR can be defined by the can be defined by the

Minkowski distance formula.Minkowski distance formula.

is a constant that ensures that is a constant that ensures that RR[0,1] and [0,1] and is equal to the inverse of the largest distance is equal to the inverse of the largest distance in in XX. .

l

j

qq

kjijki xxR1

1)(1),( xx


Example of Fuzzy classificationExample of Fuzzy classification

Let Let XX={(0,0),(1,1),(2,3),(3,1),(4,0)} be a set ={(0,0),(1,1),(2,3),(3,1),(4,0)} be a set of points in of points in 22..

Set Set qq=2, Euclidean distances.=2, Euclidean distances. The largest distance is 4 (The largest distance is 4 (xx11,,xx55), so ), so =0.25.=0.25. The relation The relation RR can be calculated by the can be calculated by the

equation equation

l

jkjijki xxR

1

212)(25.01),( xx


Points to be classifiedPoints to be classified

0

0,5

1

1,52

2,5

3

3,5

0 1 2 3 4 5


Tolerance matrixTolerance matrix

The matrix calculated by the equation isThe matrix calculated by the equation is

The is a tolerance relation that needs to be The is a tolerance relation that needs to be transformed into a equivalence relationtransformed into a equivalence relation

165.1.21.0

65.144.5.21.

1.44.144.1.

21.5.44.165.

021.165.1

R


Equivalence matrixEquivalence matrix

The matrix transformed isThe matrix transformed is

165.44.5.5.

65.144.5.5.

44.44.144.44.

5.5.44.165.

5.5.44.65.1

R


Results of clusteringResults of clustering

Taking Taking -cuts of fuzzy equivalent -cuts of fuzzy equivalent relation at various values of relation at various values of =0.44, 0.5, =0.44, 0.5, 0.65 and 1.0 we get the following 0.65 and 1.0 we get the following classes:classes:

RR.44.44=[{=[{xx11,,xx22,,xx33,,xx44,,xx55}]}] RR.55.55=[{=[{xx11,,xx22,,xx44,,xx55}{}{xx33}]}] RR.65.65=[{=[{xx11,,xx22},{},{xx33},{},{xx44,,xx55}]}] RR1.01.0=[{=[{xx11},{},{xx22},{},{xx33},{},{xx44},{},{xx55}]}]

cluster algorithms adriano joaquim de o cruz ©2006 ufrj [email protected]

Documents

cluster centre

single cluster

random object p

random p slide

nce e im ufrj cluster

object x i

step r

adriano cruz