cluster algorithms adriano joaquim de o cruz ©2006 ufrj [email protected]
TRANSCRIPT
K-means
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 3
K-means algorithmK-means algorithm
Based on the Euclidean distances Based on the Euclidean distances among elements of the clusteramong elements of the cluster
Centre of the cluster is the mean value of Centre of the cluster is the mean value of the objects in the cluster.the objects in the cluster.
Classifies objects in a hard way. Each Classifies objects in a hard way. Each object belongs to a single cluster.object belongs to a single cluster.
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 4
K-means algorithmK-means algorithm
Consider Consider n (X={xn (X={x11, x, x22, ..., x, ..., xnn})}) objects and objects and kk clusters. clusters.
Each object Each object xxii is defined by is defined by ll
characteristics characteristics xxii=(=(xxi1i1, x, xi2i2, ..., x, ..., x
ilil))..
Consider Consider AA a set of a set of kk clusters clusters ((AA={={AA11, A, A
22, ..., A, ..., Ak k }).}).
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 5
K-means propertiesK-means properties
The union of all clusters makes the The union of all clusters makes the UniverseUniverse
No element belongs to more than one No element belongs to more than one clustercluster
There is no empty clusterThere is no empty cluster
iXA
AA
XA
i
ji
k
ii
1
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 6
Membership functionMembership function
ie
ieeA Ax
Axx
i 01
)(
n
e
ex
n
eie
jeie
k
iie
k
ieAi
1
11
0
,0
,1)(
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 7
Membership matrix Membership matrix UU
Matrix containing the values of inclusion of Matrix containing the values of inclusion of each element into each cluster (0 or 1).each element into each cluster (0 or 1).
Matrix has Matrix has cc (clusters) lines and (clusters) lines and nn (elements) (elements) columns.columns.
The sum of all elements in the column must be The sum of all elements in the column must be equal to one (element belongs only to one equal to one (element belongs only to one clustercluster
The sum of each line must be less than The sum of each line must be less than nn e e grater than 0. No empty cluster, or cluster grater than 0. No empty cluster, or cluster containing all elements.containing all elements.
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 8
Matrix examples Matrix examples
X1 X2 X3
X4 X5 X6
Two examples of clustering. What do the clusters represent?
101010
0101011U
100100
010010
001001
2U
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 9
Matrix examples cont. Matrix examples cont.
X1 X2 X3
X4 X5 X6
101010
0101011U
010101
1010102
U
U1 and U2 are the same matrices.
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 10
How many clusters?How many clusters?
The cardinality of any hard k-partition of The cardinality of any hard k-partition of n elements is n elements is
nikk
i
iik
k)1(
!1
1
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 11
How many clusters (example)?How many clusters (example)?
Consider the matrix U2 (k=3, n=6)Consider the matrix U2 (k=3, n=6)
90)3()1(3
3
)2()1(2
3
)1()1(1
3
!31
60
61
62
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 12
K-means inputs and outputsK-means inputs and outputs
Inputs: the number of clusters Inputs: the number of clusters cc and a and a database containing database containing nn objects with objects with ll characteristics each.characteristics each.
Output: A set of Output: A set of kk clusters that clusters that minimises the square-error criterion.minimises the square-error criterion.
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 13
Number of ClustersNumber of Clusters
Log of Number of Partitions
02468
101214
1 4 7 10 13 16
Number of Clusters
Lo
g o
f N
um
ber
of
Par
titi
on
s No. 5
No. 10
No. 15
No. 20
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 14
K-means algorithm v1K-means algorithm v1
Arbitrarily assigns each object Arbitrarily assigns each object to a cluster (matrix to a cluster (matrix UU).).
RepeatRepeat Update the cluster centres;Update the cluster centres;
Reassign objects to the clusters Reassign objects to the clusters to which the objects are most to which the objects are most similar;similar;
UntilUntil no change; no change;
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 15
K-means algorithm v2K-means algorithm v2
Arbitrarily choose Arbitrarily choose cc objects as objects as the initial cluster centres.the initial cluster centres.
RepeatRepeat Reassign objects to the clusters Reassign objects to the clusters to which the objects are most to which the objects are most similar.similar.
Update the cluster centres.Update the cluster centres.
Until no changeUntil no change
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 16
Algorithm detailsAlgorithm details
The algorithm tries to minimise the The algorithm tries to minimise the functionfunction
ddieie is the distance between the element is the distance between the element xxee
((mm characteristics) and the centre of the characteristics) and the centre of the cluster cluster ii ( (vvii))
n
e
c
iieie dJ
1 1
2)(),( vU
21
1
2)(
)(
l
jijejie
ieieie
vxd
dd vxvx
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 17
Cluster CentreCluster Centre
The centre of the cluster The centre of the cluster ii ( (vvii) is an ) is an ll characteristics vector.characteristics vector.
The The jthjth co-ordinate is calculated as co-ordinate is calculated as
n
eie
n
eejie
ij
xv
1
1
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 18
Detailed AlgorithmDetailed Algorithm
Choose Choose cc (number of clusters). (number of clusters). Set error (Set error ( > 0) and step ( > 0) and step (rr=0).=0). Arbitrarily set matrix Arbitrarily set matrix UU(r)(r). Do not . Do not forget, each element belongs to a forget, each element belongs to a single cluster, no empty cluster single cluster, no empty cluster and no cluster has all elements.and no cluster has all elements.
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 19
Detailed Algorithm cont.Detailed Algorithm cont.
RepeatRepeat Calculate the centre of the clusters Calculate the centre of the clusters vvii
(r)(r)
Calculate the distance Calculate the distance ddii(r)(r) of each of each
point to the centre of the clusterspoint to the centre of the clustersGenerate Generate UU(r+1)(r+1) recalculating all recalculating all characteristic functions using the characteristic functions using the equationsequations
UntilUntil ||||UU(r+1)(r+1)--UU(r)(r)|| < || <
0
)min(1 )()(
)1(kjdd r
je
r
ier
ie
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 20
Matrix normsMatrix norms
Consider a matrix Consider a matrix UU of of nn lines and lines and nn columns:columns:
Column normColumn norm
Line normLine norm
n
iij
njaA
111 max
n
jij
niaA
11max
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 21
K-means problems?K-means problems?
Suitable when clusters are compact clouds Suitable when clusters are compact clouds well separated.well separated.
Scalable because computational complexity Scalable because computational complexity is is O(nkr)O(nkr)..
Necessity of choosing Necessity of choosing cc is disadvantage. is disadvantage. Not suitable for nonconvex shapes.Not suitable for nonconvex shapes. It is sensitive to noise and outliers because It is sensitive to noise and outliers because
they influence the means.they influence the means. Depends on the initial allocation.Depends on the initial allocation.
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 22
Examples of resultsExamples of results
0
1
2
3
4
5
6
0 1 2 3 4
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 23
K-means: Actual DataK-means: Actual Data
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 24
K-means: ResultsK-means: Results
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 25
K-medoids
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 26
K-medoids methodsK-medoids methods
K-means is sensitive to outliers since an K-means is sensitive to outliers since an object with an extremely large value may object with an extremely large value may distort the distribution of data.distort the distribution of data.
Instead of taking the mean value the most Instead of taking the mean value the most centrally object (medoid) is used as reference centrally object (medoid) is used as reference point.point.
The algorithm minimizes the sum of The algorithm minimizes the sum of dissimilarities between each object and the dissimilarities between each object and the medoid (similar to k-means)medoid (similar to k-means)
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 27
K-medoids strategiesK-medoids strategies
Find k-medoids arbitrarily.Find k-medoids arbitrarily. Each remaining object is clustered with the Each remaining object is clustered with the
medoid to which is the most similar.medoid to which is the most similar. Then iteratively replaces one of the medoids Then iteratively replaces one of the medoids
by a non-medoid as long as the quality of the by a non-medoid as long as the quality of the clustering is improved.clustering is improved.
The quality is measured using a cost function The quality is measured using a cost function that measures the average dissimilarity that measures the average dissimilarity between the objects and the medoid of its between the objects and the medoid of its cluster.cluster.
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 28
Reassignment costsReassignment costs
Each time a reassignment occurs a difference Each time a reassignment occurs a difference in square-error in square-error JJ is contributed. is contributed.
The cost function The cost function JJ calculates the total cost of calculates the total cost of replacing a current medoid by a non-medoid.replacing a current medoid by a non-medoid.
If the total cost is negative then If the total cost is negative then mmjj is replaced is replaced
by by mmrandomrandom, otherwise the replacement is not , otherwise the replacement is not
accepted. accepted.
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 29
Replacing medoids case 1Replacing medoids case 1
Object Object pp belongs to medoid belongs to medoid mmjj. If . If mmjj is is replaced by replaced by mmrandomrandom and and pp is closest to is closest to one of one of mmii ( (ii<><>jj), then reassigns ), then reassigns pp to to mmii
mi mj
mrandom
p
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 30
mrandom
Replacing medoids case 2Replacing medoids case 2
Object Object pp belongs to medoid belongs to medoid mmjj. If . If mmjj is is replaced by replaced by mmrandomrandom and and pp is closest to is closest to mmrandomrandom, then reassigns , then reassigns pp to to mmrandomrandom
mi mj
p
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 31
mrandom
Replacing medoids case 3Replacing medoids case 3
Object Object pp belongs to medoid belongs to medoid mmii ( (ii<><>jj). If ). If mmjj is replaced by is replaced by mmrandomrandom and and pp is still is still close to close to mmii, then does not change., then does not change.
mi mjp
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 32
mrandom
Replacing medoids case 4Replacing medoids case 4
Object Object pp belongs to medoid belongs to medoid mmii ( (ii<><>jj). If ). If mmjj is replaced by is replaced by mmrandomrandom and and pp is closest is closest to to mmrandomrandom,then reassigns ,then reassigns pp to to mmrandomrandom..
mi mj
p
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 33
K-medoid algorithmK-medoid algorithm
Arbitrarily choose Arbitrarily choose kk objects as the objects as the initial medoids.initial medoids.
RepeatRepeatAssign each remaining object to the Assign each remaining object to the cluster with the nearest medoid;cluster with the nearest medoid;
Randomly select a nonmedoid object, Randomly select a nonmedoid object, mmrandomrandom;;
Compute the total cost Compute the total cost JJ of swapping of swapping mmjj with with mmrandomrandom;;
If If J<0J<0 then swap then swap oojj with with oorandomrandom;;
Until no changeUntil no change
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 34
Comparisons?Comparisons?
K-medoids is more robust than k-means K-medoids is more robust than k-means in presence of noise and outliers.in presence of noise and outliers.
K-means is less costly in terms of K-means is less costly in terms of processing time.processing time.
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 35
Fuzzy C-means
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 36
Fuzzy C-meansFuzzy C-means
Fuzzy version of K-meansFuzzy version of K-means Elements may belong to more than one Elements may belong to more than one
clustercluster Values of characteristic function range Values of characteristic function range
from 0 to 1.from 0 to 1. It is interpreted as the degree of It is interpreted as the degree of
membership of an element to a cluster membership of an element to a cluster relative to all other clusters.relative to all other clusters.
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 37
Fuzzy C-means setupFuzzy C-means setup
Consider Consider nn ( (XX={={xx11, x, x22, ..., x, ..., x
nn})}) objects and objects and cc clusters. clusters.
Each object Each object xxii is defined by is defined by ll
characteristics characteristics xxii=(=(xxi1i1, x, xi2i2, ..., x, ..., x
ilil).).
Consider Consider AA a set of a set of kk clusters clusters ((AA={={AA11, A, A
22, ..., A, ..., Ak k }).}).
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 38
Fuzzy C-means propertiesFuzzy C-means properties
The union of all clusters makes the The union of all clusters makes the UniverseUniverse
There is no empty clusterThere is no empty cluster
iXA
AA
XA
i
ji
k
ii
1
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 39
Membership functionMembership function
]1,0[)( ieeA xi
n
e
e
n
eie
jeie
c
iie
1
1
0
,0
,1
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 40
Problems of probabilistic clustersProblems of probabilistic clusters
Points representing circle lines (C1 e C2)Points representing circle lines (C1 e C2) Due to normalization strange results may Due to normalization strange results may
emergeemerge
C1
C2{c1=0.5,c2= 0.5}
C1 C2
{c1=0.5,c2= 0.5}
{c1=0.7,c2= 0.3} {c1=0.3,c2= 0.7}
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 41
Membership matrix Membership matrix UU
Matrix containing the values of inclusion Matrix containing the values of inclusion of each element into each cluster [0,1].of each element into each cluster [0,1].
Matrix has Matrix has cc (clusters) lines and (clusters) lines and nn (elements) columns.(elements) columns.
The sum of all elements in the column The sum of all elements in the column must be equal to one.must be equal to one.
The sum of each line must be less than n The sum of each line must be less than n e grater than 0. No empty cluster, or e grater than 0. No empty cluster, or cluster containing all elements.cluster containing all elements.
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 42
Matrix examples Matrix examples
X1 X2 X3
X4 X5 X6
Two examples of clustering. What do the clusters represent?
8.07.01.0105.0
2.03.09.0015.01U
15.0002.00
02.0015.00
05.0105.01
2U
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 43
Fuzzy C-means algorithm v1Fuzzy C-means algorithm v1
Arbitrarily assigns each object Arbitrarily assigns each object to a cluster (matrix to a cluster (matrix UU).).
RepeatRepeat Update the cluster centres;Update the cluster centres;
Reassign objects to the clusters Reassign objects to the clusters to which the objects are most to which the objects are most similar;similar;
UntilUntil no change; no change;
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 44
Fuzzy C-means algorithm v2Fuzzy C-means algorithm v2
Arbitrarily choose Arbitrarily choose cc objects as objects as the initial cluster centres.the initial cluster centres.
RepeatRepeat Reassign objects to the clusters Reassign objects to the clusters to which the objects are most to which the objects are most similar.similar.
Update the cluster centres.Update the cluster centres.
Until no changeUntil no change
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 45
Algorithm detailsAlgorithm details
The algorithm tries to minimise the The algorithm tries to minimise the function, function, mm is the nebulisation factor. is the nebulisation factor.
ddieie is the distance between the element is the distance between the element xxee ((ll characteristics) and the centre of the characteristics) and the centre of the cluster cluster ii ( (vvii))
n
e
c
iie
mie dJ
1 1
2)()(),( vU
21
1
2)(
)(
m
jijejie
ieieie
vxd
dd vxvx
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 46
Nebulisation factorNebulisation factor
mm is the nebulisation factor. is the nebulisation factor. This value has a range [1,This value has a range [1,)) If If mm=1 the the system is crisp.=1 the the system is crisp. If If mm the all the membership values the all the membership values
tend to 1/tend to 1/cc The most common values are 1.25 and The most common values are 1.25 and
2.02.0
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 47
Cluster CentreCluster Centre
The centre of the cluster The centre of the cluster ii ( (vvii) is a ) is a ll characteristics vector.characteristics vector.
The jth co-ordinate is calculated asThe jth co-ordinate is calculated as
n
e
mie
n
eej
mie
ij
xv
1
1
)(
)(
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 48
Detailed AlgorithmDetailed Algorithm
Choose Choose cc (number of clusters). (number of clusters). Set error (Set error ( > 0), nebulisation > 0), nebulisation factor (factor (mm) and step () and step (rr=0).=0).
Arbitrarily set matrix Arbitrarily set matrix UU(r)(r). Do not . Do not forget, each element belongs to a forget, each element belongs to a single cluster, no empty cluster single cluster, no empty cluster and no cluster has all elements.and no cluster has all elements.
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 49
Detailed Algorithm cont.Detailed Algorithm cont.
RepeatRepeat Calculate the centre of the Calculate the centre of the clusters clusters vvii
(r)(r)
Calculate the distance Calculate the distance ddii(r)(r) of of
each point to the centre of the each point to the centre of the clustersclusters
Generate Generate UU(r+1)(r+1) recalculating all recalculating all characteristic functions(characteristic functions(How?How?))
UntilUntil ||||UU(r+1)(r+1)--UU(r)(r)|| < || <
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 50
How to recalculate?How to recalculate?
0else
1then0ifelse
then
]..1[,0if1
1
1
2
ie
ieie
c
l
m
le
ieie
ik
d
d
d
cid
If there is any If there is any distance greater than distance greater than zero then membership zero then membership grade is the weighted grade is the weighted average of the average of the distances to all centers.distances to all centers.
else the element else the element belongs to this cluster belongs to this cluster and no one else.and no one else.
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 51
Example of clustering resultExample of clustering result
0
1
2
34
5
6
7
0 2 4 6 8
Point x y C0 C10 1 1 0,9993 0,00071 2 2 0,9999 0,00012 1 2 0,9996 0,00043 2 1 0,9996 0,00044 5 5 0,0000 1,00005 5 6 0,0005 0,99956 6 6 0,0010 0,99907 6 5 0,0005 0,99958 3,5 3,5 0,4559 0,5441
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 52
Example of clustering resultExample of clustering result
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 53
Possibilistic Clustering
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 54
Membership functionMembership function
]1,0[)( ieeA xi
ec
iie
,01
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 55
Membership functionMembership function
The membership degree is the The membership degree is the representativity of typicality of the representativity of typicality of the datum datum xx for the cluster for the cluster ii..
]1,0[)( ieeA xi e
c
iie
,01
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 56
Algorithm detailsAlgorithm details
The algorithm tries to minimise the The algorithm tries to minimise the function, function, mm is the nebulisation factor. is the nebulisation factor.
The first sum is the usual and the second The first sum is the usual and the second rewards high memberships.rewards high memberships.
c
i
n
e
miei
n
e
c
iie
mie dJ
1 11 1
2 )1()()(),( vU
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 57
How to calculate?How to calculate?
1
12
1
1
m
k
ie
ik
d
n
e
mie
n
eie
mie
k
d
1
1
)(
)(
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 58
Detailed AlgorithmDetailed Algorithm
Choose Choose cc (number of clusters) (number of clusters) and and mm..
Set error (Set error ( > 0) and step > 0) and step ((rr=0).=0).
Execute FCMExecute FCM
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 59
Detailed Algorithm cont.Detailed Algorithm cont.
ForFor 2 2 timestimesInitialize Initialize UU(0) (0) and the centre of the and the centre of the clusters clusters vvii
(0)(0) with previous results with previous results
InitializeInitialize kk and and r=0r=0
RepeatRepeat Calculate the distance Calculate the distance ddii
(r)(r) of each point of each point to the centre of the clustersto the centre of the clustersGenerate Generate UU(r+1)(r+1) recalculating all recalculating all characteristic functions using the characteristic functions using the equationsequations
UntilUntil ||||UU(r+1)(r+1)--UU(r)(r)|| < || <
End FOREnd FOR
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 60
Gustafson-Kessel Algorithm
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 61
Gustafson-Kessel methodGustafson-Kessel method
This method (GK) is fuzzy clustering This method (GK) is fuzzy clustering method similar to the Fuzzy C-means method similar to the Fuzzy C-means (FCM).(FCM).
The difference is the way the distance is The difference is the way the distance is calculated.calculated.
FCM uses Euclidean distancesFCM uses Euclidean distances GK uses Mahalanobis distancesGK uses Mahalanobis distances
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 62
Gustafson-Kessel methodGustafson-Kessel method
Mahalobis distance is calculated asMahalobis distance is calculated as
The matrices The matrices AAii are given by are given by)()(2
ikiT
ikikd vxAvx
1)det( ip
ii SSA
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 63
Gustafson-Kessel methodGustafson-Kessel method
The Fuzzy Covariance Matrix is The Fuzzy Covariance Matrix is
n
j
mij
n
j
Tjjij
mij
i
1
1))((
vxvxS
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 64
GK commentsGK comments
The clusters are hyperellipsoids on the The clusters are hyperellipsoids on the ll..
The hyperellipsoids have aproximately The hyperellipsoids have aproximately the same size.the same size.
In order to be possible to calculate In order to be possible to calculate SS-1-1 the number of samples the number of samples nn must be at must be at least equal to the number of dimensions least equal to the number of dimensions ll plus 1. plus 1.
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 65
Results of GKResults of GK
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 66
Gath-Geva methodGath-Geva method
It is also known as Gaussian Mixture It is also known as Gaussian Mixture Decomposition.Decomposition.
It is similar to the FCM methodIt is similar to the FCM method The Gauss distance is used instead of The Gauss distance is used instead of
Euclidean distance.Euclidean distance. The clusters do not have a definite The clusters do not have a definite
shape anymore and have various sizes.shape anymore and have various sizes.
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 67
Gath-Geva MethodGath-Geva Method
Gauss distance is given byGauss distance is given by
AAii=S=Sii-1-1
)()(
21
2 )det( ikiT
ik
i
iik e
Pd
vxAvxS
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 68
Gath-Geva MethodGath-Geva Method
The term The term PPii is the probability of a is the probability of a sample belong to a cluster.sample belong to a cluster.
n
j
c
k
mij
n
j
mij
iP
1 1
1
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 69
Gath-Geva CommentsGath-Geva Comments
Pi is a parameter that influences the Pi is a parameter that influences the size of a cluster.size of a cluster.
Bigger clusters attract more elements.Bigger clusters attract more elements. The exponential term makes more The exponential term makes more
difficult to avoid local minima.difficult to avoid local minima. Usually another clustering method is Usually another clustering method is
used to initialise the partition matrix used to initialise the partition matrix UU..
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 70
GG Results – Random CentersGG Results – Random Centers
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 71
GG Results – Centers FCMGG Results – Centers FCM
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 72
Clustering based on Equivalence Clustering based on Equivalence RelationsRelations
A relation crisp A relation crisp RR on a universe on a universe XX can can be thought as a relation from be thought as a relation from XX to to XX
R is an R is an equivalence relationequivalence relation if it has the if it has the following three properties:following three properties: Reflexivity (Reflexivity (xxii, , xxii) ) RR
Symmetry (Symmetry (xxii, x, xjj) ) RR ( (xxjj, x, xii) ) RR
Transitivity (Transitivity (xxii, x, xjj) ) RR and ( and (xxjj, x, xkk) ) RR ( (xxii, ,
xxkk) ) RR
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 73
Crisp tolerance relationCrisp tolerance relation
R is a R is a tolerance relationtolerance relation if it has the if it has the following two properties:following two properties:
Reflexivity (Reflexivity (xxii, , xxii) ) RR
Symmetry (Symmetry (xxii, x, xjj) ) RR ( (xxjj, x, xii) ) RR
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 74
Composition of RelationsComposition of Relations
X Y Z
R S
T=R°S
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 75
Composition of Crisp RelationsComposition of Crisp Relations
YX
SRy
yxyxSR )],(),([
productou
min
max
The operation ° is similar to matrix The operation ° is similar to matrix multiplication.multiplication.
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 76
Transforming RelationsTransforming Relations
A tolerance relation can be transformed A tolerance relation can be transformed into a equivalence relation by at most into a equivalence relation by at most ((nn-1) compositions with itself.-1) compositions with itself.
nn is the cardinality of the set is the cardinality of the set RR..
1111
1 RRRRn
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 77
Example of crisp classificationExample of crisp classification
Let Let XX={1,2,3,4,5,6,7,8,9,10}={1,2,3,4,5,6,7,8,9,10} Let Let RR be defined as the relation “for the be defined as the relation “for the
identical remainder after dividing each identical remainder after dividing each element by 3”.element by 3”.
This relation is an equivalence relationThis relation is an equivalence relation
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 78
Relation MatrixRelation Matrix
100100100110
01001001009
00100100108
10010010017
01001001006
00100100105
10010010014
01001001003
00100100102
10010010011
10987654321
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 79
Crisp ClassificationCrisp Classification
Consider equivalent columns.Consider equivalent columns. It is possible to group the elements in It is possible to group the elements in
the following classesthe following classes RR00 = {3, 6, 9} = {3, 6, 9}
RR11 = {1, 4, 7, 10} = {1, 4, 7, 10}
RR22 = {2, 5, 8} = {2, 5, 8}
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 80
Clustering and Fuzzy Equivalence RelationsClustering and Fuzzy Equivalence Relations
A relation fuzzy A relation fuzzy RR on a universe on a universe XX can be thought can be thought as a relation from as a relation from XX to to XX
R is an R is an equivalence relationequivalence relation if it has the following if it has the following three properties:three properties: Reflexivity:Reflexivity: ( (xxii, , xxii) ) R R oror ((xxii, , xxii)) = 1= 1 Symmetry: Symmetry: ((xxii, x, xjj) ) RR ( (xxjj, x, xii) ) R R oror
((xxii, , xxjj) =) = ((xxjj, , xxii)) Transitivity:Transitivity: ( (xxii, x, xjj) and () and (xxjj, x, xkk) ) RR ( (xxii, x, xkk) ) R R
or ifor if ((xxii, , xxjj) =) = 1 1 andand ((xxjj, , xxkk) = ) = 2 2 then then ((xxii, , xxkk) = ) = andand >=min>=min((11, , 22))
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 81
Fuzzy tolerance relationFuzzy tolerance relation
R is a R is a tolerance relationtolerance relation if it has the if it has the following two properties:following two properties:
Reflexivity (Reflexivity (xxii, , xxii) ) RR
Symmetry (Symmetry (xxii, x, xjj) ) RR ( (xxjj, x, xii) ) RR
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 82
Composition of Fuzzy RelationsComposition of Fuzzy Relations
YX
SRy
yxyxSR )],(),([
productou
min
max
The operation ° is similar to matrix The operation ° is similar to matrix multiplication.multiplication.
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 83
Distance RelationDistance Relation
Let Let XX be a set of data on be a set of data on ll.. The distance function is a tolerance relation The distance function is a tolerance relation
that can be transformed into a equivalence.that can be transformed into a equivalence. The relation The relation RR can be defined by the can be defined by the
Minkowski distance formula.Minkowski distance formula.
is a constant that ensures that is a constant that ensures that RR[0,1] and [0,1] and is equal to the inverse of the largest distance is equal to the inverse of the largest distance in in XX. .
l
j
kjijki xxR1
1)(1),( xx
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 84
Example of Fuzzy classificationExample of Fuzzy classification
Let Let XX={(0,0),(1,1),(2,3),(3,1),(4,0)} be a set ={(0,0),(1,1),(2,3),(3,1),(4,0)} be a set of points in of points in 22..
Set Set qq=2, Euclidean distances.=2, Euclidean distances. The largest distance is 4 (The largest distance is 4 (xx11,,xx55), so ), so =0.25.=0.25. The relation The relation RR can be calculated by the can be calculated by the
equation equation
l
jkjijki xxR
1
212)(25.01),( xx
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 85
Points to be classifiedPoints to be classified
0
0,5
1
1,52
2,5
3
3,5
0 1 2 3 4 5
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 86
Tolerance matrixTolerance matrix
The matrix calculated by the equation isThe matrix calculated by the equation is
The is a tolerance relation that needs to be The is a tolerance relation that needs to be transformed into a equivalence relationtransformed into a equivalence relation
165.1.21.0
65.144.5.21.
1.44.144.1.
21.5.44.165.
021.165.1
R
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 87
Equivalence matrixEquivalence matrix
The matrix transformed isThe matrix transformed is
165.44.5.5.
65.144.5.5.
44.44.144.44.
5.5.44.165.
5.5.44.65.1
R
*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 88
Results of clusteringResults of clustering
Taking Taking -cuts of fuzzy equivalent -cuts of fuzzy equivalent relation at various values of relation at various values of =0.44, 0.5, =0.44, 0.5, 0.65 and 1.0 we get the following 0.65 and 1.0 we get the following classes:classes:
RR.44.44=[{=[{xx11,,xx22,,xx33,,xx44,,xx55}]}] RR.55.55=[{=[{xx11,,xx22,,xx44,,xx55}{}{xx33}]}] RR.65.65=[{=[{xx11,,xx22},{},{xx33},{},{xx44,,xx55}]}] RR1.01.0=[{=[{xx11},{},{xx22},{},{xx33},{},{xx44},{},{xx55}]}]