clustering aggregation
DESCRIPTION
Clustering Aggregation. Nir Geffen 021537980 Yotam Margolin 039719729 SupervisorProfessor Zeev Volkovitch. ORT BRAUDE COLLEGE – SE DEPT. 9.12.2011. Table of Contents. Introduction Goals Clustering Spectral Clustering Cluster Ensembles. Consensus Spectral Clustering Ensembles - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/1.jpg)
1
Nir Geffen 021537980Yotam Margolin 039719729
Supervisor Professor Zeev Volkovitch
Clustering Aggregation
ORT BRAUDE COLLEGE – SE DEPT.
9.12.2011
![Page 2: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/2.jpg)
2
Introduction◦ Goals◦ Clustering◦ Spectral Clustering◦ Cluster Ensembles.◦ Consensus
Spectral Clustering Ensembles◦ Abstract◦ Steps◦ Pseudo
Clustering Aggregation via Self Learning Approach - CASLA◦ Abstract◦ Steps◦ Pseudo
SE Documents
Table of Contents
![Page 3: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/3.jpg)
3
Our goal is to investigate the results of different clustering ensemble techniques and to show the exclusive distinction between the various cluster ensemble and clustering aggregation via self learning.
Introduction – Goals
![Page 4: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/4.jpg)
4
Clustering is a method of unsupervised learning, aimed at partitioning a given data set into subsets named clusters, so that items belonging to the same cluster are similar to each other while items belonging to different clusters are not similar.
Introduction – Clustering
![Page 5: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/5.jpg)
5
While Classic clustering methods gives solid results, they also need elaborate similarity functions and pre-configurations.
To make things easier, Spectral clustering approaches the clustering problem from a different angle. Instead of clustering the data as-is, we project it onto a space to which most noise will be perpendicular (orthogonal).
Finally, we will cluster the results using a classic algorithm to achieve the required results.
Introduction – Spectral Clustering
![Page 6: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/6.jpg)
6
As no clustering algorithm is agreed to be superior for any data set, a common practice is to obtain several cluster partitions of the same data set.
Our next step will be to use a Consensus function to combine the resulting partitions into a new one, thereby increasing the robustness of the clustering process.
Introduction – Cluster Ensembles
![Page 7: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/7.jpg)
7
There are 3 main algorithms to join partitions (or Clusterings). Due to long computing time, we’ll only use greedy algorithms.
These algorithms, also known as Consensus functions, mostly rely on Graph theory.
CSPA, is considered the brute-force. O() time and space complexity.
HGPA, stable, not always optimal. MCLA, high-end solution, yields solid results,
Worthy competitor to HGPA
Introduction - Consensus
![Page 8: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/8.jpg)
8
To make full use of information included in a dataset, a multiway spectral clustering algorithm with joint model is applied to image segmentation.
Overcome the sensitivity of the joint model based multiway spectral clustering to kernel parameter and to produce the robust and stable segmentation results, spectral clustering ensemble algorithm.
Spectral Ensembles - Abstract
![Page 9: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/9.jpg)
9
Produce r individual spectral partitions Use MCLA to obtain Sc MCLA(xi); Use HGPA to obtain Sc HGPA(xi); By ANMI criterion, get the final decision
Sc*(xi) from Sc MCLA(xi) and Sc HGPA(xi).
Spectral Ensembles - Steps
![Page 10: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/10.jpg)
10
Being a central task in many research fields, numerous clustering algorithms have been developed and analyzed.
However, no clustering algorithm is agreed to be superior for any data set.
The performance of a clustering algorithm depends greatly on characteristics of the given data set and on parameters used by the algorithm, such as the desired number of clusters in a partition.
CASLA - Motivation
![Page 11: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/11.jpg)
11
Use various partitions of the same data set in order to define a new metric on the data set.
Using the new metric as an enhanced input for a clustering algorithm will produce better and more robust partitions.
This process can be done repeatedly, where in each step the metric is updated using the original data as well as the new cluster partition.
CASLA - Abstract
![Page 12: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/12.jpg)
12
1. Let R be an n x n distance matrix based on X (e.g., R = (XTX)1/2 for the Euclidean distance).2. Determine C, the desired number of clusters.3. Create cluster C-partitions Π1,…., Πm using m clustering methods,
with R as the metric.4. Compute i
j and Ʃij for any cluster πi
j in any Πi.
5. Recompute A using Equation (8).6. Set R = XTAX.7. Repeat until R converges:
8. Create a cluster partition Π of X using some clustering method, with R as the metric.9. Compute j and Ʃj for any cluster πj in Π.
10. Recompute A using Equation (8) (for m = 1).11. Set R = XTAX.
12. Output Π.
CASLA – Steps (exterior metric update)
![Page 13: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/13.jpg)
13
1. Let R be an n x n distance matrix based on X . 2. Determine C, the desired number of clusters.3. Initialize C random clusters.4. Compute the cluster centroids c1,…, cC.
5. Repeat until R converges: 6. Assign each data element xr to the cluster πj
such that ||xr – cj||R is minimized.
7. Compute j and Ʃj for any cluster πj in Π.
8. Recompute A using Equation (8) (for m = 1).9. Set R = XTAX.
10. Output Π.
CASLA – Steps (interior metric update)
![Page 14: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/14.jpg)
14
Use Case
![Page 15: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/15.jpg)
15
SE Documents
![Page 16: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/16.jpg)
16
Input file (choose) Run clustering by Zeev, run clustering by
Chinese, different threads. Show ANMI criterion for each. Show colored
graph for each. Show statistics – Time eval per round, diff
ANMI for diff methods, STD for cluster size. History Tab for prev results.
GUI - TODO
![Page 17: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/17.jpg)
17
[1] Zeev article draft *[2] Spectral Clustering Ensemble for Image Segmentation, Xiuli, Wanggen & Licheng.[3] Eyal David[4] Dhilon[5] Sterhl
References
![Page 18: Clustering Aggregation](https://reader036.vdocument.in/reader036/viewer/2022081801/5681554b550346895dc31955/html5/thumbnails/18.jpg)
THE END!