![Page 1: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/1.jpg)
Unsupervised Learning and Clustering
k-means clustering
Sum-of-Squared Errors
Competitive Learning SOM
Pre-processing and Post-processing techniques
![Page 2: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/2.jpg)
K-means clustering
This an elementary but very popular method for clustering.
Our goal is to find the k mean vectors or “cluster centers”.
Initialize k, m1, m2, …, mk Repeat Classify samples according to its nearest mi Recompute mi Until there is no change in mi Return m1, m2, …, mk
![Page 3: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/3.jpg)
Complexity
The computational complexity of the algorithm is defined as follows:
O( n d c T )
Where d is the number of features, n is the number of examples, c is the number of clusters, and T is the number of iterations.
The number of iterations is normally much less than the number of examples.
![Page 4: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/4.jpg)
Figure 10.3
![Page 5: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/5.jpg)
K-means clustering
Disadvantage 1: Prone to fall into local minima.
This can be solved with more computational power byrunning the algorithm many times with different initialmeans.
Disadvantage 2: Susceptible to outliers.
One solution is to replace the mean with the median.
![Page 6: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/6.jpg)
K-means clustering
Hugo Steinhaus
Born in January 14, 1887 (Austria-Hungary).
Professor at the University of Wroclaw, Notre Dame, and Sussex.
Authored over 170 works in mathematics.
First one to use k-means clustering
![Page 7: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/7.jpg)
Unsupervised Learning and Clustering
k-means clustering
Sum-of-Squared Errors
Competitive Learning SOM
Pre-processing and Post-processing techniques
![Page 8: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/8.jpg)
The Sum-of-Squared Error
We can now define the goal of clustering:
Goal: To divide a dataset of examples into c disjoint subsets D1, D2, …, Dc, so that the distance between examples withinthe same partition is small compared to the distance betweenexamples on different partitions.
To achieve this, we define the c means by looking to minimizea metric.
![Page 9: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/9.jpg)
Metric
Let mi be the mean of examples on partition Di:
mi = (1 / ni) Σ x (for all x in Di)
Then the metric to minimize is the sum-of-squared errors:
Je = Σi Σx || x – mi || 2
For all x in Di where index i goes along the clusters.
![Page 10: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/10.jpg)
Figure 10.10
![Page 11: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/11.jpg)
Others
Hierarchical clustering Clusters have subclusters which also have subclusters and so on.
Online clustering As time goes on new information may call for restructuring the clusters (plasticity). But we don’t want this to happen very often (stability).
![Page 12: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/12.jpg)
Figure 10.11
![Page 13: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/13.jpg)
Unsupervised Learning and Clustering
k-means clustering
Sum-of-Squared Errors
Competitive Learning SOM
Pre-processing and Post-processing techniques
![Page 14: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/14.jpg)
Vector Quantisation
Data will be represented with prototype vectors.
![Page 15: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/15.jpg)
Feature Mapping
Input Nodes
![Page 16: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/16.jpg)
Feature Mapping
Input Nodes
[ x1, x2, x3, x4 ]T
[ w1, w2, w3, w4 ]T
w1
w2
w3
w4
![Page 17: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/17.jpg)
Feature Mapping
Weight vector will be mapped into the feature space.
[ w1, w2, w3, w4 ]T
[ x1, x2, x3, x4 ]T
![Page 18: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/18.jpg)
SOM Algorithm
Initialization
Select the number of neurons in the map
Choose random values for all weights
Learning
Repeat For each example, find the neuron closest to the point:
min || x - w ||
![Page 19: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/19.jpg)
SOM Algorithm
Winner takes all
Input Nodes
Update weights of winneronly (and neighbors)
![Page 20: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/20.jpg)
SOM Algorithm
Update Weights
Update weights for the closest neuron and neighbors:
wt+1 = wt + A(x,w) (x – w)
where is the learning rate
Function A defines a neighboring function.
![Page 21: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/21.jpg)
SOM Algorithm
The neighboring function A:
![Page 22: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/22.jpg)
SOM Algorithm
Usage
For every test point
Select the closest neuron using minimum Euclidean distance:
min || x - w ||
![Page 23: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/23.jpg)
Mapping a Grid to a Grid
![Page 24: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/24.jpg)
SOM Algorithm
Comments
Neighborhoods should be large at the beginningbut short as the nodes gain a specific ordering
Global ordering comes naturally (complexity theory)
Architecture of the map:Few nodes: underfittingMany nodes: overfitting
![Page 25: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/25.jpg)
Teuvo Kohonen
Teuvo Kohonen• Born in 1934, Finland• He has several books and over 300 papers• His most famous work is in Self Organizing Maps• Member of the Academy of Finland
Awards:• IEEE Neural Networks Council Pioneer Award, 1991• Technical Achievement Award of IEEE, 1995• Frank Rosenblatt Technical Field Award, 2008
![Page 26: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/26.jpg)
Unsupervised Learning and Clustering
k-means clustering
Sum-of-Squared Errors
Competitive Learning SOM
Pre-processing and Post-processing techniques
![Page 27: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/27.jpg)
Cluster Tendency
Cluster tendency is a preprocessing step that indicates when data objects exhibit a clustering structure; it precludes using clustering when the data appears randomly generated under the uniform distribution over a sample window of interest in the attribute space
![Page 28: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/28.jpg)
Example Cluster Tendency
Clustering captures inherentdata groups.
Clustering does not capture groups;Results come from random variations.
![Page 29: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/29.jpg)
Example Cluster Tendency
Problem: How do we choose the sampling window?
Rule of thumb: Create a window centered at the mean that captures half the total number of examples.
![Page 30: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/30.jpg)
Cluster Validation
Cluster validation is used to assess the value of the output of a clustering algorithm.
InternalStatistics are devised to capture the quality of the induced clusters using the available data objects.
ExternalIf the validation is performed by gathering statistics comparing the induced clusters against an external and independent classification of objects, the validation is called external.
![Page 31: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/31.jpg)
Example Cluster Validation
![Page 32: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/32.jpg)
Metrics Cluster Validation
One type of statistical metrics is defined in terms of a 2 x2 table where each entry counts the number of object pairs that agree or disagree with the class and cluster to which they belong:
E11 E12
E21 E22
Same class;Same cluster
Different class;Different cluster
Same class;Different cluster
Different class;Same cluster
![Page 33: Unsupervised Learning and Clustering k-means clustering Sum-of-Squared Errors Competitive Learning SOM Pre-processing and Post-processing techniques](https://reader035.vdocument.in/reader035/viewer/2022081511/56649e495503460f94b3d671/html5/thumbnails/33.jpg)
Examples Metrics Cluster Validation
Rand:
[ E11 + E22 ] / [ E11 + E12 + E21 + E22 ]
Jaccard:
E11 / [ E11 + E12 + E21 ]