data set used. k means k means clusters 1.k means begins with a user specified amount of clusters...

Data Set used

Upload: kylee-moxham

Post on 31-Mar-2015

213 views

Category:

Documents

0 download

Report

Download

Tags:

Embed Size (px):

TRANSCRIPT

Data Set used

K Means

K Means Clusters

1. K Means begins with a user specified amount of clusters

2. Randomly places the K centroids on the data set3. Finds all the points closest to each centroid and

makes them clusters4. Changes the centroid of each cluster to the

mean of the subset of points5. Repeats step 5 until the change of the centroids

is minimal.

Page 4: Data Set used. K Means K Means Clusters 1.K Means begins with a user specified amount of clusters 2.Randomly places the K centroids on the data set 3.Finds

Kmeans Implementation Issues

• If K is too small the algorithm did not converge (no stable clusters)– Further investigation of this is needed

• If K is too small, some clusters were null

Page 5: Data Set used. K Means K Means Clusters 1.K Means begins with a user specified amount of clusters 2.Randomly places the K centroids on the data set 3.Finds

K- Means Matlab code

Page 6: Data Set used. K Means K Means Clusters 1.K Means begins with a user specified amount of clusters 2.Randomly places the K centroids on the data set 3.Finds

Ease of Doing Business vs Paying Taxes

Page 7: Data Set used. K Means K Means Clusters 1.K Means begins with a user specified amount of clusters 2.Randomly places the K centroids on the data set 3.Finds

Interesting case

• The border points are clearly defined by distance not density

• We ask for each point “What is the closest centroid?”

Page 8: Data Set used. K Means K Means Clusters 1.K Means begins with a user specified amount of clusters 2.Randomly places the K centroids on the data set 3.Finds

Why we like it

• It is relatively straight forward in concept and implementation

• Good for globular data• We can specify the amount of clusters

Page 9: Data Set used. K Means K Means Clusters 1.K Means begins with a user specified amount of clusters 2.Randomly places the K centroids on the data set 3.Finds

Why we don’t like it

• Subject to initialization problems and heterogeneous results.

• Not good for non-globular data (but can find clusters given a large enough K)

• Sensitive to outliers (cleaning data set helps)• Data must have the notion of a “center”

Page 10: Data Set used. K Means K Means Clusters 1.K Means begins with a user specified amount of clusters 2.Randomly places the K centroids on the data set 3.Finds

Variations

• Bisecting K-means• K-median• K - medoid• Several others

Page 11: Data Set used. K Means K Means Clusters 1.K Means begins with a user specified amount of clusters 2.Randomly places the K centroids on the data set 3.Finds

DBSCAN Algo

Pick a point P, find distance of every next point P' from P.

If(Dist < K Factor)P' is in same cluster as P.

else if (Dist = K Factor)P' is a border point.

else Allot P' a new cluster.

Page 12: Data Set used. K Means K Means Clusters 1.K Means begins with a user specified amount of clusters 2.Randomly places the K centroids on the data set 3.Finds

SNAPSHOTS

For K_Factor = 20

Page 13: Data Set used. K Means K Means Clusters 1.K Means begins with a user specified amount of clusters 2.Randomly places the K centroids on the data set 3.Finds

For K_Factor = 10

Page 14: Data Set used. K Means K Means Clusters 1.K Means begins with a user specified amount of clusters 2.Randomly places the K centroids on the data set 3.Finds

For K_Factor = 120

Page 15: Data Set used. K Means K Means Clusters 1.K Means begins with a user specified amount of clusters 2.Randomly places the K centroids on the data set 3.Finds

Calculation of K-Factor

Page 16: Data Set used. K Means K Means Clusters 1.K Means begins with a user specified amount of clusters 2.Randomly places the K centroids on the data set 3.Finds

Issues faced

When adding a new point P' to the present cluster, the whole cluster of P' has to be merged with the present cluster.

No lower bound on number of clusters.

Choice of K Factor

Page 17: Data Set used. K Means K Means Clusters 1.K Means begins with a user specified amount of clusters 2.Randomly places the K centroids on the data set 3.Finds

Further Enhancements

• Calculation for K-Factor and clustering could be integrated together.

• Dynamic programming could be made use of since many computations are being repeated.

• Static vs Dynamic data

Automatically generated PDF from existing images. · B) Explain K-Means clustering algorithm? Apply K-Means algprithuis forthe following Data set with two clusters. Data Set = (25

Chapter 4: Unsupervised Learning. CS583, Bing Liu, UIC 2 Road map Basic concepts K-means algorithm Representation of clusters Hierarchical clustering

Faster Algorithms for the Constrained k-means Problemrjaiswal/Files/list-k-means-slides.pdf · k-means Clustering Problem Problem (k-means) Given n points X ˆRd, and an integer k,

Application of Improved Initialization of K- means ... · datasets in such a way that items in the same cluster are more identical to each other than to those in other clusters. K-Means

Cortical fast-spiking parvalbumin interneurons enwrapped ......Clustering of these genes by K-means algorithm displays ﬁve distinct clusters. Among these ﬁve clusters, two fast-spiking

HAC and K-MEANS with Reric.univ-lyon2.fr/.../R/en/cah_kmeans_avec_r.pdf · 2017. 7. 7. · K-Means, unlike the CAH, does not provide a tool to help us to detect the number of clusters

10701 Machine Learning Clusteringepxing/Class/10701/slides/clustering.pdf · Algorithm k-means 1. Decide on a value for K, the number of clusters. 2. Initialize the K cluster centers

Review: Mixtures of GaussiansConnection to k-means Case Study 2: Document Retrieval. K-means ©Sham Kakade 2016 35 1. Ask user how many clusters they’d like. (e.g. k=5) 2. Randomly

Clustering: K-meansClustering Introduction We will focus on two particular clustering algorithms K-means: Seeks to partition the the observations into a pre-speci ed number of clusters

Clusteringcss.cornell.edu/.../files/ov/PLSCS6200_Clustering.pdfClustering D G Rossiter Concepts Top-down vs. bottom-up Forming groups Clustering by k-means Optimum number of clusters

Vivekanand Education Society (VES) – Mumbai, India · B) Explain K-Means clustering algorithm? Apply K-Means algorithms fovthe following Data set with two clusters. Data Set —

Unsupervised learning: Clustering - cs.helsinki.fi · K-means: pseudocode I Input: A set of N points x i, and the desired number of clusters K I Output: A partition of the points

Clusteringxhx/courses/CS273P/11...K-Means Clustering A simple clustering algorithm Iterate between Updating the assignment of data to clusters Updating the cluster’s summarization

K-Means Algorithm Each cluster is represented by the mean value of the objects in the cluster Input: set of objects (n), no of clusters (k) Output:

E. K. Grebel Globular Clusters: The Dwarf Galaxy Contribution1 Globular Clusters: The Dwarf Galaxy Contribution Eva K. Grebel Astronomisches Rechen-Institut

Clustering - coli.uni-saarland.decrocker/courses/learning/lecture13… · The k-means algor ithm assigns instances to clusters according to Euclidian distance to the cluster centers

AdvancedAlgorithmics Clustering K · 2009. 4. 14. · • Shape of K‐means/K‐mediodsclusters are convex polygons Convex Shape. 57 pyg • Shapes of clusters of a representative‐based

Objective of clustering - The University of Edinburgh · Fuzzy and soft K-means clustering Objective: Soft or fuzzy partition of the data into a prede ned number of clusters, K. {

Partitional Algorithms to Detect Complex Clusters Kernel K-means K-means applied in Kernel space Spectral clustering Eigen subspace of the affinity matrix

Annexures to the report on the analysis of REF shadow ... Equalisation Fund/Anne… · separate clusters, which are then united step by step. 2.1.2 K-Means Cluster Analysis The K-Means

EXAM SRM SAMPLE QUESTIONS AND SOLUTIONS · I. The number of clusters must be pre-specified for both K-means and hierarchical clustering

What to do when K-means clustering fails: a simple yet ......the limitations imposed by K-means. The number of clusters K is estimated from the data instead of being fixed a-priori

Parallel K-means Clustering - University at Buffalo · • k-means clustering is a method of clustering which aims to partition n data points into k clusters (n >> k) in which each

Contextualization: Providing One-Clic k Virtual Clusters

Image segmentation by Clustering - · PDF fileimage using 2 centroids) Figure 4 ... of Clusters in K-Means Clustering and Application in Colour Image Segmentation". [7] ... means-like

Clustering: K-means and Kernel K-means · Clustering: K-means and Kernel K-means Piyush Rai Machine Learning (CS771A) Aug 31, 2016 Machine Learning (CS771A) Clustering: K-means and

clustering ETC3250/5250: k-means - iml.numbat.space · k-means clustering - algorithm This is an iterative procedure. To use it the number of clusters, , must be decided rst. The

Clustering: K-Means, Nearest Neighbors07-Clustering:KNN.pdf · Clustering: K-means 8. Disadvantages of K-means 9. Disadvantages of K-means • Does not work efficiently with complex

Learnability of k-means Clustering - Trinity College€¦ · • Partitions data instances into k clusters • Performance evaluation: compare the resulting clustering assignment

k-Means Clustering - Tarleton State University · Means pencil-and-paper QUIZ Means coding QUIZ . k-Means Clustering (pp. 170-183) Explaining the intialization and iterations of k-means

Clustering: K-means - GitHub Pagesi-systems.github.io/.../12_Clustering_K-means.pdf · K-means: Limitations •Make hard assignments of points to clusters –A point either completely

Intelligent choice of the number of clusters in K-Means ...mirkin/papers/chiangmir.pdf · K-Means clustering method, conventionally, applies to a dataset involving a set of N entities,

[email protected] - AIMJ · 2020. 6. 20. · self-organizing map (SOM) if-then K-means K-means K-means K K-means Silhouette Silhouette ... model by clustering and evolving fuzzy rules

MOVIE RECOMMENDATION WITH K-MEANS ......2.1 K-means Clustering In this section, we briefly describe the K-means algorithm (Alpaydin, 2004) and its utilization for recommendation. K-means

data set used. k means k means clusters 1.k means begins with a user specified amount of clusters...

Documents

taxes slide

null slide

center slide

factor p

cluster of p

matlab code slide

new point p

stable clusters