introduction to machine learning - amazon s3 · 2016-05-04 · introduction to machine learning 1....
TRANSCRIPT
![Page 1: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/1.jpg)
INTRODUCTION TO MACHINE LEARNING
Clustering with k-means
![Page 2: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/2.jpg)
Introduction to Machine Learning
Clustering, what?
● Similar within cluster
● Dissimilar between clusters
● Clustering: grouping objects in clusters
● No labels: unsupervised classification
● Plenty possible clusterings
● Cluster: collection of objects
![Page 3: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/3.jpg)
Introduction to Machine Learning
Clustering, why?● Pa!ern Analysis
● Visualise Data
● pre-Processing Step
● Outlier Detection
● …
● Targeted Marketing Programs
● Student Segmentations
● Data Mining
● …
![Page 4: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/4.jpg)
Introduction to Machine Learning
Clustering, how?● Measure of Similarity: d( …, …)
• Numerical variables Metrics: Euclidean, Manhattan, …
• Categorical variables Construct your own distance
• Clustering Methods
• k-means
• Hierarchical
• …
Many variations
![Page 5: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/5.jpg)
Introduction to Machine Learning
Cluster CentroidObjectCluster
#Clusters
Compactness and Separation● Within Cluster Sums of Squares (WSS):
Measure of compactness
● Between Cluster Sums of Squares (BSS):
Measure of separation
Minimise WSS
Maximise BSS
Cluster Centroid#Clusters
#Objects in ClusterSample Mean
![Page 6: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/6.jpg)
Introduction to Machine Learning
Goal: Partition data in k disjoint subsets
k-Means Algorithm
0 5 10−5
05
x
y
Let’s take k = 3
![Page 7: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/7.jpg)
Introduction to Machine Learning
1. Randomly assign k centroids
0 5 10−5
05
x
y
k-Means Algorithmk = 3Goal: Partition data in k disjoint subsets
![Page 8: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/8.jpg)
Introduction to Machine Learning
1. Randomly assign k centroids
2. Assign data to closest centroid
0 5 10−5
05
x
y
k-Means Algorithmk = 3Goal: Partition data in k disjoint subsets
![Page 9: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/9.jpg)
Introduction to Machine Learning
1. Randomly assign k centroids
2. Assign data to closest centroid
3. Moves centroids to average location
0 5 10−5
05
x
y
k-Means Algorithmk = 3Goal: Partition data in k disjoint subsets
![Page 10: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/10.jpg)
Introduction to Machine Learning
1. Randomly assign k centroids
2. Assign data to closest centroid
3. Moves centroids to average location
4. Repeat step 2 and 3
0 5 10−5
05
x
y
k-Means Algorithmk = 3Goal: Partition data in k disjoint subsets
![Page 11: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/11.jpg)
Introduction to Machine Learning
1. Randomly assign k centroids
2. Assign data to closest centroid
3. Moves centroids to average location
4. Repeat step 2 and 3
0 5 10−5
05
x
y
k-Means Algorithm
The algorithm has converged!
k = 3Goal: Partition data in k disjoint subsets
![Page 12: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/12.jpg)
Introduction to Machine Learning
● Problem: WSS keeps decreasing as k increases!
● Solution: WSS starts decreasing slowly
Choosing k● Goal: Find k that minimizes WSS
WSS / TSS < 0.2 Fix k}
![Page 13: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/13.jpg)
Introduction to Machine Learning
Choosing k: Scree PlotScree Plot: Visualizing the ratio WSS / TSS as function of k
1 2 3 4 5 6 7
0.2
0.4
0.6
0.8
1.0
k
WSS
/ TS
S
Look for the elbow in the plot
Choose k = 3
![Page 14: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/14.jpg)
Introduction to Machine Learning
k-Means in R
● centers: Starting centroid or #clusters
● nstart: #times R restarts with different centroids
> my_km <- kmeans(data, centers, nstart)
Distance: Euclidean metric
> my_km$tot.withinss
> my_km$betweenss
WSS
BSS
![Page 15: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/15.jpg)
INTRODUCTION TO MACHINE LEARNING
Let’s practice!
![Page 16: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/16.jpg)
INTRODUCTION TO MACHINE LEARNING
Performance and Scaling
![Page 17: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/17.jpg)
Introduction to Machine Learning
Cluster EvaluationNot trivial! There is no truth
● No true labels
● No true response
Evaluation methods? Depends on the goal
Goal: Compact and Separated Measurable!
![Page 18: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/18.jpg)
Introduction to Machine Learning
Cluster MeasuresWSS and BSS:
Underlying idea:
● Variance within clusters
● Separation between clusters } Compare
Alternative:
● Diameter
● Intercluster Distance
Good indication
![Page 19: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/19.jpg)
Introduction to Machine Learning
Diameter
0 5 10 15−5
05
x
y
: Objects : Cluster
: Distance (objects)
Measure of Compactness
![Page 20: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/20.jpg)
Introduction to Machine Learning
0 5 10 15
−50
5
x
y
Intercluster Distance
: Objects : Clusters: Distance (objects)
Measure of Separation
![Page 21: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/21.jpg)
Introduction to Machine Learning
0 5 10 15
−50
5
x
y
Dunn’s Index
![Page 22: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/22.jpg)
Introduction to Machine Learning
Notes:
● High computational cost
● Worst case indicator
Dunn’s IndexHigher Dunn Be!er separated / more compact
![Page 23: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/23.jpg)
Introduction to Machine Learning
Alternative measures● Internal Validation: based on intrinsic knowledge
● BIC Index
● Silhoue!e’s Index
● External Validation: based on previous knowledge
● Hulbert’s Correlation
● Jaccard’s Coefficient
![Page 24: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/24.jpg)
Introduction to Machine Learning
Evaluating in RLibraries: cluster and clValid
> dunn(clusters = my_km, Data = ...)
Dunn’s Index:
● clusters: cluster partitioning vector
● Data: original dataset
![Page 25: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/25.jpg)
Introduction to Machine Learning
Scale IssuesMetrics are o!en scale dependent!
Which pair is most similar? ( Age, Income, IQ )
● X1 = (28, 72000, 120)
● X2 = (56, 73000, 80)
● X3 = (29, 74500, 118)
● Intuition: (X1, X3)
● Euclidean: (X1, X2)
Solution: Rescale income / 1000$
![Page 26: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/26.jpg)
Introduction to Machine Learning
StandardizingProblem: Multiple variables on different scales
Solution: Standardize your data
1. Subtract the mean
2. Divide by the standard deviation
> scale(data)
Note: Standardizing Different interpretation
![Page 27: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/27.jpg)
INTRODUCTION TO MACHINE LEARNING
Let’s practice!
![Page 28: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/28.jpg)
INTRODUCTION TO MACHINE LEARNING
Hierarchical Clustering
![Page 29: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/29.jpg)
Introduction to Machine Learning
Hierarchical ClusteringHierarchy:
● Which objects cluster first?
● Which cluster pairs merge? When?
Bo!om-up:
● Starts from the objects
● Builds a hierarchy of clusters
![Page 30: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/30.jpg)
Introduction to Machine Learning
Bo!om-Up: AlgorithmPre: Calculate distances between objects
Objects
![Page 31: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/31.jpg)
Introduction to Machine Learning
Bo!om-Up: AlgorithmPre: Calculate distances between objects
Distance
Objects
![Page 32: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/32.jpg)
Introduction to Machine Learning
Bo!om-Up: Algorithm1. Put every object in its own cluster
![Page 33: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/33.jpg)
Introduction to Machine Learning
Bo!om-Up: Algorithm2. Find the closest pair of clusters Merge them
![Page 34: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/34.jpg)
Introduction to Machine Learning
Bo!om-Up: Algorithm3. Compute distances between new cluster and old ones
![Page 35: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/35.jpg)
Introduction to Machine Learning
Bo!om-Up: Algorithm4. Repeat steps two and three
![Page 36: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/36.jpg)
Introduction to Machine Learning
Bo!om-Up: Algorithm4. Repeat steps two and three
![Page 37: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/37.jpg)
Introduction to Machine Learning
Bo!om-Up: Algorithm4. Repeat steps two and three
![Page 38: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/38.jpg)
Introduction to Machine Learning
Bo!om-Up: Algorithm4. Repeat steps two and three One cluster
![Page 39: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/39.jpg)
Introduction to Machine Learning
Linkage-Methods● Simple-Linkage: minimal distance between clusters
● Complete-Linkage: maximal distance between clusters
● Average-Linkage: average distance between clusters
Different Clusterings
![Page 40: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/40.jpg)
Introduction to Machine Learning
Simple-LinkageMinimal distance between objects in each clusters
![Page 41: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/41.jpg)
Introduction to Machine Learning
Complete-LinkageMaximal distance between objects in each cluster
![Page 42: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/42.jpg)
Introduction to Machine Learning
Single-Linkage: Chaining
● O!en undesired
![Page 43: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/43.jpg)
Introduction to Machine Learning
Single-Linkage: Chaining
● O!en undesired
![Page 44: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/44.jpg)
Introduction to Machine Learning
Single-Linkage: Chaining
● O!en undesired
![Page 45: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/45.jpg)
Introduction to Machine Learning
Single-Linkage: Chaining
● Can be great outlier detector
● O!en undesired
![Page 46: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/46.jpg)
Introduction to Machine Learning
DendrogramHe
ight
Leaves / Objects
Cut
Merge
Merge
![Page 47: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/47.jpg)
Introduction to Machine Learning
Hierarchical Clustering in R
> dist(x, method)
Library: stats
> hclust(d, method)
● x: dataset ● method: distance
● d: distance matrix ● method: linkage
![Page 48: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/48.jpg)
Introduction to Machine Learning
Hierarchical: Pro and Cons● Pros
● In-depth analysis
● Linkage-methods
● Cons
● High computational cost
● Can never undo merges
Different pa"ern
![Page 49: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/49.jpg)
Introduction to Machine Learning
k-Means: Pro and Cons● Pros
● Can undo merges
● Fast computations
● Cons
● Fixed #Clusters
● Dependent on starting centroids
![Page 50: INTRODUCTION TO MACHINE LEARNING - Amazon S3 · 2016-05-04 · Introduction to Machine Learning 1. Randomly assign k centroids 2. Assign data to closest centroid 3. Moves centroids](https://reader033.vdocument.in/reader033/viewer/2022042807/5f7cacbcc6b64e726b184fb5/html5/thumbnails/50.jpg)
INTRODUCTION TO MACHINE LEARNING
Let’s practice!