advanced machine learning
TRANSCRIPT
-
7/29/2019 Advanced Machine Learning
1/36
CS 221: Artificial Intelligence
Lecture 6:
Advanced Machine Learning
Sebastian Thrun and Peter Norvig
Slide credit: Mark Pollefeys, Dan Klein, Chris Manning
-
7/29/2019 Advanced Machine Learning
2/36
Outline
Clustering
K-Means
EM
Spectral Clustering
Dimensionality Reduction
2
-
7/29/2019 Advanced Machine Learning
3/36
The unsupervised learning problem
3
Many data points, no labels
-
7/29/2019 Advanced Machine Learning
4/36
Unsupervised Learning?
Google Street View
4
http://maps.google.com/http://maps.google.com/ -
7/29/2019 Advanced Machine Learning
5/36
K-Means
5
Many data points, no labels
-
7/29/2019 Advanced Machine Learning
6/36
K-Means
Choose a fixed number ofclusters
Choose cluster centers andpoint-cluster allocations to
minimize error cant do this by exhaustive
search, because there aretoo many possibleallocations.
Algorithm fix cluster centers;
allocate points to closest
cluster
fix allocation; compute
best cluster centers x could be any set of
features for which we can
compute a distance (careful
about scaling)
x j i 2j elements o f i' th clusterclusters* From Marc Pollefeys COMP 256 2003
-
7/29/2019 Advanced Machine Learning
7/36
K-Means
-
7/29/2019 Advanced Machine Learning
8/36
K-Means
* From Marc Pollefeys COMP 256 2003
-
7/29/2019 Advanced Machine Learning
9/36
K-means clustering using intensity alone and color alone
Image Clusters on intensity Clusters on color
* From Marc Pollefeys COMP 256 2003
Results of K-Means Clustering:
-
7/29/2019 Advanced Machine Learning
10/36
K-Means
Is an approximation to EM Model (hypothesis space): Mixture of N Gaussians
Latent variables: Correspondence of data and Gaussians
We notice:
Given the mixture model, its easy to calculate thecorrespondence
Given the correspondence its easy to estimate the mixturemodels
-
7/29/2019 Advanced Machine Learning
11/36
Expectation Maximzation: Idea
Data generated from mixture of Gaussians
Latent variables: Correspondence between Data Itemsand Gaussians
-
7/29/2019 Advanced Machine Learning
12/36
Generalized K-Means (EM)
-
7/29/2019 Advanced Machine Learning
13/36
Gaussians
13
p(x) =1
2p s exp -(x - m )22s 2
p(x) = 2p( )-N
2 S
- 1
exp -1
2 (x- m )T S - 1(x - m )
-
7/29/2019 Advanced Machine Learning
14/36
ML Fitting Gaussians
14
m = 1M xi
i
=
1
Mxi - m( )
i
xi - m( )T
p(x) =1
2p s exp -(x - m )22s 2
p(x) = 2p( )-
N
2 S
- 1
exp -1
2 (x- m )T S - 1(x - m )
-
7/29/2019 Advanced Machine Learning
15/36
Learning a Gaussian Mixture(with known covariance)
k
n
x
x
ni
ji
e
e
1
)(2
1
)(
2
1
2
2
2
2
nj E[zij ]i=1
M
M-Step
k
n
ni
ji
ij
xxp
xxpzE
1
)|(
)|(][E-Step
Sj
1
nj
E[zij] x
i- m( )
i=1
M
xi - m( )Tm j 1
nj
E[zij] x
i
i=1
M
-
7/29/2019 Advanced Machine Learning
16/36
Expectation Maximization
Converges!
Proof[Neal/Hinton, McLachlan/Krishnan]:
E/M step does not decrease data likelihood
Converges at local minimum or saddle point
But subject to local minima
-
7/29/2019 Advanced Machine Learning
17/36
EM Clustering: Results
http://www.ece.neu.edu/groups/rpl/kmeans/
-
7/29/2019 Advanced Machine Learning
18/36
Practical EM
Number of Clusters unknown
Suffers (badly) from local minima
Algorithm: Start new cluster center if many points
unexplained
Kill cluster center that doesnt contribute
(Use AIC/BIC criterion for all this, if you want
to be formal)
18
-
7/29/2019 Advanced Machine Learning
19/36
Spectral Clustering
19
-
7/29/2019 Advanced Machine Learning
20/36
Spectral Clustering
20
-
7/29/2019 Advanced Machine Learning
21/36
The Two Spiral Problem
21
-
7/29/2019 Advanced Machine Learning
22/36
Spectral Clustering: Overview
Data Similarities Block-Detection
* Slides from Dan Klein, Sep Kamvar, Chris Manning, Natural Language Group Stanford University
-
7/29/2019 Advanced Machine Learning
23/36
Eigenvectors and Blocks
Block matrices have block eigenvectors:
Near-block matrices have near-block eigenvectors: [Ng et al., NIPS 02]
1 1 0 0
1 1 0 0
0 0 1 1
0 0 1 1
eigensolver
.71
.71
0
0
0
0
.71
.71
1= 2
2= 2
3= 0
4= 0
1 1 .2 0
1 1 0 -.2
.2 0 1 1
0 -.2 1 1
eigensolver
.71
.69
.14
0
0
-.14
.69
.71
1= 2.02
2= 2.02 3= -0.02 4= -0.02
* Slides from Dan Klein, Sep Kamvar, Chris Manning, Natural Language Group Stanford University
-
7/29/2019 Advanced Machine Learning
24/36
Spectral Space
Can put items into blocks by eigenvectors:
Resulting clusters independent of row ordering:
1 1 .2 0
1 1 0 -.2
.2 0 1 1
0 -.2 1 1
.71
.69
.14
0
0
-.14
.69
.71
e1
e2
e1 e
2
1 .2 1 0
.2 1 0 1
1 0 1 -.2
0 1 -.2 1
.71
.14
.69
0
0
.69
-.14
.71
e1
e2
e1 e
2
* Slides from Dan Klein, Sep Kamvar, Chris Manning, Natural Language Group Stanford University
-
7/29/2019 Advanced Machine Learning
25/36
The Spectral Advantage
The key advantage of spectral clustering is the spectral spacerepresentation:
* Slides from Dan Klein, Sep Kamvar, Chris Manning, Natural Language Group Stanford University
-
7/29/2019 Advanced Machine Learning
26/36
Measuring Affinity
Intensity
Texture
Distance
af f x, y exp 12 i2
I x I y 2
af f x, y exp 12 d2
x y 2
af f x, y exp 12 t2
c x c y 2* From Marc Pollefeys COMP 256 2003
-
7/29/2019 Advanced Machine Learning
27/36
Scale affects affinity
* From Marc Pollefeys COMP 256 2003
-
7/29/2019 Advanced Machine Learning
28/36
* From Marc Pollefeys COMP 256 2003
-
7/29/2019 Advanced Machine Learning
29/36
Dimensionality Reduction
29
-
7/29/2019 Advanced Machine Learning
30/36
The Space of Digits (in 2D)
30
M. Brand, MERL
-
7/29/2019 Advanced Machine Learning
31/36
Dimensionality Reduction with PCA
-
7/29/2019 Advanced Machine Learning
32/36
Linear: Principal Components
Fit multivariate Gaussian
Compute eigenvectors of
Covariance
Project onto eigenvectors
with largest eigenvalues
32
-
7/29/2019 Advanced Machine Learning
33/36
Other examples of unsupervised learning
33
Mean face (after alignment)
Slide credit: Santiago Serrano
-
7/29/2019 Advanced Machine Learning
34/36
Eigenfaces
34Slide credit: Santiago Serrano
-
7/29/2019 Advanced Machine Learning
35/36
Non-Linear Techniques
Isomap
Local Linear Embedding
35Isomap, Science, M. Balasubmaranian and E. Schwartz
-
7/29/2019 Advanced Machine Learning
36/36
Scape (Drago Anguelov et al)
36