grasp learning a kernel matrix for nonlinear dimensionality reduction kilian q. weinberger, fei sha...
TRANSCRIPT
![Page 1: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/1.jpg)
GRASP
Learning a Kernel Matrix for Nonlinear Dimensionality Reduction
Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul
ICML’04
Department of Computer and Information Science
![Page 2: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/2.jpg)
GRASP
The Big Picture
Given high dimensional data sampled from a low dimensional manifold,
how to compute a faithful embedding?
![Page 3: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/3.jpg)
GRASP
Outline
• Part I: kernel PCA
• Part II: Manifold Learning
• Part III: Algorithm • Part IV: Experimental Results
![Page 4: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/4.jpg)
GRASP
Part I.Part I. kernel PCAkernel PCA
![Page 5: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/5.jpg)
GRASP
Nearby points remain nearby,distant points remain distant.
Estimate d.
Input:
Output:
Problem:
Embedding:
![Page 6: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/6.jpg)
GRASP
Subspaces
D=3d=2
D=2d=1
![Page 7: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/7.jpg)
GRASP
Principal Component Analysis
Project data into subspace of maximum variance:
Can be solved as eigenvalue problem:
![Page 8: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/8.jpg)
GRASP
Using the kernel trick
Do PCA in a higher dimensional feature space
Can be defined implicitly through
kernel matrix
![Page 9: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/9.jpg)
GRASP
• Linear
• Gaussian
• Polynomial
Common Kernels
Do very well for classification.How about manifold learning?
![Page 10: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/10.jpg)
GRASP
Linear KernelQuickTime™ and a
Photo - JPEG decompressorare needed to see this picture.
![Page 11: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/11.jpg)
GRASP
Gaussian KernelsQuickTime™ and a
Photo - JPEG decompressorare needed to see this picture.
![Page 12: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/12.jpg)
GRASP
Gaussian KernelsQuickTime™ and a
Photo - JPEG decompressorare needed to see this picture.
Feature vectors span as many dimensions as number of spheres with radius needed to enclose input vectors.
![Page 13: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/13.jpg)
GRASP
Polynomial KernelsQuickTime™ and a
Photo - JPEG decompressorare needed to see this picture.
![Page 14: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/14.jpg)
GRASP
Part II. Manifold Learningvia Semidefinite Programming
![Page 15: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/15.jpg)
GRASP
Local Isometry
A smooth, invertible mapping that preserves distances and looks locally like a rotation plus translation.
![Page 16: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/16.jpg)
GRASP
Local Isometry
A smooth, invertible mapping that preserves distances and looks locally like a rotation plus translation.
![Page 17: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/17.jpg)
GRASP
Neighborhood graphConnect each point toits k nearest neighbors.
Discretized manifolds
![Page 18: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/18.jpg)
GRASP
Preserve local distancesApproximation of local isometry:
Constraint
Neighborhoodindicator
![Page 19: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/19.jpg)
GRASP
• Goal:
• Problem:
• Heuristic:
Objective Function?
Find Minimum Rank Kernel Matrix
Computationally Hard
Maximize Pairwise Distances
![Page 20: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/20.jpg)
GRASP
QuickTime™ and aPhoto - JPEG decompressor
are needed to see this picture.
Objective Function? (Cont’d)
What happens if we maximize the pairwise distances?
![Page 21: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/21.jpg)
GRASP
Semidefinite Programming Problem:
Maximize:
subject to: Preserve local neighborhoods
Unfold manifold
Center output
Semipositivedefinite
![Page 22: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/22.jpg)
GRASP
Part IIISemidefinite Embedding
in three easy steps(Also known as “Maximum Variance Unfolding”
[Sun, Boyd, Xiao, Diaconis])
![Page 23: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/23.jpg)
GRASP
1. Step: K-Nearest Neighbors
Compute nearest neighbors and the Gram matrix for each neighborhood
![Page 24: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/24.jpg)
GRASP
2. Step: Semidefinite programming
Compute centered, locally isometric dot-product matrix with maximal trace
![Page 25: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/25.jpg)
GRASP
Estimate d from eigenvalue spectrum. Top eigenvectors give embedding
3. Step: kernel PCA
![Page 26: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/26.jpg)
GRASP
Part IV. Experimental Results
![Page 27: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/27.jpg)
GRASP
QuickTime™ and aPhoto - JPEG decompressor
are needed to see this picture.
Trefoil Knot
N=539k=4D=3d=2
![Page 28: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/28.jpg)
GRASP
QuickTime™ and aPhoto - JPEG decompressor
are needed to see this picture.
Trefoil Knot
N=539k=4D=3d=2
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
RB
F
Poly
nom
ial
Lin
ear
SDE
% V
aria
nce
![Page 29: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/29.jpg)
GRASP
Teapot (full rotation) N=400k=4
D=23028d=2
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Bef
ore
Aft
er
% V
aria
nce
QuickTime™ and aPhoto - JPEG decompressor
are needed to see this picture.
![Page 30: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/30.jpg)
GRASP
N=200k=4
D=23028d=2
Teapot (half rotation)
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Bef
ore
Aft
er
% V
aria
nce
![Page 31: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/31.jpg)
GRASP
FacesN=1000
k=4D=540
d=2
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Bef
ore
Aft
er
% V
aria
nce
![Page 32: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/32.jpg)
GRASP
Twos vs. Threes
N=953k=3
D=256d=2
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
RBF
POLY
NOMIA
L
LINEA
RSD
E
![Page 33: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/33.jpg)
GRASP
Part V.Supervised Experimental Results
![Page 34: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/34.jpg)
GRASP
Large Margin Classification
• SDE Kernel used in SVM
• Task: Binary Digit Classification
• Input: USPS Data Set
• Training / Testing set: 810/90
• Neighborhood Size: k=4
![Page 35: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/35.jpg)
GRASP
0
2
4
6
8
10
12
1 vs 2 1 vs 3 2 vs 8 8 vs 9
LinearPolynomialGaussianSDE
SVM Kernel
SDE is not well-suited for SVMs
![Page 36: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/36.jpg)
GRASP
SVM Kernel (cont’d)
Non-Linear decision boundaryLinear decision boundary
Unfolding does not necessarily help classification • Reducing the dimensionality is counter-intuitive.• Needs linear decision boundary on manifold.
![Page 37: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/37.jpg)
GRASP
Part VI. Conclusion
![Page 38: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/38.jpg)
GRASP
Previous Work
Isomap and LLE can both be seen from a kernel view [Jihun Ham et al., ICML’04]
![Page 39: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/39.jpg)
GRASP
Previous Work (Isomap)
Isomap and LLE can both be seen from a kernel view [Jihun Ham et al., ICML’04]
Matrix not necessarily semi-positive definite
SDE Isomap
![Page 40: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/40.jpg)
GRASP
Previous Work (Isomap)
Isomap and LLE can both be seen from a kernel view [Jihun Ham et al., ICML’04]
Matrix not necessarily semi-positive definite
SDE Isomap
![Page 41: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/41.jpg)
GRASP
Previous Work (LLE)
Isomap and LLE can both be seen from a kernel view [Jihun Ham et al., ICML’04]
Eigenvalues do not reveal true dimensionality
SDE LLE
![Page 42: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/42.jpg)
GRASP
Conclusion
Semidefinite Embedding (SDE)+ extends kernel PCA to do manifold learning+ uses semidefinite programming+ has a guaranteed unique solution- not well suited for support vector machines- exact solution (so far) limited to N=2000
![Page 43: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/43.jpg)
GRASP
![Page 44: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/44.jpg)
GRASP
Semidefinite Programming Problem:
Maximize:
subject to: Preserve local neighborhoods
Unfold Manifold
Center Output
semi-positivedefinite
![Page 45: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/45.jpg)
GRASP
Semidefinite Programming Problem:
Maximize:
subject to: Preserve local neighborhoods
Unfold Manifold
Center Output
semi-positivedefinite
Introduce Slack
![Page 46: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/46.jpg)
GRASP
QuickTime™ and aPhoto - JPEG decompressor
are needed to see this picture.
Swiss Roll
N=800k=4D=3d=2
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Bef
ore
Aft
er
% V
aria
nce
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
% V
aria
nce
![Page 47: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/47.jpg)
GRASP
Applications
• Visualization of Data
• Natural Language Processing
![Page 48: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/48.jpg)
GRASP
Trefoil Knot
N=539k=4D=3d=2
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
RB
F
Poly
nom
ial
Lin
ear
SDE
% V
aria
nce
RBFPolynomial
SDE
![Page 49: GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer](https://reader035.vdocument.in/reader035/viewer/2022070404/56649f355503460f94c54012/html5/thumbnails/49.jpg)
GRASP
Motivation
• Similar vectorized pictures lie on a non-linear manifolds
• Linear Methods don’t work here