non-linear dimension-reduction methods
DESCRIPTION
Non-linear dimension-reduction methods. Olga Sorkine January 2006. Overview. Dimensionality reduction of high-dimensional data Good for learning, visualization and … parameterization. Dimension reduction. Input: points in some D -dimensional space ( D is large) Images - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/1.jpg)
Non-linear dimension-reduction methods
Olga SorkineJanuary 2006
![Page 2: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/2.jpg)
2
Overview
Dimensionality reduction of high-dimensional data Good for learning, visualization and … parameterization
![Page 3: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/3.jpg)
3
Dimension reduction
Input: points in some D-dimensional space (D is large)– Images– Physical measurements– Statistical data– etc…
We want to discover some structure/correlation in the input data. Hopefully, the data lives on a d-dimensional surface (d << D).– Discover the real dimensionality d– Find a mapping from RD to Rd that preserves something about
the data• Today we’ll talk about preserving variance/distances
![Page 4: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/4.jpg)
4
Discovering linear structures
PCA – finds linear subspaces that best preserve the variance of the data points
![Page 5: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/5.jpg)
5
Linear is sometimes not enough
When our data points sit on a non-linear manifold– We won’t find a good linear mapping from the data points to a
plane, because there isn’t any
![Page 6: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/6.jpg)
6
Today
Two methods to discover such non-linear manifolds:
Isomap (descendent of MultiDimensional Scaling) Llocally Linear Embedding
![Page 7: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/7.jpg)
7
Notations
Input data points: columns of X RDn
Assume that the center of mass of the points is the origin
2
| | |
| | |nX
1x x x
![Page 8: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/8.jpg)
8
Reminder about PCA
PCA finds a linear d-dimensional subspace of RD along which the variance of the data is the biggest
Denote by the data points projected onto the d-dimensional space. PCA finds such subspace that:
When we do parallel projection of the data points, the distances between them can only get smaller. So finding a subspace which attains the maximum scatter means we get the distances somehow preserved.
, , , 1 2 nx x x
2
maxi j
i jx x
![Page 9: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/9.jpg)
9
Reminder about PCA
To find the principal axes:– Compute the scatter matrix S RDD
– Diagonalize S:
The eigenvectors of S are the principal directions. The eigenvalues are sorted in descending order.
Take d first eigenvectors as the “principal subspace” and project the data points onto this subspace.
TS XX
1| | | |
| | | |
T
D
S
1 D 1 Dv v v v
![Page 10: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/10.jpg)
10
Why this works?
The eigenvectors vi are the maxima of the following quadratic form:
In fact, we get directions of maximal variance:2
( ) ( ) ( )T T T T T T Tf S XX X X X v v v v v v v v
2 2
22| ,
| ,
T
n
X
1 1
i
n
x x vv v x
x x v
( ) , Tf S S v v v v v
![Page 11: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/11.jpg)
Multidimensional Scaling
J. Tenenbaum, V. Silva, J.C. LangfordScience, December 2000
![Page 12: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/12.jpg)
12
Multidimensional scaling (MDS)
The idea: compute the pairwise distances between the input points:
Now, find n points in low-dimensional space Rd, so that their distance matrix is as close as possible to M.
2dist ( , )n n
M
i jx x
![Page 13: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/13.jpg)
13
MDS – the math details
We look for X’,
such that || M’ – M || is as small as possible, where
M’ is the Euclidean distances matrix for points xi’.
| |
| |
d nX R
1 nx x
22dist ( , ) n nM R
i j i jx x x x
![Page 14: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/14.jpg)
14
MDS – the math details
Ideally, we want:
2
2 2
,
|| || || || 2 ,
M M
M
M
i j
i j i j
i j i j
x x
x x x x
x x x x
2 2 2
|| || || || || ||
|| || || || || ||
|| || || || || ||
1 1 1
n n n
x x x
x x x
x x x
2
1 2
1 2
|| || || || || ||
|| || || || || ||
|| || || || || ||
1 n
n
n
x x x
x x x
x x x
| |
| |
1
1 n
n
xx x
x
TX X want to get rid of these
![Page 15: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/15.jpg)
15
MDS – the math details
Trick: use the “magic matrix” J :1 1
1 1 1
1 1
11
1
n n
n n n
n n n n
J
0a a a J
0
bb
J
b
![Page 16: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/16.jpg)
16
MDS – the math details
Cleaning the system:
2
2 2 2 1 2
1 2
|| || || || || || || || || || || ||
|| || || || || || || || || || || || 2
|| || || || || || || || || || || ||
TX X M
1 1 1 1 n
n
n n n n
x x x x x x
x x x x x x
x x x x x x
J J
12
2
:
T
T
X X JMJ
X X JMJ B
TX X B
![Page 17: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/17.jpg)
17
How to find X’
We will use the spectral decomposition of B:
1| | | |
| | | |
T
T
n
X X B
1 n 1 nv v v v
1 1| | | | | || | | | | |
| | | | | || | | | | |
TT
Tn nd d
n n
X X
1 d 1 dv v v v v v
n d
d d
TX X
![Page 18: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/18.jpg)
18
How to find X’
So we find X’ by throwing away the last nd eigenvalues
1
d
X
1
d
v
v
d n
2arg min T
LXX X X B
22
,ijL
i j
A A
![Page 19: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/19.jpg)
19
Isomap
The idea of Tenenbaum et al.: estimate geodesic distances of the data points (instead of Euclidean)
Use K nearest neighbors or -balls to define neighborhood graphs
Approximate the geodesics by shortest paths on the graph.
![Page 20: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/20.jpg)
20
Inducing a graph
-15
-10
-5
0
5
10
15
-15
-10
-5
0
5
10
150
20
40
60
![Page 21: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/21.jpg)
21
Defining neighborhood and weights
ijw i jx x
![Page 22: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/22.jpg)
22
Finding geodesic paths
Compute weighted shortest paths on the graph (Dijkstra)
![Page 23: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/23.jpg)
23
Locating new points in the Isomap embedding
Suppose we have a new data point p RD
Want to find where it belongs in the Rd embedding Compute the distances from p to all other points:
2
dist( , )dist( , )
dist( , )n
1p xp x
u
p x
d d V
p u
![Page 24: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/24.jpg)
24
Some results
![Page 25: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/25.jpg)
25
Morph in Isomap space
![Page 26: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/26.jpg)
26
Flattening results (Zigelman et al.)
![Page 27: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/27.jpg)
27
Flattening results (Zigelman et al.)
![Page 28: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/28.jpg)
28
Flattening results (Zigelman et al.)
![Page 29: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/29.jpg)
Locally Linear Embedding
S.T. Roweis and L.K. SaulScience, December 2000
![Page 30: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/30.jpg)
30
The idea
Define neighborhood relations between points– K nearest neighbors -balls
Find weights that reconstruct each data point from its neighbors:
Find low-dimensional coordinates so that the same weights hold:
2
( )1min
ijj
j N iijw
w
i jx x
, ( ),
2
min iji j N i
w
1 n
i jx x
x x
, , dR 1 nx x
![Page 31: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/31.jpg)
31
Local information reconstructs global one
The weights wij capture the local shape– Invariant to translation, rotation and scale of the neighborhood– If the neighborhood lies on a manifold, the local mapping from
the global coordinates (RD) to the surface coordinates (Rd) is almost linear
– Thus, the weights wij should hold also for manifold (Rd) coordinate system!
, ( ),
2
min iji j N i
w
1 n
i jx x
x x
2
( )1min
ijj
j N iijw
w
i jx x
![Page 32: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/32.jpg)
32
Solving the minimizations
Linear least squares (using Lagrange multipliers)
To find that minimize,
a sparse eigen-problem is solved. Additional constraints
are added for conditioning:
2
( )1min
ijj
j N iijw
w
i jx x
, ( ),
2
min iji j N i
w
1 n
i jx x
x x
, , dR 1 nx x
10, T
i i
In
i i ix x x
![Page 33: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/33.jpg)
33
Some results
The Swiss roll
![Page 34: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/34.jpg)
34
Some results
![Page 35: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/35.jpg)
35
Some results
![Page 36: Non-linear dimension-reduction methods](https://reader036.vdocument.in/reader036/viewer/2022062315/56815d68550346895dcb71d5/html5/thumbnails/36.jpg)
The end