dimension reduction - polo club of data...

81
Dimension Reduction CSE 6242 A / CS 4803 DVA Feb 14, 2013 Guest Lecturer: Jaegul Choo

Upload: others

Post on 15-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Dimension Reduction

CSE 6242 A / CS 4803 DVAFeb 14, 2013

Guest Lecturer: Jaegul Choo

Page 2: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Dimension Reduction

CSE 6242 A / CS 4803 DVAFeb 14, 2013

Guest Lecturer: Jaegul Choo

Page 3: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Today’s Lecture

Advanced methodsInteractive visualizationPractitioners’ guideResearch trend

Page 4: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Manifold LearningSwiss Roll Data

Swiss roll dataOriginally in 3D

Page 5: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Manifold LearningSwiss Roll Data

Swiss roll dataOriginally in 3D

What is the intrinsic dimensionality? (allowing flattening)

Page 6: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Manifold LearningSwiss Roll Data

Swiss roll dataOriginally in 3D

What is the intrinsic dimensionality? (allowing flattening)

Page 7: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Manifold LearningSwiss Roll Data

Swiss roll dataOriginally in 3D

What is the intrinsic dimensionality? (allowing flattening)

Page 8: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Manifold LearningSwiss Roll Data

Swiss roll dataOriginally in 3D

What is the intrinsic dimensionality? (allowing flattening)

intrinsic ≈ semantic

Page 9: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Manifold LearningSwiss Roll Data

Swiss roll dataOriginally in 3D

What is the intrinsic dimensionality? (allowing flattening)  

           → 2D

intrinsic ≈ semantic

Page 10: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Manifold LearningSwiss Roll Data

Swiss roll dataOriginally in 3D

What is the intrinsic dimensionality? (allowing flattening)  

           → 2D

intrinsic ≈ semantic

What if your data has low intrinsic dimensionality but resides in high-dimensional space?

Page 11: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Manifold LearningGoal and Approach

Manifold“Curvi-linear” low-dimensional structure of your data based on intrinsic dimensionality

Page 12: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Manifold LearningGoal and Approach

Manifold“Curvi-linear” low-dimensional structure of your data based on intrinsic dimensionality

Manifold learning

Page 13: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Manifold LearningGoal and Approach

Manifold“Curvi-linear” low-dimensional structure of your data based on intrinsic dimensionality

Manifold learningMatch intrinsic dimensions to axes of

Page 14: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Manifold LearningGoal and Approach

Manifold“Curvi-linear” low-dimensional structure of your data based on intrinsic dimensionality

Manifold learningMatch intrinsic dimensions to axes of

dimension-reduced output space

Page 15: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Manifold LearningGoal and Approach

Manifold“Curvi-linear” low-dimensional structure of your data based on intrinsic dimensionality

Manifold learningMatch intrinsic dimensions to axes of

dimension-reduced output spaceHow?

Page 16: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Manifold LearningGoal and Approach

Manifold“Curvi-linear” low-dimensional structure of your data based on intrinsic dimensionality

Manifold learningMatch intrinsic dimensions to axes of

dimension-reduced output spaceHow?

Each piece of manifold is appox. linear

Page 17: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Manifold LearningGoal and Approach

Manifold“Curvi-linear” low-dimensional structure of your data based on intrinsic dimensionality

Manifold learningMatch intrinsic dimensions to axes of

dimension-reduced output spaceHow?

Each piece of manifold is appox. linearUtilize local neighborhood information

e.g. for a particular point, Who are my neighbors?How closely am I related to neighbors?

Page 18: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Manifold LearningGoal and Approach

Manifold“Curvi-linear” low-dimensional structure of your data based on intrinsic dimensionality

Manifold learningMatch intrinsic dimensions to axes of

dimension-reduced output spaceHow?

Each piece of manifold is appox. linearUtilize local neighborhood information

e.g. for a particular point, Who are my neighbors?How closely am I related to neighbors?

Demo available at http://www.math.ucla.edu/~wittman/mani/

Page 19: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Isomap(Isometric Feature Mapping)

Let’s preserve pairwise geodesic distance (along manifold)Compute geodesic distance as the shortest path length from k-nearest neighbor (k-NN) graph*Do eigen-decomposition on pairwise geodesic distance matrix to obtain embedding that best preserves given distances

* Recall eigen-decomposition is the main algorithm of PCA

Page 20: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Isomap(Isometric Feature Mapping)

Let’s preserve pairwise geodesic distance (along manifold)Compute geodesic distance as the shortest path length from k-nearest neighbor (k-NN) graph*Do eigen-decomposition on pairwise geodesic distance matrix to obtain embedding that best preserves given distances

* Recall eigen-decomposition is the main algorithm of PCA

Page 21: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Isomap(Isometric Feature Mapping)

Let’s preserve pairwise geodesic distance (along manifold)Compute geodesic distance as the shortest path length from k-nearest neighbor (k-NN) graph

* Recall eigen-decomposition is the main algorithm of PCA

Page 22: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Isomap(Isometric Feature Mapping)

Let’s preserve pairwise geodesic distance (along manifold)Compute geodesic distance as the shortest path length from k-nearest neighbor (k-NN) graph*Do eigen-decomposition on pairwise geodesic distance matrix to obtain embedding that best preserves given distances

* Recall eigen-decomposition is the main algorithm of PCA

Page 23: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Isomap(Isometric Feature Mapping)

Algorithm: all-pair shortest path computation + eigen-decompositionPros: performs well in generalCons: slow (shortest path), sensitive to parameters

Page 24: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Isomap(Isometric Feature Mapping)

Algorithm: all-pair shortest path computation + eigen-decompositionPros: performs well in generalCons: slow (shortest path), sensitive to parameters

Nonlinear

Page 25: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Isomap(Isometric Feature Mapping)

Algorithm: all-pair shortest path computation + eigen-decompositionPros: performs well in generalCons: slow (shortest path), sensitive to parameters

NonlinearUnsupervised

Page 26: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Isomap(Isometric Feature Mapping)

Algorithm: all-pair shortest path computation + eigen-decompositionPros: performs well in generalCons: slow (shortest path), sensitive to parameters

NonlinearUnsupervisedGlobal: all pairwise distances are considered

Page 27: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Isomap(Isometric Feature Mapping)

Algorithm: all-pair shortest path computation + eigen-decompositionPros: performs well in generalCons: slow (shortest path), sensitive to parameters

NonlinearUnsupervisedGlobal: all pairwise distances are consideredFeature vectors

Page 28: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

k=8

k=22 k=49

Cluster structure

IsomapFacial Data Example

(k is the value in k-NN graph)

Angle

Person

Page 29: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

k=8

k=22 k=49

Cluster structure

IsomapFacial Data Example

Which one do you think is the best?

(k is the value in k-NN graph)

Angle

Person

Page 30: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

k=8

k=22 k=49

Cluster structure

IsomapFacial Data Example

Which one do you think is the best?

(k is the value in k-NN graph)

Angle

Person

Page 31: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

t-SNE(t-distributed Stochastic Neighborhood Embedding)

Made specifically for visualization! (in very low dimension)Can reveal clusters without any supervision

e.g., spoken letter data

PCA t-SNE

Official website: http://homepage.tudelft.nl/19j49/t-SNE.html

Page 32: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

t-SNE(t-distributed Stochastic Neighborhood Embedding)

How it workConverts distance into probability

Farther distance gets lower probabilityThen, minimize differences in probability distribution between high- and low-dimensional spaces

KL divergence naturally focuses on neighborhood relationshipsInstead of Gaussian in SNE, t-distribution is used.

Better handles crowding problem in very low dimensionCurse of dimensionality, …

Page 33: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

t-SNE(t-distributed Stochastic Neighborhood Embedding)

Algorithm: gradient-decent type

Pros: works surprisingly well in 2D/3D visualizationCons: very slow

NonlinearUnsupervisedLocalSimilarity input

Page 34: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

LLE (Locally Linear Embedding)Classical Method

Let’s preserve linear reconstruction weight from neighbors

Optional topic

Page 35: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

LLE (Locally Linear Embedding)Classical Method

Algorithm: least squares + eigen-decompositionPros: fastCons: often outperformed by other methods

NonlinearUnsupervisedLocalFeature vectors

Optional topic

Page 36: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Laplacian EigenmapsAnother Classical Method

Criterionmin ∑ij (yi - yj)2 Wij , where

Wij = exp(-||xi - xj|| / t) for neighbors, Wij = 0 otherwise. xi : high-dimensional vector, yi : low-dimensional data vector

t : user-specified parameter

Main ideaBasically, fit all the distances to zeroBut, closer distances -> higher weight Wij -> fit to zero more strongly

c.f.) MDS criterion: fits them to given ideal distancesmin ∑ij (||yi - yj|| - dij)2

Optional topic

Page 37: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Laplacian EigenmapsAnother Classical Method

Nicely backed by graph theory and algorithmAlgorithm: generalized eigen-decomposition

Pros: really fastCons: Has two parameters difficult to determine

NonlinearUnsupervisedLocalSimilarity input

Optional topic

Page 38: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive Visualization

What can you do from visualization via dimension reduction?e.g., Multidimensional scaling applied to document data

Page 39: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive Visualization

As many data items involve, it’s harder to analyzeFor n data items, users are given O(n2) relations spatially encoded in visualization

Too many to do something in general

Page 40: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

Page 41: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

Page 42: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

Page 43: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

Page 44: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

Page 45: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

Page 46: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

Thus, people tend to look for a small number of objects that perceptually/visually stand out, e.g.,

Outliers (if any)

Page 47: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

Thus, people tend to look for a small number of objects that perceptually/visually stand out, e.g.,

Outliers (if any)More commonly,

Subgroups/clustersHowever, it is hard to

expect for DR to always reveal clusters

Page 48: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

Thus, people tend to look for a small number of objects that perceptually/visually stand out, e.g.,

Outliers (if any)More commonly,

Subgroups/clustersHowever, it is hard to

expect for DR to always reveal clusters

Page 49: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

Thus, people tend to look for a small number of objects that perceptually/visually stand out, e.g.,

Outliers (if any)More commonly,

Subgroups/clustersHowever, it is hard to

expect for DR to always reveal clusters

Page 50: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

Thus, people tend to look for a small number of objects that perceptually/visually stand out, e.g.,

Outliers (if any)More commonly,

Subgroups/clustersHowever, it is hard to

expect for DR to always reveal clusters

Page 51: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

Thus, people tend to look for a small number of objects that perceptually/visually stand out, e.g.,

Outliers (if any)More commonly,

Subgroups/clustersHowever, it is hard to

expect for DR to always reveal clusters

Page 52: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

Thus, people tend to look for a small number of objects that perceptually/visually stand out, e.g.,

Outliers (if any)More commonly,

Subgroups/clustersHowever, it is hard to

expect for DR to always reveal clusters

Page 53: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

Thus, people tend to look for a small number of objects that perceptually/visually stand out, e.g.,

Outliers (if any)More commonly,

Subgroups/clustersHowever, it is hard to

expect for DR to always reveal clusters

Page 54: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

Thus, people tend to look for a small number of objects that perceptually/visually stand out, e.g.,

Outliers (if any)More commonly,

Subgroups/clustersHowever, it is hard to

expect for DR to always reveal clusters

Page 55: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

What if DR cannot reveal subgroups/clusters clearly?Or even worse, what if our data do not have originally any?

Page 56: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

What if DR cannot reveal subgroups/clusters clearly?Or even worse, what if our data do not have originally any?

Often, pre-defined grouping information is injected and color-coded.

Page 57: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

DR in Interactive VisualizationWhat to first look at?

What if DR cannot reveal subgroups/clusters clearly?Or even worse, what if our data do not have originally any?

Often, pre-defined grouping information is injected and color-coded. Such grouping information is usually obtained as

Pre-given labels along with dataComputed labels by clustering

Page 58: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Treating two subclusters in digit ‘5’ as separate clustersClassification accuracy improved from 89% to 93% (LDA+k-NN)

Dimension Reduction in ActionHandwritten Digit Data Visualization

Now we can obtain • Cluster/data relationship• Subcluster/outlier

Visualization of handwritten digit data

Page 59: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Treating two subclusters in digit ‘5’ as separate clustersClassification accuracy improved from 89% to 93% (LDA+k-NN)

Dimension Reduction in ActionHandwritten Digit Data Visualization

Now we can obtain • Cluster/data relationship• Subcluster/outlier

Visualization of handwritten digit data

Subcluster #1 in ‘5’ Subcluster #2 in ‘5’

Page 60: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Treating two subclusters in digit ‘5’ as separate clustersClassification accuracy improved from 89% to 93% (LDA+k-NN)

Dimension Reduction in ActionHandwritten Digit Data Visualization

Now we can obtain • Cluster/data relationship• Subcluster/outlier

Visualization of handwritten digit data

Subcluster #1 in ‘5’ Subcluster #2 in ‘5’

Major data in ‘7’ Minor group #1 in ‘7’ Minor group #2 in ‘7’

Page 61: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Treating two subclusters in digit ‘5’ as separate clustersClassification accuracy improved from 89% to 93% (LDA+k-NN)

Dimension Reduction in ActionHandwritten Digit Data Visualization

Now we can obtain • Cluster/data relationship• Subcluster/outlier

Visualization of handwritten digit data

Subcluster #1 in ‘5’ Subcluster #2 in ‘5’

Major data in ‘7’ Minor group #1 in ‘7’ Minor group #2 in ‘7’

Page 62: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Dimension Reduction in ActionVisual Analytics for Document

VisIRR: Visual Information Retrieval and Recommendation

Page 63: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Practitioner’s GuideCaveats

Page 64: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Practitioner’s GuideCaveats

Can you trust dimension reduction results?Expect significant distortion/information loss in 2D/3DWhat algorithm think is the best may not be what we think is the best, e.g., PCA visualization of facial image data

(1, 2)-dimension (3, 4)-dimension

Page 65: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Practitioner’s GuideCaveats

How would you determine the best method and its parameters for your needs?

Unlike typical data mining problems where only one shot is allowed, you can freely try out different methods with different parametersBasic understanding of methods will greatly help applying them properly

What is a particular method trying to achieve? And how suitable is it to your needs?What are the effects of increasing/decreasing parameters?

Page 66: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Practitioner’s GuideGeneral Recommendation

Want something simple and fast to visualize data?PCA, force-directed layout

Want to first try some manifold learning methods?Isomap

if it doesn’t show any good, probably neither will anything else. Have cluster label to use? (pre-given or computed)

LDA (supervised)Supervised approach is sometimes the only viable option when your data do not have clearly separable clusters

No labels, but still want some clusters to be revealed? Or simply, want some state-of-the-art method for visualization?

t-SNE (but, may be slow)

Page 67: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Practitioner’s GuideResults Still Not Good?

Pre-process data properly as neededData centering

Subtract the global mean from each vectorNormalization

Make each vector have unit Euclidean normOtherwise, a few outlier can affect dimension reduction significantly

Application-specific pre-processingDocument: TF-IDF weighting, remove too rare and/or short termsImage: histogram normalization

Page 68: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Practitioner’s GuideToo Slow?

Apply PCA to reduce to an intermediate dimensions before the main dimension reduction step

t-SNE does it by defaultThe results may even be improved due to noise removed by PCA

See if there is any approximated but faster versionLandmarked versions (only using a subset of data items)

e.g., landmarked IsomapLinearized versions (the same criterion, but only allow linear mapping)

e.g., Laplacian Eigenmaps → Locality preserving projection

Page 69: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Practitioner’s GuideStill need more?

Page 70: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Practitioner’s GuideStill need more?

Be creative! And feel free to tweak dimension reductionPlay with its algorithm, convergence criteria, etc.

See if you can impose label information

Original t-SNE t-SNE with simple modification

Page 71: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Practitioner’s GuideStill need more?

Page 72: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Practitioner’s GuideStill need more?

Be creative! And feel free to tweak dimension reductionPlay with its algorithm, convergence criteria, etc.

See if you can impose label informationrestrict the number of iterations as whatever you can afford.

The raison d’etre of DR is to serve us in exploring data and solving complicated real-world problems

Page 73: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Or…

Start your own research to make

Page 74: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Or…

Start your own research to make

Page 75: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Gold Mine for Researchers

Compared to recent advancement in dimension reduction, its application in visualization is highly under-explored

That is, dimension reduction for visualization still needs to Be fasterBe more interpretableGive more semantically meaningful resultsBe more interactive and responsive to users

Page 76: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Research ExampleVisualize It-Wise*

MotivationForce directed layout revisited

http://prefuse.org/gallery/graphview/

Why not making other methods like this?

*Joint work with Changhyun Lee and Haesun Park

Page 79: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Thank you for your attention!

Page 80: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Thank you for your attention!

Page 81: Dimension Reduction - Polo Club of Data Sciencepoloclub.gatech.edu/cse6242/2013spring/lectures/CSE6242...2013/02/14  · Dimension Reduction in Action Handwritten Digit Data Visualization

Kernel PCAGeneralization of Isomap

What is kernel function?Given two vectors, kernel gives their pairwise similarity

Gaussian, polynomial, exponential, etc.

Kernel PCA: nonlinear PCA using kernelConstruct a pairwise distance matrix computed by kernelDo eigen-decomposition on pairwise distance matrix to obtain embedding that best preserves given distances

Isomap can be viewed as kernel PCA with a particular kernel using geodesic distance. Many other methods can be also viewed as kernel PCA