computational topology - mapperkbuchin/teaching/2ima00/2018/slides/mapp… · i feature selection...

Post on 25-Jun-2020

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Computational Topology - Mapper

Jiaqi Ni

Eindhoven University of Technology

June 14, 2018

Outline

Introduction

Mapper in the continuous setting

Mapper in practice

Parameters of Mapper in practice

Applications

Summary

Introduction

Mapper in the continuous setting

Mapper in practice

Parameters of Mapper in practice

Applications

Summary

Introduction

I Mapper is a computational method for extracting simpledescriptions of high dimensional data sets in the form ofsimplicial complexes.

Recap about Reeb Graph

Definition: The Reeb graph of f is the set of contours R(f).

Recap about Reeb Graph

We can get similar result as Reeb Graph with Mapper.

Recap about Reeb Graph

We can also get the more different results from Reeb Graph withMapper.

Introduction

Mapper in the continuous setting

Mapper in practice

Parameters of Mapper in practice

Applications

Summary

Cover of space

If the set X is a topological space, then a cover C of X is acollection of subsets U of X whose union is the whole spaceX. In this case we say that C covers X, or that the sets Ucover X.

Topological Space X Cover of Space X

Cover of space

If the set X is a topological space, then a cover C of X is acollection of subsets U of X whose union is the whole spaceX. In this case we say that C covers X, or that the sets Ucover X.

Topological Space X Cover of Space X

Cover of space

If Y is a subset of X, then a cover of Y is a collection ofsubsets of X whose union contains Y,

i.e., C is a cover of Y if Y ⊆⋃α∈C

Cover of space

If Y is a subset of X, then a cover of Y is a collection ofsubsets of X whose union contains Y,

i.e., C is a cover of Y if Y ⊆⋃α∈C

Cover refinement

I A refinement of a cover C of a topological space X is a newcover D of X such that every set in D is contained in someset in C.

I Formally: D = {Vβ∈B} is a refinement of C = {Uα∈A}when ∀β ∃α Vβ ⊆ Uα

Space X Cover of Space X Refinement of Cover

Cover refinement

I A refinement of a cover C of a topological space X is a newcover D of X such that every set in D is contained in someset in C.

I Formally: D = {Vβ∈B} is a refinement of C = {Uα∈A}when ∀β ∃α Vβ ⊆ Uα

Space X Cover of Space X Refinement of Cover

Cover refinement

I A refinement of a cover C of a topological space X is a newcover D of X such that every set in D is contained in someset in C.

I Formally: D = {Vβ∈B} is a refinement of C = {Uα∈A}when ∀β ∃α Vβ ⊆ Uα

Space X Cover of Space X Refinement of Cover

Mapper in the continuous setting

Input:

I Continuous function(filter) f : X→ RI Cover C of im(f) by open intervals: im(f ) ⊆

⋃c∈C

c

Method:

I Compute pullback cover U of X: U = f −1(c)c∈CI Refine U by separating each of its elements into its various

connected components → connected cover VI The Mapper is the nerve of V:

I 1 vertex per element V ∈ VI 1 edge per intersection V ∪ V ′ 6= ø, V ,V ′ ∈ VI 1 k-simplex per (k + 1)-fold intersection,⋃k

i=0 Vi 6= ø,V0,V1...Vk ∈ V

Example of Mapper in the continuous setting

Example of Mapper in the continuous setting

Example of Mapper in the continuous setting

Introduction

Mapper in the continuous setting

Mapper in practice

Parameters of Mapper in practice

Applications

Summary

Mapper in practice

Input:

I Point cloud P with distance matrix

I Continuous function(filter) f : P → RI Cover C of im(f) by open intervals: im(f ) ⊆

⋃c∈C

c

Method:

I Compute pullback cover U of X: U = f −1(c)c∈CI Refine U by applying clustering algorithm(with distance

threshold δ) → connected cover VI The Mapper is the nerve of V:

I 1 vertex per element V ∈ VI 1 edge per intersection V ∪ V ′ 6= ø, V ,V ′ ∈ VI 1 k-simplex per (k + 1)-fold intersection,⋃k

i=0 Vi 6= ø,V0,V1...Vk ∈ V

Example of Mapper in practice

Example of Mapper in practice

Introduction

Mapper in the continuous setting

Mapper in practice

Parameters of Mapper in practice

Applications

Summary

Parameters of Mapper in practice

I Filter f : P → R

I Cover C of im(f) by open intervals:

I Clustering algorithm

Parameters of Mapper in practice

I Filter f : P → R

I Cover C of im(f) by open intervals:

I Clustering algorithm

Parameters of Mapper in practice

I Filter f : P → R

I Cover C of im(f) by open intervals:

I Clustering algorithm

Parameters of Mapper in practice - Filter functions

I The outcome of Mapper is highly dependent on the functionchosen to partition (filter) the data set and the choice offunctions depends mostly on the dataset.

I Possible functions:I DensityI EccentricityI Graph LaplaciansI sum/average/max/minI x/y- axis projection

Filter function examples

Filter function examples

Filter function examples

Filter function examples

Parameters of Mapper in practice - Cover

I Uniform cover II resolution / granularity: r (diameter of intervals)I gain: g (percentage of overlap)

I Example:

I Modification of r and g can highly effect the result.

Parameters of Mapper in practice - Cover

I Uniform cover II resolution / granularity: r (diameter of intervals)I gain: g (percentage of overlap)

I Example:

I Modification of r and g can highly effect the result.

Parameters of Mapper in practice - Cover

I Uniform cover II resolution / granularity: r (diameter of intervals)I gain: g (percentage of overlap)

I Example:

I Modification of r and g can highly effect the result.

Cover examples

Cover examples

Cover examples

Cover examples

Mapper for Y-shape point cloud data

Mapper for Y-shape point cloud data

Parameters of uniform Cover

Parameter r:

I Small r : fine cover, Mapper close to Reeb Graph, butsensitive to δ.

I Large r : rough cover, less sensitive to δ, but Mapper far fromReeb Graph.

Parameter g:

I Large g(close to 1): more points inside intersections, lesssensitive to δ but far from Reeb Graph.

I Small g(close to 0): controlled Mapper dimension, close toReeb Graph.

Parameters of Mapper in practice - Clustering algorithm

Single-linkage clustering is one of several methods of hierarchicalclustering.

I Based on grouping clusters in bottom-up fashion(agglomerative clustering).

I At each step combining two clusters that contain the closestpair of elements not yet belonging to the same cluster as eachother.

Example of Single-linkage clustering

Example of Single-linkage clustering

Example of Single-linkage clustering

Example of Single-linkage clustering

Example of Single-linkage clustering

Example of Single-linkage clustering

Example of Single-linkage clustering

Example of Clustering algorithm with different parameters

Example of Clustering algorithm with different parameters

Example of Clustering algorithm with different parameters

Example of Clustering algorithm with different parameters

Parameters of graph neighborhood size

Parameter δ:

I Large δ: fewer nodes, clean Mapper but far from ReebGraph(more straight lines).

I Small δ: presence of topological structure but lots of nodes(noisy).

Higher Dimensional Parameter Spaces

I We use 1 function and let R to be our 1-dimensionalparameter space.

I We can use M functions and let RM to be our M-dimensionalparameter space, remain to find a covering of anM-dimensional hypercube which is defined by the ranges ofthe M functions.

Higher Dimensional Parameter Spaces

I We use 1 function and let R to be our 1-dimensionalparameter space.

I We can use M functions and let RM to be our M-dimensionalparameter space, remain to find a covering of anM-dimensional hypercube which is defined by the ranges ofthe M functions.

Example of parameter space R2

I Assume we have a point could dataset P (2-Dim) as following.

I Assume we have two filter functions f : P → R, g : P → R,and f = f −1 and g = g−1.

Example of parameter space R2

I Moreover, assume we have the following cover C , which isalso the cover of P since f = f −1 and g = g−1.

Example of parameter space R2

I Moreover, assume we have the following cover C , which isalso the cover of P since f = f −1 and g = g−1.

I Assume the clustering algorithm group every points in eachrectangle as one cluster.

Example of parameter space R2

I Moreover, assume we have the following cover C , which isalso the cover of P since f = f −1 and g = g−1.

I Assume the clustering algorithm group every points in eachrectangle as one cluster.

Example of parameter space R2

I Moreover, assume we have the following cover C , which isalso the cover of P since f = f −1 and g = g−1.

I Assume the clustering algorithm group every points in eachrectangle as one cluster.

I Whenever clusters corresponding to any n vertices have nonempty intersection, add a corresponding n-1 simplex.

Example of parameter space R2

I Two clusters intersection = 1 edge.

Example of parameter space R2

I Three clusters intersection = 1 triangle.

Example of parameter space R2

I Four clusters intersection = 1 tetrahedron.

Example of parameter space R2

I Final simplical complex.

Higher Dimensional Parameter Spaces

Mapper to the parameter space RM can be extended in a similarfashion (by finding a covering of an M-dimensional hypercubewhich is defined by the ranges of the M functions).

Introduction

Mapper in the continuous setting

Mapper in practice

Parameters of Mapper in practice

Applications

Summary

Mapper in Applications

Most commonly used in:

I Clustering

I Feature selection (flares, loops)

Applications to Medical science data

145 patients who had diabetes, for each patient, six quantitieswere measured:

I Age

I Relative weight

I Fasting plasma glucose

I Area under the plasma glucose curve for the three hourglucose tolerance test (OGTT)

I Aarea under the plasma insulin curve for the (OGTT)

I Steady state plasma glucose response

This creates a 6 dimensional data set.

Applications to Medical science data

I Applying projection pursuit methods to obtain a projectioninto three dimensional Euclidean space

We want to use Mapper as an automatic tool for detectingsuch flares in the data.

Applications to Medical science data

I Applying projection pursuit methods to obtain a projectioninto three dimensional Euclidean space

We want to use Mapper as an automatic tool for detectingsuch flares in the data.

Applications to Medical science data

I Left: 3 intervals, 50% overlap.

I Right: 4 intervals, 50% overlap.I For each output:

I Left flare: adult onset Right flare: juvenile onsetI Distance function: L2-distanceI Filter function: density kernel with e=130,000

Mapper in Applications

I Innate and adaptive T cells in asthmatic patients:Relationship to severity and disease mechanisms, Hinks et al.,J. Allergy Clinical Immunology, 2015

I Topological Data Analysis for Discovery in Preclinical SpinalCord Injury and Traumatic Brain Injury, Nielson et al., Nature,2015

I Using Topological Data Analysis for Diagnosis PulmonaryEmbolism, Rucco et al., arXiv preprint, 2014

I CD8 T-cell reactivity to islet antigens is unique to type 1while CD4 T-cell reactivity exists in both type 1 and type 2diabetes, Sarikonda et al., J. Autoimmunity, 2013

I Extracting insights from the shape of complex data usingtopology, Lum et al., Nature, 2013

I Topological Methods for Exploring Low-density States inBiomolecular Folding Pathways, Yao et al., J. ChemicalPhysics, 2009

Introduction

Mapper in the continuous setting

Mapper in practice

Parameters of Mapper in practice

Applications

Summary

Summary

I Mapper: a computational method which retrieves ahigher-level understanding of the structure of data.

I Mapper in continuous setting.

I Mapper in practiceI Parameters of Mapper in practice

I filter function.I covering algorithm.I clustering algorithm.

I Applications

Sources

I [SMG07] G. Singh, F. M’emoli, G. Carlsson, TopologicalMethods for the Analysis of High Dimensional Data Sets and3D Object Recognition, Eurographics Symposium onPoint-Based Graphics 2007.

I Examples and images from Tutorial of topological dataanalysis part 3(Mapper algorithm):https://www.slideshare.net/Eniod/tutorial-of-topological-data-analysis-part-3mapper-algorithm

I Examples and images from Introduction to Topological DataAnalysis:https://www.slideshare.net/hendrikarisma/introduction-to-topological-data-analysis-59759836

I Examples and images from KeplerMapper:https://mlwave.github.io/kepler-mapper/

top related