topology and data - stanford universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf ·...

173
Topology and Data Gunnar Carlsson 1 Department of Mathematics Stanford University http://comptop.stanford.edu/ June 27, 2008 1 Research supported in part by DARPA and NSF

Upload: hoangngoc

Post on 18-Feb-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Topology and Data

Gunnar Carlsson 1

Department of MathematicsStanford University

http://comptop.stanford.edu/

June 27, 2008

1Research supported in part by DARPA and NSF

Page 2: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Introduction

I General area of geometric data analysis attempts to giveinsight into data by imposing a geometry on it

I Sometimes very natural (physics), sometimes less so(genomics)

I Value of geometry is that it allows us to organize and viewdata more effectively, for better understanding

I Can obtain an idea of a reasonable layout or overview of thedata

I Sometimes all that is required is a qualitative overview

Page 3: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Introduction

I General area of geometric data analysis attempts to giveinsight into data by imposing a geometry on it

I Sometimes very natural (physics), sometimes less so(genomics)

I Value of geometry is that it allows us to organize and viewdata more effectively, for better understanding

I Can obtain an idea of a reasonable layout or overview of thedata

I Sometimes all that is required is a qualitative overview

Page 4: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Introduction

I General area of geometric data analysis attempts to giveinsight into data by imposing a geometry on it

I Sometimes very natural (physics), sometimes less so(genomics)

I Value of geometry is that it allows us to organize and viewdata more effectively, for better understanding

I Can obtain an idea of a reasonable layout or overview of thedata

I Sometimes all that is required is a qualitative overview

Page 5: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Introduction

I General area of geometric data analysis attempts to giveinsight into data by imposing a geometry on it

I Sometimes very natural (physics), sometimes less so(genomics)

I Value of geometry is that it allows us to organize and viewdata more effectively, for better understanding

I Can obtain an idea of a reasonable layout or overview of thedata

I Sometimes all that is required is a qualitative overview

Page 6: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Introduction

I General area of geometric data analysis attempts to giveinsight into data by imposing a geometry on it

I Sometimes very natural (physics), sometimes less so(genomics)

I Value of geometry is that it allows us to organize and viewdata more effectively, for better understanding

I Can obtain an idea of a reasonable layout or overview of thedata

I Sometimes all that is required is a qualitative overview

Page 7: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Methods for Imposing a Geometry

Define a metric

Page 8: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Methods for Imposing a Geometry

Define a graph or network structure

Page 9: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Methods for Imposing a Geometry

Cluster the data

Page 10: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Methods for Summarizing or Visualizing a Geometry

Linear projections

Page 11: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Methods for Summarizing or Visualizing a Geometry

Multidimensional scaling, ISOMAP, LLE

Page 12: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Methods for Summarizing or Visualizing a Geometry

Project to a tree

Page 13: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Properties of Data Geometries

We Don’t Trust Large Distances

I In physics, distances have strong theoretical backing, andshould be viewed as reliable

I In biology or social sciences, distances are constructed using anotion of similarity, but have no theoretical backing (e.g.Jukes-Cantor distance between sequences)

I Means that small distances still represent similarity, butcomparison of long distances makes little sense

Page 14: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Properties of Data Geometries

We Don’t Trust Large Distances

I In physics, distances have strong theoretical backing, andshould be viewed as reliable

I In biology or social sciences, distances are constructed using anotion of similarity, but have no theoretical backing (e.g.Jukes-Cantor distance between sequences)

I Means that small distances still represent similarity, butcomparison of long distances makes little sense

Page 15: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Properties of Data Geometries

We Don’t Trust Large Distances

I In physics, distances have strong theoretical backing, andshould be viewed as reliable

I In biology or social sciences, distances are constructed using anotion of similarity, but have no theoretical backing (e.g.Jukes-Cantor distance between sequences)

I Means that small distances still represent similarity, butcomparison of long distances makes little sense

Page 16: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Properties of Data Geometries

We Don’t Trust Large Distances

I In physics, distances have strong theoretical backing, andshould be viewed as reliable

I In biology or social sciences, distances are constructed using anotion of similarity, but have no theoretical backing (e.g.Jukes-Cantor distance between sequences)

I Means that small distances still represent similarity, butcomparison of long distances makes little sense

Page 17: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Properties of Data Geometries

We Only Trust Small Distances a Bit

1.5 1.8

I Both pairs are regarded as similar, but the strength of thesimilarity as encoded by the distance may not be so significant

I Similarity more like a 0/1-valued quantity than R-valued

Page 18: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Properties of Data Geometries

We Only Trust Small Distances a Bit

1.5 1.8

I Both pairs are regarded as similar, but the strength of thesimilarity as encoded by the distance may not be so significant

I Similarity more like a 0/1-valued quantity than R-valued

Page 19: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Properties of Data Geometries

We Only Trust Small Distances a Bit

1.5 1.8

I Both pairs are regarded as similar, but the strength of thesimilarity as encoded by the distance may not be so significant

I Similarity more like a 0/1-valued quantity than R-valued

Page 20: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Properties of Data Geometries

We Only Trust Small Distances a Bit

1.5 1.8

I Both pairs are regarded as similar, but the strength of thesimilarity as encoded by the distance may not be so significant

I Similarity more like a 0/1-valued quantity than R-valued

Page 21: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Properties of Data Geometries

Connections are Noisy

I Distance measurements are noisy, as are the connections inmany graph models

I Requires stochastic geometric methods for study

I Methods of Coifman et al and others relevant here

Page 22: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Properties of Data Geometries

Connections are Noisy

I Distance measurements are noisy, as are the connections inmany graph models

I Requires stochastic geometric methods for study

I Methods of Coifman et al and others relevant here

Page 23: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Properties of Data Geometries

Connections are Noisy

I Distance measurements are noisy, as are the connections inmany graph models

I Requires stochastic geometric methods for study

I Methods of Coifman et al and others relevant here

Page 24: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Properties of Data Geometries

Connections are Noisy

I Distance measurements are noisy, as are the connections inmany graph models

I Requires stochastic geometric methods for study

I Methods of Coifman et al and others relevant here

Page 25: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Topology

Homeomorphic

Page 26: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Topology

Homeomorphic

Page 27: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Topology

I To see that these pairs are “same” requires distortion ofdistances, i.e. stretching and shrinking

I We do not permit “tearing”, i.e. distorting distances in adiscontinuous way

I How to make this precise?

Page 28: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Topology

I To see that these pairs are “same” requires distortion ofdistances, i.e. stretching and shrinking

I We do not permit “tearing”, i.e. distorting distances in adiscontinuous way

I How to make this precise?

Page 29: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Topology

I To see that these pairs are “same” requires distortion ofdistances, i.e. stretching and shrinking

I We do not permit “tearing”, i.e. distorting distances in adiscontinuous way

I How to make this precise?

Page 30: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Topology

I One would like to say that all non-zero distances in a metricspace are the same

I But, d(x , y) = 0 means x = y

I Idea: consider instead distances from points to subsets. Canbe zero.

This accomplishes the intuitive idea of permitting arbitraryrescalings of distances while leaving “infinite nearness”intact.

Page 31: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Topology

I One would like to say that all non-zero distances in a metricspace are the same

I But, d(x , y) = 0 means x = y

I Idea: consider instead distances from points to subsets. Canbe zero.

This accomplishes the intuitive idea of permitting arbitraryrescalings of distances while leaving “infinite nearness”intact.

Page 32: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Topology

I One would like to say that all non-zero distances in a metricspace are the same

I But, d(x , y) = 0 means x = y

I Idea: consider instead distances from points to subsets. Canbe zero.

This accomplishes the intuitive idea of permitting arbitraryrescalings of distances while leaving “infinite nearness”intact.

Page 33: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Topology

I One would like to say that all non-zero distances in a metricspace are the same

I But, d(x , y) = 0 means x = y

I Idea: consider instead distances from points to subsets. Canbe zero.

This accomplishes the intuitive idea of permitting arbitraryrescalings of distances while leaving “infinite nearness”intact.

Page 34: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Topology

I Topology is the idealized form of what we want in dealingwith data, namely permitting arbitrary rescalings which varyover the space

I Now must make versions of topological methods which are“less idealized”

I Means in particular finding ways of tracking or summarizingbehavior as metrics are deformed or other parameters arechanged

I Ultimately means building in noise and uncertainty. This is inthe future - “statistical topology”.

Page 35: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Topology

I Topology is the idealized form of what we want in dealingwith data, namely permitting arbitrary rescalings which varyover the space

I Now must make versions of topological methods which are“less idealized”

I Means in particular finding ways of tracking or summarizingbehavior as metrics are deformed or other parameters arechanged

I Ultimately means building in noise and uncertainty. This is inthe future - “statistical topology”.

Page 36: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Topology

I Topology is the idealized form of what we want in dealingwith data, namely permitting arbitrary rescalings which varyover the space

I Now must make versions of topological methods which are“less idealized”

I Means in particular finding ways of tracking or summarizingbehavior as metrics are deformed or other parameters arechanged

I Ultimately means building in noise and uncertainty. This is inthe future - “statistical topology”.

Page 37: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Topology

I Topology is the idealized form of what we want in dealingwith data, namely permitting arbitrary rescalings which varyover the space

I Now must make versions of topological methods which are“less idealized”

I Means in particular finding ways of tracking or summarizingbehavior as metrics are deformed or other parameters arechanged

I Ultimately means building in noise and uncertainty. This is inthe future - “statistical topology”.

Page 38: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Outline

1. Homology as signature for shape identification

2. Image processing example

3. Topological “imaging” of data

4. Signatures for significance of structural invariants

Page 39: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Outline

1. Homology as signature for shape identification

2. Image processing example

3. Topological “imaging” of data

4. Signatures for significance of structural invariants

Page 40: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Outline

1. Homology as signature for shape identification

2. Image processing example

3. Topological “imaging” of data

4. Signatures for significance of structural invariants

Page 41: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Outline

1. Homology as signature for shape identification

2. Image processing example

3. Topological “imaging” of data

4. Signatures for significance of structural invariants

Page 42: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology

I Homology: crudest measure of topological properties

I For every space X and dimension k, constructs a vector spaceHk(X ) whose dimension (the k-th Betti number βk) is amathematically precise version of the intuitive notion ofcounting “k-dimensional holes”

I Computed using linear algebraic methods, basically Smithnormal form

I β0 is a count of the number of connected components

I βi ’s form a signature which encodes topological informationabout the shape

Page 43: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology

I Homology: crudest measure of topological properties

I For every space X and dimension k, constructs a vector spaceHk(X ) whose dimension (the k-th Betti number βk) is amathematically precise version of the intuitive notion ofcounting “k-dimensional holes”

I Computed using linear algebraic methods, basically Smithnormal form

I β0 is a count of the number of connected components

I βi ’s form a signature which encodes topological informationabout the shape

Page 44: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology

I Homology: crudest measure of topological properties

I For every space X and dimension k, constructs a vector spaceHk(X ) whose dimension (the k-th Betti number βk) is amathematically precise version of the intuitive notion ofcounting “k-dimensional holes”

I Computed using linear algebraic methods, basically Smithnormal form

I β0 is a count of the number of connected components

I βi ’s form a signature which encodes topological informationabout the shape

Page 45: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology

I Homology: crudest measure of topological properties

I For every space X and dimension k, constructs a vector spaceHk(X ) whose dimension (the k-th Betti number βk) is amathematically precise version of the intuitive notion ofcounting “k-dimensional holes”

I Computed using linear algebraic methods, basically Smithnormal form

I β0 is a count of the number of connected components

I βi ’s form a signature which encodes topological informationabout the shape

Page 46: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology

I Homology: crudest measure of topological properties

I For every space X and dimension k, constructs a vector spaceHk(X ) whose dimension (the k-th Betti number βk) is amathematically precise version of the intuitive notion ofcounting “k-dimensional holes”

I Computed using linear algebraic methods, basically Smithnormal form

I β0 is a count of the number of connected components

I βi ’s form a signature which encodes topological informationabout the shape

Page 47: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology

β0 = 1, β1 = 1, and βi = 0 for i ≥ 2

Page 48: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology

β0 = 1, β1 = 0, β2 = 0, and βk = 0 for k ≥ 3

Page 49: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology

β0 = 1, β1 = 2, β2 = 1, and βk = 0 for k ≥ 3

Page 50: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology

Question: For a point cloud X , can one infer the Bettinumbers of the space X from which it is sampled?

Page 51: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology - Cech Complex

Page 52: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology - Cech Complex

C(X , ε) - involves a choice of a parameter ε (radius of the balls)

Points are connected if balls of radius ε around them overlap

Complex grows with ε

Page 53: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology - Cech Complex

C(X , ε) - involves a choice of a parameter ε (radius of the balls)

Points are connected if balls of radius ε around them overlap

Complex grows with ε

Page 54: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology - Cech Complex

C(X , ε) - involves a choice of a parameter ε (radius of the balls)

Points are connected if balls of radius ε around them overlap

Complex grows with ε

Page 55: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology

Page 56: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology

Page 57: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology

1=3

Page 58: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology

1=2

Page 59: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology

I Obtain a diagram of vector spaces

· · · → Hi (C(X , ε1))→ Hi (C(X , ε2))→ Hi (C(X , ε3))→ · · ·

when ε1 ≤ ε2 ≤ ε3 etc.

I Called persistence vector spaces

I Such diagrams can be classified by bar codes

I Analogue of dimension for ordinary vector spaces

Page 60: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology

I Obtain a diagram of vector spaces

· · · → Hi (C(X , ε1))→ Hi (C(X , ε2))→ Hi (C(X , ε3))→ · · ·

when ε1 ≤ ε2 ≤ ε3 etc.

I Called persistence vector spaces

I Such diagrams can be classified by bar codes

I Analogue of dimension for ordinary vector spaces

Page 61: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology

I Obtain a diagram of vector spaces

· · · → Hi (C(X , ε1))→ Hi (C(X , ε2))→ Hi (C(X , ε3))→ · · ·

when ε1 ≤ ε2 ≤ ε3 etc.

I Called persistence vector spaces

I Such diagrams can be classified by bar codes

I Analogue of dimension for ordinary vector spaces

Page 62: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology

I Obtain a diagram of vector spaces

· · · → Hi (C(X , ε1))→ Hi (C(X , ε2))→ Hi (C(X , ε3))→ · · ·

when ε1 ≤ ε2 ≤ ε3 etc.

I Called persistence vector spaces

I Such diagrams can be classified by bar codes

I Analogue of dimension for ordinary vector spaces

Page 63: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology - Bar Codes

A segment indicates a basis element “born” at the left handendpoint and which dies at the right hand endpoint

Geometrically, means a loop which begins to exist (i.e. becomesclosed) at the left hand point and is filled in at the right handendpoint.

Page 64: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology - Bar Codes

A segment indicates a basis element “born” at the left handendpoint and which dies at the right hand endpoint

Geometrically, means a loop which begins to exist (i.e. becomesclosed) at the left hand point and is filled in at the right handendpoint.

Page 65: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology - Bar Codes

Interpretation:

Long segments correspond to “honest” geometric features in thepoint cloud

Short segments correspond to “noise”

Look at an example.

Page 66: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology - Bar Codes

Interpretation:

Long segments correspond to “honest” geometric features in thepoint cloud

Short segments correspond to “noise”

Look at an example.

Page 67: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology - Bar Codes

Interpretation:

Long segments correspond to “honest” geometric features in thepoint cloud

Short segments correspond to “noise”

Look at an example.

Page 68: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Persistent Homology - Bar Codes

Interpretation:

Long segments correspond to “honest” geometric features in thepoint cloud

Short segments correspond to “noise”

Look at an example.

Page 69: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

I Joint with V. de Silva, T. Ishkanov, A. Zomorodian

I An image taken by black and white digital camera can beviewed as a vector, with one coordinate for each pixel

I Each pixel has a “gray scale” value, can be thought of as areal number (in reality, takes one of 255 values)

I Typical camera uses tens of thousands of pixels, so images liein a very high dimensional space, call it pixel space, P

Page 70: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

I Joint with V. de Silva, T. Ishkanov, A. Zomorodian

I An image taken by black and white digital camera can beviewed as a vector, with one coordinate for each pixel

I Each pixel has a “gray scale” value, can be thought of as areal number (in reality, takes one of 255 values)

I Typical camera uses tens of thousands of pixels, so images liein a very high dimensional space, call it pixel space, P

Page 71: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

I Joint with V. de Silva, T. Ishkanov, A. Zomorodian

I An image taken by black and white digital camera can beviewed as a vector, with one coordinate for each pixel

I Each pixel has a “gray scale” value, can be thought of as areal number (in reality, takes one of 255 values)

I Typical camera uses tens of thousands of pixels, so images liein a very high dimensional space, call it pixel space, P

Page 72: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

I Joint with V. de Silva, T. Ishkanov, A. Zomorodian

I An image taken by black and white digital camera can beviewed as a vector, with one coordinate for each pixel

I Each pixel has a “gray scale” value, can be thought of as areal number (in reality, takes one of 255 values)

I Typical camera uses tens of thousands of pixels, so images liein a very high dimensional space, call it pixel space, P

Page 73: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

D. Mumford: What can be said about the set of images I ⊆ Pone obtains when one takes many images with a digital camera?

(Lee, Mumford, Pedersen): Useful to study local structure ofimages statistically

Page 74: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

D. Mumford: What can be said about the set of images I ⊆ Pone obtains when one takes many images with a digital camera?

(Lee, Mumford, Pedersen): Useful to study local structure ofimages statistically

Page 75: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

D. Mumford: What can be said about the set of images I ⊆ Pone obtains when one takes many images with a digital camera?

(Lee, Mumford, Pedersen): Useful to study local structure ofimages statistically

Page 76: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

3× 3 patches in images

Page 77: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

Observations:

1. Each patch gives a vector in R9

2. Most patches will be nearly constant, or low contrast, becauseof the presence of regions of solid shading in most images

3. Low contrast will dominate statistics, not interesting

Page 78: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

Observations:

1. Each patch gives a vector in R9

2. Most patches will be nearly constant, or low contrast, becauseof the presence of regions of solid shading in most images

3. Low contrast will dominate statistics, not interesting

Page 79: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

Observations:

1. Each patch gives a vector in R9

2. Most patches will be nearly constant, or low contrast, becauseof the presence of regions of solid shading in most images

3. Low contrast will dominate statistics, not interesting

Page 80: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

Observations:

1. Each patch gives a vector in R9

2. Most patches will be nearly constant, or low contrast, becauseof the presence of regions of solid shading in most images

3. Low contrast will dominate statistics, not interesting

Page 81: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

I Lee-Mumford-Pedersen [LMP] study only high contrastpatches

I Collect c:a 4.5× 106 high contrast patches from a collectionof images obtained by van Hateren and van der Schaaf

I Normalize mean intensity by subtracting mean from each pixelvalue to obtain patches with mean intensity = 0

I Puts data on an 8-dimensional hyperplane, ∼= R8

Page 82: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

I Lee-Mumford-Pedersen [LMP] study only high contrastpatches

I Collect c:a 4.5× 106 high contrast patches from a collectionof images obtained by van Hateren and van der Schaaf

I Normalize mean intensity by subtracting mean from each pixelvalue to obtain patches with mean intensity = 0

I Puts data on an 8-dimensional hyperplane, ∼= R8

Page 83: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

I Lee-Mumford-Pedersen [LMP] study only high contrastpatches

I Collect c:a 4.5× 106 high contrast patches from a collectionof images obtained by van Hateren and van der Schaaf

I Normalize mean intensity by subtracting mean from each pixelvalue to obtain patches with mean intensity = 0

I Puts data on an 8-dimensional hyperplane, ∼= R8

Page 84: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

I Lee-Mumford-Pedersen [LMP] study only high contrastpatches

I Collect c:a 4.5× 106 high contrast patches from a collectionof images obtained by van Hateren and van der Schaaf

I Normalize mean intensity by subtracting mean from each pixelvalue to obtain patches with mean intensity = 0

I Puts data on an 8-dimensional hyperplane, ∼= R8

Page 85: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

I Normalize contrast by dividing by the norm, so obtain patcheswith norm = 1

I Means that data now lies on a 7-D ellipsoid, ∼= S7

Page 86: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

I Normalize contrast by dividing by the norm, so obtain patcheswith norm = 1

I Means that data now lies on a 7-D ellipsoid, ∼= S7

Page 87: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

Result: Point cloud data M lying on a sphere in R8

We wish to analyze it with persistent homology to understand itqualitatively

Page 88: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

Result: Point cloud data M lying on a sphere in R8

We wish to analyze it with persistent homology to understand itqualitatively

Page 89: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

First Observation: The points fill out S7 in the sense that everypoint in S7 is “close” to a point in M

However, density of points varies a great deal from region to region

How to analyze?

Page 90: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

First Observation: The points fill out S7 in the sense that everypoint in S7 is “close” to a point in M

However, density of points varies a great deal from region to region

How to analyze?

Page 91: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

First Observation: The points fill out S7 in the sense that everypoint in S7 is “close” to a point in M

However, density of points varies a great deal from region to region

How to analyze?

Page 92: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

Threshholding M

Define M[T ] ⊆M by

M[T ] = {x |x is in T -th percentile of densest points}

What is the persistent homology of these M[T ]’s?

Page 93: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

Threshholding M

Define M[T ] ⊆M by

M[T ] = {x |x is in T -th percentile of densest points}

What is the persistent homology of these M[T ]’s?

Page 94: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

Threshholding M

Define M[T ] ⊆M by

M[T ] = {x |x is in T -th percentile of densest points}

What is the persistent homology of these M[T ]’s?

Page 95: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

5× 104 points, T = 25

One-dimensional barcode, suggests β1 = 5

Page 96: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

THREE CIRCLE MODEL

Page 97: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

THREE CIRCLE MODEL

Page 98: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Three Circle Model

Red and green circles do not touch, each touches black circle

Page 99: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

Does the data fit with this model?

Page 100: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

PRIMARY

SECONDARY SECONDARY

Page 101: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

IS THERE A TWO DIMENSIONAL SURFACE IN WHICHTHIS PICTURE FITS?

Page 102: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

4.5× 106 points, T = 10

Betti 0 = 1

Betti 1 = 2

Betti 2 = 1

Betti 0 = 1

Betti 1 = 2

Betti 2 = 1

Page 103: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

K - KLEIN BOTTLE

Page 104: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

P

P Q

Q

R

R

S

S

Identification Space Model

Page 105: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

Three circles fit naturally inside K?

Page 106: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

PRIMARY CIRCLE

P

PQ

Q

Page 107: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

SECONDARY CIRCLES

R

R

S

S

Page 108: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

Page 109: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

Page 110: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Example: Natural Image Statistics

Page 111: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Natural Image Statistics

Klein bottle makes sense in quadratic polynomials in two variables,as polynomials which can be written as

f = q(λ(x))

where

1. q is single variable quadratic

2. λ is a linear functional

3.∫D f = 0

4.∫D f 2 = 1

Page 112: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper

Algebraic topology can produce signatures which can help inmapping out a data set.

Can one obtain flexible topological mapping methods, withcombinatorial simplicial complex images?

Yes, joint work with G. Singh and F. Memoli.

Page 113: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper

Algebraic topology can produce signatures which can help inmapping out a data set.

Can one obtain flexible topological mapping methods, withcombinatorial simplicial complex images?

Yes, joint work with G. Singh and F. Memoli.

Page 114: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper

Algebraic topology can produce signatures which can help inmapping out a data set.

Can one obtain flexible topological mapping methods, withcombinatorial simplicial complex images?

Yes, joint work with G. Singh and F. Memoli.

Page 115: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Mayer-Vietoris Blowup

X a space, U = {Uα}α∈A a covering of X .

∆ is the simplex with vertex set A

∅ 6= S ⊆ A,X (S) =⋂

s∈S Us and ∆[S ] = face spanned by S

Let XU ⊆ X ×∆, XU =⋃

S X (S)×∆[S ]

Page 116: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Mayer-Vietoris Blowup

X a space, U = {Uα}α∈A a covering of X .

∆ is the simplex with vertex set A

∅ 6= S ⊆ A,X (S) =⋂

s∈S Us and ∆[S ] = face spanned by S

Let XU ⊆ X ×∆, XU =⋃

S X (S)×∆[S ]

Page 117: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Mayer-Vietoris Blowup

X a space, U = {Uα}α∈A a covering of X .

∆ is the simplex with vertex set A

∅ 6= S ⊆ A,X (S) =⋂

s∈S Us and ∆[S ] = face spanned by S

Let XU ⊆ X ×∆, XU =⋃

S X (S)×∆[S ]

Page 118: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Mayer-Vietoris Blowup

X a space, U = {Uα}α∈A a covering of X .

∆ is the simplex with vertex set A

∅ 6= S ⊆ A,X (S) =⋂

s∈S Us and ∆[S ] = face spanned by S

Let XU ⊆ X ×∆, XU =⋃

S X (S)×∆[S ]

Page 119: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Mayer-Vietoris Blowup

Page 120: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Mayer-Vietoris Blowup

Exists a map πX : XU → X , which is a homotopy equivalence withmild hypotheses

N(U) =⋃{S|X (S)6=∅}∆[S ]

Exists a second map π∆ : XU → N(U)

π∆ is equivalence if all X (S)’s are empty or contractible

Page 121: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Mayer-Vietoris Blowup

Exists a map πX : XU → X , which is a homotopy equivalence withmild hypotheses

N(U) =⋃{S|X (S)6=∅}∆[S ]

Exists a second map π∆ : XU → N(U)

π∆ is equivalence if all X (S)’s are empty or contractible

Page 122: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Mayer-Vietoris Blowup

Exists a map πX : XU → X , which is a homotopy equivalence withmild hypotheses

N(U) =⋃{S|X (S)6=∅}∆[S ]

Exists a second map π∆ : XU → N(U)

π∆ is equivalence if all X (S)’s are empty or contractible

Page 123: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Mayer-Vietoris Blowup

Exists a map πX : XU → X , which is a homotopy equivalence withmild hypotheses

N(U) =⋃{S|X (S)6=∅}∆[S ]

Exists a second map π∆ : XU → N(U)

π∆ is equivalence if all X (S)’s are empty or contractible

Page 124: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Mayer-Vietoris Blowup

Intermediate construction M(X ,U)

M(X ,U) =∐S

π0(X (S))×∆[S ]/ '

π0(X (S))×∆[S ]φ←− π0(X (T ))×∆[S ]

ψ−→ π0(X (T ))×∆[T ]

φ(x , ζ) ' ψ(x , ζ)

Page 125: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Mayer-Vietoris Blowup

Intermediate construction M(X ,U)

M(X ,U) =∐S

π0(X (S))×∆[S ]/ '

π0(X (S))×∆[S ]φ←− π0(X (T ))×∆[S ]

ψ−→ π0(X (T ))×∆[T ]

φ(x , ζ) ' ψ(x , ζ)

Page 126: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Mayer-Vietoris Blowup

Intermediate construction M(X ,U)

M(X ,U) =∐S

π0(X (S))×∆[S ]/ '

π0(X (S))×∆[S ]φ←− π0(X (T ))×∆[S ]

ψ−→ π0(X (T ))×∆[T ]

φ(x , ζ) ' ψ(x , ζ)

Page 127: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Mayer-Vietoris Blowup

Intermediate construction M(X ,U)

M(X ,U) =∐S

π0(X (S))×∆[S ]/ '

π0(X (S))×∆[S ]φ←− π0(X (T ))×∆[S ]

ψ−→ π0(X (T ))×∆[T ]

φ(x , ζ) ' ψ(x , ζ)

Page 128: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Mayer-Vietoris Blowup

Page 129: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Statistical Version

Now given point cloud data set X, and a covering U .

Build simplicial complex same way, but π0 operation replaced bysingle linkage clustering with fixed error parameter ε.

Critical that clustering operation be functorial.

Partition of unity subordinate to U gives map from X to M(X,U).

Page 130: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Statistical Version

Now given point cloud data set X, and a covering U .

Build simplicial complex same way, but π0 operation replaced bysingle linkage clustering with fixed error parameter ε.

Critical that clustering operation be functorial.

Partition of unity subordinate to U gives map from X to M(X,U).

Page 131: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Statistical Version

Now given point cloud data set X, and a covering U .

Build simplicial complex same way, but π0 operation replaced bysingle linkage clustering with fixed error parameter ε.

Critical that clustering operation be functorial.

Partition of unity subordinate to U gives map from X to M(X,U).

Page 132: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Statistical Version

Now given point cloud data set X, and a covering U .

Build simplicial complex same way, but π0 operation replaced bysingle linkage clustering with fixed error parameter ε.

Critical that clustering operation be functorial.

Partition of unity subordinate to U gives map from X to M(X,U).

Page 133: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Statistical Version

How to choose coverings?

Given a reference map (or filter) f : X→ Z , where Z is a metricspace, and a covering U of Z , can consider the covering{f −1Uα}α∈A of X. Typical choices of Z - R, R2, S1.

Construction gives an image complex of the data set which canreflect interesting properties of X.

Page 134: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Statistical Version

How to choose coverings?

Given a reference map (or filter) f : X→ Z , where Z is a metricspace, and a covering U of Z , can consider the covering{f −1Uα}α∈A of X. Typical choices of Z - R, R2, S1.

Construction gives an image complex of the data set which canreflect interesting properties of X.

Page 135: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Statistical Version

How to choose coverings?

Given a reference map (or filter) f : X→ Z , where Z is a metricspace, and a covering U of Z , can consider the covering{f −1Uα}α∈A of X. Typical choices of Z - R, R2, S1.

Construction gives an image complex of the data set which canreflect interesting properties of X.

Page 136: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Statistical Version

Typical one dimensional filters:

I Density estimators

I “Eccentricity” :∑

x ′∈X d(x , x ′)2

I Eigenfunctions of graph Laplacian for Vietoris-Rips graph

I User defined, data dependent filter functions

Page 137: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Statistical Version

Typical one dimensional filters:

I Density estimators

I “Eccentricity” :∑

x ′∈X d(x , x ′)2

I Eigenfunctions of graph Laplacian for Vietoris-Rips graph

I User defined, data dependent filter functions

Page 138: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Statistical Version

Typical one dimensional filters:

I Density estimators

I “Eccentricity” :∑

x ′∈X d(x , x ′)2

I Eigenfunctions of graph Laplacian for Vietoris-Rips graph

I User defined, data dependent filter functions

Page 139: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Statistical Version

Typical one dimensional filters:

I Density estimators

I “Eccentricity” :∑

x ′∈X d(x , x ′)2

I Eigenfunctions of graph Laplacian for Vietoris-Rips graph

I User defined, data dependent filter functions

Page 140: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Statistical Version

Miller-Reaven Diabetes Study, 1976

Page 141: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Statistical Version

Cell Cycle Microarray Data

Joint with M. Nicolau, Nagarajan, G. Singh

Page 142: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Scale Space

How to choose the parameter ε in the single linkage clustering?

Can one allow ε to vary with α?

Important question: too many parameter choices makes toolunusable, and choosing one ε for the entire space is too restrictive.

Page 143: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Scale Space

How to choose the parameter ε in the single linkage clustering?

Can one allow ε to vary with α?

Important question: too many parameter choices makes toolunusable, and choosing one ε for the entire space is too restrictive.

Page 144: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Scale Space

How to choose the parameter ε in the single linkage clustering?

Can one allow ε to vary with α?

Important question: too many parameter choices makes toolunusable, and choosing one ε for the entire space is too restrictive.

Page 145: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Scale Space

Construct a new space with reference map to Z .

For each α, we construct the zero dimensional persistence diagramfor f −1Uα.

Consider the set of all endpoints of intervals in the persistencediagram. Provides a decomposition of the real line in which ε isvarying into intervals. Call these intervals S-intervals.

Page 146: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Scale Space

Construct a new space with reference map to Z .

For each α, we construct the zero dimensional persistence diagramfor f −1Uα.

Consider the set of all endpoints of intervals in the persistencediagram. Provides a decomposition of the real line in which ε isvarying into intervals. Call these intervals S-intervals.

Page 147: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Scale Space

Construct a new space with reference map to Z .

For each α, we construct the zero dimensional persistence diagramfor f −1Uα.

Consider the set of all endpoints of intervals in the persistencediagram. Provides a decomposition of the real line in which ε isvarying into intervals. Call these intervals S-intervals.

Page 148: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Scale Space

Page 149: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Scale Space

I Vertex set of SS(X ,U) consists of a pair (α, I ), where α ∈ Aand I is an S-interval for the zero dimensional persistencediagram for f −1(Uα).

I We connect (α, I ) and (β, J) with an edge if (a) Uα ∩ Uβ 6= ∅and (b) I ∩ J 6= ∅.

I SS(X ) is equipped with a reference map π : SS(X ,U)→ NUgiven on vertices by (α, I )→ α

Page 150: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Scale Space

I Vertex set of SS(X ,U) consists of a pair (α, I ), where α ∈ Aand I is an S-interval for the zero dimensional persistencediagram for f −1(Uα).

I We connect (α, I ) and (β, J) with an edge if (a) Uα ∩ Uβ 6= ∅and (b) I ∩ J 6= ∅.

I SS(X ) is equipped with a reference map π : SS(X ,U)→ NUgiven on vertices by (α, I )→ α

Page 151: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Scale Space

I Vertex set of SS(X ,U) consists of a pair (α, I ), where α ∈ Aand I is an S-interval for the zero dimensional persistencediagram for f −1(Uα).

I We connect (α, I ) and (β, J) with an edge if (a) Uα ∩ Uβ 6= ∅and (b) I ∩ J 6= ∅.

I SS(X ) is equipped with a reference map π : SS(X ,U)→ NUgiven on vertices by (α, I )→ α

Page 152: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Scale Space

A varying choice of scale is now determined by a section of π, i.e amap

σ : NU −→ SS(X ,U)

so that πσ = idNU .

Sections can be given an weighting depending on the length of Ifor the vertices and depending on the length of I ∩ J for the edges.

Finding the high weight sections in the case of 1-D filters iscomputationally tractable.

Page 153: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Scale Space

A varying choice of scale is now determined by a section of π, i.e amap

σ : NU −→ SS(X ,U)

so that πσ = idNU .

Sections can be given an weighting depending on the length of Ifor the vertices and depending on the length of I ∩ J for the edges.

Finding the high weight sections in the case of 1-D filters iscomputationally tractable.

Page 154: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Mapper - Scale Space

A varying choice of scale is now determined by a section of π, i.e amap

σ : NU −→ SS(X ,U)

so that πσ = idNU .

Sections can be given an weighting depending on the length of Ifor the vertices and depending on the length of I ∩ J for the edges.

Finding the high weight sections in the case of 1-D filters iscomputationally tractable.

Page 155: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

Bootstrap - B. Efron

I Studies statistics of measures of central tendency acrossdifferent samples within a data set

I Can give assessment of reliability of conclusions to be drawnfrom the statistics of the data set

I How can one adapt the technique to apply to qualitativeinformation, such as presence of loops or decompositions intoclusters?

Page 156: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

Bootstrap - B. Efron

I Studies statistics of measures of central tendency acrossdifferent samples within a data set

I Can give assessment of reliability of conclusions to be drawnfrom the statistics of the data set

I How can one adapt the technique to apply to qualitativeinformation, such as presence of loops or decompositions intoclusters?

Page 157: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

Bootstrap - B. Efron

I Studies statistics of measures of central tendency acrossdifferent samples within a data set

I Can give assessment of reliability of conclusions to be drawnfrom the statistics of the data set

I How can one adapt the technique to apply to qualitativeinformation, such as presence of loops or decompositions intoclusters?

Page 158: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

How to distinguish?

Page 159: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

I Family of samples S1,S2, . . . ,Sk from point cloud data X

I Construct new samples Si ∪ Si+1

I Fit together into a diagram

· · · Si−1 ∪ Si Si ∪ Si+1 · · ·

Si−1 Si Si+1

QQQk

���3

QQQk

���3 Q

QQk���3

I Apply Hk to VR-complexes on each of these, get a diagram ofvector spaces of same shape

I If a family of homology classes “matches up” under inducedmaps, then they are stable across samples

Page 160: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

I Family of samples S1,S2, . . . ,Sk from point cloud data XI Construct new samples Si ∪ Si+1

I Fit together into a diagram

· · · Si−1 ∪ Si Si ∪ Si+1 · · ·

Si−1 Si Si+1

QQQk

���3

QQQk

���3 Q

QQk���3

I Apply Hk to VR-complexes on each of these, get a diagram ofvector spaces of same shape

I If a family of homology classes “matches up” under inducedmaps, then they are stable across samples

Page 161: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

I Family of samples S1,S2, . . . ,Sk from point cloud data XI Construct new samples Si ∪ Si+1

I Fit together into a diagram

· · · Si−1 ∪ Si Si ∪ Si+1 · · ·

Si−1 Si Si+1

QQQk

���3

QQQk

���3 Q

QQk���3

I Apply Hk to VR-complexes on each of these, get a diagram ofvector spaces of same shape

I If a family of homology classes “matches up” under inducedmaps, then they are stable across samples

Page 162: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

I Family of samples S1,S2, . . . ,Sk from point cloud data XI Construct new samples Si ∪ Si+1

I Fit together into a diagram

· · · Si−1 ∪ Si Si ∪ Si+1 · · ·

Si−1 Si Si+1

QQQk

���3

QQQk

���3 Q

QQk���3

I Apply Hk to VR-complexes on each of these, get a diagram ofvector spaces of same shape

I If a family of homology classes “matches up” under inducedmaps, then they are stable across samples

Page 163: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

I Family of samples S1,S2, . . . ,Sk from point cloud data XI Construct new samples Si ∪ Si+1

I Fit together into a diagram

· · · Si−1 ∪ Si Si ∪ Si+1 · · ·

Si−1 Si Si+1

QQQk

���3

QQQk

���3 Q

QQk���3

I Apply Hk to VR-complexes on each of these, get a diagram ofvector spaces of same shape

I If a family of homology classes “matches up” under inducedmaps, then they are stable across samples

Page 164: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

To carry out analysis, one needs a classification of diagrams ofvector spaces of shape of upper row. Second row is shape forordinary persistence.

Page 165: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

Classification exists, due to P. Gabriel

Every diagram of this shape has a decomposition into a direct sumof cyclic diagrams, i.e. diagrams which consist of either aone-dimensional or a zero dimensional vector space.

Can therefore parametrize isomorphism classes by barcodes, just asin the case of ordinary persistence.

Long intervals correspond to elements stable across samples, othersare artifacts.

Page 166: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

Classification exists, due to P. Gabriel

Every diagram of this shape has a decomposition into a direct sumof cyclic diagrams, i.e. diagrams which consist of either aone-dimensional or a zero dimensional vector space.

Can therefore parametrize isomorphism classes by barcodes, just asin the case of ordinary persistence.

Long intervals correspond to elements stable across samples, othersare artifacts.

Page 167: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

Classification exists, due to P. Gabriel

Every diagram of this shape has a decomposition into a direct sumof cyclic diagrams, i.e. diagrams which consist of either aone-dimensional or a zero dimensional vector space.

Can therefore parametrize isomorphism classes by barcodes, just asin the case of ordinary persistence.

Long intervals correspond to elements stable across samples, othersare artifacts.

Page 168: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

Classification exists, due to P. Gabriel

Every diagram of this shape has a decomposition into a direct sumof cyclic diagrams, i.e. diagrams which consist of either aone-dimensional or a zero dimensional vector space.

Can therefore parametrize isomorphism classes by barcodes, just asin the case of ordinary persistence.

Long intervals correspond to elements stable across samples, othersare artifacts.

Page 169: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

Results have value in other situations:

I Analysis of time varying data

I Analysis of behavior of data under varying choice of densityestimators

I Analysis of behavior of witness complexes under varyingchoices of landmarks

This analysis is relevant and interesting even in zero dimensionalcase, i.e. clustering.

Page 170: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

Results have value in other situations:

I Analysis of time varying data

I Analysis of behavior of data under varying choice of densityestimators

I Analysis of behavior of witness complexes under varyingchoices of landmarks

This analysis is relevant and interesting even in zero dimensionalcase, i.e. clustering.

Page 171: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

Results have value in other situations:

I Analysis of time varying data

I Analysis of behavior of data under varying choice of densityestimators

I Analysis of behavior of witness complexes under varyingchoices of landmarks

This analysis is relevant and interesting even in zero dimensionalcase, i.e. clustering.

Page 172: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

Results have value in other situations:

I Analysis of time varying data

I Analysis of behavior of data under varying choice of densityestimators

I Analysis of behavior of witness complexes under varyingchoices of landmarks

This analysis is relevant and interesting even in zero dimensionalcase, i.e. clustering.

Page 173: Topology and Data - Stanford Universityweb.stanford.edu/group/mmds/slides2008/carlsson.pdf · Topology and Data Gunnar Carlsson 1 ... June 27, 2008 1Research supported in part by

Variants on Persistence: Zig-Zags

Results have value in other situations:

I Analysis of time varying data

I Analysis of behavior of data under varying choice of densityestimators

I Analysis of behavior of witness complexes under varyingchoices of landmarks

This analysis is relevant and interesting even in zero dimensionalcase, i.e. clustering.