advanced data visualizationbeiwang/teaching/cs6965-fall-2019/lecture04-mapper.pdfclustering is...
TRANSCRIPT
![Page 1: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/1.jpg)
CS 6965Fall 2019
Prof. Bei Wang PhillipsUniversity of Utah
Advanced Data Visualization
Lecture 04
![Page 2: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/2.jpg)
Beyond PCA & t -SNE: Visua l In teract ions wi th DR
HD
![Page 3: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/3.jpg)
Visual Interactions with DR
[SachaZhangSedlmair2017]
![Page 4: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/4.jpg)
Visual Interactions with DR
[SachaZhangSedlmair2017]
![Page 5: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/5.jpg)
Visual Interactions with DR
[SachaZhangSedlmair2017]
![Page 6: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/6.jpg)
7 Guiding Scenarios for DR Interaction
[SachaZhangSedlmair2017]
![Page 7: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/7.jpg)
Visual Interactions with DR
[SachaZhangSedlmair2017]
![Page 8: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/8.jpg)
iPCA and DR Interaction
[SachaZhangSedlmair2017]
![Page 9: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/9.jpg)
Some more maths on PCA and t-SNE See white board for some more math
![Page 10: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/10.jpg)
t-SNE Revisited
[vanderMaatenHinton2008, vanderMaaten2014, CaoWang2017]
![Page 11: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/11.jpg)
SNE/t-SNE Revisited
[vanderMaatenHinton2008, vanderMaaten2014, CaoWang2017]
![Page 12: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/12.jpg)
SNE/t-SNE Revisited
https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html#sklearn.manifold.TSNE
![Page 13: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/13.jpg)
Perplexity auto-tuning
[CaoWang2017]
![Page 14: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/14.jpg)
Further reading
[NguyenHolmes2019]
![Page 15: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/15.jpg)
Mapper, C lus ter ing & beyond
HD
![Page 16: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/16.jpg)
The Mapper Algorithm:History and Overview
A tool for high-dimensional data analysis and visualization
![Page 17: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/17.jpg)
History of mapper algorithm
At the core of at least two data analysis startups:
Ayasdi: topological data analysis, machine learning and visualization
https://www.ayasdi.com/
Alpine Data: (topological) data analysis at scale, http://alpinedata.com/
![Page 18: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/18.jpg)
Ayasdi
https://www.youtube.com/watch?v=XfWibrh6stw
![Page 19: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/19.jpg)
Ayasdi: Fraud detection
https://www.youtube.com/watch?v=L8o4an5nh4E
![Page 20: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/20.jpg)
Ayasdi: Patient Stratification
https://www.youtube.com/watch?v=FmfIJ3-UuaI
![Page 21: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/21.jpg)
Alpine Data
https://databricks.com/session/enterprise-scale-topological-data-analysis-using-spark
![Page 22: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/22.jpg)
Mapper Algorithm and VisualizationA qualitative understanding of high-dim point cloud data through direct visualizationCombining DR with graph visualizationDesirable properties of visualization for high-dimensional data
Insensitive to metric (approximation to similarity): robust to small changes to the metricUnderstanding sensitivity to parameter changes: provide useful summary of behavior under all choices of parameters, exploratoryMulti-scale representation: at various levels of resolution, comparison
“Features which are seen at multiple scales will be viewed as more likely to be actual features as opposed to more transient features which could be viewed as artifacts of the imaging method.”
[Carlsson2009]
![Page 23: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/23.jpg)
Covering a circle by sets
[Carlsson2009]
X
The abstraction of set relations based on overlaps of sets
![Page 24: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/24.jpg)
Covering a circle by sets
[Carlsson2009]
X
The abstraction of set relations based on overlaps among connected components of the set(Nerve, roughly speaking)
![Page 25: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/25.jpg)
Covering a circle by sets
![Page 26: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/26.jpg)
Point cloud data: soft clustering
![Page 27: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/27.jpg)
Change of scale
![Page 28: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/28.jpg)
Covering of a point cloud
Given point cloud data and a covering…Taking the nerve of the covering can sometimes capture the shape of the data at the right scale(s)
![Page 29: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/29.jpg)
Mapper Algorithm at a GlanceQualitative analysis, simplification and visualization of high-dimensional data sets and functions on these data sets:
Data summarization/skeletonization: Extracting simple descriptions of high dimensional data sets in the form of simplicial complexes or graphsFunction-induced clustering: partial clustering of the data guided by a set of functions defined on the data. Flexibility: any clustering algorithm may be used with Mapper.Exploratory and multi-scale: explore parameters at all scales if possible.
[SinghMemoliCarlsson2007]
![Page 30: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/30.jpg)
Mapper Algorithm: Core
A main algorithm
![Page 31: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/31.jpg)
Mapper I/O, implementationInput:
Point cloud data, distance metric on the point cloud Functions on the point cloud: filter function/lens
Output:(Interactive) visualization of a summary of the data as a graph or a simplicial complex based on function-induced clusteringPotentially interface with statistics and machine learning algorithms
Parameters:Parameters related to the chosen clustering algorithmsFilter functionsNumber of intervalsAmount of interval overlapColor functions, etc. [SinghMemoliCarlsson2007]
![Page 32: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/32.jpg)
Mapper Algorithm by example
[SinghMemoliCarlsson2007]
1. Input: a point cloud with a filter functione.g., a height function. Also assumethat there is a distance (metric) defined between any two points in the point cloud.
2. Cover the range of the function with intervals: using # of intervals, and amount % ofoverlap as parameters. E.g., # of intervals = 5, overlap = 25%.
![Page 33: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/33.jpg)
[SinghMemoliCarlsson2007]
3. Look at the points in the domain that falls into each interval, and apply clustering to these points. E.g., following the inverse map.
4.Obtaining the nerve of all clusters (a covering) in the domain. E.g., here it is a graph representation that summarizes the data.
Such a graph can interface withmachine learning and interactivevisualization…
![Page 34: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/34.jpg)
Clustering
[SinghMemoliCarlsson2007]
Almost any clustering algorithm can be usedAssume there is a notion of distance (metric) between a pair of points in the data domain (distance can be computed or provided)Clustering is equivalent to a notion of connected component in the point cloud settingCommonly used clustering algorithms:
Density-based spatial clustering of applications with noise (DBSCAN)Single-linkage clusteringK-means, etc.
Desirable properties:Not restricted to Euclidean distance; can take distance matrix inputDo not require specifying the number of clusters beforehand
![Page 35: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/35.jpg)
Parameters for the covering
Number of intervals: kIncreasing k will increase the # of clusters we observeMay create more empty clusters (small number of points per cluster)Finer features of the dataIf density varies, pick up clusters with high density
Percentage of overlap: pIncreasing p will increase the connectivities among the clustersSometimes robust in dealing with noise
[SinghMemoliCarlsson2007]
![Page 36: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/36.jpg)
Filter functions
A filter function can be given a prior, e.g. car purchasing priceIt can also be derived from the properties of the point cloud itself
Density estimationEccentricityDistance to a point in the dataGraph laplacians
[SinghMemoliCarlsson2007]
![Page 37: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/37.jpg)
Filter functions
[SinghMemoliCarlsson2007]
Density:
Eccentricity
![Page 38: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/38.jpg)
1D vs 2D Mapper
[SinghMemoliCarlsson2007]
1D Mapper: a single filter function2D Mapper: 2 filter functions
The covering of the domain of the function is no longer by intervalsInstead, by rectangles or other geometric shapes, etc.
![Page 39: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/39.jpg)
2D Mapper
f
g
[SinghMemoliCarlsson2007]
Count the number of connectedcomponents per “2D interval” (square in the range)
https://en.wikipedia.org/wiki/Sphere
![Page 40: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/40.jpg)
Mapper Algorithm: Applications
A few examples
![Page 41: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/41.jpg)
Shape skeletonization & classification
[SinghMemoliCarlsson2007]
Also see Kepler Mapper demo examples: cat, lion, horse…
![Page 42: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/42.jpg)
Breast Cancer dataset[NicolauLevineCarlsson2011]
![Page 43: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/43.jpg)
Breast Cancer dataset 2
LumSinghLehman2012
![Page 44: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/44.jpg)
Politics
[LumSinghLehman2012]
![Page 45: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/45.jpg)
Sports[LumSinghLehman2012]
![Page 46: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/46.jpg)
Discussions
Future directions…
![Page 47: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/47.jpg)
Limitations of Mapper
How to choose the stable range of parameters (k, p)How to choose the clustering algorithmsHow to choose the filter functionsObtain insights with the right color function…
![Page 48: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/48.jpg)
Future directions
Better automatic parameter tuningMulti-scale mapper2D Mapper: theoretical understandingWhat are the other possible variations? (Discussion)
![Page 49: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/49.jpg)
KepperMapper
A Demo
![Page 50: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/50.jpg)
Open sourced implementation
Python Mapperhttp://danifold.net/mapper/index.html
R implementation: TDAmapperhttps://cran.r-project.org/web/packages/TDAmapper/index.html
Spark Mapper:https://github.com/log0ymxm/spark-mapper
![Page 51: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/51.jpg)
Kepler-Mapper Demo
The example with one circleThe example with two circlesDigits exampleBreast Cancer Example
![Page 52: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/52.jpg)
Project 1 ExplainedA simple, f irst application of HD analysis
![Page 53: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/53.jpg)
Limitations of KeplerMapper
KeplerMapper is still under active development:The visualization capabilities are limited (one color function)There are not much of interactive visualizationNo integration with other machine learning algorithms
![Page 54: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/54.jpg)
Project 1 tips
The questions on KeplerMapper is rather fundamental, the goal is for you to understand the inner-working of the code; try to read and understand kmapper.py as much as possibleStart your project early; start it todayRead the paper
![Page 55: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/55.jpg)
Final Project Idea
A non-trivial extension to open sourced Mapper implementationE.g. enhance the visualization capabilities of KeplerMapperScalable solution using Spark Mapper
A non-trivial application of mapper algorithms to real-world data setNeed to solve or give insight to a real-world problem
![Page 57: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/57.jpg)
Vector Icons by Matthew Skiles
Presentation template designed by Slidesmash
Photographs by unsplash.com and pexels.com
CREDITSSpecial thanks to all people who made and share these awesome resources for free:
![Page 58: Advanced Data Visualizationbeiwang/teaching/cs6965-fall-2019/Lecture04-Mapper.pdfClustering is equivalent to a notion of connected component in the point cloud setting Commonly used](https://reader033.vdocument.in/reader033/viewer/2022043020/5f3c7a09822e9e55bf6c046e/html5/thumbnails/58.jpg)
Presentation DesignThis presentation uses the following typographies and colors:
Co lo rs used
Free Fonts used:http://www.1001fonts.com/oswald-font.html
https://www.fontsquirrel.com/fonts/open-sans