introduction and overviewpwp.gatech.edu/fds-summer-school/wp-content/... · arkadi nemirovski....

35
Introduction and Overview of the Foundation of Data Science 2019 Summer School at the Georgia Institute of Technology August 5, 2019 Xiaoming Huo, Georgia Tech

Upload: others

Post on 03-Jan-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Introduction and Overview

of the Foundation of Data Science 2019 Summer School

at the Georgia Institute of TechnologyAugust 5, 2019

Xiaoming Huo, Georgia Tech

Page 2: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Agenda

I. Overview of this summer schoolII. Foundation of data science: Convex relaxationIII. Fast algorithm in statistic computingIV. Conclusion

2

overview foundation case 1 case 2 theory simulations conclude

Page 3: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

TRIAD: Transdisciplinary Research Institute for Advancing Data

Sciencetriad.gatech.edu

August 5, 2019 TRIAD summer school at GeorgiaTech 3

overview foundation case 1 case 2 theory simulations conclude

Page 4: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Transdisciplinary Research Institute for Advancing Data Science

Speakers

Arkadi Nemirovski

August 5, 2019 TRIAD summer school at GeorgiaTech 4

Mark Davenport

Polo Chau

Vladimir Koltchinskii

Yao Xie

overview foundation case 1 case 2 theory simulations conclude

Page 5: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Transdisciplinary Research Institute for Advancing Data Science

Speakers (2)

• Arkadi Nemirovski: Convex optimization for statistical inference

• Mark Davenport: Sparse recovery, matrix completion, and applications

• Polo Chau: Data visualization

• Vladimir Koltchinskii: Probabilistic tools for high-dimensional statistics

• Xiaoming Huo: Overview of FDS summer school

• Yao Xie: Sequential data analysis and change detection

• (Industry Guest Speaker) Huan Yan: Data Science at Wellsfargo

August 5, 2019 TRIAD summer school at GeorgiaTech 5

overview foundation case 1 case 2 theory simulations conclude

Page 6: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Transdisciplinary Research Institute for Advancing Data Science

Schedule

• Location: the ISyE Main Building, room 228

• Monday, August 5

• Visit our web site

August 5, 2019 TRIAD summer school at GeorgiaTech 6

Time Event Topic

9:30-10:30 Lecture Xiaoming Huo – ISyEIntroduction and overview of FDS summer school

10:30-11:00 Break

11:00-12:00 Lecture Mark Davenport – ECESparse recovery, matrix completion, and applications

12:00-1:30 Lunch

1:30-2:30 Lecture Mark Davenport – ECESparse recovery, matrix completion, and applications

2:30-3:00 Break

3:00-4:00 Lecture Vladimir I Koltchinskii – MathProbalistic tools for high-dimensional statistics

4:00-6:00 Poster Session / Reception

6:00 Dinner

overview foundation case 1 case 2 theory simulations conclude

Page 7: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Schedule (2)• Tuesday, August 6 • Wednesday, August 7

August 5, 2019 TRIAD summer school at GeorgiaTech 7

Time Event Topic

9:30-10:30 Lecture Arkadi S Nemirovski – ISyEConvex optimization for statistical inference

10:30-11:00 Break

11:00-12:00 Lecture Arkadi S Nemirovski – ISyE

12:00-1:30 Lunch

1:30-2:30 Lecture Vladimir I Koltchinskii – MathProbalistic tools for high-dimensional statistics

2:30-3:00 Break

3:00-4:00 Lecture Vladimir I Koltchinskii – Math

4:00-5:00 Lab

6:00 Dinner

Time Event Topic

9:30-10:30

Lecture Arkadi S Nemirovski – ISyEConvex optimization for statistical inference

10:30-11 Break

11:00-12:00

Lecture Arkadi S Nemirovski – ISyEConvex optimization for statistical inference

12:00-1: Lunch

1:00-1:30

Lecture Huan Yan – WellsfargoIndustry Guest Speaker

1:30-2:30

Lecture Mark Davenport – ECESparse recovery, matrix completion, and applications

2:30-3 Break

3:00-4:00

Lecture Mark Davenport – ECESparse recovery, matrix completion, and appli…

4:00-5:00

Tour of Georgia Tech IRIM (Institute of Robotics and Intelligent Machines)

6:00 Dinner

overview foundation case 1 case 2 theory simulations conclude

Page 8: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Transdisciplinary Research Institute for Advancing Data Science

Schedule (3)

• Thursday, August 8

August 5, 2019 TRIAD summer school at GeorgiaTech 8

Time Event Topic

9:30-10:30 Lecture Yao Xie – ISyESequential data analysis and change detection

10:30-11:00 Break

11:00-12:00 Lecture Polo Chau – CSEData visualization

12:00-12:15 Closing remark

Yao Xie and Xiaoming Huo

12:15-1:30 Lunch

overview foundation case 1 case 2 theory simulations conclude

Page 9: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Transdisciplinary Research Institute for Advancing Data Science

The Surrounding

August 5, 2019 TRIAD summer school at GeorgiaTech 9

overview foundation case 1 case 2 theory simulations conclude

Page 10: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Transdisciplinary Research Institute for Advancing Data Science

Safety!!!

• http://ipat.gatech.edu/news/data-driven-policing

August 5, 2019 TRIAD summer school at GeorgiaTech 10

overview foundation case 1 case 2 theory simulations conclude

Page 11: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Agenda

I. Overview of this summer schoolII. Foundation of data science: Convex relaxationIII. Fast algorithm in statistic computingIV. Conclusion

11

overview foundation case 1 case 2 theory simulations conclude

Page 12: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Transdisciplinary Research Institute for Advancing Data Science

Foundation of Data Science

August 5, 2019 TRIAD summer school at GeorgiaTech 12

overview foundation case 1 case 2 theory simulations conclude

Page 13: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Transdisciplinary Research Institute for Advancing Data Science

Foundation of Data Science

August 5, 2019 TRIAD summer school at GeorgiaTech 13

Programming, data base

overview foundation case 1 case 2 theory simulations conclude

Page 14: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Transdisciplinary Research Institute for Advancing Data Science

Foundation of Data Science

August 5, 2019 TRIAD summer school at GeorgiaTech 14

Statistics, mathematics, tcs

overview foundation case 1 case 2 theory simulations conclude

Page 15: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Transdisciplinary Research Institute for Advancing Data Science

Foundation of Data Science

August 5, 2019 TRIAD summer school at GeorgiaTech 15

Domain expertise

overview foundation case 1 case 2 theory simulations conclude

Page 16: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Transdisciplinary Research Institute for Advancing Data Science

L1 relaxation in statistics

• Statistical inference: 𝑋𝑋 ⇒ 𝑌𝑌• Modeling: �𝑌𝑌 = 𝑓𝑓 𝑋𝑋 ≈ 𝑌𝑌• Regression: 𝑌𝑌 = 𝛽𝛽0 + 𝛽𝛽1𝑋𝑋1 + ⋯+ 𝛽𝛽𝑝𝑝𝑋𝑋𝑝𝑝 + 𝑒𝑒• Least squares: ⟶ min

𝛽𝛽𝑌𝑌 − 𝑋𝑋𝛽𝛽 2

2

• Difficulty: 𝑌𝑌 ∈ ℝ𝑛𝑛,𝑋𝑋 ∈ ℝ𝑛𝑛×𝑝𝑝

Q: What if 𝑝𝑝 ≫ 𝑛𝑛

August 5, 2019 TRIAD summer school at GeorgiaTech 16

=

overview foundation case 1 case 2 theory simulations conclude

Page 17: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Transdisciplinary Research Institute for Advancing Data Science

L1 relaxation in statistics (2)

• Statistical theory (60’s, 70’s…): AIC, BIC, …• Subset Ω ⊂ 1,2, … ,𝑝𝑝• E.g., AIC(Ω) = 𝑌𝑌 − 𝑋𝑋𝛽𝛽 2

2 + 𝐶𝐶 Ω• Problem: there are 2𝑝𝑝of Ω′s

August 5, 2019 TRIAD summer school at GeorgiaTech 17

𝒑𝒑 𝟐𝟐𝒑𝒑

10 103

20 106

30 109 = 1 billion

60 1018 = impossible

overview foundation case 1 case 2 theory simulations conclude

Page 18: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Transdisciplinary Research Institute for Advancing Data Science

L1 relaxation in statistics (3)

• 1996, Lasso, Basis Pursuit,…• min

𝛽𝛽𝑌𝑌 − 𝑋𝑋𝛽𝛽 2

2 + 𝜆𝜆 𝛽𝛽 1

• Where 𝛽𝛽 1 = ∑𝑗𝑗=1𝑝𝑝 𝛽𝛽𝑗𝑗

• Convex relaxation

August 5, 2019 TRIAD summer school at GeorgiaTech 18

overview foundation case 1 case 2 theory simulations conclude

Page 19: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Transdisciplinary Research Institute for Advancing Data Science

L1 relaxation in statistics (4)

• Convex ⇒ polynomial time algorithm (vs NP hard)• Linear programming• Many existing good solvers• 2001, Donoho+H. IEEE IT

• Under certain conditions, the relaxation delivers the identical solutions as the original formulation (which is potentially NP-hard).

• Compressive sensing…

August 5, 2019 TRIAD summer school at GeorgiaTech 19

overview foundation case 1 case 2 theory simulations conclude

Page 20: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Transdisciplinary Research Institute for Advancing Data Science

Foundation of Data Science

• Foundation is powerful and critical• Interdisciplinary approach (borrow the strengths)• Multi-stage → coherent

August 5, 2019 TRIAD summer school at GeorgiaTech 20

overview foundation case 1 case 2 theory simulations conclude

Page 21: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Agenda

I. Overview of this summer schoolII. Foundation of data science: Convex relaxationIII. Fast algorithm in statistic computingIV. Conclusion

21

overview foundation case 1 case 2 theory simulations conclude

Page 22: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Transdisciplinary Research Institute for Advancing Data Science

Distance Covariance

• Let’s take a look at a statistical example…

August 5, 2019 TRIAD summer school at GeorgiaTech 22

overview foundation case 1 case 2 theory simulations conclude

Page 23: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Linear Dependency (Pearson’s)Pearson’s linear correlation coefficient:

Corr(X,Y)= 𝐶𝐶𝐶𝐶𝐶𝐶(𝑋𝑋,𝑌𝑌)𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋 𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌)

Karl Pearson (1895)

August 5, 2019 TRIAD SUMMER SCHOOL AT GEORGIATECH 23

corr =1.0 corr =1.0 corr =0.8corr =0.4

corr =0.0

Corr=1.0 Corr=1.0 Corr=0.8 Corr=0.4 Corr=0.0

Y

X

overview foundation case 1 case 2 theory simulations conclude

Page 24: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

NonlinearityDependency could be complicated

August 5, 2019 TRIAD SUMMER SCHOOL AT GEORGIATECH 24

Y

X

overview foundation case 1 case 2 theory simulations conclude

Page 25: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Pearson’s corr. not effective

August 5, 2019 TRIAD SUMMER SCHOOL AT GEORGIATECH 25

corr =0.8 corr =1.0 corr =0.0

corr =-0.0 corr =-0.0

corr =-0.1

corr =-0.1 corr =0.0 corr =-0.0

0.8 1.0 0.0

0.0

-0.1 0.0 0.0

0.0 0.1

overview foundation case 1 case 2 theory simulations conclude

Page 26: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

How to measure statistical dependence?Independence: f(x,y)=f(x)f(y)◦ Joint density is the multiplication of two marginal densities

Hope: ◦ X and Y independent if and only if corr(X,Y)=0◦ If X=𝑐𝑐1⋅Y+𝑐𝑐2, then corr(X,Y)=1

Pearson’s correlation coefficient not effective

August 5, 2019 TRIAD SUMMER SCHOOL AT GEORGIATECH 26

overview foundation case 1 case 2 theory simulations conclude

Page 27: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Distance covarianceGabor J. Szekely, 2005, 2007 (AoS), 2009 (AoAS), 2012 (SPL), 2014 (AoS)

Distance covariance: (population version)

𝒱𝒱2 𝑋𝑋,𝑌𝑌 = 𝜙𝜙𝑋𝑋,𝑌𝑌 𝑡𝑡, 𝑠𝑠 − 𝜙𝜙𝑋𝑋 𝑡𝑡 𝜙𝜙𝑌𝑌 𝑠𝑠 𝑤𝑤2

≔ ∫𝑅𝑅𝑝𝑝+𝑞𝑞 𝜙𝜙𝑋𝑋,𝑌𝑌 𝑡𝑡, 𝑠𝑠 − 𝜙𝜙𝑋𝑋 𝑡𝑡 𝜙𝜙𝑌𝑌 𝑠𝑠2 𝑤𝑤 𝑡𝑡, 𝑠𝑠 𝑑𝑑𝑡𝑡 𝑑𝑑𝑠𝑠

where 𝜙𝜙𝑋𝑋,𝑌𝑌, 𝜙𝜙𝑋𝑋, and 𝜙𝜙𝑌𝑌 are characteristic func.

Weight 𝑤𝑤 𝑡𝑡, 𝑠𝑠 = (|𝑡𝑡|𝑝𝑝1+𝑝𝑝|𝑠𝑠|𝑞𝑞

1+𝑞𝑞)−1 to ensure the above integral is well defined…

August 5, 2019 TRIAD SUMMER SCHOOL AT GEORGIATECH 27

overview foundation case 1 case 2 theory simulations conclude

Page 28: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Sample Distance CovariancePairwise distances: 𝑎𝑎𝑖𝑖𝑗𝑗 = 𝑋𝑋𝑖𝑖 − 𝑋𝑋𝑗𝑗 , 1 ≤ 𝑖𝑖, 𝑗𝑗 ≤ 𝑛𝑛

Similarly, 𝑏𝑏𝑖𝑖𝑗𝑗 = 𝑌𝑌𝑖𝑖 − 𝑌𝑌𝑗𝑗Centered matrix:

𝐴𝐴𝑖𝑖𝑗𝑗 = �𝑎𝑎𝑖𝑖𝑗𝑗 −∑ℓ=1𝑛𝑛 𝑎𝑎𝑖𝑖ℓ𝑛𝑛 − 2 −

∑𝑘𝑘=1𝑛𝑛 𝑎𝑎𝑘𝑘𝑗𝑗𝑛𝑛 − 2 +

∑𝑘𝑘,ℓ=1𝑛𝑛 𝑎𝑎𝑘𝑘ℓ

𝑛𝑛 − 1 𝑛𝑛 − 2 , 𝑖𝑖 ≠ 𝑗𝑗;

0, 𝑖𝑖 = 𝑗𝑗

Similarly, 𝐵𝐵𝑖𝑖𝑗𝑗.

An unbiased estimator of 𝒱𝒱2 𝑋𝑋,𝑌𝑌 :

𝐴𝐴 � 𝐵𝐵 =∑𝑖𝑖≠𝑗𝑗 𝐴𝐴𝑖𝑖𝑗𝑗𝐵𝐵𝑖𝑖𝑗𝑗𝑛𝑛(𝑛𝑛 − 3)

August 5, 2019 TRIAD SUMMER SCHOOL AT GEORGIATECH 28

overview foundation case 1 case 2 theory simulations conclude

Page 29: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

ComparisonComparison between dependence measures

August 5, 2019 TRIAD SUMMER SCHOOL AT GEORGIATECH 29

Name of Coeff. Comp. cost

Pearson’s 𝝆𝝆 𝑛𝑛Spearman’s 𝜌𝜌 𝑛𝑛 log𝑛𝑛Kendall’s 𝜏𝜏 𝑛𝑛 log𝑛𝑛CCA 𝑛𝑛KCCA 𝑛𝑛3ACE 𝑛𝑛MIC 2𝑛𝑛MMD 𝑛𝑛2CMMD 𝑛𝑛2RDC 𝑛𝑛 log𝑛𝑛dCor 𝑛𝑛2 → 𝑛𝑛 log𝑛𝑛

overview foundation case 1 case 2 theory simulations conclude

Page 30: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Main ideas towards an O(n log n) algorithmWe designed a dyadic updating scheme to compute for

�𝑖𝑖≠𝑗𝑗

𝑎𝑎𝑖𝑖𝑗𝑗𝑏𝑏𝑖𝑖𝑗𝑗 = �𝑖𝑖≠𝑗𝑗

|𝑥𝑥𝑖𝑖 − 𝑥𝑥𝑗𝑗| � |𝑦𝑦𝑖𝑖 − 𝑦𝑦𝑗𝑗|

An O(n log n) algorithm

August 5, 2019 TRIAD SUMMER SCHOOL AT GEORGIATECH 30

overview foundation case 1 case 2 theory simulations conclude

Page 31: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Fast Method (3)For an 𝑖𝑖, need to compute for

�𝑗𝑗:𝑗𝑗<𝑖𝑖,𝑦𝑦𝑗𝑗<𝑦𝑦𝑖𝑖

𝑐𝑐𝑗𝑗

An dyadic partitioning/updating scheme:

August 5, 2019 TRIAD SUMMER SCHOOL AT GEORGIATECH 31

𝑖𝑖

𝑗𝑗

… … … … …

overview foundation case 1 case 2 theory simulations conclude

Page 32: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Several sets of (x, y) points, with the Pearson correlation coefficient

August 5, 2019 TRIAD SUMMER SCHOOL AT GEORGIATECH 32

overview foundation case 1 case 2 theory simulations conclude

Page 33: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Several sets of points, with the distance correlation coefficient

August 5, 2019 TRIAD SUMMER SCHOOL AT GEORGIATECH 33

overview foundation case 1 case 2 theory simulations conclude

Page 34: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Main message(s)

● Foundational research key to data science● Computing, statistics, theoretical computer science, math, … play

important roles ⟹ Interdisciplinary● New paradigm in data science technologies● This summer school: Foundation of Data Science – future activities…

62

Thank you! Email: [email protected]

overview foundation case 1 case 2 theory simulations conclude

Page 35: Introduction and Overviewpwp.gatech.edu/fds-summer-school/wp-content/... · Arkadi Nemirovski. August 5, 2019 TRIAD summer school at GeorgiaTech 4 Mark Davenport Polo Chau Vladimir

Transdisciplinary Research Institute for Advancing Data Science

Acknowledgment

• Kathy Huggins

August 5, 2019 TRIAD summer school at GeorgiaTech 63

overview foundation case 1 case 2 theory simulations conclude