data-powered algorithms

42
Data-Powered Data-Powered Algorithms Algorithms Bernard Chazelle Bernard Chazelle Princeton University Princeton University

Upload: mihaly

Post on 18-Jan-2016

19 views

Category:

Documents


0 download

DESCRIPTION

Data-Powered Algorithms. Bernard Chazelle Princeton University. Tools. Linear Programming. N constraints and d variables. N constraints and d variables. Dimension Reduction.  25.  10000. Images (face recognition) Signals (voice recognition) Text (NLP) . . . - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data-Powered Algorithms

Data-Powered AlgorithmsData-Powered AlgorithmsData-Powered AlgorithmsData-Powered Algorithms

Bernard ChazelleBernard Chazelle

Princeton UniversityPrinceton University

Bernard ChazelleBernard Chazelle

Princeton UniversityPrinceton University

Page 2: Data-Powered Algorithms
Page 3: Data-Powered Algorithms

Linear ProgrammingLinear Programming Linear ProgrammingLinear Programming

Page 4: Data-Powered Algorithms
Page 5: Data-Powered Algorithms
Page 6: Data-Powered Algorithms
Page 7: Data-Powered Algorithms
Page 8: Data-Powered Algorithms
Page 9: Data-Powered Algorithms
Page 10: Data-Powered Algorithms
Page 11: Data-Powered Algorithms
Page 12: Data-Powered Algorithms

N constraints and d variablesN constraints and d variables

Page 13: Data-Powered Algorithms

N constraints and d variablesN constraints and d variables

Page 14: Data-Powered Algorithms

Dimension ReductionDimension Reduction

1000010000 2525Images (face recognition)Images (face recognition) Signals (voice recognition)Signals (voice recognition)Text (NLP)Text (NLP). . . . . .

Nearest neighbor searchingNearest neighbor searchingClusteringClustering. . .. . .

Page 15: Data-Powered Algorithms

Dimension reductionDimension reduction

All pairwise distances nearly preserved

Page 16: Data-Powered Algorithms

Johnson-Lindenstrauss Transform (JLT)

c log nc log n22

dd

Random OrthogonalMatrix

vv dd

Page 17: Data-Powered Algorithms

Friendly JLTFriendly JLT

c log nc log n22

dd

N(0,1)N(0,1) N(0,1)N(0,1) N(0,1)N(0,1)N(0,1)N(0,1)N(0,1)N(0,1)N(0,1)N(0,1) N(0,1)N(0,1)

N(0,1)N(0,1)

N(0,1)N(0,1) N(0,1)N(0,1) N(0,1)N(0,1)N(0,1)N(0,1)N(0,1)N(0,1)N(0,1)N(0,1) N(0,1)N(0,1)

N(0,1)N(0,1)

Page 18: Data-Powered Algorithms

Friendlier JLTFriendlier JLT

c log nc log n22

dd

11++-- 11++-- 11++-- 11++--11++--11++--11++--11++--

11++--11++-- 11++--

11++-- 11++--11++-- 11++--

11++--

d log nd log n 22 = =

Page 19: Data-Powered Algorithms

Sparse JLTSparse JLT? ?

c log nc log n22

11++--11++--11++--

11++-- 11++--11++--

11++--

00

00

00

00

00

00

0000

00

dd

11 dd

00

00

00

00

. .

..

. .

. .

..

. .

o(1)-Fraction non-o(1)-Fraction non-zeroszeros

Page 20: Data-Powered Algorithms

Main Tool: Uncertainty Main Tool: Uncertainty PrinciplePrinciple

TimeTime

FrequencyFrequency

HeisenbergHeisenberg

Page 21: Data-Powered Algorithms

Fast Johnson-Lindenstrauss Transform (FJLT)Fast Johnson-Lindenstrauss Transform (FJLT)

1+- 1+- 1+-

1+-

dd

DiscreteFourier

Transform

dddd

c log nc log n22

. . .

0N(0,1)

= =OO+ d log d + d + d log d + d loglog33 n n 22

dd

OptimalOptimal?? ??

Page 22: Data-Powered Algorithms
Page 23: Data-Powered Algorithms

theory experimentation

Page 24: Data-Powered Algorithms

computation

theory experimentation

Page 25: Data-Powered Algorithms

computation

theory experimentation

Page 26: Data-Powered Algorithms

inputinput outputoutput

Most interestingMost interestingproblems areproblems are

too hard !!too hard !!

Most interestingMost interestingproblems areproblems are

too hard !!too hard !!

Page 27: Data-Powered Algorithms

inputinput outputoutput

randomizationrandomization

approximationapproximation

So, we change So, we change the model…the model…

So, we change So, we change the model…the model…

Page 28: Data-Powered Algorithms

inputinput outputoutput

randomizationrandomization

approximationapproximationPTAS for ETSPPTAS for ETSPPTAS for ETSPPTAS for ETSP

Page 29: Data-Powered Algorithms

inputinput outputoutput

randomizationrandomization

approximationapproximation

Impossible toImpossible toapproximateapproximate chromatic chromatic

number withinnumber withina factor of… a factor of…

Impossible toImpossible toapproximateapproximate chromatic chromatic

number withinnumber withina factor of… a factor of…

Page 30: Data-Powered Algorithms

inputinput outputoutput

randomizationrandomization

approximationapproximationProperty Property TestingTesting

[RS’96, [RS’96, GGR’96]GGR’96]

Property Property TestingTesting

[RS’96, [RS’96, GGR’96]GGR’96]

Berkeley “school”Berkeley “school”(program checking &(program checking &probabilistic proofs)probabilistic proofs)

Berkeley “school”Berkeley “school”(program checking &(program checking &probabilistic proofs)probabilistic proofs)

Page 31: Data-Powered Algorithms
Page 32: Data-Powered Algorithms

Distance is 3Distance is 3Distance is 3Distance is 3

Page 33: Data-Powered Algorithms

Distance is 4Distance is 4Distance is 4Distance is 4

Page 34: Data-Powered Algorithms

nononono

yesyesyesyes

bipartitebipartitebipartitebipartite

Page 35: Data-Powered Algorithms

nononono

yesyesyesyesbipartitebipartitebipartitebipartite

anythinganythinganythinganything

[GR’97][GR’97][GR’97][GR’97]

Page 36: Data-Powered Algorithms
Page 37: Data-Powered Algorithms
Page 38: Data-Powered Algorithms

Birthday paradox Birthday paradox Birthday paradox Birthday paradox

62626262

181818187777

polylog cyclespolylog cyclespolylog cyclespolylog cycles

17171717

MixingMixing casecaseMixingMixing casecase

Page 39: Data-Powered Algorithms

[M’89[M’89

]][M’89[M’89

]]Nonmixing implies small cutsNonmixing implies small cutsNonmixing implies small cutsNonmixing implies small cuts

Non-mixingNon-mixing casecaseNon-mixingNon-mixing casecase

Page 40: Data-Powered Algorithms

Dense graphsDense graphsDense graphsDense graphs

[GGR98, AK99][GGR98, AK99][GGR98, AK99][GGR98, AK99]

Hofstadter. Godel, Escher, Bach.

Is graph k-colorable?Is graph k-colorable?Is graph k-colorable?Is graph k-colorable?

1010001

0101011

1101100

1010011

1101101

0010110

1011001

Page 41: Data-Powered Algorithms

Main Main tooltoolMain Main tooltool

Szemerédi’s Regularity Lemma Szemerédi’s Regularity Lemma Szemerédi’s Regularity Lemma Szemerédi’s Regularity Lemma

Far from k-colorableFar from k-colorableFar from k-colorableFar from k-colorable

Lots of Lots of witnesseswitnesses

Lots of Lots of witnesseswitnesses

Page 42: Data-Powered Algorithms

Property Testing

Graph algorithms connectivity acyclicity k-way cuts clique

Distributions independence entropy monotonicity distances

Geometry convexity disjointness delaunay plane EMST

http://www.cs.princeton.edu/http://www.cs.princeton.edu/~chazelle/~chazelle/

http://www.cs.princeton.edu/http://www.cs.princeton.edu/~chazelle/~chazelle/