composablecore-sets for determinant maximization: a simple ...mahabadi/slides/det-icml.pdf ·...

105
Composable Core-sets for Determinant Maximization: A Simple Near-Optimal Algorithm Sepideh Mahabadi TTIC Piotr Indyk MIT Shayan Oveis Gharan U. of Washington Alireza Rezaei U. of Washington

Upload: others

Post on 07-Aug-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Composable Core-sets for Determinant Maximization:

A Simple Near-Optimal Algorithm

Sepideh Mahabadi

TTIC

Piotr IndykMIT

Shayan Oveis GharanU. of Washington

Alireza RezaeiU. of Washington

Page 2: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Volume (Determinant) Maximization Problem

Input: a set of 𝑛𝑛 vectors V ∈ ℝ𝑑𝑑 and a parameter 𝑘𝑘 ≤ 𝑑𝑑,

𝒌𝒌 = 𝟐𝟐

Page 3: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Volume (Determinant) Maximization Problem

Input: a set of 𝑛𝑛 vectors V ∈ ℝ𝑑𝑑 and a parameter 𝑘𝑘 ≤ 𝑑𝑑,Output: a subset 𝑆𝑆 ⊂ 𝑉𝑉 of size 𝑘𝑘 with the maximum volume

• Parallelepiped spanned by the points in 𝑆𝑆

𝒌𝒌 = 𝟐𝟐

Page 4: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Volume (Determinant) Maximization Problem

Input: a set of 𝑛𝑛 vectors V ∈ ℝ𝑑𝑑 and a parameter 𝑘𝑘 ≤ 𝑑𝑑,Output: a subset 𝑆𝑆 ⊂ 𝑉𝑉 of size 𝑘𝑘 with the maximum volume

• Parallelepiped spanned by the points in 𝑆𝑆

𝒌𝒌 = 𝟐𝟐

Page 5: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Determinant (Volume) Maximization Problem

Input: a set of 𝑛𝑛 vectors V ∈ ℝ𝑑𝑑 and a parameter 𝑘𝑘 ≤ 𝑑𝑑,Output: a subset 𝑆𝑆 ⊂ 𝑉𝑉 of size 𝑘𝑘 with the maximum volume

• Parallelepiped spanned by the points in 𝑆𝑆

Equivalent Formulation: Reuse 𝑉𝑉 to denote the matrix where its columns are the vectors in 𝑉𝑉

𝑣𝑣1 𝑣𝑣2 …𝑣𝑣𝑛𝑛

Page 6: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Determinant (Volume) Maximization Problem

Input: a set of 𝑛𝑛 vectors V ∈ ℝ𝑑𝑑 and a parameter 𝑘𝑘 ≤ 𝑑𝑑,Output: a subset 𝑆𝑆 ⊂ 𝑉𝑉 of size 𝑘𝑘 with the maximum volume

• Parallelepiped spanned by the points in 𝑆𝑆

Equivalent Formulation: Reuse 𝑉𝑉 to denote the matrix where its columns are the vectors in 𝑉𝑉• Let 𝑀𝑀 be the gram matrix 𝑉𝑉𝑇𝑇𝑉𝑉 𝑀𝑀𝑖𝑖,𝑗𝑗 = 𝑣𝑣𝑖𝑖 · 𝑣𝑣𝑗𝑗

𝑣𝑣1𝑣𝑣2…𝑣𝑣𝑛𝑛

𝑣𝑣1 𝑣𝑣2 …𝑣𝑣𝑛𝑛 𝑣𝑣𝑖𝑖 ⋅ 𝑣𝑣𝑗𝑗𝑖𝑖

𝑗𝑗

Page 7: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Determinant (Volume) Maximization Problem

Input: a set of 𝑛𝑛 vectors V ∈ ℝ𝑑𝑑 and a parameter 𝑘𝑘 ≤ 𝑑𝑑,Output: a subset 𝑆𝑆 ⊂ 𝑉𝑉 of size 𝑘𝑘 with the maximum volume

• Parallelepiped spanned by the points in 𝑆𝑆

Equivalent Formulation: Reuse 𝑉𝑉 to denote the matrix where its columns are the vectors in 𝑉𝑉• Let 𝑀𝑀 be the gram matrix 𝑉𝑉𝑇𝑇𝑉𝑉• Choose 𝑆𝑆 such that det(𝑀𝑀𝑆𝑆,𝑆𝑆) is maximized

𝑀𝑀𝑖𝑖,𝑗𝑗 = 𝑣𝑣𝑖𝑖 · 𝑣𝑣𝑗𝑗

det 𝑀𝑀𝑆𝑆,𝑆𝑆 = 𝑉𝑉𝑉𝑉𝑉𝑉 𝑆𝑆 2

𝑣𝑣1𝑣𝑣2…𝑣𝑣𝑛𝑛

𝑣𝑣1 𝑣𝑣2 …𝑣𝑣𝑛𝑛

Page 8: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

What is known?• Hard to approximate within a factor of 2𝑐𝑐𝑐𝑐 [CMI’13]

Page 9: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

What is known?• Hard to approximate within a factor of 2𝑐𝑐𝑐𝑐 [CMI’13]• Best algorithm: 𝑒𝑒𝑐𝑐-approximation [Nik’15]

Page 10: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

What is known?• Hard to approximate within a factor of 2𝑐𝑐𝑐𝑐 [CMI’13]• Best algorithm: 𝑒𝑒𝑐𝑐-approximation [Nik’15]• Greedy is a popular algorithm: achieves approximation factor 𝑘𝑘! 𝑈𝑈 ← ∅ For 𝑘𝑘 iterations, Add to 𝑈𝑈 the farthest point from the subspace spanned by 𝑈𝑈

𝒌𝒌 = 𝟐𝟐

Page 11: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

What is known?• Hard to approximate within a factor of 2𝑐𝑐𝑐𝑐 [CMI’13]• Best algorithm: 𝑒𝑒𝑐𝑐-approximation [Nik’15]• Greedy is a popular algorithm: achieves approximation factor 𝑘𝑘! 𝑈𝑈 ← ∅ For 𝑘𝑘 iterations, Add to 𝑈𝑈 the farthest point from the subspace spanned by 𝑈𝑈

𝒌𝒌 = 𝟐𝟐

Page 12: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

What is known?• Hard to approximate within a factor of 2𝑐𝑐𝑐𝑐 [CMI’13]• Best algorithm: 𝑒𝑒𝑐𝑐-approximation [Nik’15]• Greedy is a popular algorithm: achieves approximation factor 𝑘𝑘! 𝑈𝑈 ← ∅ For 𝑘𝑘 iterations, Add to 𝑈𝑈 the farthest point from the subspace spanned by 𝑈𝑈

𝒌𝒌 = 𝟐𝟐

Page 13: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

What is known?• Hard to approximate within a factor of 2𝑐𝑐𝑐𝑐 [CMI’13]• Best algorithm: 𝑒𝑒𝑐𝑐-approximation [Nik’15]• Greedy is a popular algorithm: achieves approximation factor 𝑘𝑘! 𝑈𝑈 ← ∅ For 𝑘𝑘 iterations, Add to 𝑈𝑈 the farthest point from the subspace spanned by 𝑈𝑈

𝒌𝒌 = 𝟐𝟐

Page 14: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

What is known?• Hard to approximate within a factor of 2𝑐𝑐𝑐𝑐 [CMI’13]• Best algorithm: 𝑒𝑒𝑐𝑐-approximation [Nik’15]• Greedy is a popular algorithm: achieves approximation factor 𝑘𝑘! 𝑈𝑈 ← ∅ For 𝑘𝑘 iterations, Add to 𝑈𝑈 the farthest point from the subspace spanned by 𝑈𝑈

𝒌𝒌 = 𝟐𝟐

Page 15: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

What is known?• Hard to approximate within a factor of 2𝑐𝑐𝑐𝑐 [CMI’13]• Best algorithm: 𝑒𝑒𝑐𝑐-approximation [Nik’15]• Greedy is a popular algorithm: achieves approximation factor 𝑘𝑘! 𝑈𝑈 ← ∅ For 𝑘𝑘 iterations, Add to 𝑈𝑈 the farthest point from the subspace spanned by 𝑈𝑈

• Greedy performs very well in practice

𝒌𝒌 = 𝟐𝟐

Page 16: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Determinantal Point Processes (DPP)DPP: Very popular probabilistic model, where given a set of vectors 𝑉𝑉, samples any 𝑘𝑘-subset 𝑆𝑆 with

probability proportional to this determinant.

• Maximum a posteriori (MAP) decoding is determinant maximization

• Volume/determinant is a notion of diversity

• NeurIPS’18 Tutorial, Negative Dependence, Stable Polynomials, and All That, Jegelka, Sra• ICML’19 Workshop, Negative Dependence: Theory and Applications in Machine Learning, Gartrell,

Gillenwater, Kulesza, Mariet

Page 17: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Application: Diversity Maximization

Given a set of objects, how to pick a few of them while maximizing diversity?

Page 18: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Application: Diversity Maximization

• Searching

Given a set of objects, how to pick a few of them while maximizing diversity?

Page 19: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Application: Diversity Maximization

• Searching

Given a set of objects, how to pick a few of them while maximizing diversity?

Page 20: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Application: Diversity Maximization

Points in a high dimensional space

Objects (documents, images, etc)

Feature Vectors

Page 21: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Application: Diversity Maximization

Input: a set of 𝑛𝑛 vectors V ⊂ ℝ𝑑𝑑 and a parameter 𝑘𝑘,

Goal: pick 𝒌𝒌 points while maximizing “diversity”.

𝒌𝒌 = 𝟑𝟑

Page 22: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Determinantal Point Processes (DPP)DPP: Very popular probabilistic model, where given a set of vectors 𝑉𝑉, samples any 𝑘𝑘-subset 𝑆𝑆 with

probability proportional to this determinant.

• Maximum a posteriori (MAP) decoding is determinant maximization

• Volume/determinant is a notion of diversity

[MJK’17,GCGS’14] Video summarization[KT+’12, CGGS’15,KT’11] Document summarization[YFZ+’16] Tweet generation[LCYO’16] Object detection….

Applications

Page 23: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Determinantal Point Processes (DPP)

• Most applications deal with massive data• Lots of effort for solving the problem in massive

data models of computation [MJK’17, WIB’14, PJG+’14, MKSK’13, MKBK’15, MZ’15, MZ’15, BENW’15]

• e.g. streaming, distributed, parallel

DPP: Very popular probabilistic model, where given a set of vectors 𝑉𝑉, samples any 𝑘𝑘-subset 𝑆𝑆 with

probability proportional to this determinant.

• Maximum a posteriori (MAP) decoding is determinant maximization

• Volume/determinant is a notion of diversity

[MJK’17,GCGS’14] Video summarization[KT+’12, CGGS’15,KT’11] Document summarization[YFZ+’16] Tweet generation[LCYO’16] Object detection….

Applications

Page 24: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Determinantal Point Processes (DPP)

• Most applications deal with massive data• Lots of effort for solving the problem in massive

data models of computation [MJK’17, WIB’14, PJG+’14, MKSK’13, MKBK’15, MZ’15, MZ’15, BENW’15]

• e.g. streaming, distributed, parallel

DPP: Very popular probabilistic model, where given a set of vectors 𝑉𝑉, samples any 𝑘𝑘-subset 𝑆𝑆 with

probability proportional to this determinant.

• Maximum a posteriori (MAP) decoding is determinant maximization

• Volume/determinant is a notion of diversity

[MJK’17,GCGS’14] Video summarization[KT+’12, CGGS’15,KT’11] Document summarization[YFZ+’16] Tweet generation[LCYO’16] Object detection….

Applications

Composable Core-sets

Page 25: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Core-sets

Solving the problem over 𝑼𝑼 gives a good approximation of solving the problem over 𝑽𝑽

Core-sets [AHV’05]: a subset 𝑼𝑼 of the data 𝑽𝑽 that represents it well

Page 26: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Composable Core-setsCore-sets [AHV’05]: a subset 𝑼𝑼 of the data 𝑽𝑽 that represents it well

Composable Core-sets [AAIMV’13 and IMMM’14]:

The union of coresets is a coreset for the union

Page 27: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Composable Core-setsCore-sets [AHV’05]: a subset 𝑼𝑼 of the data 𝑽𝑽 that represents it well

Composable Core-sets [AAIMV’13 and IMMM’14]:

The union of coresets is a coreset for the union

• Let 𝒇𝒇 be an optimization function

o E.g. 𝒇𝒇(𝑽𝑽): solution to 𝑘𝑘 determinant maximization

Page 28: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Composable Core-setsCore-sets [AHV’05]: a subset 𝑼𝑼 of the data 𝑽𝑽 that represents it well

Composable Core-sets [AAIMV’13 and IMMM’14]:

The union of coresets is a coreset for the union

• Let 𝒇𝒇 be an optimization function

o E.g. 𝒇𝒇(𝑽𝑽): solution to 𝑘𝑘 determinant maximization

• Multiple data sets 𝑽𝑽𝟏𝟏,⋯ ,𝑽𝑽𝒎𝒎 and their coresets 𝑼𝑼𝟏𝟏 ⊂ 𝑽𝑽𝟏𝟏,⋯ ,𝑼𝑼𝒎𝒎 ⊂ 𝑽𝑽𝒎𝒎,

o 𝒇𝒇 𝑼𝑼𝟏𝟏 ∪⋯∪ 𝑼𝑼𝒎𝒎 approximates 𝒇𝒇 𝑽𝑽𝟏𝟏 ∪⋯∪ 𝑽𝑽𝒎𝒎 by a factor 𝜶𝜶

Page 29: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Composable Core-setsCore-sets [AHV’05]: a subset 𝑼𝑼 of the data 𝑽𝑽 that represents it well

Composable Core-sets [AAIMV’13 and IMMM’14]:

The union of coresets is a coreset for the union

• Let 𝒇𝒇 be an optimization function

o E.g. 𝒇𝒇(𝑽𝑽): solution to 𝑘𝑘 determinant maximization

• Multiple data sets 𝑽𝑽𝟏𝟏,⋯ ,𝑽𝑽𝒎𝒎 and their coresets 𝑼𝑼𝟏𝟏 ⊂ 𝑽𝑽𝟏𝟏,⋯ ,𝑼𝑼𝒎𝒎 ⊂ 𝑽𝑽𝒎𝒎,

o 𝒇𝒇 𝑼𝑼𝟏𝟏 ∪⋯∪ 𝑼𝑼𝒎𝒎 approximates 𝒇𝒇 𝑽𝑽𝟏𝟏 ∪⋯∪ 𝑽𝑽𝒎𝒎 by a factor 𝜶𝜶

Page 30: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Composable Core-setsCore-sets [AHV’05]: a subset 𝑼𝑼 of the data 𝑽𝑽 that represents it well

Composable Core-sets [AAIMV’13 and IMMM’14]:

The union of coresets is a coreset for the union

• Let 𝒇𝒇 be an optimization function

o E.g. 𝒇𝒇(𝑽𝑽): solution to 𝑘𝑘 determinant maximization

• Multiple data sets 𝑽𝑽𝟏𝟏,⋯ ,𝑽𝑽𝒎𝒎 and their coresets 𝑼𝑼𝟏𝟏 ⊂ 𝑽𝑽𝟏𝟏,⋯ ,𝑼𝑼𝒎𝒎 ⊂ 𝑽𝑽𝒎𝒎,

o 𝒇𝒇 𝑼𝑼𝟏𝟏 ∪⋯∪ 𝑼𝑼𝒎𝒎 approximates 𝒇𝒇 𝑽𝑽𝟏𝟏 ∪⋯∪ 𝑽𝑽𝒎𝒎 by a factor 𝜶𝜶

Page 31: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Composable Core-sets

𝒇𝒇

Core-sets [AHV’05]: a subset 𝑼𝑼 of the data 𝑽𝑽 that represents it well

Composable Core-sets [AAIMV’13 and IMMM’14]:

The union of coresets is a coreset for the union

• Let 𝒇𝒇 be an optimization function

o E.g. 𝒇𝒇(𝑽𝑽): solution to 𝑘𝑘 determinant maximization

• Multiple data sets 𝑽𝑽𝟏𝟏,⋯ ,𝑽𝑽𝒎𝒎 and their coresets 𝑼𝑼𝟏𝟏 ⊂ 𝑽𝑽𝟏𝟏,⋯ ,𝑼𝑼𝒎𝒎 ⊂ 𝑽𝑽𝒎𝒎,

o 𝒇𝒇 𝑼𝑼𝟏𝟏 ∪⋯∪ 𝑼𝑼𝒎𝒎 approximates 𝒇𝒇 𝑽𝑽𝟏𝟏 ∪⋯∪ 𝑽𝑽𝒎𝒎 by a factor 𝜶𝜶

Page 32: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Composable Core-setsCore-sets [AHV’05]: a subset 𝑼𝑼 of the data 𝑽𝑽 that represents it well

𝒇𝒇

Composable Core-sets [AAIMV’13 and IMMM’14]:

The union of coresets is a coreset for the union

• Let 𝒇𝒇 be an optimization function

o E.g. 𝒇𝒇(𝑽𝑽): solution to 𝑘𝑘 determinant maximization

• Multiple data sets 𝑽𝑽𝟏𝟏,⋯ ,𝑽𝑽𝒎𝒎 and their coresets 𝑼𝑼𝟏𝟏 ⊂ 𝑽𝑽𝟏𝟏,⋯ ,𝑼𝑼𝒎𝒎 ⊂ 𝑽𝑽𝒎𝒎,

o 𝒇𝒇 𝑼𝑼𝟏𝟏 ∪⋯∪ 𝑼𝑼𝒎𝒎 approximates 𝒇𝒇 𝑽𝑽𝟏𝟏 ∪⋯∪ 𝑽𝑽𝒎𝒎 by a factor 𝜶𝜶

Page 33: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Composable Core-setsCore-sets [AHV’05]: a subset 𝑼𝑼 of the data 𝑽𝑽 that represents it well

Composable Core-sets have been studied for the diversity Maximization problems, for other

notions of diversity: minimum pairwise distance, sum of pairwise distances, etc.

Determinant maximization is a “higher order” notion of diversity

Composable Core-sets [AAIMV’13 and IMMM’14]:

The union of coresets is a coreset for the union

• Let 𝒇𝒇 be an optimization function

o E.g. 𝒇𝒇(𝑽𝑽): solution to 𝑘𝑘 determinant maximization

• Multiple data sets 𝑽𝑽𝟏𝟏,⋯ ,𝑽𝑽𝒎𝒎 and their coresets 𝑼𝑼𝟏𝟏 ⊂ 𝑽𝑽𝟏𝟏,⋯ ,𝑼𝑼𝒎𝒎 ⊂ 𝑽𝑽𝒎𝒎,

o 𝒇𝒇 𝑼𝑼𝟏𝟏 ∪⋯∪ 𝑼𝑼𝒎𝒎 approximates 𝒇𝒇 𝑽𝑽𝟏𝟏 ∪⋯∪ 𝑽𝑽𝒎𝒎 by a factor 𝜶𝜶

Page 34: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Applications: Streaming Computation• Streaming Computation:

• Processing sequence of 𝑛𝑛 data elements “on the fly”• limited Storage

Page 35: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Applications: Streaming Computation• Streaming Computation:

• Processing sequence of 𝑛𝑛 data elements “on the fly”• limited Storage

• Composable Core-set• Divide into chunks

𝒏𝒏 𝒏𝒏

Page 36: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Applications: Streaming Computation• Streaming Computation:

• Processing sequence of 𝑛𝑛 data elements “on the fly”• limited Storage

• Composable Core-set• Divide into chunks• Compute Core-set for each chunk as it arrives

𝒏𝒏 𝒏𝒏

Core-set

Page 37: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Applications: Streaming Computation• Streaming Computation:

• Processing sequence of 𝑛𝑛 data elements “on the fly”• limited Storage

• Composable Core-set • Divide into chunks• Compute Core-set for each chunk as it arrives

𝒏𝒏 𝒏𝒏

Core-set Core-set

Page 38: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Applications: Streaming Computation• Streaming Computation:

• Processing sequence of 𝑛𝑛 data elements “on the fly”• limited Storage

• Composable Core-set • Divide into chunks• Compute Core-set for each chunk as it arrives• Space goes down from 𝑛𝑛 to 𝑛𝑛

𝒏𝒏 𝒏𝒏

Core-set Core-set

Page 39: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Applications: Distributed Computation• Streaming Computation• Distributed System:

• Each machine holds a block of data.• A composable core-set is computed and sent to the server

Core-set

Data

Data

Data

Mapper

Mapper

Mapper

Reducer Solution

Page 40: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Applications: Improving Runtime• Streaming Computation• Distributed System• Similar framework for improving the runtime

Page 41: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Can we get a composable core-set of small size for the determinant maximization problem?

Page 42: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Composable Core-sets for Volume Maximization

LP-based Optimal Approximation Algorithm of [IMOR’18]:

There exists a polynomial time algorithm for computing an �𝑶𝑶 𝒌𝒌 𝒌𝒌/𝟐𝟐 -composable

core-set of size �𝑶𝑶(𝒌𝒌) for the volume maximization problem.

[IMOR’18]Approximation �𝑶𝑶 𝒌𝒌 𝒌𝒌/𝟐𝟐

Core-set Size �𝑶𝑶(𝒌𝒌)Simple? ×

Page 43: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Composable Core-sets for Volume Maximization

Lower bound [IMOR’18]:

Any composable core-set of size 𝒌𝒌𝑶𝑶(𝟏𝟏) for the volume maximization problem must

have an approximation factor of 𝛀𝛀(𝒌𝒌)𝒌𝒌𝟐𝟐(𝟏𝟏−𝒐𝒐 𝟏𝟏 ).

Lower Bound [IMOR’18]Approximation 𝛀𝛀 𝒌𝒌

𝒌𝒌𝟐𝟐−𝒐𝒐 𝒌𝒌 �𝑶𝑶 𝒌𝒌

𝒌𝒌𝟐𝟐

Core-set Size 𝒌𝒌𝑶𝑶 𝟏𝟏 �𝑶𝑶(𝒌𝒌)Simple? ×

Page 44: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Our Results

Lower Bound [IMOR’18] Greedy Approximation 𝛀𝛀 𝒌𝒌

𝒌𝒌𝟐𝟐−𝒐𝒐 𝒌𝒌 �𝑶𝑶 𝒌𝒌

𝒌𝒌𝟐𝟐 𝑶𝑶(𝑪𝑪𝒌𝒌𝟐𝟐)

Core-set Size 𝒌𝒌𝑶𝑶 𝟏𝟏 �𝑶𝑶(𝒌𝒌) 𝒌𝒌Simple? × ✔

The widely used Greedy algorithm produces a composable core-set of size 𝑘𝑘 with

approximation factor 𝑂𝑂(𝐶𝐶𝑐𝑐2).

Page 45: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Our Results

The Local Search Algorithm produces a composable core-set of size 𝑘𝑘 with

approximation factor 𝑂𝑂 𝑘𝑘 2𝑐𝑐.

Lower Bound [IMOR’18] Greedy Local SearchApproximation 𝛀𝛀 𝒌𝒌

𝒌𝒌𝟐𝟐−𝒐𝒐 𝒌𝒌 �𝑶𝑶 𝒌𝒌

𝒌𝒌𝟐𝟐 𝑶𝑶(𝑪𝑪𝒌𝒌𝟐𝟐) 𝑶𝑶 𝒌𝒌𝒌𝒌

Core-set Size 𝒌𝒌𝑶𝑶 𝟏𝟏 �𝑶𝑶(𝒌𝒌) 𝒌𝒌 𝒌𝒌Simple? × ✔ ✔

Page 46: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

This Talk

The Local Search Algorithm produces a composable core-set of size 𝑘𝑘 with

approximation factor 𝑂𝑂 𝑘𝑘 𝑐𝑐 for the volume maximization problem.

Page 47: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

This Talk

The Local Search Algorithm produces a composable core-set of size 𝑘𝑘 with

approximation factor 𝑂𝑂 𝑘𝑘 𝑐𝑐 for the volume maximization problem.

In comparison to the optimal core-set algorithm Approximation O 𝑘𝑘 𝒌𝒌 as opposed to 𝑂𝑂 𝑘𝑘 log 𝑘𝑘 𝒌𝒌/𝟐𝟐

Smaller Size 𝑘𝑘 as opposed to 𝑂𝑂 𝑘𝑘 log 𝑘𝑘 Simpler to implement (similar to Greedy)

Better performance in practice

Page 48: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

The Local Search Algorithm

Input: a set 𝑉𝑉 of 𝑛𝑛 points and a parameter 𝑘𝑘

1. Start with an arbitrary subset of 𝑘𝑘 points 𝑆𝑆 ⊆ 𝑉𝑉

2. While there exists a point 𝑝𝑝 ∈ 𝑉𝑉 ∖ 𝑆𝑆 and 𝑞𝑞 ∈ 𝑆𝑆 s.t. replacing 𝑞𝑞 with 𝑝𝑝 increases the volume, then swap them, i.e., 𝑆𝑆 = 𝑆𝑆 ∪ 𝑝𝑝 ∖ {𝑞𝑞}

Page 49: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

The Local Search Algorithm

𝒌𝒌 = 𝟑𝟑

Input: a set 𝑉𝑉 of 𝑛𝑛 points and a parameter 𝑘𝑘

1. Start with an arbitrary subset of 𝑘𝑘 points 𝑆𝑆 ⊆ 𝑉𝑉

2. While there exists a point 𝑝𝑝 ∈ 𝑉𝑉 ∖ 𝑆𝑆 and 𝑞𝑞 ∈ 𝑆𝑆 s.t. replacing 𝑞𝑞 with 𝑝𝑝 increases the volume, then swap them, i.e., 𝑆𝑆 = 𝑆𝑆 ∪ 𝑝𝑝 ∖ {𝑞𝑞}

Page 50: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

The Local Search Algorithm

𝒌𝒌 = 𝟑𝟑

𝑞𝑞

𝑝𝑝

Input: a set 𝑉𝑉 of 𝑛𝑛 points and a parameter 𝑘𝑘

1. Start with an arbitrary subset of 𝑘𝑘 points 𝑆𝑆 ⊆ 𝑉𝑉

2. While there exists a point 𝑝𝑝 ∈ 𝑉𝑉 ∖ 𝑆𝑆 and 𝑞𝑞 ∈ 𝑆𝑆 s.t. replacing 𝑞𝑞 with 𝑝𝑝 increases the volume, then swap them, i.e., 𝑆𝑆 = 𝑆𝑆 ∪ 𝑝𝑝 ∖ {𝑞𝑞}

Page 51: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

The Local Search Algorithm

𝒌𝒌 = 𝟑𝟑

Input: a set 𝑉𝑉 of 𝑛𝑛 points and a parameter 𝑘𝑘

1. Start with an arbitrary subset of 𝑘𝑘 points 𝑆𝑆 ⊆ 𝑉𝑉

2. While there exists a point 𝑝𝑝 ∈ 𝑉𝑉 ∖ 𝑆𝑆 and 𝑞𝑞 ∈ 𝑆𝑆 s.t. replacing 𝑞𝑞 with 𝑝𝑝 increases the volume, then swap them, i.e., 𝑆𝑆 = 𝑆𝑆 ∪ 𝑝𝑝 ∖ {𝑞𝑞}

Page 52: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

The Local Search Algorithm

𝒌𝒌 = 𝟑𝟑

𝑞𝑞

𝑝𝑝

Input: a set 𝑉𝑉 of 𝑛𝑛 points and a parameter 𝑘𝑘

1. Start with an arbitrary subset of 𝑘𝑘 points 𝑆𝑆 ⊆ 𝑉𝑉

2. While there exists a point 𝑝𝑝 ∈ 𝑉𝑉 ∖ 𝑆𝑆 and 𝑞𝑞 ∈ 𝑆𝑆 s.t. replacing 𝑞𝑞 with 𝑝𝑝 increases the volume, then swap them, i.e., 𝑆𝑆 = 𝑆𝑆 ∪ 𝑝𝑝 ∖ {𝑞𝑞}

Page 53: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

The Local Search Algorithm

𝒌𝒌 = 𝟑𝟑

Input: a set 𝑉𝑉 of 𝑛𝑛 points and a parameter 𝑘𝑘

1. Start with an arbitrary subset of 𝑘𝑘 points 𝑆𝑆 ⊆ 𝑉𝑉

2. While there exists a point 𝑝𝑝 ∈ 𝑉𝑉 ∖ 𝑆𝑆 and 𝑞𝑞 ∈ 𝑆𝑆 s.t. replacing 𝑞𝑞 with 𝑝𝑝 increases the volume, then swap them, i.e., 𝑆𝑆 = 𝑆𝑆 ∪ 𝑝𝑝 ∖ {𝑞𝑞}

Page 54: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

To bound the run time

Input: a set 𝑉𝑉 of 𝑛𝑛 points and a parameter 𝑘𝑘

1. Start with an arbitrary subset of 𝑘𝑘 points 𝑆𝑆 ⊆ 𝑉𝑉

2. While there exists a point 𝑝𝑝 ∈ 𝑉𝑉 ∖ 𝑆𝑆 and 𝑞𝑞 ∈ 𝑆𝑆 s.t. replacing 𝑞𝑞 with 𝑝𝑝 increases the volume, then swap them, i.e., 𝑆𝑆 = 𝑆𝑆 ∪ 𝑝𝑝 ∖ {𝑞𝑞}

Start with a crude approximation (Greedy algorithm)

If it increases by at least a factor of (1 + 𝜖𝜖)

Page 55: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Checking the condition

𝑞𝑞𝑝𝑝

Input: a set 𝑉𝑉 of 𝑛𝑛 points and a parameter 𝑘𝑘

1. Start with an arbitrary subset of 𝑘𝑘 points 𝑆𝑆 ⊆ 𝑉𝑉

2. While there exists a point 𝑝𝑝 ∈ 𝑉𝑉 ∖ 𝑆𝑆 and 𝑞𝑞 ∈ 𝑆𝑆 s.t. replacing 𝑞𝑞 with 𝑝𝑝 increases the volume, then swap them, i.e., 𝑆𝑆 = 𝑆𝑆 ∪ 𝑝𝑝 ∖ {𝑞𝑞}

𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒑𝒑,𝑯𝑯𝑺𝑺∖ 𝒒𝒒 > 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅(𝒒𝒒,𝑯𝑯𝑺𝑺∖ 𝒒𝒒 )

(𝒌𝒌 − 𝟏𝟏)-dimensional Subspace

Page 56: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Main Lemma [informal]: Local Search preserves maximum distance to “all” subspaces of dimension 𝒌𝒌 − 𝟏𝟏

Page 57: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

𝑽𝑽 is the point set𝑺𝑺 = 𝐿𝐿𝑆𝑆 𝑉𝑉 is the core-set produced by local search

Main Lemma [informal]: Local Search preserves maximum distance to “all” subspaces of dimension 𝒌𝒌 − 𝟏𝟏

Page 58: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Main Lemma [formal]:For any (𝑘𝑘 − 1)-dimensional subspace 𝐺𝐺, the maximum distance of the point set to 𝐺𝐺 is approximately preserved

max𝑞𝑞∈𝑆𝑆

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑞𝑞,𝐺𝐺 ≥1

2𝑘𝑘⋅ max𝑝𝑝∈𝑉𝑉

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑(𝑝𝑝,𝐺𝐺)

𝑽𝑽 is the point set𝑺𝑺 = 𝐿𝐿𝑆𝑆 𝑉𝑉 is the core-set produced by local search

Main Lemma [informal]: Local Search preserves maximum distance to “all” subspaces of dimension 𝒌𝒌 − 𝟏𝟏

Page 59: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Proof.

• Let 𝑝𝑝 ∈ 𝑉𝑉 be a point

𝑝𝑝

Main Lemma [formal]:For any (𝑘𝑘 − 1)-dimensional subspace 𝐺𝐺, the maximum distance of the point set to 𝐺𝐺 is approximately preserved

max𝑠𝑠∈𝑆𝑆

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑞𝑞,𝐺𝐺 ≥1

2𝑘𝑘⋅ max𝑝𝑝∈𝑉𝑉

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑(𝑝𝑝,𝐺𝐺)

Page 60: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Proof.

• Let 𝑝𝑝 ∈ 𝑉𝑉 be a point

• Let 𝐺𝐺 be a (𝑘𝑘 − 1)-dimensional subspace.

𝑝𝑝

𝑮𝑮

Main Lemma [formal]:For any (𝑘𝑘 − 1)-dimensional subspace 𝐺𝐺, the maximum distance of the point set to 𝐺𝐺 is approximately preserved

max𝑠𝑠∈𝑆𝑆

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑞𝑞,𝐺𝐺 ≥1

2𝑘𝑘⋅ max𝑝𝑝∈𝑉𝑉

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑(𝑝𝑝,𝐺𝐺)

Page 61: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Proof.

• Let 𝑝𝑝 ∈ 𝑉𝑉 be a point

• Let 𝐺𝐺 be a (𝑘𝑘 − 1)-dimensional subspace.

• Assume for any 𝑞𝑞 ∈ 𝑆𝑆, 𝑑𝑑 𝑞𝑞,𝐺𝐺 ≤ 𝑥𝑥

𝑝𝑝

𝑮𝑮

Main Lemma [formal]:For any (𝑘𝑘 − 1)-dimensional subspace 𝐺𝐺, the maximum distance of the point set to 𝐺𝐺 is approximately preserved

max𝑠𝑠∈𝑆𝑆

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑞𝑞,𝐺𝐺 ≥1

2𝑘𝑘⋅ max𝑝𝑝∈𝑉𝑉

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑(𝑝𝑝,𝐺𝐺)

Page 62: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Proof.

• Let 𝑝𝑝 ∈ 𝑉𝑉 be a point

• Let 𝐺𝐺 be a (𝑘𝑘 − 1)-dimensional subspace.

• Assume for any 𝑞𝑞 ∈ 𝑆𝑆, 𝑑𝑑 𝑞𝑞,𝐺𝐺 ≤ 𝑥𝑥

𝑝𝑝

𝑮𝑮

≤ 𝑥𝑥≤ 𝑥𝑥

≤ 𝑥𝑥

Main Lemma [formal]:For any (𝑘𝑘 − 1)-dimensional subspace 𝐺𝐺, the maximum distance of the point set to 𝐺𝐺 is approximately preserved

max𝑠𝑠∈𝑆𝑆

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑞𝑞,𝐺𝐺 ≥1

2𝑘𝑘⋅ max𝑝𝑝∈𝑉𝑉

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑(𝑝𝑝,𝐺𝐺)

Page 63: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Proof.

• Let 𝑝𝑝 ∈ 𝑉𝑉 be a point

• Let 𝐺𝐺 be a (𝑘𝑘 − 1)-dimensional subspace.

• Assume for any 𝑞𝑞 ∈ 𝑆𝑆, 𝑑𝑑 𝑞𝑞,𝐺𝐺 ≤ 𝑥𝑥

• Goal: 𝒅𝒅 𝒑𝒑,𝑮𝑮 ≤ 𝟐𝟐𝒌𝒌𝟐𝟐

𝑝𝑝

𝑮𝑮

≤ 2𝑘𝑘𝑥𝑥≤ 𝑥𝑥

≤ 𝑥𝑥≤ 𝑥𝑥

Main Lemma [formal]:For any (𝑘𝑘 − 1)-dimensional subspace 𝐺𝐺, the maximum distance of the point set to 𝐺𝐺 is approximately preserved

max𝑠𝑠∈𝑆𝑆

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑞𝑞,𝐺𝐺 ≥1

2𝑘𝑘⋅ max𝑝𝑝∈𝑉𝑉

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑(𝑝𝑝,𝐺𝐺)

Page 64: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Proof.

• Let 𝑝𝑝 ∈ 𝑉𝑉 be a point

• Let 𝐺𝐺 be a (𝑘𝑘 − 1)-dimensional subspace.

• Assume for any 𝑞𝑞 ∈ 𝑆𝑆, 𝑑𝑑 𝑞𝑞,𝐺𝐺 ≤ 𝑥𝑥

• Goal:

• 𝐻𝐻 ≔ 𝐻𝐻𝑆𝑆 be the subspace passing through 𝑆𝑆

𝑯𝑯𝑝𝑝

𝑮𝑮

𝒅𝒅 𝒑𝒑,𝑮𝑮 ≤ 𝟐𝟐𝒌𝒌𝟐𝟐

Main Lemma [formal]:For any (𝑘𝑘 − 1)-dimensional subspace 𝐺𝐺, the maximum distance of the point set to 𝐺𝐺 is approximately preserved

max𝑠𝑠∈𝑆𝑆

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑞𝑞,𝐺𝐺 ≥1

2𝑘𝑘⋅ max𝑝𝑝∈𝑉𝑉

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑(𝑝𝑝,𝐺𝐺)

Page 65: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Proof.

• Let 𝑝𝑝 ∈ 𝑉𝑉 be a point

• Let 𝐺𝐺 be a (𝑘𝑘 − 1)-dimensional subspace.

• Assume for any 𝑞𝑞 ∈ 𝑆𝑆, 𝑑𝑑 𝑞𝑞,𝐺𝐺 ≤ 𝑥𝑥

• Goal:

• 𝐻𝐻 ≔ 𝐻𝐻𝑆𝑆 be the subspace passing through 𝑆𝑆

• Let 𝑝𝑝𝐻𝐻 be the projection of 𝑝𝑝 onto 𝐺𝐺

𝑯𝑯𝑝𝑝

𝑝𝑝𝐻𝐻

𝑮𝑮

𝒅𝒅 𝒑𝒑,𝑮𝑮 ≤ 𝟐𝟐𝒌𝒌𝟐𝟐

Main Lemma [formal]:For any (𝑘𝑘 − 1)-dimensional subspace 𝐺𝐺, the maximum distance of the point set to 𝐺𝐺 is approximately preserved

max𝑠𝑠∈𝑆𝑆

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑞𝑞,𝐺𝐺 ≥1

2𝑘𝑘⋅ max𝑝𝑝∈𝑉𝑉

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑(𝑝𝑝,𝐺𝐺)

Page 66: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Proof.

• Let 𝑝𝑝 ∈ 𝑉𝑉 be a point

• Let 𝐺𝐺 be a (𝑘𝑘 − 1)-dimensional subspace.

• Assume for any 𝑞𝑞 ∈ 𝑆𝑆, 𝑑𝑑 𝑞𝑞,𝐺𝐺 ≤ 𝑥𝑥

• Goal:

• 𝐻𝐻 ≔ 𝐻𝐻𝑆𝑆 be the subspace passing through 𝑆𝑆

• Let 𝑝𝑝𝐻𝐻 be the projection of 𝑝𝑝 onto 𝐺𝐺

𝑯𝑯𝑝𝑝

𝑝𝑝𝐻𝐻

𝑮𝑮

≤ 𝑘𝑘𝑥𝑥

≤ 𝑘𝑘𝑥𝑥

Lemma 1: 𝒅𝒅 𝒑𝒑,𝒑𝒑𝑯𝑯 ≤ 𝒌𝒌𝟐𝟐

Lemma 2: 𝒅𝒅 𝒑𝒑𝑯𝑯,𝑮𝑮 ≤ 𝒌𝒌𝟐𝟐

𝒅𝒅 𝒑𝒑,𝑮𝑮 ≤ 𝟐𝟐𝒌𝒌𝟐𝟐

Main Lemma [formal]:For any (𝑘𝑘 − 1)-dimensional subspace 𝐺𝐺, the maximum distance of the point set to 𝐺𝐺 is approximately preserved

max𝑠𝑠∈𝑆𝑆

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑞𝑞,𝐺𝐺 ≥1

2𝑘𝑘⋅ max𝑝𝑝∈𝑉𝑉

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑(𝑝𝑝,𝐺𝐺)

Page 67: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Proof.

• Let 𝑝𝑝 ∈ 𝑉𝑉 be a point

• Let 𝐺𝐺 be a (𝑘𝑘 − 1)-dimensional subspace.

• Assume for any 𝑞𝑞 ∈ 𝑆𝑆, 𝑑𝑑 𝑞𝑞,𝐺𝐺 ≤ 𝑥𝑥

• Goal:

• 𝐻𝐻 ≔ 𝐻𝐻𝑆𝑆 be the subspace passing through 𝑆𝑆

• Let 𝑝𝑝𝐻𝐻 be the projection of 𝑝𝑝 onto 𝐺𝐺

𝑯𝑯𝑝𝑝

𝑝𝑝𝐻𝐻

𝑮𝑮

≤ 2𝑘𝑘𝑥𝑥

Lemma 1: 𝒅𝒅 𝒑𝒑,𝒑𝒑𝑯𝑯 ≤ 𝒌𝒌𝟐𝟐

Lemma 2: 𝒅𝒅 𝒑𝒑𝑯𝑯,𝑮𝑮 ≤ 𝒌𝒌𝟐𝟐

𝒅𝒅 𝒑𝒑,𝑮𝑮 ≤ 𝟐𝟐𝒌𝒌𝟐𝟐

Main Lemma [formal]:For any (𝑘𝑘 − 1)-dimensional subspace 𝐺𝐺, the maximum distance of the point set to 𝐺𝐺 is approximately preserved

max𝑠𝑠∈𝑆𝑆

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑞𝑞,𝐺𝐺 ≥1

2𝑘𝑘⋅ max𝑝𝑝∈𝑉𝑉

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑(𝑝𝑝,𝐺𝐺)

Page 68: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Lemma 2: 𝒅𝒅 𝒑𝒑𝑯𝑯,𝑮𝑮 ≤ 𝒌𝒌𝟐𝟐

𝑯𝑯𝑝𝑝

𝑝𝑝𝐻𝐻

Claim:We can write 𝑝𝑝𝐻𝐻 = ∑𝑖𝑖=1𝑐𝑐 𝛼𝛼𝑖𝑖𝑞𝑞𝑖𝑖 s.t. all 𝛼𝛼𝑖𝑖 ≤ 1

Page 69: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Proof.

• Since 𝐻𝐻 has dimension 𝑘𝑘, we can write 𝒑𝒑𝑯𝑯 = ∑𝒅𝒅=𝟏𝟏𝒌𝒌 𝜶𝜶𝒅𝒅𝒒𝒒𝒅𝒅

Lemma 2: 𝒅𝒅 𝒑𝒑𝑯𝑯,𝑮𝑮 ≤ 𝒌𝒌𝟐𝟐

𝑯𝑯𝑝𝑝

𝑝𝑝𝐻𝐻

Claim:We can write 𝑝𝑝𝐻𝐻 = ∑𝑖𝑖=1𝑐𝑐 𝛼𝛼𝑖𝑖𝑞𝑞𝑖𝑖 s.t. all 𝛼𝛼𝑖𝑖 ≤ 1

Page 70: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Proof.

• Since 𝐻𝐻 has dimension 𝑘𝑘, we can write 𝒑𝒑𝑯𝑯 = ∑𝒅𝒅=𝟏𝟏𝒌𝒌 𝜶𝜶𝒅𝒅𝒒𝒒𝒅𝒅

• Let 𝑯𝑯−𝒅𝒅 ≔ 𝑯𝑯𝑺𝑺∖ 𝒒𝒒𝒅𝒅

Lemma 2: 𝒅𝒅 𝒑𝒑𝑯𝑯,𝑮𝑮 ≤ 𝒌𝒌𝟐𝟐

𝑯𝑯𝑝𝑝

𝑝𝑝𝐻𝐻

Claim:We can write 𝑝𝑝𝐻𝐻 = ∑𝑖𝑖=1𝑐𝑐 𝛼𝛼𝑖𝑖𝑞𝑞𝑖𝑖 s.t. all 𝛼𝛼𝑖𝑖 ≤ 1

𝑯𝑯−𝒅𝒅

𝑞𝑞𝑖𝑖

Page 71: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Proof.

• Since 𝐻𝐻 has dimension 𝑘𝑘, we can write 𝒑𝒑𝑯𝑯 = ∑𝒅𝒅=𝟏𝟏𝒌𝒌 𝜶𝜶𝒅𝒅𝒒𝒒𝒅𝒅

• Let 𝑯𝑯−𝒅𝒅 ≔ 𝑯𝑯𝑺𝑺∖ 𝒒𝒒𝒅𝒅

• Since we did not choose 𝑝𝑝 in LS, 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒑𝒑,𝑯𝑯−𝒅𝒅 ≤ 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒒𝒒𝒅𝒅,𝑯𝑯−𝒅𝒅

Lemma 2: 𝒅𝒅 𝒑𝒑𝑯𝑯,𝑮𝑮 ≤ 𝒌𝒌𝟐𝟐

𝑯𝑯𝑝𝑝

𝑝𝑝𝐻𝐻

Claim:We can write 𝑝𝑝𝐻𝐻 = ∑𝑖𝑖=1𝑐𝑐 𝛼𝛼𝑖𝑖𝑞𝑞𝑖𝑖 s.t. all 𝛼𝛼𝑖𝑖 ≤ 1

𝑯𝑯−𝒅𝒅

𝑞𝑞𝑖𝑖

Page 72: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Proof.

• Since 𝐻𝐻 has dimension 𝑘𝑘, we can write 𝒑𝒑𝑯𝑯 = ∑𝒅𝒅=𝟏𝟏𝒌𝒌 𝜶𝜶𝒅𝒅𝒒𝒒𝒅𝒅

• Let 𝑯𝑯−𝒅𝒅 ≔ 𝑯𝑯𝑺𝑺∖ 𝒒𝒒𝒅𝒅

• Since we did not choose 𝑝𝑝 in LS, 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒑𝒑,𝑯𝑯−𝒅𝒅 ≤ 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒒𝒒𝒅𝒅,𝑯𝑯−𝒅𝒅

• Since 𝑝𝑝𝐻𝐻 is a projection of 𝑝𝑝 onto 𝐻𝐻, 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒑𝒑𝑯𝑯,𝑯𝑯−𝒅𝒅 ≤ 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒑𝒑,𝑯𝑯−𝒅𝒅

Lemma 2: 𝒅𝒅 𝒑𝒑𝑯𝑯,𝑮𝑮 ≤ 𝒌𝒌𝟐𝟐

𝑯𝑯𝑝𝑝

𝑝𝑝𝐻𝐻

Claim:We can write 𝑝𝑝𝐻𝐻 = ∑𝑖𝑖=1𝑐𝑐 𝛼𝛼𝑖𝑖𝑞𝑞𝑖𝑖 s.t. all 𝛼𝛼𝑖𝑖 ≤ 1

𝑯𝑯−𝒅𝒅

𝑞𝑞𝑖𝑖

Page 73: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Proof.

• Since 𝐻𝐻 has dimension 𝑘𝑘, we can write 𝒑𝒑𝑯𝑯 = ∑𝒅𝒅=𝟏𝟏𝒌𝒌 𝜶𝜶𝒅𝒅𝒒𝒒𝒅𝒅

• Let 𝑯𝑯−𝒅𝒅 ≔ 𝑯𝑯𝑺𝑺∖ 𝒒𝒒𝒅𝒅

• Since we did not choose 𝑝𝑝 in LS, 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒑𝒑,𝑯𝑯−𝒅𝒅 ≤ 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒒𝒒𝒅𝒅,𝑯𝑯−𝒅𝒅

• Since 𝑝𝑝𝐻𝐻 is a projection of 𝑝𝑝 onto 𝐻𝐻, 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒑𝒑𝑯𝑯,𝑯𝑯−𝒅𝒅 ≤ 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅(𝒑𝒑,𝑯𝑯−𝒅𝒅)

• Thus 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒑𝒑𝑯𝑯,𝑯𝑯−𝒅𝒅 ≤ 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒒𝒒𝒅𝒅,𝑯𝑯−𝒅𝒅

Lemma 2: 𝒅𝒅 𝒑𝒑𝑯𝑯,𝑮𝑮 ≤ 𝒌𝒌𝟐𝟐

𝑯𝑯𝑝𝑝𝐻𝐻

Claim:We can write 𝑝𝑝𝐻𝐻 = ∑𝑖𝑖=1𝑐𝑐 𝛼𝛼𝑖𝑖𝑞𝑞𝑖𝑖 s.t. all 𝛼𝛼𝑖𝑖 ≤ 1

𝑯𝑯−𝒅𝒅

𝑞𝑞𝑖𝑖

Page 74: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Proof.

• Since 𝐻𝐻 has dimension 𝑘𝑘, we can write 𝒑𝒑𝑯𝑯 = ∑𝒅𝒅=𝟏𝟏𝒌𝒌 𝜶𝜶𝒅𝒅𝒒𝒒𝒅𝒅

• Let 𝑯𝑯−𝒅𝒅 ≔ 𝑯𝑯𝑺𝑺∖ 𝒒𝒒𝒅𝒅

• Since we did not choose 𝑝𝑝 in LS, 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒑𝒑,𝑯𝑯−𝒅𝒅 ≤ 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒒𝒒𝒅𝒅,𝑯𝑯−𝒅𝒅

• Since 𝑝𝑝𝐻𝐻 is a projection of 𝑝𝑝 onto 𝐻𝐻, 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒑𝒑𝑯𝑯,𝑯𝑯−𝒅𝒅 ≤ 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅(𝒑𝒑,𝑯𝑯−𝒅𝒅)

• Thus 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒑𝒑𝑯𝑯,𝑯𝑯−𝒅𝒅 ≤ 𝒅𝒅𝒅𝒅𝒅𝒅𝒅𝒅 𝒒𝒒𝒅𝒅,𝑯𝑯−𝒅𝒅

• Thus |𝜶𝜶𝒅𝒅| ≤ 𝟏𝟏

Lemma 2: 𝒅𝒅 𝒑𝒑𝑯𝑯,𝑮𝑮 ≤ 𝒌𝒌𝟐𝟐

𝑯𝑯𝑝𝑝𝐻𝐻

Claim:We can write 𝑝𝑝𝐻𝐻 = ∑𝑖𝑖=1𝑐𝑐 𝛼𝛼𝑖𝑖𝑞𝑞𝑖𝑖 s.t. all 𝛼𝛼𝑖𝑖 ≤ 1

𝑯𝑯−𝒅𝒅

𝑞𝑞𝑖𝑖

Page 75: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Claim:We can write 𝑝𝑝𝐻𝐻 = ∑𝑖𝑖=1𝑐𝑐 𝛼𝛼𝑖𝑖𝑞𝑞𝑖𝑖 s.t. all 𝛼𝛼𝑖𝑖 ≤ 1

Page 76: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Claim:We can write 𝑝𝑝𝐻𝐻 = ∑𝑖𝑖=1𝑐𝑐 𝛼𝛼𝑖𝑖𝑞𝑞𝑖𝑖 s.t. all 𝛼𝛼𝑖𝑖 ≤ 1

Assumption: 𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑞𝑞𝑖𝑖 ,𝐺𝐺 ≤ 𝑥𝑥

Page 77: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Claim:We can write 𝑝𝑝𝐻𝐻 = ∑𝑖𝑖=1𝑐𝑐 𝛼𝛼𝑖𝑖𝑞𝑞𝑖𝑖 s.t. all 𝛼𝛼𝑖𝑖 ≤ 1

Assumption: 𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑞𝑞𝑖𝑖 ,𝐺𝐺 ≤ 𝑥𝑥

Lemma2: 𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑝𝑝𝐻𝐻 ,𝐺𝐺 ≤ ∑𝑖𝑖=1𝑐𝑐 𝛼𝛼𝑖𝑖𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑(𝑞𝑞𝑖𝑖 ,𝐺𝐺) ≤ 𝑘𝑘 ⋅ 𝑥𝑥 ≤ 𝑘𝑘𝑥𝑥

Page 78: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Claim:We can write 𝑝𝑝𝐻𝐻 = ∑𝑖𝑖=1𝑐𝑐 𝛼𝛼𝑖𝑖𝑞𝑞𝑖𝑖 s.t. all 𝛼𝛼𝑖𝑖 ≤ 1

Assumption: 𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑞𝑞𝑖𝑖 ,𝐺𝐺 ≤ 𝑥𝑥

Lemma2: 𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑝𝑝𝐻𝐻 ,𝐺𝐺 ≤ ∑𝑖𝑖=1𝑐𝑐 𝛼𝛼𝑖𝑖𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑(𝑞𝑞𝑖𝑖 ,𝐺𝐺) ≤ 𝑘𝑘 ⋅ 𝑥𝑥 ≤ 𝑘𝑘𝑥𝑥

Lemma 1: 𝑑𝑑 𝑝𝑝,𝑝𝑝𝐻𝐻 ≤ 𝑘𝑘𝑥𝑥

Page 79: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Claim:We can write 𝑝𝑝𝐻𝐻 = ∑𝑖𝑖=1𝑐𝑐 𝛼𝛼𝑖𝑖𝑞𝑞𝑖𝑖 s.t. all 𝛼𝛼𝑖𝑖 ≤ 1

Assumption: 𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑞𝑞𝑖𝑖 ,𝐺𝐺 ≤ 𝑥𝑥

Lemma2: 𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑝𝑝𝐻𝐻 ,𝐺𝐺 ≤ ∑𝑖𝑖=1𝑐𝑐 𝛼𝛼𝑖𝑖𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑(𝑞𝑞𝑖𝑖 ,𝐺𝐺) ≤ 𝑘𝑘 ⋅ 𝑥𝑥 ≤ 𝑘𝑘𝑥𝑥

Lemma 1: 𝑑𝑑 𝑝𝑝,𝑝𝑝𝐻𝐻 ≤ 𝑘𝑘𝑥𝑥 Goal: 𝑑𝑑 𝑝𝑝,𝐺𝐺 ≤ 2𝑘𝑘𝑥𝑥

Page 80: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Claim:We can write 𝑝𝑝𝐻𝐻 = ∑𝑖𝑖=1𝑐𝑐 𝛼𝛼𝑖𝑖𝑞𝑞𝑖𝑖 s.t. all 𝛼𝛼𝑖𝑖 ≤ 1

Assumption: 𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑞𝑞𝑖𝑖 ,𝐺𝐺 ≤ 𝑥𝑥

Lemma2: 𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑝𝑝𝐻𝐻 ,𝐺𝐺 ≤ ∑𝑖𝑖=1𝑐𝑐 𝛼𝛼𝑖𝑖𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑(𝑞𝑞𝑖𝑖 ,𝐺𝐺) ≤ 𝑘𝑘 ⋅ 𝑥𝑥 ≤ 𝑘𝑘𝑥𝑥

Lemma 1: 𝑑𝑑 𝑝𝑝,𝑝𝑝𝐻𝐻 ≤ 𝑘𝑘𝑥𝑥 Goal: 𝑑𝑑 𝑝𝑝,𝐺𝐺 ≤ 2𝑘𝑘𝑥𝑥

Main Lemma [formal]:For any (𝑘𝑘 − 1)-dimensional subspace 𝐺𝐺, the maximum distance of the point set to 𝐺𝐺 is approximately preserved

max𝑠𝑠∈𝑆𝑆

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑 𝑞𝑞,𝐺𝐺 ≥1

2𝑘𝑘⋅ max𝑝𝑝∈𝑉𝑉

𝑑𝑑𝑖𝑖𝑑𝑑𝑑𝑑(𝑝𝑝,𝐺𝐺)

Page 81: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Let 𝑽𝑽 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Main TheoremLocal Search produces a core-set for volume maximization

Page 82: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Let 𝑽𝑽 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Let 𝑺𝑺 = ⋃𝑖𝑖 𝑆𝑆𝑖𝑖 be the union of core-sets

Main TheoremLocal Search produces a core-set for volume maximization

Page 83: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Let 𝑽𝑽 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Let 𝑺𝑺 = ⋃𝑖𝑖 𝑆𝑆𝑖𝑖 be the union of core-sets

Let 𝑶𝑶𝒑𝒑𝒅𝒅 = 𝒐𝒐𝟏𝟏, … ,𝒐𝒐𝒌𝒌 ⊂ 𝑉𝑉 be the optimal subset of points maximizing the volume

Main TheoremLocal Search produces a core-set for volume maximization

𝑉𝑉2

𝑉𝑉3

𝑉𝑉1

Page 84: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Let 𝑽𝑽 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Let 𝑺𝑺 = ⋃𝑖𝑖 𝑆𝑆𝑖𝑖 be the union of core-sets

Let 𝑶𝑶𝒑𝒑𝒅𝒅 = 𝒐𝒐𝟏𝟏, … ,𝒐𝒐𝒌𝒌 ⊂ 𝑉𝑉 be the optimal subset of points maximizing the volume

Main TheoremLocal Search produces a core-set for volume maximization

𝑉𝑉2

𝑉𝑉3

𝑉𝑉1

𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑂𝑂𝑝𝑝𝑑𝑑

For 𝑖𝑖 = 1 𝑑𝑑𝑉𝑉 𝑘𝑘

• Let 𝑞𝑞𝑖𝑖 ∈ 𝑆𝑆 be the point that is farthest away from 𝐻𝐻𝑆𝑆𝑆𝑆𝑆𝑆∖ 𝑆𝑆𝑖𝑖

• 𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑆𝑆𝑉𝑉𝑉𝑉 ∪ 𝑞𝑞𝑖𝑖 ∖ {𝑉𝑉𝑖𝑖}

Page 85: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Let 𝑽𝑽 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Let 𝑺𝑺 = ⋃𝑖𝑖 𝑆𝑆𝑖𝑖 be the union of core-sets

Let 𝑶𝑶𝒑𝒑𝒅𝒅 = 𝒐𝒐𝟏𝟏, … ,𝒐𝒐𝒌𝒌 ⊂ 𝑉𝑉 be the optimal subset of points maximizing the volume

Main TheoremLocal Search produces a core-set for volume maximization

𝒐𝒐𝟐𝟐

𝒐𝒐𝟑𝟑

𝒐𝒐𝟏𝟏

𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑂𝑂𝑝𝑝𝑑𝑑

For 𝑖𝑖 = 1 𝑑𝑑𝑉𝑉 𝑘𝑘

• Let 𝑞𝑞𝑖𝑖 ∈ 𝑆𝑆 be the point that is farthest away from 𝐻𝐻𝑆𝑆𝑆𝑆𝑆𝑆∖ 𝑆𝑆𝑖𝑖

• 𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑆𝑆𝑉𝑉𝑉𝑉 ∪ 𝑞𝑞𝑖𝑖 ∖ {𝑉𝑉𝑖𝑖}

Page 86: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Let 𝑽𝑽 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Let 𝑺𝑺 = ⋃𝑖𝑖 𝑆𝑆𝑖𝑖 be the union of core-sets

Let 𝑶𝑶𝒑𝒑𝒅𝒅 = 𝒐𝒐𝟏𝟏, … ,𝒐𝒐𝒌𝒌 ⊂ 𝑉𝑉 be the optimal subset of points maximizing the volume

Main TheoremLocal Search produces a core-set for volume maximization

𝒐𝒐𝟐𝟐

𝒐𝒐𝟑𝟑

𝒐𝒐𝟏𝟏

𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑂𝑂𝑝𝑝𝑑𝑑

For 𝑖𝑖 = 1 𝑑𝑑𝑉𝑉 𝑘𝑘

• Let 𝑞𝑞𝑖𝑖 ∈ 𝑆𝑆 be the point that is farthest away from 𝐻𝐻𝑆𝑆𝑆𝑆𝑆𝑆∖ 𝑆𝑆𝑖𝑖

• 𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑆𝑆𝑉𝑉𝑉𝑉 ∪ 𝑞𝑞𝑖𝑖 ∖ {𝑉𝑉𝑖𝑖}

Page 87: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Let 𝑽𝑽 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Let 𝑺𝑺 = ⋃𝑖𝑖 𝑆𝑆𝑖𝑖 be the union of core-sets

Let 𝑶𝑶𝒑𝒑𝒅𝒅 = 𝒐𝒐𝟏𝟏, … ,𝒐𝒐𝒌𝒌 ⊂ 𝑉𝑉 be the optimal subset of points maximizing the volume

Main TheoremLocal Search produces a core-set for volume maximization

𝒐𝒐𝟐𝟐

𝒐𝒐𝟑𝟑𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑂𝑂𝑝𝑝𝑑𝑑

For 𝑖𝑖 = 1 𝑑𝑑𝑉𝑉 𝑘𝑘

• Let 𝑞𝑞𝑖𝑖 ∈ 𝑆𝑆 be the point that is farthest away from 𝐻𝐻𝑆𝑆𝑆𝑆𝑆𝑆∖ 𝑆𝑆𝑖𝑖

• 𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑆𝑆𝑉𝑉𝑉𝑉 ∪ 𝑞𝑞𝑖𝑖 ∖ {𝑉𝑉𝑖𝑖}

Page 88: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Let 𝑽𝑽 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Let 𝑺𝑺 = ⋃𝑖𝑖 𝑆𝑆𝑖𝑖 be the union of core-sets

Let 𝑶𝑶𝒑𝒑𝒅𝒅 = 𝒐𝒐𝟏𝟏, … ,𝒐𝒐𝒌𝒌 ⊂ 𝑉𝑉 be the optimal subset of points maximizing the volume

Main TheoremLocal Search produces a core-set for volume maximization

𝒐𝒐𝟐𝟐

𝒐𝒐𝟑𝟑

𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑂𝑂𝑝𝑝𝑑𝑑

For 𝑖𝑖 = 1 𝑑𝑑𝑉𝑉 𝑘𝑘

• Let 𝑞𝑞𝑖𝑖 ∈ 𝑆𝑆 be the point that is farthest away from 𝐻𝐻𝑆𝑆𝑆𝑆𝑆𝑆∖ 𝑆𝑆𝑖𝑖

• 𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑆𝑆𝑉𝑉𝑉𝑉 ∪ 𝑞𝑞𝑖𝑖 ∖ {𝑉𝑉𝑖𝑖}

Page 89: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Let 𝑽𝑽 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Let 𝑺𝑺 = ⋃𝑖𝑖 𝑆𝑆𝑖𝑖 be the union of core-sets

Let 𝑶𝑶𝒑𝒑𝒅𝒅 = 𝒐𝒐𝟏𝟏, … ,𝒐𝒐𝒌𝒌 ⊂ 𝑉𝑉 be the optimal subset of points maximizing the volume

Main TheoremLocal Search produces a core-set for volume maximization

𝒐𝒐𝟐𝟐

𝒐𝒐𝟑𝟑

𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑂𝑂𝑝𝑝𝑑𝑑

For 𝑖𝑖 = 1 𝑑𝑑𝑉𝑉 𝑘𝑘

• Let 𝑞𝑞𝑖𝑖 ∈ 𝑆𝑆 be the point that is farthest away from 𝐻𝐻𝑆𝑆𝑆𝑆𝑆𝑆∖ 𝑆𝑆𝑖𝑖

• 𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑆𝑆𝑉𝑉𝑉𝑉 ∪ 𝑞𝑞𝑖𝑖 ∖ {𝑉𝑉𝑖𝑖}

Page 90: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Let 𝑽𝑽 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Let 𝑺𝑺 = ⋃𝑖𝑖 𝑆𝑆𝑖𝑖 be the union of core-sets

Let 𝑶𝑶𝒑𝒑𝒅𝒅 = 𝒐𝒐𝟏𝟏, … ,𝒐𝒐𝒌𝒌 ⊂ 𝑉𝑉 be the optimal subset of points maximizing the volume

Main TheoremLocal Search produces a core-set for volume maximization

𝒐𝒐𝟑𝟑

𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑂𝑂𝑝𝑝𝑑𝑑

For 𝑖𝑖 = 1 𝑑𝑑𝑉𝑉 𝑘𝑘

• Let 𝑞𝑞𝑖𝑖 ∈ 𝑆𝑆 be the point that is farthest away from 𝐻𝐻𝑆𝑆𝑆𝑆𝑆𝑆∖ 𝑆𝑆𝑖𝑖

• 𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑆𝑆𝑉𝑉𝑉𝑉 ∪ 𝑞𝑞𝑖𝑖 ∖ {𝑉𝑉𝑖𝑖}

Page 91: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Let 𝑽𝑽 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Let 𝑺𝑺 = ⋃𝑖𝑖 𝑆𝑆𝑖𝑖 be the union of core-sets

Let 𝑶𝑶𝒑𝒑𝒅𝒅 = 𝒐𝒐𝟏𝟏, … ,𝒐𝒐𝒌𝒌 ⊂ 𝑉𝑉 be the optimal subset of points maximizing the volume

Main TheoremLocal Search produces a core-set for volume maximization

𝒐𝒐𝟑𝟑

𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑂𝑂𝑝𝑝𝑑𝑑

For 𝑖𝑖 = 1 𝑑𝑑𝑉𝑉 𝑘𝑘

• Let 𝑞𝑞𝑖𝑖 ∈ 𝑆𝑆 be the point that is farthest away from 𝐻𝐻𝑆𝑆𝑆𝑆𝑆𝑆∖ 𝑆𝑆𝑖𝑖

• 𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑆𝑆𝑉𝑉𝑉𝑉 ∪ 𝑞𝑞𝑖𝑖 ∖ {𝑉𝑉𝑖𝑖}

Page 92: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Let 𝑽𝑽 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Let 𝑺𝑺 = ⋃𝑖𝑖 𝑆𝑆𝑖𝑖 be the union of core-sets

Let 𝑶𝑶𝒑𝒑𝒅𝒅 = 𝒐𝒐𝟏𝟏, … ,𝒐𝒐𝒌𝒌 ⊂ 𝑉𝑉 be the optimal subset of points maximizing the volume

Main TheoremLocal Search produces a core-set for volume maximization

𝒐𝒐𝟑𝟑

𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑂𝑂𝑝𝑝𝑑𝑑

For 𝑖𝑖 = 1 𝑑𝑑𝑉𝑉 𝑘𝑘

• Let 𝑞𝑞𝑖𝑖 ∈ 𝑆𝑆 be the point that is farthest away from 𝐻𝐻𝑆𝑆𝑆𝑆𝑆𝑆∖ 𝑆𝑆𝑖𝑖

• 𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑆𝑆𝑉𝑉𝑉𝑉 ∪ 𝑞𝑞𝑖𝑖 ∖ {𝑉𝑉𝑖𝑖}

Page 93: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Let 𝑽𝑽 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Let 𝑺𝑺 = ⋃𝑖𝑖 𝑆𝑆𝑖𝑖 be the union of core-sets

Let 𝑶𝑶𝒑𝒑𝒅𝒅 = 𝒐𝒐𝟏𝟏, … ,𝒐𝒐𝒌𝒌 ⊂ 𝑉𝑉 be the optimal subset of points maximizing the volume

Main TheoremLocal Search produces a core-set for volume maximization

𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑂𝑂𝑝𝑝𝑑𝑑

For 𝑖𝑖 = 1 𝑑𝑑𝑉𝑉 𝑘𝑘

• Let 𝑞𝑞𝑖𝑖 ∈ 𝑆𝑆 be the point that is farthest away from 𝐻𝐻𝑆𝑆𝑆𝑆𝑆𝑆∖ 𝑆𝑆𝑖𝑖

• 𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑆𝑆𝑉𝑉𝑉𝑉 ∪ 𝑞𝑞𝑖𝑖 ∖ {𝑉𝑉𝑖𝑖}

Page 94: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Let 𝑽𝑽 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Let 𝑺𝑺 = ⋃𝑖𝑖 𝑆𝑆𝑖𝑖 be the union of core-sets

Let 𝑶𝑶𝒑𝒑𝒅𝒅 = 𝒐𝒐𝟏𝟏, … ,𝒐𝒐𝒌𝒌 ⊂ 𝑉𝑉 be the optimal subset of points maximizing the volume

Main TheoremLocal Search produces a core-set for volume maximization

𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑂𝑂𝑝𝑝𝑑𝑑

For 𝑖𝑖 = 1 𝑑𝑑𝑉𝑉 𝑘𝑘

• Let 𝑞𝑞𝑖𝑖 ∈ 𝑆𝑆 be the point that is farthest away from 𝐻𝐻𝑆𝑆𝑆𝑆𝑆𝑆∖ 𝑆𝑆𝑖𝑖

• 𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑆𝑆𝑉𝑉𝑉𝑉 ∪ 𝑞𝑞𝑖𝑖 ∖ {𝑉𝑉𝑖𝑖}

Page 95: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Let 𝑽𝑽 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Let 𝑺𝑺 = ⋃𝑖𝑖 𝑆𝑆𝑖𝑖 be the union of core-sets

Let 𝑶𝑶𝒑𝒑𝒅𝒅 = 𝒐𝒐𝟏𝟏, … ,𝒐𝒐𝒌𝒌 ⊂ 𝑉𝑉 be the optimal subset of points maximizing the volume

Main TheoremLocal Search produces a core-set for volume maximization

𝑉𝑉2

𝑉𝑉3

𝑉𝑉1

𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑂𝑂𝑝𝑝𝑑𝑑

For 𝑖𝑖 = 1 𝑑𝑑𝑉𝑉 𝑘𝑘

• Let 𝑞𝑞𝑖𝑖 ∈ 𝑆𝑆 be the point that is farthest away from 𝐻𝐻𝑆𝑆𝑆𝑆𝑆𝑆∖ 𝑆𝑆𝑖𝑖

• 𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑆𝑆𝑉𝑉𝑉𝑉 ∪ 𝑞𝑞𝑖𝑖 ∖ {𝑉𝑉𝑖𝑖}

Page 96: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Let 𝑽𝑽 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Let 𝑺𝑺 = ⋃𝑖𝑖 𝑆𝑆𝑖𝑖 be the union of core-sets

Let 𝑶𝑶𝒑𝒑𝒅𝒅 = 𝒐𝒐𝟏𝟏, … ,𝒐𝒐𝒌𝒌 ⊂ 𝑉𝑉 be the optimal subset of points maximizing the volume

Main TheoremLocal Search produces a core-set for volume maximization

𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑂𝑂𝑝𝑝𝑑𝑑

For 𝑖𝑖 = 1 𝑑𝑑𝑉𝑉 𝑘𝑘

• Let 𝑞𝑞𝑖𝑖 ∈ 𝑆𝑆 be the point that is farthest away from 𝐻𝐻𝑆𝑆𝑆𝑆𝑆𝑆∖ 𝑆𝑆𝑖𝑖

• 𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑆𝑆𝑉𝑉𝑉𝑉 ∪ 𝑞𝑞𝑖𝑖 ∖ {𝑉𝑉𝑖𝑖}

Page 97: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Lose a factor of at most 2𝑘𝑘 at each iteration

Main TheoremLocal Search produces a core-set for volume maximization

𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑂𝑂𝑝𝑝𝑑𝑑

For 𝑖𝑖 = 1 𝑑𝑑𝑉𝑉 𝑘𝑘

• Let 𝑞𝑞𝑖𝑖 ∈ 𝑆𝑆 be the point that is farthest away from 𝐻𝐻𝑆𝑆𝑆𝑆𝑆𝑆∖ 𝑆𝑆𝑖𝑖

• 𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑆𝑆𝑉𝑉𝑉𝑉 ∪ 𝑞𝑞𝑖𝑖 ∖ {𝑉𝑉𝑖𝑖}Since local search preserve

maximum distances to subspaces

Let 𝑉𝑉 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Let 𝑆𝑆 = ⋃𝑖𝑖 𝑆𝑆𝑖𝑖 be the union of core-sets

Let 𝑂𝑂𝑝𝑝𝑑𝑑 = 𝑉𝑉1, … , 𝑉𝑉𝑐𝑐 ⊂ 𝑉𝑉 be the optimal subset of points maximizing the volume

Page 98: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Let 𝑉𝑉 = ⋃𝑖𝑖 𝑉𝑉𝑖𝑖 be the union of the point sets

Let 𝑆𝑆 = ⋃𝑖𝑖 𝑆𝑆𝑖𝑖 be the union of core-sets

Let 𝑂𝑂𝑝𝑝𝑑𝑑 = 𝑉𝑉1, … , 𝑉𝑉𝑐𝑐 ⊂ 𝑉𝑉 be the optimal subset of points maximizing the volume

𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑂𝑂𝑝𝑝𝑑𝑑

For 𝑖𝑖 = 1 𝑑𝑑𝑉𝑉 𝑘𝑘

• Let 𝑞𝑞𝑖𝑖 ∈ 𝑆𝑆 be the point that is farthest away from 𝐻𝐻𝑆𝑆𝑆𝑆𝑆𝑆∖ 𝑆𝑆𝑖𝑖

• 𝑆𝑆𝑉𝑉𝑉𝑉 ← 𝑆𝑆𝑉𝑉𝑉𝑉 ∪ 𝑞𝑞𝑖𝑖 ∖ {𝑉𝑉𝑖𝑖}

Lose a factor of at most 2𝑘𝑘 at each iteration

Total approximation factor 2𝑘𝑘 𝑐𝑐

Main TheoremLocal Search produces a core-set for volume maximization

Page 99: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Empirical Results

Data Set• MNIST, with number of parts = 10• MNIST, with number of parts = 50• GENES, with number of parts = 10

Process- Partition the data set randomly into parts- Compute a core-set using one of the algorithms: Greedy, Local

Search, LP-Based algorithm of [IMOR’18]- Use greedy on the union of the coresets

Page 100: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Local Search vs Greedy

Improvement of the solution of Local Search over Greedy On average, 1.2%, 2.5%, and 9.6%

improvement Some cases up to 58% improvement

Ratio of runtime of Local Search over Greedy On average, 6 times slower

Page 101: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

Local Search vs. LP-based Algorithm of [IMOR’18]

Improvement of the solution of Local Search over [IMOR’18] On average, 1.4%, 1.8%, and 7.3%

improvement Some cases up to 63% improvement

Ratio of runtime of Local Search over [IMOR’18] For lower values of k, Local Search is up to

50 times faster.

Page 102: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

[IMOR’18] Greedy Local SearchApproximation 𝑂𝑂 𝑘𝑘 log𝑘𝑘 𝑐𝑐/2 𝑂𝑂(𝐶𝐶𝑐𝑐2) 𝑂𝑂 𝑘𝑘𝑐𝑐

Core-set Size 𝑂𝑂(𝑘𝑘 log𝑘𝑘) 𝑘𝑘 𝑘𝑘Simple? × ✔ ✔

Empirical Approximation Performs Best

Empirical Runtime Slowest Fastest 4 times slower than Greedy.

Summary• Volume/Determinant Maximization Problem• Notion of composable core-sets• Algorithms that find composable core-sets for volume/determinant maximization

Page 103: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

[IMOR’18] Greedy Local SearchApproximation 𝑂𝑂 𝑘𝑘 log𝑘𝑘 𝑐𝑐/2 𝑂𝑂(𝐶𝐶𝑐𝑐2) 𝑂𝑂 𝑘𝑘𝑐𝑐

Core-set Size 𝑂𝑂(𝑘𝑘 log𝑘𝑘) 𝑘𝑘 𝑘𝑘Simple? × ✔ ✔

Empirical Approximation Performs Best

Empirical Runtime Slowest Fastest 4 times slower than Greedy.

Conclusion• Local Search might be the right algorithm to use in massive data models of computation.

Summary• Volume/Determinant Maximization Problem• Notion of composable core-sets• Algorithms that find composable core-sets for volume/determinant maximization

Page 104: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

[IMOR’18] Greedy Local SearchApproximation 𝑂𝑂 𝑘𝑘 log𝑘𝑘 𝑐𝑐/2 𝑂𝑂(𝐶𝐶𝑐𝑐2) 𝑂𝑂 𝑘𝑘𝑐𝑐

Core-set Size 𝑂𝑂(𝑘𝑘 log𝑘𝑘) 𝑘𝑘 𝑘𝑘Simple? × ✔ ✔

Empirical Approximation Performs Best

Empirical Runtime Slowest Fastest 4 times slower than Greedy.

Conclusion• Local Search might be the right algorithm to use in massive data models of computation.

Open Problem• Tight analysis of Greedy: does it also provide approximation 𝑘𝑘O(k)?

Summary• Volume/Determinant Maximization Problem• Notion of composable core-sets• Algorithms that find composable core-sets for volume/determinant maximization

Page 105: ComposableCore-sets for Determinant Maximization: A Simple ...mahabadi/slides/Det-ICML.pdf · DeterminantalPoint Processes (DPP) DPP: Very popular probabilistic model, where given

[IMOR’18] Greedy Local SearchApproximation 𝑂𝑂 𝑘𝑘 log𝑘𝑘 𝑐𝑐/2 𝑂𝑂(𝐶𝐶𝑐𝑐2) 𝑂𝑂 𝑘𝑘𝑐𝑐

Core-set Size 𝑂𝑂(𝑘𝑘 log𝑘𝑘) 𝑘𝑘 𝑘𝑘Simple? × ✔ ✔

Empirical Approximation Performs Best

Empirical Runtime Slowest Fastest 4 times slower than Greedy.

Summary• Volume/Determinant Maximization Problem• Notion of composable core-sets• Algorithms that find composable core-sets for volume/determinant maximization

Conclusion• Local Search might be the right algorithm to use in massive data models of computation.

Open Problem• Tight analysis of Greedy: does it also provide approximation 𝑘𝑘O(k)?

Thank you!