online submodular optimization problemsniaohe.ise.illinois.edu/ie598/online submodular... · 2018....

29
Online Submodular Optimization Problems Menglong Li April 26, 2018 Menglong Li (UIUC) Short title April 26, 2018 1 / 24

Upload: others

Post on 17-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Online Submodular Optimization Problems

Menglong Li

April 26, 2018

Menglong Li (UIUC) Short title April 26, 2018 1 / 24

Page 2: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Overview

1 The Reason

2 Online convex optimization (OCO)

3 Online submodular set function minimization

4 Online Continuous Submodular Maximization

5 Online submodular minimization in real spaceQuadratic submodular functionsGeneral continuous submodular functions

Menglong Li (UIUC) Short title April 26, 2018 2 / 24

Page 3: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Why consider submodular functions

Question: why consider submodular functions?

Submodular functions often appear as objective functions of machinelearning tasks such as sensor placement, document summarization oractive learning

submodular functions can model valuation functions of agents withdiminishing returns

Submodular function itself is nonconvex and nonconcave in general,which lead to very hard optimization problems

Menglong Li (UIUC) Short title April 26, 2018 3 / 24

Page 4: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Why consider submodular functions

Question: why consider submodular functions?

Submodular functions often appear as objective functions of machinelearning tasks such as sensor placement, document summarization oractive learning

submodular functions can model valuation functions of agents withdiminishing returns

Submodular function itself is nonconvex and nonconcave in general,which lead to very hard optimization problems

Menglong Li (UIUC) Short title April 26, 2018 3 / 24

Page 5: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Why consider submodular functions

Question: why consider submodular functions?

Submodular functions often appear as objective functions of machinelearning tasks such as sensor placement, document summarization oractive learning

submodular functions can model valuation functions of agents withdiminishing returns

Submodular function itself is nonconvex and nonconcave in general,which lead to very hard optimization problems

Menglong Li (UIUC) Short title April 26, 2018 3 / 24

Page 6: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Why consider submodular functions

Question: why consider submodular functions?

Submodular functions often appear as objective functions of machinelearning tasks such as sensor placement, document summarization oractive learning

submodular functions can model valuation functions of agents withdiminishing returns

Submodular function itself is nonconvex and nonconcave in general,which lead to very hard optimization problems

Menglong Li (UIUC) Short title April 26, 2018 3 / 24

Page 7: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Why consider online problem

Question: why consider online problem?

Online problem can model multi-period problem which the cost function ineach period is revealed after the decision is made at that period.

Menglong Li (UIUC) Short title April 26, 2018 4 / 24

Page 8: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Why consider online problem

Question: why consider online problem?Online problem can model multi-period problem which the cost function ineach period is revealed after the decision is made at that period.

Menglong Li (UIUC) Short title April 26, 2018 4 / 24

Page 9: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Online convex optimization

Model setting:Consider a multi-period decision-making problem where a decision makermakes decision at each period to minimize the ”regret”.For t=1,...,T,at iteration t, the decision maker chooses xt ∈ K .After the decision maker has committed to this choice, a convex costfunction ft ∈ F : K → R is revealed.Then go to the next period.The decision maker wants to minimize the regret:

RT ((xt)) =T∑t=1

ft(xt)−minx∈K

T∑t=1

ft(x).

Menglong Li (UIUC) Short title April 26, 2018 5 / 24

Page 10: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

OCO

Projected gradient algorithm:

xt+1 = ΠK (xt − αt∇ft(xt))

. Denote D the diameter of K .

1 If ft is L-Lipschtiz continuous and αt = DL√2T

, then

RT ((xt)) ≤ DL√

2T

2 If ft is L-Lipschtiz continuous and µ-strongly convex, and αt = 1µt ,

then

RT ((xt)) ≤ L2(1 + logT )

Menglong Li (UIUC) Short title April 26, 2018 6 / 24

Page 11: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Submodular set function

Let [n] = {1, ..., n}. A set function f : 2[n] → R is called submodular if forall sets S ,T ⊆ [n] such that T ⊆ S , and for all elements i ∈ [n]− S , wehave

f (T ∪ {i})− f (T ) ≥ f (S ∪ {i})− f (S).

Or equivalently, f is submodular if and only if for all S ,T ⊆ [n],

f (S ∪ T ) + f (S ∩ T ) ≤ f (S) + f (T ).

An important theorem:For any set function f : 2[n] → R, there is a convex extension (calledLovasz extension) f L : [0, 1]n → R such that f is submodular if and onlyf L is convex. In addition, there is a correspondence between minimum off L and minimum of f .

Menglong Li (UIUC) Short title April 26, 2018 7 / 24

Page 12: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Results

The model setting is similar to OCO:

Assume ft : 2[n] → [−M,M] is submodular.

Suppose in each period t, the decision maker has unlimited access tothe value oracles of the previously seen cost functions f1, f2, ...ft−1.

Theorem (Hazan, E., & Kale, S. (2012).)

There is an online subgradient descent algorithm with step sizeαt =

√n

16MT such that

E [RegretT ] ≤ 4M√nT

Menglong Li (UIUC) Short title April 26, 2018 8 / 24

Page 13: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Results

The model setting is similar to OCO:

Assume ft : 2[n] → [−M,M] is submodular.

Suppose in each period t, the decision maker has unlimited access tothe value oracles of the previously seen cost functions f1, f2, ...ft−1.

Theorem (Hazan, E., & Kale, S. (2012).)

There is an online subgradient descent algorithm with step sizeαt =

√n

16MT such that

E [RegretT ] ≤ 4M√nT

Menglong Li (UIUC) Short title April 26, 2018 8 / 24

Page 14: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Submodular function in real space

Let L be a lattice. A function f : L → R ∪ {+∞} is submodular if for allx , y ∈ L,

f (x) + f (y) ≥ f (x ∧ y) + f (x ∨ y)

It is clear that when f is twice differentiable and L is a box, f issubmodular if and only if

∂2f (x)

∂xi∂xj≤ 0, ∀i 6= j , x ∈ L

A twice differentiable function f is called DR-submodular if

∂2f (x)

∂xi∂xj≤ 0,∀i , j , x ∈ L.

Menglong Li (UIUC) Short title April 26, 2018 9 / 24

Page 15: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

The performance of stationary points

Theorem (Hassani et al. 2017)

If f be a monotone and DR-submodular and assume K ⊆ L is a convexset. Then,(i) If x is a stationary point of f in K , then f (x) ≥ 1

2OPT.(ii) Furthermore, if f is L-smooth, gradient ascent with a step size smallerthan 1/L will converge to a stationary point.

The lower bound in (i) is tight.

Menglong Li (UIUC) Short title April 26, 2018 10 / 24

Page 16: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Online gradient ascent

Online Gradient AscentInput: convex set K ,T , x1 ∈ K , step sizes {αt}Output: {xt : 1 ≤ t ≤ T}1: for t ← 1, 2, 3, ...,T do2: Play xt and receive reward ft(xt).3: xt+1 = ΠK (xt + αt∇ft(xt))4: end for

Theorem (Chen et al. 2018)

Assume that the functions ft : L → R+ are monotone and DR-submodularfor t = 1, 2, 3, ...,T . With step size αt = D

G√t,

1

2maxx∈K

T∑t=1

ft(x)−T∑t=1

ft(xt) ≤3

4DG√T

Here, D is the diam(K ),G = sup1≤t≤T ,x∈K ||∇ft(x)||.

Menglong Li (UIUC) Short title April 26, 2018 11 / 24

Page 17: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Online Quadratic submodular function minimization

Why consider quadratic functions?

Broad applications: It arises in a broad range of fields such ascombinatorial optimization, numerical partial differential equationsfrom engineering, control and finance, and general nonlinearprogramming problems

NP-hard: Nonconvex quadratic optimization problems are known tobe NP-hard

Menglong Li (UIUC) Short title April 26, 2018 12 / 24

Page 18: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Online Quadratic submodular function minimization

Model setting:In each period t = 1, ...,T , ft is a quadratic function, i.e., ft(x) = xTAtx .Let box K ⊂ Rn be the decision space.The decision maker wants to minimize the regret

RT ((xt)) =T∑t=1

ft(xt)−minx∈K

T∑t=1

ft(x)

(Note that a quadratic function xTAx is submodular if and only if all theoff diagonal entries are nonpositive.)

Menglong Li (UIUC) Short title April 26, 2018 13 / 24

Page 19: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

SDP relaxation of quadratic submodular functionsminimization problem

Consider the quadratic optimization problem

OPTQP = min xTQx (QP)

s.t. x2 ∈ F

and its (SDP) relaxation

OPT SDP = min < Q,X > (SDP)

s.t. diag(X ) ∈ F ,Z ∈ Sn+.

Here, F ∈ Rn is a closed convex set. Sn and Sn+ are the set of n × n

symmetric matrices and the set of n × n positive semidefinite symmetricmatrices, respectively.

Menglong Li (UIUC) Short title April 26, 2018 14 / 24

Page 20: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

When QP=SDP ?

Theorem (Zhang, S. (2000))

If Q = [qij ]n×n satisfies qij ≤ 0 for all i 6= j , then OPTQP = OPT SDP .Moreover, suppose that X ∗ is an optimal solution for (SDP), then√

diag(X ∗) is an optimal solution for (QP).

Menglong Li (UIUC) Short title April 26, 2018 15 / 24

Page 21: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Regret bound

Back to our online quadratic submodular optimization problem:For t = 1, ...,T , ft(x) = xTAtx is submodular. Decision makersuccessively chooses xt to minimize the regret

RT ((xt)) =T∑t=1

ft(xt)−minx∈K

T∑t=1

ft(x).

Menglong Li (UIUC) Short title April 26, 2018 16 / 24

Page 22: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Regret bound

Initiated by the SDP relaxation theorem, we first solve the online SDPproblem:For t = 1, ...,T , choose Xt ∈ Sn

+(K ) with diag(Xt) ∈ K 2 to minimize theregret

RT ((Xt)) =T∑t=1

< At ,Xt > − minX∈Sn

+(K)

T∑t=1

< At ,X >

Here, Sn+(K ) = {X ∈ Sn

+|diag(X ) ∈ K 2}.

Menglong Li (UIUC) Short title April 26, 2018 17 / 24

Page 23: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Algorithm

For the online SDP problem, we use the following algorithm:Initial X1.For t = 1, ...,T − 1, let Xt+1 = ΠSn

+(K2)(Xt − αtAt).

Theorem

If At are all submodular and αt = DG√T

, the regret

RT ((Xt)) ≤ DG√T

Here, D = diam(Sn+(K 2)),G = max1≤t≤T ||At ||2

Menglong Li (UIUC) Short title April 26, 2018 18 / 24

Page 24: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

proof

Proof.

Let Yt+1 = Xt − αAt , α = DG√T

. Then

||Xt+1−X ∗||2 ≤ ||Yt+1−X ∗||2 = ||Xt−X ∗||2+α2||At ||2−2α < At ,Xt−X ∗ >

This implies

< At ,Xt − X ∗ >≤ 1

2α(||Xt − X ∗||2 − ||Xt+1 − X ∗||2) +

α

2||At ||2

Take summation from 1 to T ,

T∑t=1

< At ,Xt − X ∗ >≤ 1

2α||X1 − X ∗||2 +

α

2

T∑t=1

||At ||2

≤ D2

2α+αTG

2= DG

√T

Menglong Li (UIUC) Short title April 26, 2018 19 / 24

Page 25: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Regret bound for original problem

Theorem

Suppose Xt is the selected matrices in the online SDP problem. Letxt =

√diag(Xt) ∈ K ′ = K ∪ (−K ). Then

RT ((xt)) ≤ RT ((Xt)) ≤ DG√T

Proof.

Denote xij , aij the (i , j)-th entry of Xt ,At , respectively. Thenxt = (

√x11, ...,

√xnn). Since Xt is positive semidefinite, we have

x2ij ≤ xiixjj .

xTt Atxt =∑

i ,j aij√xii√xjj =

∑ni=1 aiixii +

∑i 6=j aij

√xii√xjj ≤∑n

i=1 aiixii +∑

i 6=j aij |xij | ≤∑n

i=1 aiixii +∑

i 6=j aijxij =< At ,Xt >.

Therefore, RT ((xt)) =∑T

t=1 xTt Atxt −minx∈K ′

∑Tt=1 x

TAtx ≤∑T

t=1 <

At ,Xt > −minX∈Sn+(K

′)

∑Tt=1 < At ,X >≤ DG

√T

Menglong Li (UIUC) Short title April 26, 2018 20 / 24

Page 26: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

How about general continuous submodular functions?

Intuition: Generalize the Lovasz extension of submodular set functionLet L =

∏ni=1 Xi . Xi ⊂ R are compact. Define the measure space P(Xi )

convex hull of all the one-point distribution on Xi , i.e.,

P(Xi ) = conv{δxi : xi ∈ Xi}

Let P(L) = Πni=1P(Xi ). There are two extensions of H : L → R:

∀µ ∈ P(L), h1(µ1, ..., µn) =

∫ 1

0H(F−1µ1 (t), ...,F−1µn (t))dt

convex closure: The lowest semi-continuous convex functionsuch that hc(δx) ≤ H(x)

hc(µ1, ..., µn) = infγ∈P(L)

∫LH(x)dγ(x)

Menglong Li (UIUC) Short title April 26, 2018 21 / 24

Page 27: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

Theorem (Bach, F. (2015))

h1 is convex if and only if H is submodular.

If H is submodular, then the two extensions are equal, i.e., h1 = hc .

Minimizing hc on P(L) and minimizing H on L is equivalent, that is,the two optimal values are equal, and one may find minimizers of oneproblem given the other one.

Menglong Li (UIUC) Short title April 26, 2018 22 / 24

Page 28: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

References

Chen, L., Hassani, H., & Karbasi, A. (2018). Online Continuous SubmodularMaximization. arXiv preprint arXiv:1802.06052.

Hassani, H., Soltanolkotabi, M., & Karbasi, A. (2017). Gradient methods forsubmodular maximization. In Advances in Neural Information Processing Systems(pp. 5843-5853).

Hazan, E., & Kale, S. (2012). Online submodular minimization. Journal of MachineLearning Research, 13(Oct), 2903-2922.

Bach, F. (2015). Submodular functions: from discrete to continous domains. arXivpreprint arXiv:1511.00394.

Zhang, S. (2000). Quadratic maximization and semidefinite relaxation.Mathematical Programming, 87(3), 453-465

Menglong Li (UIUC) Short title April 26, 2018 23 / 24

Page 29: Online Submodular Optimization Problemsniaohe.ise.illinois.edu/IE598/Online submodular... · 2018. 4. 26. · Online Quadratic submodular function minimization Model setting: In each

The End

Menglong Li (UIUC) Short title April 26, 2018 24 / 24