lecture 7: second-order cone programming and applications · lecture 7: second-order cone...

Lecture 7: Second-order cone programming andApplications

Gan Zheng

University of Luxembourg

SnT Course: Convex Optimization with Applications

Fall 2012

Second-order cone programming

minx

cTx

s.t. ‖Aix + bi‖ ≤ fTi x + di,∀i,

Fx = g.

• More general than LP and QP.

• When Ai = 0, it reduces to LP.

1 of 27

Consider QCQP in the form

minx

‖A0x + b0‖2

s.t. ‖Aix + bi‖2 ≤ ri, i = 1, · · · , N.

This problem can be reformulated as

minx,t

t

s.t. ‖A0x + b0‖2 ≤ t

‖Aix + bi‖2 ≤ ri, i = 1, · · · , N,

which is an SOCP.

2 of 27

Example 1: Worse-Case Robust linear programming

Standard LP:

minx

cTx

s.t. aTi x ≤ bi,∀i.

Consider the uncertainty in ai:

ai ∈ {ai + Piu|‖u‖ ≤ 1} , Ui,

where we only know ai and Pi.

3 of 27

Robust LP formulation:

minx

cTx

s.t. aTi x ≤ bi,∀i, ∀ai ∈ Ui.

Notice that

aTi x ≤ bi,∀i,∀a = ai + Piu, ‖u‖ ≤ 1 ⇔ aT

i x + ‖PTi x‖ ≤ bi,

then the robust LP problem is equivalent to

minx

cTx

s.t. aTi x + ‖PT

i x‖ ≤ bi,∀i,

which is an SOCP.

4 of 27

Example 2: Statistical robust linear programming

Standard LP:

minx

cTx

s.t. aTi x ≤ bi,∀i.

Consider the uncertainty in ai:

Ui , {ai ∼ CN (ai,Σi)},

where we only know ai and Σi.

The robust constraint is

Prob(aTi x ≤ bi) ≥ η,

and we will reformulate it.

5 of 27

Define u = aTi x with u = aT

i x and σ = xTΣix denoting its mean andvariance, then the constraint becomes

Prob(aTi x ≤ bi) ≥ η ⇒ Prob(

u− u√σ

≤ bi − u√σ

) ≥ η.

Now u−u√σ

is a zero mean unit variance Gaussian variable, the probability above is

simply Φ(bi−u√σ

), where

Φ(z) =1√2π

∫ z

−∞e−t2

2 dt,

is the CDF of a zero mean unit variable Gaussian random variable CN (0, 1). Thusthe probability constraint can be expressed as

bi − u√σ

≥ Φ−1(η) ⇒ u + Φ−1(η)√

σ ≤ bi ⇒ aTi x + Φ−1(η)‖Σ1/2

i x‖ ≤ bi.

Provided η ≥ 12, this constraint is an SOCP constraint. η < 1

2?

6 of 27

Example 3: Robust Least Squares

Standard LS:

minx

‖Ax− b‖2.

Consider the uncertainty is A:

A ∈ {A + U|‖U‖2 ≤ α} , A

and we only know A and α.

Worse-case robust LS formulation:

minx

supA∈A

‖Ax− b‖2.

7 of 27

For A = A + U, ‖U‖2 ≤ α,

‖Ax− b‖ = ‖Ax− b + Ux‖≤ ‖Ax− b‖+ ‖Ux‖≤ ‖Ax− b‖+ α‖x‖

The equality is shown to be achievable for some ‖U‖ ≤ α. The robust LS thenbecomes

minx‖Ax− b‖+ α‖x‖,

which is equivalent to and SOCP below,

mint1,t2,x

t1 + αt2

s.t. ‖Ax− b‖2 ≤ t1, ‖x‖ ≤ t2.

8 of 27

Example 4: Robust Minimum Variance Beamforming

Recall the average energy minimization design:

minw

w†Pw

s.t. w†a(θdes) = 1.

P =∑

i a(θi)a†(θi), where θi are directions that are not of interest.

• Uncertainty in θdes or the desired steering vector a(θi).

• The uncertainty effect can be modelled as

a = a + u,

where a is the real steering vector and u is the uncertainty.

• The solution is very sensitive to u.

9 of 27

Robust minimum variance beamforming

• We consider the robust beamforming problem:

minw

w†Rw

s.t. |w†(a + u)|2 ≥ 1,∀‖u‖ ≤ ε.

• We can rewrite

minw

w†Rw

s.t. inf‖u‖≤ε

|w†(a + u)|2 ≥ 1.

• It is a nonconvex problem.

10 of 27

• Using triangular inequality:

|w†(a + u)|2 ≥ |w†a| − |w†u|≥ |w†a| − ε‖w‖,∀‖u‖ ≤ ε

• The equality is achieved for some u. Thus

inf‖u‖≤ε

|w†(a + u)|2 = |w†a| − ε‖w‖.

• We have assumed |w†a| ≥ ε‖w†‖. What if |w†a| < ε‖w‖?

11 of 27

• The robust beamforming problem can be rewritten as

minw

w†Rw

s.t. |w†a| − ε‖w†‖ ≥ 1,

which is still not convex.

• A fact: if w∗ is a solution, then w∗ejθ is also a solution for arbitrary θ.

• Without losing optimality, we can safely add additional constraints:

Re(w†a) ≥ 0, Im(w†a) = 0.

12 of 27

• We then reach

minw

w†Rw

s.t. w†a ≥ 1 + ε‖w†‖, Im(w†a) = 0.

• Or formally, an SOCP

mint,w

t

s.t. ‖R1/2w‖ ≤ t

ε‖w†‖ ≤ w†a− 1

Im(w†a) = 0.

13 of 27

Example 5: Multiuser transmit beamforming

.

.

.

.

.

.

.

.

.

UM

w1

+

+wM

.

.

.

S1

SM

Ms

h1

hM

U1

1s

The received signal for receiver m is

sm = h†mwmsm +M∑

n=1,n 6=m

h†mwnsn + nm.

14 of 27

• The received signal-to-noise plus interference ratio (SINR), Γm, is defined asfollows,

Γm =|h†mwm|2

M∑n=1,n 6=m

|h†mwn|2 + 1.

• Power minimization problem with QoS constraints {γm}:

min{wm}

M∑m=1

‖wm‖2

s.t. Γm =|h†mwm|2

M∑n=1,n 6=m

|h†mwn|2 + 1≥ γm, ∀m.

15 of 27

• SOCP formulation:

min{wm,t}

t

s.t.

∥∥∥∥∥∥ w1

...wM

∥∥∥∥∥∥ ≤ t

∥∥∥∥∥∥∥∥∥∥∥∥∥∥

h†mw1...

h†mwm−1

h†mwm+1...

h†mwM

1

∥∥∥∥∥∥∥∥∥∥∥∥∥∥≤ 1√

γmh†mwm, ∀m.

17 of 27

• We consider the problem of maximizing the worst user’s SINR, given powerlimit P0.

max{wm}

minm

Γm

s.t. Γm =|h†mwm|2

M∑n=1,n 6=m

|h†mwn|2 + 1, ∀m,

M∑m=1

‖wm‖2 ≤ P0.

• Originally nonconvex.

• It can be solved via quasi-convex optimization.

18 of 27

Example 6: Hyperbolic Constraints

• Scalar case:

w2 ≤ xy, x ≥ 0, y ≥ 0 ⇔∥∥∥∥[

2wx− y

]∥∥∥∥ ≤ x + y.

• Vector case:

wTw ≤ xy, x ≥ 0, y ≥ 0 ⇔∥∥∥∥[

2wx− y

]∥∥∥∥ ≤ x + y.

19 of 27

• Consider the convex problem below:

minx

N∑i=1

1aT

i x + bi

s.t. aTi x + bi > 0, cT

i x + di ≥ 0,∀i.

• Equivalently,

minx,ti

N∑i=1

ti

s.t. ti(aTi x + bi) ≥ 1, cT

i x + di ≥ 0,∀i.

• SOCP formulation

minx,ti

N∑i=1

ti

s.t.

∥∥∥∥[2

aTi x + bi − ti

]∥∥∥∥ ≤ aTi x + bi + ti, cT

i x + di ≥ 0,∀i.

20 of 27

Extension to quadratic/linear fractional problem• A convex problem

minx

∑i

‖Fix + gi‖2

aTi x + bi

s.t. aTi x + bi > 0,∀i.

• First express it as

minx,ti

∑i

ti

s.t. (Fix + gi)T (Fix + gi) ≤ ti(aTi x + bi)

aTi x + bi > 0,∀i.

• Then an SOCPminx,ti

∑i

ti

s.t.

∥∥∥∥[2(Fix + gi)

ti − (aTi x + bi)

]∥∥∥∥ ≤ ti + (aTi x + bi)

aTi x + bi > 0,∀i.

21 of 27

Example 7: Robust Classifiers

22 of 27

• For linearly separable datasets, we can find a hyperplane

wTx + b = 0

to separate the two classes. The maximum-margin hyperplane (w, b) can befound by solving

minw,b

‖w‖2

s.t. yi(wTxi + b) ≥ 1,∀i.

• When such separation is impossible or there are outliers, we relax the constraintand use soft margin,

minw,b,ξi

‖w‖2 + C∑

i

ξi Cξi is the penalty

s.t. yi(wTxi + b) ≥ 1− ξi, ξi ≥ 0,∀i.

23 of 27

Robust Classifiers

• Uncertainty in observation. For each xi, we only know its mean xi and varianceΣi without knowing its distribution, i.e., xi ∼ (xi,Σi).

• We want to classify xi correctly with a high probability 1 − εi even for theworst distribution, i.e.,

infxi∼(xi,Σi)

Prob(yi(wTxi + b) ≥ 1− ξi) ≥ 1− εi.

• We have seen similar result but for Gaussian distribution xi ∼ CN (xi,Σi),now we want to solve

minw,b,ξ

‖w‖2 + C∑

ξi

s.t. infxi∼(xi,Σi)

Prob(yi(wTxi + b) ≥ 1− ξi) ≥ 1− εi,∀i.

24 of 27

Multivariate Chebyshev inequality

• Single variable, x ∼ (x, σ), then we have

Prob(|x− x| ≥ 1) ≤ σ.

• Multivariate case, xi ∼ (xi,Σi):

supxi∼(xi,Σi)

Prob(x ∈ S) = (1 + d2)−1,

where S is a convex set and d2 = infx∈S(x− xi)TΣ−1i (x− xi).

See reference:

G. R. G. Lanckriet, L. E. Ghaoui, C. Bhattacharrya, and M. I. Jordan. A robustminimax approach to classification. Journal of Machine Learning Research, 3:555-582, 2002.

25 of 27

• Rewrite infxi∼(xi,Σ) Prob(yi(wTxi + b) ≥ 1− ξi) ≥ 1− εi as

supxi∼(xi,Σi)

Prob(yi(wTxi + b) ≤ 1− ξi) ≤ εi

•sup

xi∼(xi,Σi)

Prob(yi(wTxi + b) ≥ 1− ξi) = (1 + d2)−1,

where d2 = infx|yi(wTxi+b)≤1−ξi(x− xi)TΣ−1

i (x− xi).

• Optimization of d2:

yi(wT xi + b) ≤ 1− ξi, d2 = 0; otherwise, d =

yi(wT xi + b)− 1 + εi√wTΣiw

26 of 27

• The robust constraint is equivalent to,

yi(wT xi + b) ≥ 1− εi + γi‖Σ1/2i w‖,where γi =

√εi

1− εi.

To sum up,

minw,b,ξ

‖w‖2 + C∑

ξi

s.t. infxi∼(xi,Σi)

Prob(yi(wTxi + b) ≥ 1− ξi) ≥ 1− εi,∀i.

is equivalent to

minw,b,ξ

‖w‖2 + C∑

ξi

s.t. yi(wT xi + b) ≥ 1− εi + γi‖Σ1/2i w‖,where γi =

√εi

1− εi,∀i.

27 of 27

lecture 7: second-order cone programming and applications · lecture 7: second-order cone...

Documents