lecture 7: second-order cone programming and applications · lecture 7: second-order cone...
TRANSCRIPT
Lecture 7: Second-order cone programming andApplications
Gan Zheng
University of Luxembourg
SnT Course: Convex Optimization with Applications
Fall 2012
Second-order cone programming
minx
cTx
s.t. ‖Aix + bi‖ ≤ fTi x + di,∀i,
Fx = g.
• More general than LP and QP.
• When Ai = 0, it reduces to LP.
1 of 27
Consider QCQP in the form
minx
‖A0x + b0‖2
s.t. ‖Aix + bi‖2 ≤ ri, i = 1, · · · , N.
This problem can be reformulated as
minx,t
t
s.t. ‖A0x + b0‖2 ≤ t
‖Aix + bi‖2 ≤ ri, i = 1, · · · , N,
which is an SOCP.
2 of 27
Example 1: Worse-Case Robust linear programming
Standard LP:
minx
cTx
s.t. aTi x ≤ bi,∀i.
Consider the uncertainty in ai:
ai ∈ {ai + Piu|‖u‖ ≤ 1} , Ui,
where we only know ai and Pi.
3 of 27
Robust LP formulation:
minx
cTx
s.t. aTi x ≤ bi,∀i, ∀ai ∈ Ui.
Notice that
aTi x ≤ bi,∀i,∀a = ai + Piu, ‖u‖ ≤ 1 ⇔ aT
i x + ‖PTi x‖ ≤ bi,
then the robust LP problem is equivalent to
minx
cTx
s.t. aTi x + ‖PT
i x‖ ≤ bi,∀i,
which is an SOCP.
4 of 27
Example 2: Statistical robust linear programming
Standard LP:
minx
cTx
s.t. aTi x ≤ bi,∀i.
Consider the uncertainty in ai:
Ui , {ai ∼ CN (ai,Σi)},
where we only know ai and Σi.
The robust constraint is
Prob(aTi x ≤ bi) ≥ η,
and we will reformulate it.
5 of 27
Define u = aTi x with u = aT
i x and σ = xTΣix denoting its mean andvariance, then the constraint becomes
Prob(aTi x ≤ bi) ≥ η ⇒ Prob(
u− u√σ
≤ bi − u√σ
) ≥ η.
Now u−u√σ
is a zero mean unit variance Gaussian variable, the probability above is
simply Φ(bi−u√σ
), where
Φ(z) =1√2π
∫ z
−∞e−t2
2 dt,
is the CDF of a zero mean unit variable Gaussian random variable CN (0, 1). Thusthe probability constraint can be expressed as
bi − u√σ
≥ Φ−1(η) ⇒ u + Φ−1(η)√
σ ≤ bi ⇒ aTi x + Φ−1(η)‖Σ1/2
i x‖ ≤ bi.
Provided η ≥ 12, this constraint is an SOCP constraint. η < 1
2?
6 of 27
Example 3: Robust Least Squares
Standard LS:
minx
‖Ax− b‖2.
Consider the uncertainty is A:
A ∈ {A + U|‖U‖2 ≤ α} , A
and we only know A and α.
Worse-case robust LS formulation:
minx
supA∈A
‖Ax− b‖2.
7 of 27
For A = A + U, ‖U‖2 ≤ α,
‖Ax− b‖ = ‖Ax− b + Ux‖≤ ‖Ax− b‖+ ‖Ux‖≤ ‖Ax− b‖+ α‖x‖
The equality is shown to be achievable for some ‖U‖ ≤ α. The robust LS thenbecomes
minx‖Ax− b‖+ α‖x‖,
which is equivalent to and SOCP below,
mint1,t2,x
t1 + αt2
s.t. ‖Ax− b‖2 ≤ t1, ‖x‖ ≤ t2.
8 of 27
Example 4: Robust Minimum Variance Beamforming
Recall the average energy minimization design:
minw
w†Pw
s.t. w†a(θdes) = 1.
P =∑
i a(θi)a†(θi), where θi are directions that are not of interest.
• Uncertainty in θdes or the desired steering vector a(θi).
• The uncertainty effect can be modelled as
a = a + u,
where a is the real steering vector and u is the uncertainty.
• The solution is very sensitive to u.
9 of 27
Robust minimum variance beamforming
• We consider the robust beamforming problem:
minw
w†Rw
s.t. |w†(a + u)|2 ≥ 1,∀‖u‖ ≤ ε.
• We can rewrite
minw
w†Rw
s.t. inf‖u‖≤ε
|w†(a + u)|2 ≥ 1.
• It is a nonconvex problem.
10 of 27
• Using triangular inequality:
|w†(a + u)|2 ≥ |w†a| − |w†u|≥ |w†a| − ε‖w‖,∀‖u‖ ≤ ε
• The equality is achieved for some u. Thus
inf‖u‖≤ε
|w†(a + u)|2 = |w†a| − ε‖w‖.
• We have assumed |w†a| ≥ ε‖w†‖. What if |w†a| < ε‖w‖?
11 of 27
• The robust beamforming problem can be rewritten as
minw
w†Rw
s.t. |w†a| − ε‖w†‖ ≥ 1,
which is still not convex.
• A fact: if w∗ is a solution, then w∗ejθ is also a solution for arbitrary θ.
• Without losing optimality, we can safely add additional constraints:
Re(w†a) ≥ 0, Im(w†a) = 0.
12 of 27
• We then reach
minw
w†Rw
s.t. w†a ≥ 1 + ε‖w†‖, Im(w†a) = 0.
• Or formally, an SOCP
mint,w
t
s.t. ‖R1/2w‖ ≤ t
ε‖w†‖ ≤ w†a− 1
Im(w†a) = 0.
13 of 27
Example 5: Multiuser transmit beamforming
.
.
.
.
.
.
.
.
.
UM
w1
+
+wM
.
.
.
S1
SM
Ms
h1
hM
U1
1s
The received signal for receiver m is
sm = h†mwmsm +M∑
n=1,n 6=m
h†mwnsn + nm.
14 of 27
• The received signal-to-noise plus interference ratio (SINR), Γm, is defined asfollows,
Γm =|h†mwm|2
M∑n=1,n 6=m
|h†mwn|2 + 1.
• Power minimization problem with QoS constraints {γm}:
min{wm}
M∑m=1
‖wm‖2
s.t. Γm =|h†mwm|2
M∑n=1,n 6=m
|h†mwn|2 + 1≥ γm, ∀m.
15 of 27
• |h†mwm|2MP
n=1,n 6=m|h†mwn|2+1
≥ γm, Quadratic constraint, originally nonconvex.
• Add additional constraint: Re(h†mwm) > 0, Im(h†mwm) = 0.
• The constraint can be made convex in the following way:
√√√√ M∑n=1,n 6=m
|h†mwn|2 + 1 ≤ 1√
γmh†mwm.
16 of 27
• SOCP formulation:
min{wm,t}
t
s.t.
∥∥∥∥∥∥ w1
...wM
∥∥∥∥∥∥ ≤ t
∥∥∥∥∥∥∥∥∥∥∥∥∥∥
h†mw1...
h†mwm−1
h†mwm+1...
h†mwM
1
∥∥∥∥∥∥∥∥∥∥∥∥∥∥≤ 1√
γmh†mwm, ∀m.
17 of 27
• We consider the problem of maximizing the worst user’s SINR, given powerlimit P0.
max{wm}
minm
Γm
s.t. Γm =|h†mwm|2
M∑n=1,n 6=m
|h†mwn|2 + 1, ∀m,
M∑m=1
‖wm‖2 ≤ P0.
• Originally nonconvex.
• It can be solved via quasi-convex optimization.
18 of 27
Example 6: Hyperbolic Constraints
• Scalar case:
w2 ≤ xy, x ≥ 0, y ≥ 0 ⇔∥∥∥∥[
2wx− y
]∥∥∥∥ ≤ x + y.
• Vector case:
wTw ≤ xy, x ≥ 0, y ≥ 0 ⇔∥∥∥∥[
2wx− y
]∥∥∥∥ ≤ x + y.
19 of 27
• Consider the convex problem below:
minx
N∑i=1
1aT
i x + bi
s.t. aTi x + bi > 0, cT
i x + di ≥ 0,∀i.
• Equivalently,
minx,ti
N∑i=1
ti
s.t. ti(aTi x + bi) ≥ 1, cT
i x + di ≥ 0,∀i.
• SOCP formulation
minx,ti
N∑i=1
ti
s.t.
∥∥∥∥[2
aTi x + bi − ti
]∥∥∥∥ ≤ aTi x + bi + ti, cT
i x + di ≥ 0,∀i.
20 of 27
Extension to quadratic/linear fractional problem• A convex problem
minx
∑i
‖Fix + gi‖2
aTi x + bi
s.t. aTi x + bi > 0,∀i.
• First express it as
minx,ti
∑i
ti
s.t. (Fix + gi)T (Fix + gi) ≤ ti(aTi x + bi)
aTi x + bi > 0,∀i.
• Then an SOCPminx,ti
∑i
ti
s.t.
∥∥∥∥[2(Fix + gi)
ti − (aTi x + bi)
]∥∥∥∥ ≤ ti + (aTi x + bi)
aTi x + bi > 0,∀i.
21 of 27
Example 7: Robust Classifiers
22 of 27
• For linearly separable datasets, we can find a hyperplane
wTx + b = 0
to separate the two classes. The maximum-margin hyperplane (w, b) can befound by solving
minw,b
‖w‖2
s.t. yi(wTxi + b) ≥ 1,∀i.
• When such separation is impossible or there are outliers, we relax the constraintand use soft margin,
minw,b,ξi
‖w‖2 + C∑
i
ξi Cξi is the penalty
s.t. yi(wTxi + b) ≥ 1− ξi, ξi ≥ 0,∀i.
23 of 27
Robust Classifiers
• Uncertainty in observation. For each xi, we only know its mean xi and varianceΣi without knowing its distribution, i.e., xi ∼ (xi,Σi).
• We want to classify xi correctly with a high probability 1 − εi even for theworst distribution, i.e.,
infxi∼(xi,Σi)
Prob(yi(wTxi + b) ≥ 1− ξi) ≥ 1− εi.
• We have seen similar result but for Gaussian distribution xi ∼ CN (xi,Σi),now we want to solve
minw,b,ξ
‖w‖2 + C∑
ξi
s.t. infxi∼(xi,Σi)
Prob(yi(wTxi + b) ≥ 1− ξi) ≥ 1− εi,∀i.
24 of 27
Multivariate Chebyshev inequality
• Single variable, x ∼ (x, σ), then we have
Prob(|x− x| ≥ 1) ≤ σ.
• Multivariate case, xi ∼ (xi,Σi):
supxi∼(xi,Σi)
Prob(x ∈ S) = (1 + d2)−1,
where S is a convex set and d2 = infx∈S(x− xi)TΣ−1i (x− xi).
See reference:
G. R. G. Lanckriet, L. E. Ghaoui, C. Bhattacharrya, and M. I. Jordan. A robustminimax approach to classification. Journal of Machine Learning Research, 3:555-582, 2002.
25 of 27
• Rewrite infxi∼(xi,Σ) Prob(yi(wTxi + b) ≥ 1− ξi) ≥ 1− εi as
supxi∼(xi,Σi)
Prob(yi(wTxi + b) ≤ 1− ξi) ≤ εi
•sup
xi∼(xi,Σi)
Prob(yi(wTxi + b) ≥ 1− ξi) = (1 + d2)−1,
where d2 = infx|yi(wTxi+b)≤1−ξi(x− xi)TΣ−1
i (x− xi).
• Optimization of d2:
yi(wT xi + b) ≤ 1− ξi, d2 = 0; otherwise, d =
yi(wT xi + b)− 1 + εi√wTΣiw
26 of 27
• The robust constraint is equivalent to,
yi(wT xi + b) ≥ 1− εi + γi‖Σ1/2i w‖,where γi =
√εi
1− εi.
To sum up,
minw,b,ξ
‖w‖2 + C∑
ξi
s.t. infxi∼(xi,Σi)
Prob(yi(wTxi + b) ≥ 1− ξi) ≥ 1− εi,∀i.
is equivalent to
minw,b,ξ
‖w‖2 + C∑
ξi
s.t. yi(wT xi + b) ≥ 1− εi + γi‖Σ1/2i w‖,where γi =
√εi
1− εi,∀i.
27 of 27