portfolio optimization for influence spread (www'17)
TRANSCRIPT
Portfolio Optimization
for Influence SpreadNaoto Ohsaka (UTokyo)
Yuichi Yoshida (NII & PFI)
2017/4/6 WWW'17@Perth, Australia
๐๏ผ 0.2 + 0.3 + 0.5a
b
a
c
d
e
Kawarabayashi Large Graph Project
Influence maximization
2
max๐ด: ๐ด =๐
๐[cascade size trigger by ๐ด]
0.8
a
b
d
fe
c
0.6 0.1
0.30.4
0.2 0.5
Is maximizing expectation enough?
Discrete optimization problem under
stochastic models
Find the most influential group from a social networkMotivated by viral marketing [Domingos-Richardson. KDD'01]
Greedy strategy is 1 โ eโ1 โ 63%-approx.[Nemhauser-Wolsey-Fisher. Math. Program.'78]
Due to monotonicity & submodularity[Kempe-Kleinberg-Tardos. KDD'03]
# vertices influenced
[Kempe-Kleinberg-Tardos. KDD'03]
Risk of having small cascades
3
๐ = ๐
High risk
Cascade size
Pro
bab
ilit
y
Low risk
Need a good risk measure for
finding a low-risk strategy
Q. Which is better, or ?
Popular downside risk measuresin finance economics & actuarial science
Value at Risk at ฮฑ (VaRฮฑ) = ฮฑ-percentile
Conditional Value at Risk at ฮฑ (CVaRฮฑ)
โ expectation in the worst ฮฑ-fraction of cases
ฮฑ : significance level (typically 0.01 or 0.05)
4
CVaR is appropriate in practice & theorycoherence, convex/concave, continuous
ฮฑ
Pro
bab
ilit
y
Cascade size
VaRCVaR
๐
Optimizing CVaR
5
Much effort in continuous optimization[Rockafellar-Uryasev. J. Risk'00] [Rockafellar-Uryasev. J. Bank. Financ.'02]
Solve max๐ด: ๐ด =๐
CVaR๐ผ[cascade size triggered by ๐ด]?
NOT submodular
Poly-time approximation is impossibleunder some assumption [Maehara. Oper. Res. Lett.'15]
Our approach is portfolio optimization
6
๐๏ผ 0.2 + 0.3 + 0.5a
b
a
c
d
e
Return = 0.2 โ 5 + 0.3 โ 4 + 0.5 โ 6 = 5.2
Sample
value 0.2
b c
d
e
a
We found poly-time approximation is possible!!
Sample
value 0.3
Sample
value 0.5
54
6
investment asset
Our contributions
โ Formulation
Portfolio optimization for maximizing CVaR
โก Algorithm
Constant additive error in poly-time
โข Experiments
Our portfolio outperforms baselines
7
Related work
[Zhang-Chen-Sun-Wang-Zhang. KDD'14]
โถFind smallest ๐ด s.t. VaR โฅ thld.
[Deng-Du-Jia-Ye. WASA'15]
โถFind ๐ด s.t. (# vertices influenced by ๐ด w.h.p.) โฅ thld.
โถNo approx. guarantee
Robust influence maximization
[Chen-Lin-Tan-Zhao-Zhou. KDD'16] [He-Kempe. KDD'16]
โถModel parameters are noisy or uncertain
โถRobustness is measured by ๐[cascade size]
8These are single set selection problems
Formulation
Diffusion process [Goldenberg-Libai-Muller. Market. Lett.'01]
9
Graph ๐บ = ๐, ๐ธEdge prob. ๐: ๐ธ โ 0,1
uv lives w.p. ๐uv2 ๐ธ outcomes
๐ด influences v
in the diffusion process
๐ด can reach v
in the random graphโ๐๐ด = r.v. for # vertices reachable from ๐ด in the random graph
a
cd
b0.5
0.50.5
0.5
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
Formulation
๐-vertex portfolio, ๐ โ and CVaR๐ผ โ
10
๐๏ผ ๐๐
-dim. vectors.t. ๐ 1 = 1
E.g., ๐ = 1, ๐a = ๐b = 0.5
๐ ๐a๐a + ๐b๐b = 2.25
CVaR0.25 ๐a๐a + ๐b๐b = 1.25 worst 0.25-fraction
a
cd
b0.5
0.50.5
0.5
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
buv lives w.p. ๐uv2 ๐ธ outcomes
Dist. of ๐, ๐ = ๐a๐a + ๐b๐b
Formulation
๐-vertex portfolio, ๐ โ and CVaR๐ผ โ
11
๐๏ผ ๐๐
-dim. vectors.t. ๐ 1 = 1
E.g., ๐ = 1, ๐a = ๐b = 0.5
๐ ๐a๐a + ๐b๐b = 2.25
CVaR0.25 ๐a๐a + ๐b๐b = 1.25 worst 0.25-fraction
a
cd
b0.5
0.50.5
0.5
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
b
a
cd
buv lives w.p. ๐uv2 ๐ธ outcomes
Dist. of ๐, ๐ = ๐a๐a + ๐b๐b
0.5โ 3 + 0.5โ 2 = 2.5
a
cd
b
๐a = 3
a
cd
b
๐b = 2
1 1.5 1 2
1.5 2.5 1.5 3.5
1.5 2 2 3
2 2.5 2.5 3.5
Formulation
๐-vertex portfolio, ๐ โ and CVaR๐ผ โ
12
๐๏ผ ๐๐
-dim. vectors.t. ๐ 1 = 1
E.g., ๐ = 1, ๐a = ๐b = 0.5
๐ ๐a๐a + ๐b๐b = 2.25
CVaR0.25 ๐a๐a + ๐b๐b = 1.25 worst 0.25-fraction
a
cd
b0.5
0.50.5
0.5
uv lives w.p. ๐uv2 ๐ธ outcomes
Dist. of ๐, ๐ = ๐a๐a + ๐b๐b
Formulation
Our problem definition
13
max๐
CVaR๐ผ โจ๐, ๐โฉ
๐ด: ๐ด =๐
๐๐ด๐๐ด
โ ๐ & ๐ are ๐๐
-dimensionalExisting methods cannot be applied
# vertices reachable from ๐ดin the random graph
Input : ๐บ = (๐, ๐ธ), ๐, integer ๐, significance level ๐ผ
Output : ๐-vertex portfolio ๐
Our algorithm
Main idea
14
CVaR optimization โ BIG linear programming
Multiplicative weights algorithm [Arora-Hazan-Kale. '12]
Used in optimization, machine learning, game theory, โฆ
E.g., our group's application [Hatano-Yoshida. AAAI'15]
Requires an oracle for
a convex combination of the constraints
Standard approximation
Greedy strategy solves approximately!
โ
Our algorithm
Standard approximation as a first step
max๐,๐
๐ โ1
๐ผ๐
1โค๐โค๐
max ๐ โ ๐, ๐๐ , 0
Sampling ๐1, โฆ , ๐๐ from the dist. of ๐
Solve the feasibility problem โ โฅ ๐พ?โ
to perform the bisection search on ๐พ
max๐
CVaR๐ผ ๐, ๐
Write CVaR as an optimization problem[Rockafellar-Uryasev. J. Risk'00]
15
Our algorithm
Difficulty of checking โ โฅ ๐พ?โ
16
โ? ๐ฑ โ ๐๐พ ๐๐ฑ โฅ ๐
auxiliary variable required for expressing CVaR
auxiliary variables for removing max functions
๐-vertex portfolio๐ฑ =
๐๐ฒ๐
โ Multiple submodular functions
exceed a threshold simultaneously โ
โ โฅ ๐พ?โ ๏ผ Feasibility of a BIG linear programming
# variables โ ๐๐
# constraints = ๐
โ
HARD to satisfy!!
Our algorithm
Our solution for checking the BIG LP
Multiplicative weights algorithm [Arora-Hazan-Kale. '12]
โ Solve the convex combination โก for ๐ฉ = ๐ฉ1, โฆ , ๐ฉ๐โก The average solution โ A solution of โก
17
โ? ๐ฑ โ ๐๐พ ๐ฉ, ๐๐ฑ โฅ ๐ฉ, ๐
โ A single submodular function
exceeds a threshold โ
We can assume ๐ is sparse โ Greedy works!!Constant additive error โeโ1 in poly-time
Still looks difficult, but โฆ
โ
Our algorithm
Summary : Time & Quality
18
Exact CVaR optimization
Empirical CVaR optimization
BIG linear programming
Convex combination Greedy works!
Multiplicative weights!
Error is bounded
Bisection search
โ
CVaR๐ผ ๐, ๐ โฅ max๐โ
CVaRฮฑ ๐โ, ๐ โ ๐ eโ1 โ ๐ ๐
O ๐โ6๐2 ๐ ๐ธ time O ๐ = O ๐ log๐ ๐
additive erroroptimum value
Experiments
Settings
โถ Test data (see our paper for results on other data)
Physicians friend network ( ๐ = 117 & ๐ธ = 542)from [Koblenz Network Collection]
๐uv = outโdegree of u โ1
โถ Parameter : ๐ = 0.4
โถ Baselines : produce a single vertex set
19
Greedy a standard greedy algorithm for influence maximization
Degree select ๐ vertices in degree decreasing order
Random select ๐ vertices uniformly at random
โถ Environment : Intel Xeon E5-2670 2.60GHz CPU, 512GB RAM
1/3 1/3
1/3
1/4 1/4
1/4
1/4
Experiments
Results for ๐ = 10 & ๐ผ = 0.01
๐ด ๐๐ด25,42,67,81,85,94,103,106,111,112 3/47
7,20,21,48,75,98,104,111,112,113 2/47
0,29,43,52,92,97,107,108,113,116 2/47
25,38,69,71,81,103,105,110,112,116 2/47
โฎ โฎ
๐ is extremely sparse!!
(# non-zeros in ๐) = 42
dim. of ๐ = ๐10
โ 1014
CVaR at ๐ถ Expectation Runtime
This work 23.6 37.7 157.2s
Greedy 16.9 38.2 0.1s
Degree 14.2 30.8 0.2ms
Random 15.1 31.0 0.2ms
Comparable
Concentrated
Conclusion
Future studyโ : Speed-up
O ๐โ6๐2 ๐ ๐ธ time is still expensive
Can we solve the BIG LP in a different way?
Future studyโ ก: Exact computation of CVaR
Is it possible to extend [Maehara-Suzuki-Ishihata. WWW'17] ?
Future studyโ ข: Other risk measures
Value at Risk, Lower partial moment, โฆ21
Succeed to obtain a low-risk strategy
by a portfolio optimization approach