mathematical optimization in information visualization. carrizosa (u. sevilla), v. guerrero (u....

71
Mathematical Optimization in Information Visualization Dolores Romero Morales Copenhagen Business School, Frederiksberg, Denmark BCAM, Bilbao, April 25, 2017 DRM MO in Information Visualization 1 / 48

Upload: hoangdieu

Post on 09-Mar-2018

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Mathematical Optimization in InformationVisualization

Dolores Romero Morales

Copenhagen Business School, Frederiksberg, Denmark

BCAM, Bilbao, April 25, 2017

DRM MO in Information Visualization 1 / 48

Page 2: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Data Science

Data Science

Extract and represent knowledge from complex data [Baesens, 2014,Fortunato, 2010, Hand et al., 2001, Provost and Fawcett, 2013]

Mathematical Optimization in Data Science

Mathematical Optimization has contributed significantly to the developmentof this area, [Bertsimas et al., 2016, Bottou et al., 2016, Carrizosa andRomero Morales, 2013, Carrizosa et al., 2017b, Duarte Silva, 2017, Hansenand Jaumard, 1997, Le Thi and Pham Dinh, 2001, Olafsson et al., 2008,Speckmann et al., 2006, Wright, 2016]

DRM MO in Information Visualization 2 / 48

Page 3: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Data Science

My interests

Interpretability is desirable [Freitas, 2014, Ridgeway, 2013], and requiredin, for instance, credit scoring [Baesens et al., 2003] and medical diagnosis[Ustun and Rudin, 2016]

Information Visualization tools are critical to enable analysts to observeand interact with data [Heer et al., 2010, Liu et al., 2014, Marron andAlonso, 2014, Thomas and Wong, 2004]

Close to 20 years of joint work with:

E. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B.Martın-Barragan (U. Edinburgh), A. Nogales-Gomez (Huawei), J. Wang(Deutsche Bank)

DRM MO in Information Visualization 3 / 48

Page 4: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Data Science

My interests

Interpretability is desirable [Freitas, 2014, Ridgeway, 2013], and requiredin, for instance, credit scoring [Baesens et al., 2003] and medical diagnosis[Ustun and Rudin, 2016]

Information Visualization tools are critical to enable analysts to observeand interact with data [Heer et al., 2010, Liu et al., 2014, Marron andAlonso, 2014, Thomas and Wong, 2004]

Today

E. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS)

DRM MO in Information Visualization 3 / 48

Page 5: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Outline

1 Introduction

2 Visualizing magnitudes and relations with a convex body

3 Visualizing magnitudes and relations with multiple convex bodies

4 Visualizing frequencies and relations with rectangles

5 Visualizing frequencies and relations with box-connected portions

6 Concluding remarks

DRM MO in Information Visualization 4 / 48

Page 6: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Outline

1 Introduction

2 Visualizing magnitudes and relations with a convex body

3 Visualizing magnitudes and relations with multiple convex bodies

4 Visualizing frequencies and relations with rectangles

5 Visualizing frequencies and relations with box-connected portions

6 Concluding remarks

DRM MO in Information Visualization 5 / 48

Page 7: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Information Visualization: our framework

V : v1, v2, . . . , vNω : ω1, ω2, . . . , ωN

δ :

0 δ12 · · · δ1Nδ21 0 · · · δ2N...

.... . .

...δN1 δN2 · · · 0

Ω :

set of N individuals

magnitudes

relations, e.g.,dissimilaritiesor adjacencies visualization region ⊂ Rn

Today

New Mixed Integer NonLinear Programming (MINLP) models and numericaloptimization solution approaches to build visualization maps, in whichindividuals in V are depicted in Ω, whose volumes represent the magnitudesω and which are located accordingly to the relations δ

DRM MO in Information Visualization 6 / 48

Page 8: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Information Visualization: our framework

V : v1, v2, . . . , vNω : ω1, ω2, . . . , ωN

δ :

0 δ12 · · · δ1Nδ21 0 · · · δ2N...

.... . .

...δN1 δN2 · · · 0

Ω :

set of N individuals

magnitudes

relations, e.g.,dissimilaritiesor adjacencies visualization region ⊂ Rn

Today

New Mixed Integer NonLinear Programming (MINLP) models and numericaloptimization solution approaches to build visualization maps, in whichindividuals in V are depicted in Ω, whose volumes represent the magnitudesω and which are located accordingly to the relations δ

DRM MO in Information Visualization 6 / 48

Page 9: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Information Visualization: our framework

V : v1, v2, . . . , vNω : ω1, ω2, . . . , ωN

δ :

0 δ12 · · · δ1Nδ21 0 · · · δ2N...

.... . .

...δN1 δN2 · · · 0

Ω :

set of N individuals

magnitudes

relations, e.g.,dissimilaritiesor adjacencies visualization region ⊂ Rn

Today

New Mixed Integer NonLinear Programming (MINLP) models and numericaloptimization solution approaches to build visualization maps, in whichindividuals in V are depicted in Ω, whose volumes represent the magnitudesω and which are located accordingly to the relations δ

DRM MO in Information Visualization 6 / 48

Page 10: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Information Visualization: our framework

V : v1, v2, . . . , vNω : ω1, ω2, . . . , ωN

δ :

0 δ12 · · · δ1Nδ21 0 · · · δ2N...

.... . .

...δN1 δN2 · · · 0

Ω :

set of N individuals

magnitudes

relations, e.g.,dissimilaritiesor adjacencies visualization region ⊂ Rn

Today

New Mixed Integer NonLinear Programming (MINLP) models and numericaloptimization solution approaches to build visualization maps, in whichindividuals in V are depicted in Ω, whose volumes represent the magnitudesω and which are located accordingly to the relations δ

DRM MO in Information Visualization 6 / 48

Page 11: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Information Visualization: our framework

V : v1, v2, . . . , vNω : ω1, ω2, . . . , ωN

δ :

0 δ12 · · · δ1Nδ21 0 · · · δ2N...

.... . .

...δN1 δN2 · · · 0

Ω :

set of N individuals

magnitudes

relations, e.g.,dissimilaritiesor adjacencies visualization region ⊂ Rn

Today

New Mixed Integer NonLinear Programming (MINLP) models and numericaloptimization solution approaches to build visualization maps, in whichindividuals in V are depicted in Ω, whose volumes represent the magnitudesω and which are located accordingly to the relations δ

DRM MO in Information Visualization 6 / 48

Page 12: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Information Visualization: our framework

V : v1, v2, . . . , vNω : ω1, ω2, . . . , ωN

δ :

0 δ12 · · · δ1Nδ21 0 · · · δ2N...

.... . .

...δN1 δN2 · · · 0

Ω :

set of N individuals

magnitudes

relations, e.g.,dissimilaritiesor adjacencies visualization region ⊂ Rn

Today

New Mixed Integer NonLinear Programming (MINLP) models and numericaloptimization solution approaches to build visualization maps, in whichindividuals in V are depicted in Ω, whose volumes represent the magnitudesω and which are located accordingly to the relations δ

DRM MO in Information Visualization 6 / 48

Page 13: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Outline

1 Introduction

2 Visualizing magnitudes and relations with a convex body

3 Visualizing magnitudes and relations with multiple convex bodies

4 Visualizing frequencies and relations with rectangles

5 Visualizing frequencies and relations with box-connected portions

6 Concluding remarks

DRM MO in Information Visualization 7 / 48

Page 14: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Information Visualization: our framework

V : v1, v2, . . . , vNω : ω1, ω2, . . . , ωN

δ :

0 δ12 · · · δ1Nδ21 0 · · · δ2N...

.... . .

...δN1 δN2 · · · 0

Ω :

set of N individuals

magnitudes

dissimilaritiesvisualization region ⊂ Rn

Carrizosa, Guerrero, and Romero Morales [2017b]

New models and solution approaches to build a visualization map, in whichindividuals in V are depicted as convex bodies in Ω, whose volumes areproportional to their magnitudes ω and which are located accordingly to thedissimilarities δ.

DRM MO in Information Visualization 8 / 48

Page 15: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

In the literature

MultiDimensional Scaling

Ding and Qi [2016], Hubert et al. [1992], Kruskal [1964], Leung and Lau[2004], Torgerson [1958], Trosset [2002], Trosset and Mathar [1997], Zilinskasand Podlipskyte [2003], Zilinskas and Zilinskas [2009]

−15 −10 −5 0 5 10 15 20

−20

−10

010

20

t= 31

brus

cbs

dax

djftse

hs

madrid

milan

nikkei

singsp

taiwan

vec

DRM MO in Information Visualization 9 / 48

Page 16: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Problem Statement: the model

Ω : visualization region,compact subset ⊂ Rn,

B : reference object,compact convex body ⊂ Rn,0 ∈ B,symmetric wrt 0,

vi 7−→ ci + τriB,ri : such that vol(riB) ∝ ωi,τ : is a common rescaling for all objects

Ω

B

ci + τriB

ci

cj + τrjBcj

ck + τrkB

ck

vi

vj

vk

DRM MO in Information Visualization 10 / 48

Page 17: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Problem Statement: the model

Ω : visualization region,compact subset ⊂ Rn,

B : reference object,compact convex body ⊂ Rn,0 ∈ B,symmetric wrt 0,

vi 7−→ ci + τriB,ri : such that vol(riB) ∝ ωi,τ : is a common rescaling for all objects

Ω

B

ci + τriB

ci

cj + τrjBcj

ck + τrkB

ck

vi

vj

vk

DRM MO in Information Visualization 10 / 48

Page 18: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Problem Statement: the model

Ω : visualization region,compact subset ⊂ Rn,

B : reference object,compact convex body ⊂ Rn,0 ∈ B,symmetric wrt 0,

vi 7−→ ci + τriB,

ri : such that vol(riB) ∝ ωi,τ : is a common rescaling for all objects

Ω

B

ci + τriB

ci

cj + τrjBcj

ck + τrkB

ck

vi

vj

vk

DRM MO in Information Visualization 10 / 48

Page 19: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Problem Statement: the model

Ω : visualization region,compact subset ⊂ Rn,

B : reference object,compact convex body ⊂ Rn,0 ∈ B,symmetric wrt 0,

vi 7−→ ci + τriB,ri : such that vol(riB) ∝ ωi,

τ : is a common rescaling for all objects

Ω

B

ci + τriB

ci

cj + τrjBcj

ck + τrkB

ck

vi

vj

vk

DRM MO in Information Visualization 10 / 48

Page 20: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Problem Statement: the model

Ω : visualization region,compact subset ⊂ Rn,

B : reference object,compact convex body ⊂ Rn,0 ∈ B,symmetric wrt 0,

vi 7−→ ci + τriB,ri : such that vol(riB) ∝ ωi,τ : is a common rescaling for all objects

Ω

B

ci + τriB

ci

cj + τrjBcj

ck + τrkB

ck

vi

vj

vk

DRM MO in Information Visualization 10 / 48

Page 21: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Problem Statement: the model

Ω : visualization region,compact subset ⊂ Rn,

B : reference object,compact convex body ⊂ Rn,0 ∈ B,symmetric wrt 0,

vi 7−→ ci + τriB,ri : such that vol(riB) ∝ ωi,τ : is a common rescaling for all objects

Ω

B

ci + τriB

ci

cj + τrjBcj

ck + τrkB

ck

vi

vj

vk

DRM MO in Information Visualization 10 / 48

Page 22: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Problem Statement: distance between objects

Distance function between objects

Let d be a function, defined on pairs of compact convex sets of Rn, which satisfies forany A1, A2

1 d ≥ 0 and d is symmetric

2 d(A1, A2) = d(A1 + z,A2 + z), ∀z ∈ Rn

3 The function dz : z ∈ Rn 7−→ d(z +A1, A2) is convex and satisfies for all θ > 0that dz(θA1, θA2) = θd 1

θz(A1, A2).

ExamplesInfimum: d(A1, A2) = inf‖a1 − a2‖ : a1 ∈ A1, a2 ∈ A2

Supremum: d(A1, A2) = sup‖a1 − a2‖ : a1 ∈ A1, a2 ∈ A2

Average: d(A1, A2) =

∫‖a1 − a2‖dµ(a1)dν(a2), µ, ν probability distributions with support A1 and A2

Hausdorff: d(A1, A2) = max

supa1∈A1

infa2∈A2

‖a1 − a2‖, supa2∈A2

infa1∈A1

‖a1 − a2‖

DRM MO in Information Visualization 11 / 48

Page 23: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Problem Statement: objectives

Biobjective optimization problem:

Distances between objects resemble dissimilarities

Objects are spread out in the visualization region Ω

Distances resemble dissimilarities

F1 : Rn × . . .× Rn × R+ × R+ 7−→ R+

(c1, . . . , cN , τ, κ) 7−→∑

i,j=1,...,Ni 6=j

[d(ci + τriB, cj + τrjB)− κδij ]2 ,

where κ is a common rescaling for all dissimilarities

DRM MO in Information Visualization 12 / 48

Page 24: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Problem Statement: objectives

Biobjective optimization problem:

Distances between objects resemble dissimilarities

Objects are spread out in the visualization region Ω

Distances resemble dissimilarities

F1 : Rn × . . .× Rn × R+ × R+ 7−→ R+

(c1, . . . , cN , τ, κ) 7−→∑

i,j=1,...,Ni 6=j

[d(ci + τriB, cj + τrjB)− κδij ]2 ,

where κ is a common rescaling for all dissimilarities

DRM MO in Information Visualization 12 / 48

Page 25: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Problem Statement: objectives

Biobjective optimization problem:

Distances between objects resemble dissimilarities

Objects are spread out in the visualization region Ω

Distances resemble dissimilarities

F1 : Rn × . . .× Rn × R+ × R+ 7−→ R+

(c1, . . . , cN , τ, κ) 7−→∑

i,j=1,...,Ni 6=j

[d(ci + τriB, cj + τrjB)− κδij ]2 ,

where κ is a common rescaling for all dissimilarities

Spread: separate the objects

F2 : Rn × . . .× Rn × R+ 7−→ R+

(c1, . . . , cN , τ) 7−→ −∑

i,j=1,...,Ni 6=j

d2(ci + τriB, cj + τrjB).

DRM MO in Information Visualization 12 / 48

Page 26: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Problem Statement: Visualization Map problem

The Visualization Map (VizMap) problem

minc1,...,cN ,τ,κ

F (c1, . . . , cN , τ, κ)

s.t. ci + τriB ⊆ Ω, i = 1, . . . , Nτ ∈ Tκ ∈ K,

(VizMap)

where F = λF1 + (1− λ)F2, λ ∈ [0, 1], T,K ⊂ R+

DRM MO in Information Visualization 13 / 48

Page 27: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Difference of Convex (DC) decomposition

DC decomposition

Proposition. The function F , where d is the infimum distance, can be expressed asa DC function, F = u− (u− F ), where u is a quadratic separable convex functiongiven by

u = max3λ− 1, 0 ·

∑i=1,...,N

8(N − 1)‖ci‖2 + τ2∑

i,j=1,...,Ni 6=j

βij

+ 2λκ2∑

i,j=1,...,Ni 6=j

δ2ij ,

where βij satisfies βij ≥ 2‖ribi − rjbj‖2 for all bi, bj ∈ B.

DRM MO in Information Visualization 14 / 48

Page 28: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Difference of Convex Algorithm (DCA)

minf(x) = u(x)− v(x) : x ∈ X

Algorithm DCA scheme [Le Thi and Pham Dinh, 2005]

Input: x0 ∈ Rn.1: t← 02: repeat3: yt ∈ ∂v(xt);4: xt+1 ∈ arg min u(x)− (v(xt) + 〈x− xt, yt〉) : x ∈ X;5: t← t+ 1;6: until stop condition is met.

Output: xt

= +

DRM MO in Information Visualization 15 / 48

Page 29: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Difference of Convex Algorithm (DCA)

minf(x) = u(x)− v(x) : x ∈ X

Algorithm DCA scheme [Le Thi and Pham Dinh, 2005]

Input: x0 ∈ Rn.1: t← 02: repeat3: yt ∈ ∂v(xt);4: xt+1 ∈ arg min u(x)− (v(xt) + 〈x− xt, yt〉) : x ∈ X;5: t← t+ 1;6: until stop condition is met.

Output: xt

x0

= +

DRM MO in Information Visualization 15 / 48

Page 30: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Difference of Convex Algorithm (DCA)

minf(x) = u(x)− v(x) : x ∈ X

Algorithm DCA scheme [Le Thi and Pham Dinh, 2005]

Input: x0 ∈ Rn.1: t← 02: repeat3: yt ∈ ∂v(xt);4: xt+1 ∈ arg min u(x)− (v(xt) + 〈x− xt, yt〉) : x ∈ X;5: t← t+ 1;6: until stop condition is met.

Output: xt

x0

= +

x0

DRM MO in Information Visualization 15 / 48

Page 31: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Difference of Convex Algorithm (DCA)

minf(x) = u(x)− v(x) : x ∈ X

Algorithm DCA scheme [Le Thi and Pham Dinh, 2005]

Input: x0 ∈ Rn.1: t← 02: repeat3: yt ∈ ∂v(xt);4: xt+1 ∈ arg min u(x)− (v(xt) + 〈x− xt, yt〉) : x ∈ X;5: t← t+ 1;6: until stop condition is met.

Output: xt

x1

= +

DRM MO in Information Visualization 15 / 48

Page 32: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Difference of Convex Algorithm (DCA)

minf(x) = u(x)− v(x) : x ∈ X

Algorithm DCA scheme [Le Thi and Pham Dinh, 2005]

Input: x0 ∈ Rn.1: t← 02: repeat3: yt ∈ ∂v(xt);4: xt+1 ∈ arg min u(x)− (v(xt) + 〈x− xt, yt〉) : x ∈ X;5: t← t+ 1;6: until stop condition is met.

Output: xt

x1

= +

x1

DRM MO in Information Visualization 15 / 48

Page 33: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

DCA for (VizMap)

(VizMapRelaxed): optimization problem in line 4 of DCA

min u(x)− (v(xt) + 〈x− xt, yt〉) : x ∈ X

DRM MO in Information Visualization 16 / 48

Page 34: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

DCA for (VizMap)

(VizMapRelaxed) has the form:

minc1,...,cN ,τ,κ

∑i=1,...,N

Mci‖ci‖2 +Mκκ2 +Mττ2 +∑

i=1,...,N

ci>qci + pκκ+ pττ

s.t. ci + τriB ⊆ Ω, i = 1, . . . , N

τ ∈ Tκ ∈ K,

for scalars Mci , Mκ, Mτ ≥ 0, vectors qci and scalars pκ and pτ .

DRM MO in Information Visualization 16 / 48

Page 35: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

DCA for (VizMap)

(VizMapRelaxed) has the form:

minκ∈K

Mκκ2 + pκκ

+ min

ci+τriB⊆Ωτ∈T

∑i=1,...,N

Mci‖ci‖2 + ci>qci+Mττ2 + pττ

for scalars Mci , Mκ, Mτ ≥ 0, vectors qci and scalars pκ and pτ .

DRM MO in Information Visualization 16 / 48

Page 36: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

DCA for (VizMap)

(VizMapRelaxed) has the form:

minκ∈K

Mκκ2 + pκκ

+ min

ci+τriB⊆Ωτ∈T

∑i=1,...,N

Mci‖ci‖2 + ci>qci+Mττ2 + pττ

for scalars Mci , Mκ, Mτ ≥ 0, vectors qci and scalars pκ and pτ .

Convex problem in one variable. Separable in the variables ciif τ is fixed.

DRM MO in Information Visualization 16 / 48

Page 37: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

DCA for (VizMap)

(VizMapRelaxed) has the form:

minκ∈K

Mκκ2 + pκκ

+ min

ci+τriB⊆Ωτ∈T

∑i=1,...,N

Mci‖ci‖2 + ci>qci+Mττ2 + pττ

for scalars Mci , Mκ, Mτ ≥ 0, vectors qci and scalars pκ and pτ .

Convex problem in one variable. Alternating scheme:

Optimization of τ forc1, . . . , cN fixed.

For fixed τ and i, optimize ciwith (VizMapRelaxedSub)

minci

Mci‖ci‖2 + ci

>qci

s.t. ci ∈ Ω− τriB.

DRM MO in Information Visualization 16 / 48

Page 38: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

DCA for (VizMap)

Algorithm DCA for (VizMap)

Input: c01, . . . , c

0N ∈ Ω, κ0 ∈ K, τ0 ∈ T .

1: s← 0;2: repeat3: t← 0;4: repeat

5: Compute Mcit

and qcit, i = 1, . . . , N ;

6: Compute ct+1i by solving (VizMapRelaxedSub) for τ fixed at τs, i = 1, . . . , N ;

7: t← t+ 1;8: until stop condition is met.

9: Compute Mκs and pκs;

10: Compute κs+1 by solving the first optimization problem in (VizMapRelaxed);

11: Compute Mτs and pτs;

12: Compute τs+1 by solving the second optimization problem in (VizMapRelaxed)for c1, . . . , cN fixed at ct1, . . . , c

tN ;

13: s← s+ 1;14: until stop condition is met.Output: ct1, . . . , c

tN , κ

t, τs

DRM MO in Information Visualization 17 / 48

Page 39: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Visualizing financial markets

V : 11 financial markets across Europe and Asia;

ω: importance regarding to the world market portfolio, [Flavin et al., 2002];

δ: correlation between markets, [Borg and Groenen, 2005];

B: disc centered at the origin with radius equal to one;

Ω = [0, 1]2;

λ = 0.9.

bruscbs

dax

ftse

hs

madrid

milan

nikkei

sing

taiwan

vec

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00DRM MO in Information Visualization 18 / 48

Page 40: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Visualizing a social network

V : 200 musicians;ω: degree of influence, Dork et al. [2012];δ: shortest path in the network;B: disc centered at the origin with radius equal to one;Ω = disc centered at the origin with radius equal to one;λ = 0.9.

−1.0

−0.5

0.0

0.5

1.0

−1.0 −0.5 0.0 0.5 1.0DRM MO in Information Visualization 19 / 48

Page 41: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Visualizing a social network

V : 200 musicians;ω: degree of influence, Dork et al. [2012];δ: shortest path in the network;B: disc centered at the origin with radius equal to one;Ω = disc centered at the origin with radius equal to one;λ = 0.9.

−1.0

−0.5

0.0

0.5

1.0

−1.0 −0.5 0.0 0.5 1.0DRM MO in Information Visualization 19 / 48

Page 42: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Outline

1 Introduction

2 Visualizing magnitudes and relations with a convex body

3 Visualizing magnitudes and relations with multiple convex bodies

4 Visualizing frequencies and relations with rectangles

5 Visualizing frequencies and relations with box-connected portions

6 Concluding remarks

DRM MO in Information Visualization 20 / 48

Page 43: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Information Visualization: our framework

Dynamic data across T snapshots

V (t) individuals in snapshot t, with magnitudes ωt and dissimilarities δt

Carrizosa, Guerrero, and Romero Morales [2017c]

New models and solution approaches to build a dynamic visualizationmap, in which, for each snapshot t, individuals in V (t) are depicted as convexbodies in Ω, whose volumes are proportional to their magnitudes ωt andwhich are located accordingly to the dissimilarities δt, and the transactionsfrom snapshots t to t+ 1 are smooth.

DRM MO in Information Visualization 21 / 48

Page 44: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Multiple Reference Objects

Ω : visualization region,compact subset ⊂ Rn,

Bi =B1i , . . . ,Bsii

: a catalogue of reference objects,

Bpi : compact convex body ⊂ Rn,0 ∈ Bpi ,symmetric wrt 0,

vi,t 7−→ ci,t + τrpi,tBpi ,

rpi,t : such that vol(rpi,tBpi ) ∝ ωi,t,

τ : is a common rescaling for all objectsxpi : 1 if individual vi is represented by Bpi ∈ Bi

0 otherwise

DRM MO in Information Visualization 22 / 48

Page 45: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Problem Statement: objectives

Triobjective optimization problem:

Distances between objects resemble dissimilarities, F1(c1,1, . . . , cN,T ,x) =

T∑t=1

∑i,j∈V (t)

∑p=1,...,siq=1,...,sj

[d(ci,t + τrpi,tB

pi , cj,t + τrqj,tB

qj )− κδij,t

]2xpi x

qj .

Objects are spread out in Ω, F2(c1,1, . . . , cN,T ,x) =

−T∑t=1

∑i,j∈V (t)

∑p=1,...,siq=1,...,sj

d2(ci,t + τrpi,tBpi , cj,t + τrqj,tB

qj )x

pi xqj .

Smooth transitions from a snapshot to the next, F3(c1,1, . . . , cN,T ) =

T−1∑t=1

∑i=1,...,N

‖ci,t − ci,t+1‖2.

DRM MO in Information Visualization 23 / 48

Page 46: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

The Dynamic Visualization Map problem

The Dynamic Visualization Map (DyVizMap) problem

minc1,1,...,cN,T ,x

F (c1,1, . . . , cN,T ,x)

s.t. ∑p=1,...,si

xpi = 1, i = 1, . . . , N,

ci,t + τrpi,txpiB

pi ⊆ Ω,

i = 1, . . . , N, p = 1, . . . , si, t = 1, . . . , T,

ci,t ∈ Rn, i = 1, . . . , N ; t = 1, . . . , T,

xpi ∈ 0, 1, i = 1, . . . , N, p = 1, . . . , si,

(DyVizMap)

where F = λ1F1 + λ2F2 + λ3F3, λ1, λ2, λ3 ∈ [0, 1] and λ1 + λ2 + λ3 = 1

DRM MO in Information Visualization 24 / 48

Page 47: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

(DyVizMap) for fixed x and for fixed c

(DyVizMap) for fixed x

(DyVizMap)x can be solved with DCA, as for (VizMap).

(DyVizMap) for fixed c

(DyVizMap)c is a nonconvex 0–1 quadratic optimization problem withassignment constraints.

When there are at most two reference objects, the problem can be rewritten asa convex quadratic 0–1 problem, and solved through a convexification of theobjective function, Billionnet and Elloumi [2007].

DRM MO in Information Visualization 25 / 48

Page 48: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

An alternating scheme for (DyVizMap)

Algorithm Alternating scheme for (DyVizMap)

Input: xini =((xini)pi

)∈ 0, 1S , where S =

∑Nk=1 sk, and

cini =(cini1,1 , . . . , c

iniN,T

), such that cinii,t + τrpi,t(x

ini)piBpi ⊆ Ω,

i = 1, . . . , N, p = 1, . . . , si, t = 1, . . . , T.1: c← cini;2: x← xini;3: repeat4: c← solve (DyVizMap)x with DCA for (VizMap);5: x← solve (DyVizMap)c;6: until stop condition is met.

Output: c = (c1,1, . . . , cN,T ), x =(xpi), i = 1, . . . , N, p = 1, . . . , si

DRM MO in Information Visualization 26 / 48

Page 49: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Visualizing news topics

Carrizosa, Guerrero, Hardt, Romero Morales, and Valverde Martınez [2016]

V : 203 words across 21 snapshots (1995–2015) around the topic of immigration;ω: tdidf;δ: cosine distance;Bi =

R1i ,R2

i

: Rpi are rectangles parallel to the coordinate axes, p = 1, 2;

Ω = [0, 1]2;λ1 = 0.7, λ2 = 0.2, λ3 = 0.1.

DRM MO in Information Visualization 27 / 48

Page 50: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

DRM MO in Information Visualization 28 / 48

Page 51: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Outline

1 Introduction

2 Visualizing magnitudes and relations with a convex body

3 Visualizing magnitudes and relations with multiple convex bodies

4 Visualizing frequencies and relations with rectangles

5 Visualizing frequencies and relations with box-connected portions

6 Concluding remarks

DRM MO in Information Visualization 29 / 48

Page 52: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Information Visualization: our framework

V : v1, v2, . . . , vNω : ω1, ω2, . . . , ωN

δ :

0 δ12 · · · δ1Nδ21 0 · · · δ2N...

.... . .

...δN1 δN2 · · · 0

Ω :

set of N individuals

frequencies

adjacenciesvisualization region ⊂ Rn

Carrizosa, Guerrero, and Romero Morales [2015]

New models and solution approaches to build a (K,L)-rectangular map, apartition P = Pi of Ω, in which individuals vi are depicted as rectangles Pi,whose areas are as close as possible to the frequencies ω, whose adjacencies(no adjacencies, respect.) are as close as possible to the adjacencies (noadjacencies, respect.) in δ.

DRM MO in Information Visualization 30 / 48

Page 53: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

In the literature

AL

AZAR

CA

CO

CT

DE

FL

GA

ID

IL IN

IA

KSKY

LA

ME

MD

MA

MIMN

MS

MO

MT

NE

NV

NH

NJ

NM

NY

NC

ND

OH

OK

OR

PARI

SC

SD

TN

TX

UT

VT

VA

WA

WV

WI

WY

(a) The U.S. map

AL

AZ

AR

CA

CO

CTDE

FL

GA

ID

IL

IN

IA

KS

KY

LA

ME

MD

MA

MI

MN

MS

MO

MT

NE

NV

NH

NJ

NM

NY

NC

ND

OH

OK

ORPA

RI

SC

SD

TN

TX

UT

VT

VA

WA

WV

WI

WY

(b) Recmap for the U.S., Heilmannet al. [2004]

ALAZ

ARCA

CO

CT

DE

FLGA

ID

IL IN

IA

KS KY

LA

ME

MD

MAMI

MN

MS

MO

MT

NE

NENV

NJ

NM

NY

NC

ND

OH

OK

OR PA

RI

SC

SD

TX

TX

UT

VT

VA

WA

WV

WI

WY

(c) Grid map for the U.S. built inEppstein et al. [2015]

AL

AZ AR

CA

CO

CT

DE

FLGA

ID

IL IN

IA

KS KY

LA ME

MD

MAMI

MN

MS

MO

MT

NENV

NHNJ

NM

NY

NC

ND

OH

OK

OR PA

RI

SC

SD

TN

TX

UT

VT

VA

WA

WV

WI

WY

(d) Grid map for the U.S. built withthe ECPA methodology

DRM MO in Information Visualization 31 / 48

Page 54: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Problem Statement: decision variables

xrij =

1 if cell (i, j) belongs to Pr0 otherwise

zrs =

1 if Ps is adjacent to Pr0 otherwise.

ulrsij =

1 if Ps is adjacent to Pr at cell (i, j) from side l:l=1(above), l=2(below), l=3(right), l=4(left)

0 otherwise.

ϕr, ψr ≥ 0

DRM MO in Information Visualization 32 / 48

Page 55: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Problem Statement: objectives

Triobjective optimization problem:

Adjacencies are similar, F1(x, z,u, ϕ, ψ) =∑

r,s=1...N(r,s)∈E

zrs

No adjacencies are similar, F2(x, z,u, ϕ, ψ) = −∑

r,s=1...N

(r,s)∈E

zrs

Areas are similar to frequencies, F3(x, z,u, ϕ, ψ) = −∑

r=1,...,N

(ϕr + ψr) , where

1

KL

∑i=1,...,Kj=1,...,L

xrij − ωr = ϕr − ψr

F = λ1F1 + λ2F2 + λ3F3, λ1, λ2, λ3 ∈ [0, 1] and λ1 + λ2 + λ3 = 1

DRM MO in Information Visualization 33 / 48

Page 56: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

The Rectangular Map (RecVizMap) problem

The Rectangular Map (RecVizMap) problemmin

x,z,u,ϕ,ψF (x, z,u, ϕ, ψ)

s.t. ∑r=1,...,N

xrij = 1, i = 1, . . . , K, j = 1, . . . , L,∑i=1,...,Kj=1,...,L

xrij ≥ 1, r = 1, . . . , N,

∑mini,i′≤i′′≤maxi,i′minj,j′≤j′′≤maxj,j′

xr i′′ j′′ ≥ (|i− i′|+ 1) · (|j − j′|+ 1) · (xrij + xri′j′ − 1),

r = 1, . . . , N, i, i′ = 1, . . . , K, j, j′ = 1, . . . , L,∑i=2,...,Kj=1,...,L

u1rsij +

∑i=1,...,K−1j=1,...,L

u2rsij +

∑i=1,...,Kj=1,...,L−1

u3rsij +

∑i=1,...,Kj=2,...,L

u4rsij ≥ zrs,

r, s = 1, . . . , N, r 6= s,

...Additional Well-defined Variables Constraints...

1

KL

∑i=1,...,Kj=1,...,L

xrij − ωr = ϕr − ψr, r = 1, . . . , N,

xrij , zrs, ulrsij ∈ 0, 1,

r, s = 1, . . . , N, r 6= s, i = 1, . . . , K, j = 1, . . . , L, l = 1, . . . , 4,ϕr, ψr ≥ 0, r = 1, . . . , N.

(RecVizMap)

DRM MO in Information Visualization 34 / 48

Page 57: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

(RecVizMap) (C’ed)

xrij + xs i−1 j ≤ zrs + 1, r, s = 1, . . . , N, r 6= s, i = 2, . . . , K, j = 1, . . . , L,

xrij + xs i+1 j ≤ zrs + 1, r, s = 1, . . . , N, r 6= s, i = 1, . . . , K − 1, j = 1, . . . , L,

xrij + xs i j+1 ≤ zrs + 1, r, s = 1, . . . , N, r 6= s, i = 1, . . . , K, j = 1, . . . , L− 1,

xrij + xs i j−1 ≤ zrs + 1, r, s = 1, . . . , N, r 6= s, i = 1, . . . , K, j = 2, . . . , L,

u1rsij ≤ xrij , r, s = 1, . . . , N, r 6= s, i = 1, . . . , K, j = 1, . . . , L,

u1rsij ≤ xs i−1 j , r, s = 1, . . . , N, r 6= s, i = 2, . . . , K, j = 1, . . . , L,

xrij + xs i−1 j ≤ u1rsij + 1, r, s = 1, . . . , N, r 6= s, i = 2, . . . , K, j = 1, . . . , L,

u2rsij ≤ xrij , r, s = 1, . . . , N, r 6= s, i = 1, . . . , K, j = 1, . . . , L,

u2rsij ≤ xs i+1 j , r, s = 1, . . . , N, r 6= s, i = 1, . . . , K − 1, j = 1, . . . , L,

xrij + xs i+1 j ≤ u2rsij + 1, r, s = 1, . . . , N, r 6= s, i = 1, . . . , K − 1, j = 1, . . . , L,

u3rsij ≤ xrij , r, s = 1, . . . , N, r 6= s, i = 1, . . . , K, j = 1, . . . , L,

u3rsij ≤ xs i j+1, r, s = 1, . . . , N, r 6= s, i = 1, . . . , K, j = 1, . . . , L− 1,

xrij + xs i j+1 ≤ u3rsij + 1, r, s = 1, . . . , N, r 6= s, i = 1, . . . , K, j = 1, . . . , L− 1,

u4rsij ≤ xrij , r, s = 1, . . . , N, r 6= s, i = 1, . . . , K, j = 1, . . . , L,

u4rsij ≤ xs i j−1, r, s = 1, . . . , N, r 6= s, i = 1, . . . , K, j = 2, . . . , L,

xrij + xs i j−1 ≤ u4rsij + 1, r, s = 1, . . . , N, r 6= s, i = 1, . . . , K, j = 2, . . . , L.

DRM MO in Information Visualization 35 / 48

Page 58: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Embedded Cell Perturbing Algorithm for (RecVizMap)

Algorithm Embedded Cell Perturbing Algorithm (ECPA)

Input: The number of levels in the hierarchy T . A set of embedded grids (Kt, Lt)t=1,...,T .The set of locating cells arising from locating points obtained with MDS on the (K1, L1)-grid,

CMDS(K1,L1). A perturb and subdivide procedures, perturb(·) and subdivide(·).

1:(C∗(K1,L1), RecV izMapλ,C∗

(K1,L1)

)← CPA

(CMDS(K1,L1), perturb(·)

)2: for t← 2 to T do3: C∗(Kt,Lt) ← subdivide(C∗(Kt−1,Lt−1));

4:(C∗(Kt,Lt), RecV izMapλ,C∗

(Kt,Lt)

)← CPA

(C∗(Kt−1,Lt−11), perturb(·)

);

5: end forOutput: C∗(KT ,LT ), RecV izMapλ,C∗

(KT ,LT )

DRM MO in Information Visualization 36 / 48

Page 59: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Visualizing population rates

V : 12 provinces of The Netherlands;ω: (normalized) population, Statistics Netherlands [2013];δ: geographical adjacencies;K = L = 20;

λ1 =1

|E| , λ2 =1

|E|, λ3 = 1.

GR

FR

DR

NH

FL

OV

ZH

UT

GE

ZE

NB

LI

DRM MO in Information Visualization 37 / 48

Page 60: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Visualizing population rates

V : 12 provinces of The Netherlands;ω: (normalized) population, Statistics Netherlands [2013];δ: geographical adjacencies;K = L = 20;

λ1 =1

|E| , λ2 =1

|E|, λ3 = 1.

GR

FRDR

NHFL

OV

ZH

UT

GE

ZE

NB

LI

GR

FRDR

NHFL

OV

ZH

UT

GE

ZE

NB

LI

DRM MO in Information Visualization 37 / 48

Page 61: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Outline

1 Introduction

2 Visualizing magnitudes and relations with a convex body

3 Visualizing magnitudes and relations with multiple convex bodies

4 Visualizing frequencies and relations with rectangles

5 Visualizing frequencies and relations with box-connected portions

6 Concluding remarks

DRM MO in Information Visualization 38 / 48

Page 62: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Information Visualization: our framework

V : v1, v2, . . . , vNω : ω1, ω2, . . . , ωN

δ :

0 δ12 · · · δ1Nδ21 0 · · · δ2N...

.... . .

...δN1 δN2 · · · 0

Ω :

set of N individuals

frequencies

dissimilaridadesvisualization region ⊂ Rn

Carrizosa, Guerrero, and Romero Morales [2017a]

New models and solution approaches to build a (K,L)-Box-Connectedmap, in which individuals in V are depicted as box-connected portions in Ω,whose areas are as close as possible to the frequencies ω, whose distancesare as close as possible to δ.

DRM MO in Information Visualization 39 / 48

Page 63: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Box-Connectivity

1 2 3 4

1

2

3

4

1 2 3 4

1

2

3

4

1 2 3 4

1

2

3

4

1 2 3 4

1

2

3

4

DRM MO in Information Visualization 40 / 48

Page 64: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

The Box-Connected Map (BoxVizMap) problem

The Box-Connected Map (BoxVizMap) problem

minx,κ∑

r,s=1,...,Nr 6=s

|distance(Pr(x), Ps(x))− κδrs|

s.t. ∑r=1,...,N

xrij = 1 i = 1, . . . , K, j = 1, . . . , L∑i=1,...,Kj=1,...,L

xrij ≥ 1 r = 1, . . . , N

xrij ∈ 0, 1 r = 1, . . . , N, i = 1, . . . , K, j = 1, . . . , Lκ ≥ 0 ∑(i′′,j′′)∈B((i,j),(i′,j′))

(i′′,j′′)6=(i,j)

(i′′,j′′)6=(i′,j′)

xri′′j′′ ≥ xrij + xri′j′ − 1

r = 1, . . . , N, i, i′ = 1, . . . , K, j, j′ = 1, . . . , L,such that cells (i, j) and (i′, j′) are non-adjacent

∑r=1,...,N

∣∣∣∣∣∣∣∣1

KL

∑i=1,...,Kj=1,...,L

xrij − ωr

∣∣∣∣∣∣∣∣ ≤ α.

(BoxVizMap)

DRM MO in Information Visualization 41 / 48

Page 65: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

A Large Neighborhood Search (LNS) approach

Algorithm LNS pseudocode, Pisinger and Ropke [2010]

Input: A feasible solution x, an objective function f , destroy and repair procedures1: x∗ ← x2: repeat3: xt ← repair(destroy(x∗));4: if f(xt) < f(x∗) then5: x∗ ← xt;6: end if7: until stop condition is met

Output: x∗

DRM MO in Information Visualization 42 / 48

Page 66: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Visualizing signal confusions

V : 26 letters of the English alphabet;ω: relative frequency of letters, Lewand [2000];δ: confusion rate of Morse signals.

a a a a a a a aa a a a a a a aa a a a a a a aa a a a a a a a

a a a a a a aa a a a a a

a a a a aa a a a aa a a a a aa a a a a a aa a a a a a a a

a a a a a a aa a a a a a aa a a a a aa a a a a aa a a a a aa a a a a aa a a a a aaaaaa

b b b b bb b b b bb b b b b

b b bbb

c c c c c c cc c c c c c cc c c c c c c

c c c c c cc c c c c c c c cc c c c c c c c c cc c c c c c

c c cc cc cc cc

d dd dd dd d dd d d

d d d d d d d dd d d d dd d d d dd d d d d dd d d d d dd d d d d dd d d d d dd d d d d d dd d d d d d d

d d d d dd

ee ee e e

e e e e e ee e e e e e e e e e e e e e

e e e e e e e e e e e e e e ee e e e e e e e e e e e e e e ee e e e e e e e e e e e e e e ee e e e e e e e e e e e e e e e e ee e e e e e e e e e e e e e e e e ee e e e e e e e e e e e e e e e e e

e e e e e e e e e e e e e e e ee e e e e e e

e e e e e ee e e e e e e ee e e e e e e ee e e e e e e ee e e e e e e ee e e e e e e ee e e e e e e e

ffff f f f f f ff f f f f f ff f f f f f f f ff f f f f f f f f f

f g g g g g g gg g g g g g gg g g gg g g gg g g gg g g g

g g g gg g gg g gg g

h h hh h hh h hh h h

h h h hh h h hh h h hh h h hh h h h hh h h h h h h h hh h h h h h h h hh h h h h h h h hh h h h h h h h hh h h h h h h hh h h h h hh h h hh h h hh h h hh h h

i i ii i ii i i

i i i i ii i i i i i i ii i i i i i i i

i i i i i i i i ii i i i i i i i ii i i i i i i i ii i i i i i i i i

i i i i i i i i i i i ii i i i i i i i i i i i i

i ii ii ii iij

j j

kkk k kk k k

l l ll l ll l l

l l l l ll l l l l l ll l l l l l l l ll l l l l l l l l

l l l l l l l l l ll l l l l l l l l ll l l l l l l l l l l

l l l l ll l l l

m mm m m m m m m m

m m m m m mm m mm m mm m m

n n n n n n n n n n n n nn n n n n n n nn n n n n n n nn n n n n n n nn n n n n n n nn n n n n n n nn n n n n n n n

n n n n n n nn n n n n n n

n n n n n nn n n n nn n n n nn n n n nn n n n nn n n n

n n n

o o o o o o o o o o o o o o o oo o o o o o o o o o o o o o o oo o o o o o o o o o o o o o o oo o o o o o o o o o o o o oo o o o o o o o o o oo o o o o o o o o o oo o o o o o o o o o oo o o o o o o o o

o o o o oo o o o

o oo oo o

o

pp p pp p p

p p p p p pp p p pp p p pp p p pp p p p

p pp p

q

rr

r rr r

r r rr r rr r r

r r r rr r r r r r

r r r r r r r rr r r r r r r r r r r rr r r r r r r r r r r r

r r r r r r r r r r r r rr r r r r r r r r r r rr r r r r r r r r r r r r r

ss ss s s ss s s s s s s s s s s s s s ss s s s s s s s s s s s s s s ss s s s s s s s s s s s s s s s

s s s s s s s s s s s ss s s s s s s s s s s ss s s s s s s s s s s s

s s s s s ss s s s s

tt tt t tt t tt t tt t tt t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t tt t t t

t t tt t t

t t t t t t t t t tt t t t t t t t t t

u uu uu u u u

u u u u u u u u uu u u u u u u uu u u u u uu u u u u uu u u u u u

u u u u

vv v

v v v vvv w

ww w

w w ww w w ww w w ww w w w

w w w w ww ww ww w

xx

y y y y y y y y y yy y y y y y y y y y

y y yy y y

y yy y

y

z

abcdefghijklmnopqrstuvwxyz

DRM MO in Information Visualization 43 / 48

Page 67: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Outline

1 Introduction

2 Visualizing magnitudes and relations with a convex body

3 Visualizing magnitudes and relations with multiple convex bodies

4 Visualizing frequencies and relations with rectangles

5 Visualizing frequencies and relations with box-connected portions

6 Concluding remarks

DRM MO in Information Visualization 44 / 48

Page 68: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

Concluding remarks

Summary

New mathematical optimization models to visualize simultaneouslymagnitudes and relations attached to a set of individuals.

Problems formulated as Mixed Integer NonLinear Programs and solvedwith alternating schemes.

Numerical illustrations for datasets of diverse nature.

We can handle medium-large size instances.

DRM MO in Information Visualization 45 / 48

Page 69: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

References I

B. Baesens. Analytics in a Big Data World: The Essential Guide to Data Science and its Applications. Wiley and SASBusiness Series. Wiley, 2014.

B. Baesens, R. Setiono, C. Mues, and J. Vanthienen. Using neural network rule extraction and decision tables forcredit-risk evaluation. Management Science, 49(3):312–329, 2003.

D. Bertsimas, A. O’Hair, S. Relyea, and J. Silberholz. An analytics approach to designing combinationchemotherapy regimens for cancer. Management Science, 62(5):1511–1531, 2016.

A. Billionnet and S. Elloumi. Using a mixed integer quadratic programming solver for the unconstrainedquadratic 0–1 problem. Mathematical Programming, 109(1):55–68, 2007.

I. Borg and P.J.F. Groenen. Modern Multidimensional Scaling: Theory and Applications. Springer Science & BusinessMedia, 2005.

L. Bottou, F.E. Curtis, and J. Nocedal. Optimization methods for large-scale machine learning. arXiv preprintarXiv:1606.04838, 2016.

E. Carrizosa and D. Romero Morales. Supervised classification and mathematical optimization. Computers andOperations Research, 40(1):150–165, 2013.

E. Carrizosa, V. Guerrero, and D. Romero Morales. A multi-objective approach to visualize adjacencies inweighted graphs by rectangular maps. Technical report, IMUS, Sevilla, Spain, 2015.

E. Carrizosa, V. Guerrero, D. Hardt, D. Romero Morales, and J.M. Valverde Martınez. On building dynamicvisualization maps for news story topic data. Technical report, Copenhagen Business School, 2016.

E. Carrizosa, V. Guerrero, and D. Romero Morales. Visualizing proportions and dissimilarities by space-fillingmaps: a large neighborhood search approach. Computers & Operations Research, 78:369–380, 2017a.

E. Carrizosa, V. Guerrero, and D. Romero Morales. Visualizing data as objects by dc (difference of convex)optimization. Forthcoming in Mathematical Programming, Series B, 2017b.

E. Carrizosa, V. Guerrero, and D. Romero Morales. Visualization of complex dynamic datasets by means ofmathematical optimization. Technical report, IMUS, Sevilla, Spain, 2017c.

C. Ding and H.-D. Qi. Convex optimization learning of faithful euclidean distance representations in nonlineardimensionality reduction. Forthcoming in Mathematical Programming, 2016.

M. Dork, S. Carpendale, and C. Williamson. Visualizing explicit and implicit relations of complex informationspaces. Information Visualization, 11(1):5–21, 2012.

DRM MO in Information Visualization 46 / 48

Page 70: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

References II

A.P. Duarte Silva. Optimization approaches to supervised classification. European Journal of Operational Research, 261(2):772–788, 2017.

D. Eppstein, M. van Kreveld, B. Speckmann, and F. Staals. Improved grid map layout by point set matching.International Journal of Computational Geometry & Applications, 25(02):101–122, 2015.

T. Flavin, M. Hurley, and F. Rousseau. Explaining stock market correlation: A gravity model approach. TheManchester School, 70:87–106, 2002.

S. Fortunato. Community detection in graphs. Physics Reports, 486(3):75–174, 2010.

A.A. Freitas. Comprehensible classification models: a position paper. ACM SIGKDD Explorations Newsletter, 15(1):1–10, 2014.

H. Hand, H. Mannila, and P. Smyth. Principles of Data Mining. MIT Press, 2001.

P. Hansen and B. Jaumard. Cluster analysis and mathematical programming. Mathematical Programming, 79(1-3):191–215, 1997.

J. Heer, M. Bostock, and V. Ogievetsky. A tour through the visualization zoo. Communications of the ACM, 53:59–67, 2010.

R. Heilmann, D. A. Keim, C. Panse, and M. Sips. Recmap: Rectangular map approximations. In Proceedings of theIEEE Symposium on Information Visualization, pages 33–40. IEEE Computer Society, 2004.

L. Hubert, P. Arabie, and M. Hesson-Mcinnis. Multidimensional scaling in the city-block metric: A combinatorialapproach. Journal of Classification, 9(2):211–236, 1992.

J.B. Kruskal. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29(1):1–27, 1964.

H.A. Le Thi and T. Pham Dinh. D.C. Programming Approach to the Multidimensional Scaling Problem. InA. Migdalas, P.M. Pardalos, and P. Varbrand, editors, From Local to Global Optimization, volume 53 of NonconvexOptimizations and Its Applications, pages 231–276. Springer, 2001.

H.A. Le Thi and T. Pham Dinh. The DC (difference of convex functions) programming and DCA revisited withDC models of real world nonconvex optimization problems. Annals of Operations Research, 133(1-4):23–46, 2005.

P.L. Leung and K. Lau. Estimating the city-block two-dimensional scaling model with simulated annealing.European Journal of Operational Research, 158(2):518–524, 2004.

R.E. Lewand. Cryptological Mathematics. The Mathematical Association of America, Washington, D.C., 2000.

DRM MO in Information Visualization 47 / 48

Page 71: Mathematical Optimization in Information Visualization. Carrizosa (U. Sevilla), V. Guerrero (U. Sevilla), D. Hardt (CBS), B. Mart n-Barrag an (U. Edinburgh), A. Nogales-G omez (Huawei),

References III

S. Liu, W. Cui, Y. Wu, and M. Liu. A survey on information visualization: recent advances and challenges. TheVisual Computer, 30(12):1373–1393, 2014.

J.S. Marron and A.M. Alonso. Overview of object oriented data analysis. Biometrical Journal, 56(5):732–753, 2014.

S. Olafsson, X. Li, and S. Wu. Operations research and data mining. European Journal of Operational Research, 187(3):1429–1448, 2008.

D. Pisinger and S. Ropke. Large neighborhood search. In M. Gendreau and J. Y. Potvin, editors, Handbook ofMetaheuristics, volume 146, chapter 13, pages 399–419. Springer US, 2010.

F. Provost and T. Fawcett. Data Science for Business: What you need to know about data mining and data-analytic thinking.O’Reilly Media, Inc., 2013.

G. Ridgeway. The pitfalls of prediction. National Institute of Justice Journal, 271:34–40, 2013.

B. Speckmann, M. van Kreveld, and S. Florisson. A linear programming approach to rectangular cartograms. InProc. 12th International Symposium on spatial data handling, pages 527–546. Springer, 2006.

Statistics Netherlands. Population; gender, age, marital status and region, January 1. www.cbs.nl, 2013. Retrievedon: 2013-10-31.

J. Thomas and P.C. Wong. Visual Analytics. IEEE Computer Graphics and Applications, 24(5):20–21, 2004.

W.S. Torgerson. Theory and Methods of Scaling. Wiley, 1958.

M. Trosset and R. Mathar. On the existence of nonglobal minimizers of the stress criterion for metricmultidimensional scaling. In Proceedings of the Statistical Computing Section, pages 158–162. American StatisticalAssociation, 1997.

M.W. Trosset. Extensions of classical multidimensional scaling via variable reduction. Computational Statistics, 17:147–163, 2002.

B. Ustun and C. Rudin. Supersparse linear integer models for optimized medical scoring systems. Machine Learning,102(3):349–391, 2016.

S.J. Wright. Optimization algorithms for data analysis. Technical report, Optimization Online, 2016.http://www.optimization-online.org/DB HTML/2016/12/5748.html.

A. Zilinskas and A. Podlipskyte. On multimodality of the sstress criterion for metric multidimensional scaling.Informatica, 14(1):121–130, 2003.

A. Zilinskas and J. Zilinskas. Branch and bound algorithm for multidimensional scaling with city-block metric.Journal of Global Optimization, 43(2):357–372, 2009.

DRM MO in Information Visualization 48 / 48