some comments on gd and igd and relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1...

20
1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O. Schütze, X. Esquivel, A. Lara, C. Coello CINVESTAV-IPN Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional. Mexico City, Mexico

Upload: others

Post on 17-Jan-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

1

O. Schütze

Some Comments on GD and IGD and Relations to the Hausdorff Distance

O. Schütze, X. Esquivel, A. Lara, C. Coello

CINVESTAV-IPNCentro de Investigación y de Estudios Avanzados

del Instituto Politécnico Nacional.Mexico City, Mexico

Page 2: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

2

O. Schütze

Outline

Introduction and Background• Trade off for the design of indicators for the evaluation

of MOEAs• Metric / Hausdorff distance

Investigation of the Indicators • GD• IGD

A ‘New’ Indicator • Metric properties• Extension to continuous models

Page 3: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

3

O. Schütze

Multi-Objective Optimization

⎪⎩

⎪⎨

→⊂

→⊂=

RRQf

RRQfF

nk

n

:

:min

1

(MOP)

Multi-Objective Optimization Problem

PQ = set of optimal solutions (Pareto set)F(PQ) = the image of PQ (Pareto front)

Pareto set

f2

f1

Pareto front

f1,f2

x

First we consider discrete (or discretized) models, i.e., |Q|<∞.

Page 4: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

4

O. Schütze

Outliers in Stochastic Search Algorithms

⎟⎟⎠

⎞⎜⎜⎝

⎛=

)(

)(

]1,0[:

1

xgx

xF

RF kn

Example: Consider the MOP

where g:[0,1]n Rk-1 ( Okabe, ZDT).

Assume a point x=(ε,z), z∈[0,1]n-1, is a member of the archive/population.Further, assume that new candidate solutions are chosen uniformly at random from the domain.

Then the probability to find a point that dominates x is less than ε( objective 1). The distance of x to PQ can be ‘large’.

(ε,x2)

Page 5: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

5

O. Schütze

Example

P hypothetical Pareto front

X1 perfect approximation of P, except one outlier

X2 none of the elements are‘near’ to P

Question: Which approxomation is ‘better’?

Extreme situations:

-- pessimistic view (Hausdorff distance): dH(X1,P)=9, dH(X2,P)=2.83

-- averaged result (Generational distance): GD(X1,P)=0.81, GD(X2,P)=2.83

Page 6: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

6

O. Schütze

Outlier Trade Off

Use of a Metric

+ greedy search = shortestpath to the set of interest( triangle inequality)

-- Penalization of single outliers of the candidate set

Averaging the Results

+ Single outliers do not have a mayor influence on the result

-- The greedy search is not neccessarily the shortest path to the set of interest

Trade off for the indicator D when measuring results of MOEAs (the design of MOEAs is influenced by D):

Page 7: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

7

O. Schütze

Metric

Definition: Suppose X is a set and d:X×X R is a function. Then d is called a metric on X if, and only if, for each a,b,c∈X:

),(),(),( )(),(),( )(

0),( and 0),( )(

cadbadcadcabdbadb

babadbada

+≤=

=⇔=≥ (Positive Property)(Symmetric Property)(Triangle Inequality)

Variants:

-- d is called a semi-metric if properties (a) and (b) are satisfied

-- A pseudo-metric is a semi-metric that satisfies the relaxedtriangle inequality:

1 )),,(),((),( ≥+≤ σσ cadbadcad

Page 8: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

8

O. Schütze

Hausdorff DistanceDefinition: Let u,v∈Rn, A,B⊂Rn, and ||.|| be a vector norm. The Hausdorff distance dH is defined as follows:

)),(),,(max(:),( )(

),(sup:),( )(

inf:),( )(

ABdistBAdistBAdc

AudistABdistb

vuAudista

H

Bu

Av

=

=

−=

∈u

A

B

A

Remarks:

(i) dist(A,B) is not symmetric: if B is a proper subset of A, then it isdist(B,A)=0 and dist(A,B)>0.

(ii) dH is a metric on the set of discrete sets. It can also be used for continuous spaces. In that case it is dH(A,B)=0 ⇔clos(A)=clos(B)

Page 9: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

9

O. Schütze

Discussion of GD (1)

GD as proposed by Van Veldhuizen applied on general finitesets X, Y⊂Rk using dist:

pX

i

pi Yxdist

XYXGD

/1||

1),(1),( ⎟⎟⎠

⎞⎜⎜⎝

⎛= ∑

=

Metric properties:

-- positive property: NOit is GD(X,Y)=0 ⇔ X⊂Y (X can be a proper subset of Y (*))

-- symmetric property: NO(*): then GD(X,Y)=0 but GD(Y,X)>0

-- triangle inequality: NO ( next slide)

Page 10: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

10

O. Schütze

Discussion of GD (2)

2.) Investigate (relaxed) triangle inequality: let X,Z⊂Rk s.t. GD(X,Z)>0. Let

rhs(Y):= GD(X,Y)+GD(Y,Z)

and define Yn := X ∪ {y1,y2,…,yn}

such that Σi dist(yi,Z) < ∞. Then GD(X,Y)=0 and GD(Y,Z) 0 for n ∞GD does not satisfy and relaxed triangle inequality since rhs(y) 0.

Note: for p>1, any set {y1,..,yn}⊂F(Q) (if compact) can be taken!!

1.) Normalization strategy of GD: Let A1={a} with dist(F(a),F(PQ))=1, i.e.,GD(F(A1),F(PQ))=1

Now let An be the multiset consisting of n copies of a, An={a,…,a}, then

0)1,..,1(

))(),(( →==nn

nPFAFGD

pp

T

Qn

Page 11: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

11

O. Schütze

New Variant of GD

pX

i

pip

pX

i

pip Yxdist

XYxdist

XYXGD

/1||

1

/1||

1),(1),(1),( ⎟⎟⎠

⎞⎜⎜⎝

⎛=⎟

⎟⎠

⎞⎜⎜⎝

⎛= ∑∑

==

Nearby modification: take the power mean of the distances:

-- same (poor) metric properties, but

-- better averaging: GDp(F(An),F(PQ))=1 for all n∈N

-- (needed for the upcoming indicator)

Page 12: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

12

O. Schütze

Discussion IGD

IGD as proposed by Coello & Cruz applied on general finitesets X, Y⊂Rk using dist:

pY

i

pi Xydist

YYXIGD

/1||

1),(1),( ⎟⎟⎠

⎞⎜⎜⎝

⎛= ∑

=

-- same metric properties as GD since IGD(A,B) = GD(B,A)

-- same modification: take power mean of the distances:

pY

i

pip

pY

i

pip Xydist

YXydist

YYXIGD

/1||

1

/1||

1),(1),(1),( ⎟⎟⎠

⎞⎜⎜⎝

⎛=⎟

⎟⎠

⎞⎜⎜⎝

⎛= ∑∑

==

Page 13: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

13

O. Schütze

A “New” Indicator

( )),(),,(max),( YXIGDYXGDYX ppp =Δ

Proposition 1: ∆p is a semi-metric for 1≤p<∞ and a metric for p=∞

Remark: for p=∞ the indicator ∆p coincides with dH

Proposition 2: let |X|,|Y|,|Z|≤N, then

)),(),((),( ZYYXNZX ppp

p Δ+Δ≤Δ

Observation: GD(X,Y) is an ‘averaged version’ of dist(X,Y), same for IGD combine GD and IGD as for dH:

Page 14: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

14

O. Schütze

Interpretation of p for the Trade Off

p=1 p=2 p=5 p=10 p=∞

N=2 0.541 0.15 0.026 0.008 0

N=4 0.249 0.06 0.019 0.009 0N=6 0.105 0.033 0.008 0.003 0

N=10 0.02 0.004 0.002 0.001 0N=100 0 0 0 0 0

The larger the value of p, the ´nearer´Δp is to a metric(but: how to choose p? what is the influence of N?)

Table: Percentage of the triangle violations (σ=1) for different values of p. Hereby, we have taken 100,000 different sets A,B,C with

|A|,|B|,|C|=N, k=2, each entry randomly chosen within [0,1].

Page 15: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

15

O. Schütze

ExampleP hypothetical Pareto front

X1 perfect approximation of P, except one outlier

X2 none of the elements are‘near’ to P

Question: Which approxomation is ‘better’?

p=1 p=2 p=5 p=10 p=∞∆p(P,X1) 0.8182 2.714 4.047 5.571 9

∆p(P,X2) 2.828 2.828 2.828 2.828 2.828

Page 16: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

16

O. Schütze

Extension to Continuous Models

M1m1 f1

f2M2

m2

211 ],[ RMm →γNow consider continuous models

In general: k objectives PQ (k-1)-dimensional

GDp: A finite, PQ compactGD turns to a continuous SOP

pM

m

pQ dtAFtdist

mMPFAFIGD

/1

11

1

1

))(),((1))(),(( ⎟⎟⎠

⎞⎜⎜⎝

⎛−

= ∫ γ

IGDp: PQ continuous the power mean of IGDp turns into an integral.

Example: k=2, F(PQ) connected, then

Page 17: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

17

O. Schütze

Discretization of F(PQ)

Task: PQ given analytically, compute an approximation Y of F(PQ) with dH(Y,F(PQ))<δ (a priori defined approximation quality)

For k=2: use continuation-like methods:select step size t such that ||F(x+tv)-F(x)||∞≈Θδ, Θ<1 a safety factor(selection of t based on Lipschitz estimations)

−4 −3 −2 −1 0 1 2 3 4−0.2

0

0.2

0.4

0.6

0.8

1

1.2

f1

f 2

PF

−4 −3 −2 −1 0 1 2 3 4−0.2

0

0.2

0.4

0.6

0.8

1

1.2

f1

f 2

PF

δ=0.01 δ=0.4

OKA2

Page 18: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

18

O. Schütze

Numerical Example

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

pop1pop2pop3pop4pop5pop6pop7Pareto Front

NSGA-II applied on ZDT1

Y = F(PQ)Yi=F(popi)

∆2(Y1,Y)=3.03∆2(Y2,Y)=2.71∆2(Y3,Y)=1.43∆2(Y4,Y)=0.77∆2(Y5,Y)=0.31∆2(Y6,Y)=0.12∆2(Y7,Y)=0.007

Page 19: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

19

O. Schütze

Discussion

Conclusions• New indicator ∆p proposed for the evaluation of MOEAs.• ∆p is a semi-metric, and a pseudo-metric for bounded

archive sizes• p can (in principle) be used to handle the ‘outlier trade off’

Open Questions • How to choose p?• How to measure the distance to a metric?• How to adapt the selection mechanisms in order to

improve ∆p?(∆p is NOT compliant with the dominance relation!)

Page 20: Some Comments on GD and IGD and Relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1 O. Schütze Some Comments on GD and IGD and Relations to the Hausdorff Distance O

20

O. Schütze

Thank you for your attention!