some comments on gd and igd and relations to the …taemo.gforge.inria.fr/2010/slides/04_lara.pdf1...
TRANSCRIPT
1
O. Schütze
Some Comments on GD and IGD and Relations to the Hausdorff Distance
O. Schütze, X. Esquivel, A. Lara, C. Coello
CINVESTAV-IPNCentro de Investigación y de Estudios Avanzados
del Instituto Politécnico Nacional.Mexico City, Mexico
2
O. Schütze
Outline
Introduction and Background• Trade off for the design of indicators for the evaluation
of MOEAs• Metric / Hausdorff distance
Investigation of the Indicators • GD• IGD
A ‘New’ Indicator • Metric properties• Extension to continuous models
3
O. Schütze
Multi-Objective Optimization
⎪⎩
⎪⎨
⎧
→⊂
→⊂=
RRQf
RRQfF
nk
n
:
:min
1
(MOP)
Multi-Objective Optimization Problem
PQ = set of optimal solutions (Pareto set)F(PQ) = the image of PQ (Pareto front)
Pareto set
f2
f1
Pareto front
f1,f2
x
First we consider discrete (or discretized) models, i.e., |Q|<∞.
4
O. Schütze
Outliers in Stochastic Search Algorithms
⎟⎟⎠
⎞⎜⎜⎝
⎛=
→
)(
)(
]1,0[:
1
xgx
xF
RF kn
Example: Consider the MOP
where g:[0,1]n Rk-1 ( Okabe, ZDT).
Assume a point x=(ε,z), z∈[0,1]n-1, is a member of the archive/population.Further, assume that new candidate solutions are chosen uniformly at random from the domain.
Then the probability to find a point that dominates x is less than ε( objective 1). The distance of x to PQ can be ‘large’.
(ε,x2)
5
O. Schütze
Example
P hypothetical Pareto front
X1 perfect approximation of P, except one outlier
X2 none of the elements are‘near’ to P
Question: Which approxomation is ‘better’?
Extreme situations:
-- pessimistic view (Hausdorff distance): dH(X1,P)=9, dH(X2,P)=2.83
-- averaged result (Generational distance): GD(X1,P)=0.81, GD(X2,P)=2.83
6
O. Schütze
Outlier Trade Off
Use of a Metric
+ greedy search = shortestpath to the set of interest( triangle inequality)
-- Penalization of single outliers of the candidate set
Averaging the Results
+ Single outliers do not have a mayor influence on the result
-- The greedy search is not neccessarily the shortest path to the set of interest
Trade off for the indicator D when measuring results of MOEAs (the design of MOEAs is influenced by D):
7
O. Schütze
Metric
Definition: Suppose X is a set and d:X×X R is a function. Then d is called a metric on X if, and only if, for each a,b,c∈X:
),(),(),( )(),(),( )(
0),( and 0),( )(
cadbadcadcabdbadb
babadbada
+≤=
=⇔=≥ (Positive Property)(Symmetric Property)(Triangle Inequality)
Variants:
-- d is called a semi-metric if properties (a) and (b) are satisfied
-- A pseudo-metric is a semi-metric that satisfies the relaxedtriangle inequality:
1 )),,(),((),( ≥+≤ σσ cadbadcad
8
O. Schütze
Hausdorff DistanceDefinition: Let u,v∈Rn, A,B⊂Rn, and ||.|| be a vector norm. The Hausdorff distance dH is defined as follows:
)),(),,(max(:),( )(
),(sup:),( )(
inf:),( )(
ABdistBAdistBAdc
AudistABdistb
vuAudista
H
Bu
Av
=
=
−=
∈
∈u
A
B
A
Remarks:
(i) dist(A,B) is not symmetric: if B is a proper subset of A, then it isdist(B,A)=0 and dist(A,B)>0.
(ii) dH is a metric on the set of discrete sets. It can also be used for continuous spaces. In that case it is dH(A,B)=0 ⇔clos(A)=clos(B)
9
O. Schütze
Discussion of GD (1)
GD as proposed by Van Veldhuizen applied on general finitesets X, Y⊂Rk using dist:
pX
i
pi Yxdist
XYXGD
/1||
1),(1),( ⎟⎟⎠
⎞⎜⎜⎝
⎛= ∑
=
Metric properties:
-- positive property: NOit is GD(X,Y)=0 ⇔ X⊂Y (X can be a proper subset of Y (*))
-- symmetric property: NO(*): then GD(X,Y)=0 but GD(Y,X)>0
-- triangle inequality: NO ( next slide)
10
O. Schütze
Discussion of GD (2)
2.) Investigate (relaxed) triangle inequality: let X,Z⊂Rk s.t. GD(X,Z)>0. Let
rhs(Y):= GD(X,Y)+GD(Y,Z)
and define Yn := X ∪ {y1,y2,…,yn}
such that Σi dist(yi,Z) < ∞. Then GD(X,Y)=0 and GD(Y,Z) 0 for n ∞GD does not satisfy and relaxed triangle inequality since rhs(y) 0.
Note: for p>1, any set {y1,..,yn}⊂F(Q) (if compact) can be taken!!
1.) Normalization strategy of GD: Let A1={a} with dist(F(a),F(PQ))=1, i.e.,GD(F(A1),F(PQ))=1
Now let An be the multiset consisting of n copies of a, An={a,…,a}, then
0)1,..,1(
))(),(( →==nn
nPFAFGD
pp
T
Qn
11
O. Schütze
New Variant of GD
pX
i
pip
pX
i
pip Yxdist
XYxdist
XYXGD
/1||
1
/1||
1),(1),(1),( ⎟⎟⎠
⎞⎜⎜⎝
⎛=⎟
⎟⎠
⎞⎜⎜⎝
⎛= ∑∑
==
Nearby modification: take the power mean of the distances:
-- same (poor) metric properties, but
-- better averaging: GDp(F(An),F(PQ))=1 for all n∈N
-- (needed for the upcoming indicator)
12
O. Schütze
Discussion IGD
IGD as proposed by Coello & Cruz applied on general finitesets X, Y⊂Rk using dist:
pY
i
pi Xydist
YYXIGD
/1||
1),(1),( ⎟⎟⎠
⎞⎜⎜⎝
⎛= ∑
=
-- same metric properties as GD since IGD(A,B) = GD(B,A)
-- same modification: take power mean of the distances:
pY
i
pip
pY
i
pip Xydist
YXydist
YYXIGD
/1||
1
/1||
1),(1),(1),( ⎟⎟⎠
⎞⎜⎜⎝
⎛=⎟
⎟⎠
⎞⎜⎜⎝
⎛= ∑∑
==
13
O. Schütze
A “New” Indicator
( )),(),,(max),( YXIGDYXGDYX ppp =Δ
Proposition 1: ∆p is a semi-metric for 1≤p<∞ and a metric for p=∞
Remark: for p=∞ the indicator ∆p coincides with dH
Proposition 2: let |X|,|Y|,|Z|≤N, then
)),(),((),( ZYYXNZX ppp
p Δ+Δ≤Δ
Observation: GD(X,Y) is an ‘averaged version’ of dist(X,Y), same for IGD combine GD and IGD as for dH:
14
O. Schütze
Interpretation of p for the Trade Off
p=1 p=2 p=5 p=10 p=∞
N=2 0.541 0.15 0.026 0.008 0
N=4 0.249 0.06 0.019 0.009 0N=6 0.105 0.033 0.008 0.003 0
N=10 0.02 0.004 0.002 0.001 0N=100 0 0 0 0 0
The larger the value of p, the ´nearer´Δp is to a metric(but: how to choose p? what is the influence of N?)
Table: Percentage of the triangle violations (σ=1) for different values of p. Hereby, we have taken 100,000 different sets A,B,C with
|A|,|B|,|C|=N, k=2, each entry randomly chosen within [0,1].
15
O. Schütze
ExampleP hypothetical Pareto front
X1 perfect approximation of P, except one outlier
X2 none of the elements are‘near’ to P
Question: Which approxomation is ‘better’?
p=1 p=2 p=5 p=10 p=∞∆p(P,X1) 0.8182 2.714 4.047 5.571 9
∆p(P,X2) 2.828 2.828 2.828 2.828 2.828
16
O. Schütze
Extension to Continuous Models
M1m1 f1
f2M2
m2
211 ],[ RMm →γNow consider continuous models
In general: k objectives PQ (k-1)-dimensional
GDp: A finite, PQ compactGD turns to a continuous SOP
pM
m
pQ dtAFtdist
mMPFAFIGD
/1
11
1
1
))(),((1))(),(( ⎟⎟⎠
⎞⎜⎜⎝
⎛−
= ∫ γ
IGDp: PQ continuous the power mean of IGDp turns into an integral.
Example: k=2, F(PQ) connected, then
17
O. Schütze
Discretization of F(PQ)
Task: PQ given analytically, compute an approximation Y of F(PQ) with dH(Y,F(PQ))<δ (a priori defined approximation quality)
For k=2: use continuation-like methods:select step size t such that ||F(x+tv)-F(x)||∞≈Θδ, Θ<1 a safety factor(selection of t based on Lipschitz estimations)
−4 −3 −2 −1 0 1 2 3 4−0.2
0
0.2
0.4
0.6
0.8
1
1.2
f1
f 2
PF
−4 −3 −2 −1 0 1 2 3 4−0.2
0
0.2
0.4
0.6
0.8
1
1.2
f1
f 2
PF
δ=0.01 δ=0.4
OKA2
18
O. Schütze
Numerical Example
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
pop1pop2pop3pop4pop5pop6pop7Pareto Front
NSGA-II applied on ZDT1
Y = F(PQ)Yi=F(popi)
∆2(Y1,Y)=3.03∆2(Y2,Y)=2.71∆2(Y3,Y)=1.43∆2(Y4,Y)=0.77∆2(Y5,Y)=0.31∆2(Y6,Y)=0.12∆2(Y7,Y)=0.007
19
O. Schütze
Discussion
Conclusions• New indicator ∆p proposed for the evaluation of MOEAs.• ∆p is a semi-metric, and a pseudo-metric for bounded
archive sizes• p can (in principle) be used to handle the ‘outlier trade off’
Open Questions • How to choose p?• How to measure the distance to a metric?• How to adapt the selection mechanisms in order to
improve ∆p?(∆p is NOT compliant with the dominance relation!)
20
O. Schütze
Thank you for your attention!