advances in metric embedding theory
DESCRIPTION
Advances in Metric Embedding Theory. Ofer Neiman Ittai Abraham Yair Bartal Hebrew University. Talk Outline. Current results: New method of embedding. New partition techniques. Constant average distortion. Extend notions of distortion. Optimal results for scaling embeddings. - PowerPoint PPT PresentationTRANSCRIPT
Advances in Metric Advances in Metric Embedding TheoryEmbedding Theory
Ofer NeimanOfer Neiman
Ittai Abraham Yair BartalIttai Abraham Yair Bartal
Hebrew UniversityHebrew University
Talk OutlineTalk OutlineCurrent results:Current results: New method of embedding.New method of embedding. New partition techniques.New partition techniques. Constant average distortion.Constant average distortion. Extend notions of distortion.Extend notions of distortion. Optimal results for Optimal results for scaling scaling embeddings.embeddings. Tradeoff between Tradeoff between distortion and dimensiondistortion and dimension..
Work in progress:Work in progress: Low dimension embedding for Low dimension embedding for doublingdoubling metrics. metrics. Scaling distortion into a Scaling distortion into a single treesingle tree.. Nearest neighborsNearest neighbors preserving embedding. preserving embedding.
Embedding Metric SpacesEmbedding Metric Spaces Metric spaces Metric spaces (X,d(X,dXX), (Y,d), (Y,dyy))
EmbeddingEmbedding is a function is a function f:Xf:X→→YY For non-contracting Embedding For non-contracting Embedding ff,,
Given Given u,v u,v in in XX let let
Distortion cDistortion c if if maxmax{u,v {u,v X} X} dist distff(u,v) (u,v) ≤≤ c c
vud
vfufdvudist
X
Yf ,
,,
Low-Dimension EmbeddingsLow-Dimension Embeddings into Linto Lpp
For arbitrary metric space on n points:For arbitrary metric space on n points:
[Bourgain 85][Bourgain 85]:: distortion distortion O(log n)O(log n) [LLR 95][LLR 95]:: distortion distortion ΘΘ(log n)(log n) dimension dimension O(logO(log22 n) n) Can the dimension be reduced?Can the dimension be reduced? For For p=2p=2, yes using , yes using [JL][JL] to dimension to dimension O(log n)O(log n) TheoremTheorem: embedding into : embedding into LLpp with distortion with distortion O(log n),O(log n),
dimension dimension O(log n)O(log n) for any for any pp.. TheoremTheorem: distortion : distortion O(logO(log1+1+θθ n), n), dimension dimension
ΘΘ(log n/ ((log n/ (θθ loglog n)) loglog n))
Average Distortion EmbeddingsAverage Distortion Embeddings
In many practical uses, the quality of an embedding is In many practical uses, the quality of an embedding is measured by its measured by its average distortionaverage distortion Network embeddingNetwork embedding Multi-dimensional scalingMulti-dimensional scaling BiologyBiology VisionVision
TheoremTheorem: Every n point metric space can be embedded : Every n point metric space can be embedded into into LLpp with average distortion with average distortion O(1),O(1), worst-case distortion worst-case distortion O(log n)O(log n) and dimension and dimension O(log n).O(log n).
Xvuf vudist
nfavgdist
,
1
),(2
XvuX
XvuY
vud
vfufd
fdistavg
,
,
,
,
Variation on distortion: Variation on distortion: The LThe Lqq distortion of an embedding distortion of an embedding
Given a non-contracting embedding Given a non-contracting embedding
ff from from (X,d(X,dXX)) to to (Y,d(Y,dYY):):
Define it’s Define it’s LLqq-distortion-distortion
vud
vfufdvudist
X
Yf ,
,,
q
vu
qfqfq vudist
nvudistfdist
,
2,
1
vudistfdist f ,max
21
2 ,2
Xvuf vudist
nfdist
Xvuf vudist
nfdist ,
2
1
1 Thm: Lq-distortion is bounded by O(q)
Partial & Scaling DistortionPartial & Scaling Distortion
Definition: Definition: A A (1-(1-εε)-)-partial embedding has distortion partial embedding has distortion D(D(εε),), if if at least at least 1-1-εε of the pairs satisfy of the pairs satisfy dist(u,v)<D(dist(u,v)<D(εε).).
Definition:Definition: An embedding has scaling distortion An embedding has scaling distortion D(·)D(·) if it is if it is a a 1-1-εε partial embedding with distortion partial embedding with distortion D(D(εε),), for for all all εε>0>0 simultaneouslysimultaneously..
[KSW 04][KSW 04]: : Introduce the problem in context of network embeddings.Introduce the problem in context of network embeddings. Initial results.Initial results.
[A+ 05][A+ 05]:: Partial distortion andPartial distortion and dimension dimension O(log(1/O(log(1/εε)))) for all metrics. for all metrics. Scaling distortion Scaling distortion O(log(1/O(log(1/εε)))) for doubling metrics. for doubling metrics.
Thm:Thm: Scaling distortion Scaling distortion O(log(1/ε))O(log(1/ε)) for all metrics. for all metrics.
LLqq-Distortion Vs -Distortion Vs Scaling DistortionScaling Distortion
Upper boundUpper bound O(log 1/O(log 1/εε)) on on Scaling Scaling distortiondistortion implies: implies: LLqq--distortiondistortion = O(min{q,log n}). = O(min{q,log n}). Average distortion = Average distortion = O(1).O(1). Distortion = Distortion = O(log n).O(log n).
For any metric: For any metric: ½ of pairs distortion are ≤ ½ of pairs distortion are ≤ c log(2) = cc log(2) = c +¼ of+¼ of pairspairs distortion are ≤ distortion are ≤ c log(4)= 2cc log(4)= 2c +⅛ of+⅛ of pairspairs distortion are ≤ distortion are ≤ c log(8) = 3cc log(8) = 3c … ….. +1/n+1/n22 of of pairspairs distortion are ≤ 2distortion are ≤ 2c log(n)c log(n)
For For εε<1/n<1/n22, no pairs are ignored, no pairs are ignored..cciavgdist
i
i 220
Lower boundLower bound ΩΩ(log 1/(log 1/εε)) on on partial distortionpartial distortion implies: implies:
LLqq--distortiondistortion = = ΩΩ(min{q,log n}).(min{q,log n}).
Probabilistic Partitions Probabilistic Partitions
PP={={SS11,S,S22,…S,…Stt} is a partition of } is a partition of X X ifif
PP((xx)) is the cluster containing is the cluster containing xx.. P P is is ΔΔ-bounded-bounded if if diam(Sdiam(Sii)≤)≤ΔΔ for all for all ii..
A A probabilistic partitionprobabilistic partition PP is a distribution over a set is a distribution over a set of partitions. of partitions.
PP is is ηη-padded-padded if if
XSSSji ii
ji ,:
21,Pr xPxB
Let Let ΔΔii=4=4ii be the scales.be the scales.
For each scale For each scale ii, create a probabilistic , create a probabilistic ΔΔii--
boundeboundedd partitions partitions PPii,, that are that are ηη--paddedpadded..
For each cluster choose For each cluster choose σσii(S)~Ber(½)(S)~Ber(½) i.i.d. i.i.d.
ffii(x)= (x)= σσii(P(Pii(x))·d(x,X\P(x))·d(x,X\Pii(x))(x))
Repeat Repeat O(log n)O(log n) times. times. Distortion : Distortion : O(O(ηη-1-1·log·log1/p1/pΔΔ).). Dimension : Dimension : O(log n·log O(log n·log ΔΔ).).
Partitions and EmbeddingPartitions and Embedding
xfxf ii 0
diameter of X =diameter of X = Δ
Δi
48
x
d(x,X\P(x))
Upper BoundUpper Bound
ffii(x)= (x)= σσii(P(Pii(x))·d(x,X\P(x))·d(x,X\Pii(x))(x))
For all For all x,yx,yєєXX:: PPii(x)(x)≠≠PPii(y)(y) implies implies d(x,X\Pd(x,X\Pii(x))≤d(x,y) (x))≤d(x,y)
PPii(x)(x)==PPii(y)(y) impliesimplies d(x,A)-d(y,A)≤d(x,y) d(x,A)-d(y,A)≤d(x,y)
yxdyxdyfxf ppp
p
i
p
ii ,log,log 111log
0
i
x
y
Take a scale Take a scale i i such that such that ΔΔii≈≈d(x,y)/4.d(x,y)/4. It must be thatIt must be that P Pii(x)≠P(x)≠Pii(y)(y)
With probability ½ :With probability ½ : d(x,X\Pd(x,X\Pii(x))≥(x))≥ηΔηΔii
With probabilityWith probability ¼ : ¼ : σσii(P(Pii(x))=1 and (x))=1 and σσii(P(Pii(y))=0(y))=0
yxdyfxf iii ,0
Lower Lower Bound:Bound:
ηη-padded Partitions-padded Partitions
The parameter The parameter ηη determines the quality of the determines the quality of the embedding.embedding.
[Bartal 96]:[Bartal 96]: ηη==ΩΩ(1/log n)(1/log n) for any metric space. for any metric space. [Rao 99]:[Rao 99]: ηη==ΩΩ(1) (1) used to embed planar metrics into Lused to embed planar metrics into L22..
[CKR01+FRT03]:[CKR01+FRT03]: Improved partitions with Improved partitions with ηη(x)=log(x)=log-1-1((ρρ(x,(x,ΔΔ)).)). [KLMN 03]:[KLMN 03]: Used to embed general + doubling metrics into Used to embed general + doubling metrics into
LLpp : distortion : distortion O(O(ηη-(1-1/p)-(1-1/p)loglog1/p1/pn),n), dimension dimension O(logO(log22n)n)
The The local growth ratelocal growth rate of of xx at radius at radius rr is: is:
4,
4,,
rxB
rxBrx
Uniform Probabilistic Uniform Probabilistic PartitionsPartitions In a In a UniformUniform Probabilistic Partition Probabilistic Partition
ηη:X→[0,1]:X→[0,1] All points in a cluster have the same padding parameter.All points in a cluster have the same padding parameter. Uniform partition lemmaUniform partition lemma: There exists a uniform : There exists a uniform
probabilistic probabilistic ΔΔ-bounded partition such that for -bounded partition such that for any , any , ηη(x)=log(x)=log-1-1ρρ(v,(v,ΔΔ),), wherewhere
v1v2
v3
C1C2
η(C2)
η(C1)
,min xvCx
Cx
Let Let ΔΔii=4=4ii..
For each scale For each scale ii, create , create uniformly paddeduniformly padded probabilistic probabilistic ΔΔii--boundeboundedd partitions partitions PPii..
For each cluster choose For each cluster choose σσii(S)~Ber(½)(S)~Ber(½) i.i.d. i.i.d.
, , ffii(x)= (x)= σσii(P(Pii(x))·(x))·ηηii-1-1(x)·(x)·d(x,X\Pd(x,X\Pii(x))(x))
Upper boundUpper bound : : |f(x)-f(y)| |f(x)-f(y)| ≤≤ O(log n)·d(x,y). O(log n)·d(x,y). Lower boundLower bound: : E[|f(x)-f(y)|] E[|f(x)-f(y)|] ≥≥ ΩΩ(d(x,y))(d(x,y)) ReplicateReplicate D=Θ(log n)D=Θ(log n) times to get high probability. times to get high probability.
0i
i xfxf
EmbeddingEmbeddinginto one dimensioninto one dimension
Upper Bound:Upper Bound: |f(|f(xx)-f()-f(yy)| ≤ )| ≤ OO(log (log nn) d() d(xx,,yy))
For all For all x,yx,yєєXX:: - - PPii(x)(x)≠≠PPii(y)(y) implies implies ffii(x)≤ (x)≤ ηηii
-1-1(x)·(x)· d(x,y) d(x,y)
- P- Pii(x)(x)==PPii(y)(y) impliesimplies ffii(x)-(x)- ffii(y(y)≤ )≤ ηηii-1-1(x)·(x)· d(x,y) d(x,y)
yxdnO
xB
xByxd
xyxdyfxf
i i
i
ii
iii
,log
4,
4,log,
,
0
0
1
0
Use uniform padding in cluster
xPXxdxxPxf iiiii \,1
ii x
x
y
Take a scale Take a scale i i such that such that ΔΔii≈≈d(x,y)/4.d(x,y)/4. It must be thatIt must be that P Pii(x)≠P(x)≠Pii(y)(y)
With probability ½ With probability ½ : f: fii(x)= (x)= ηηii-1-1(x)d(x,X\P(x)d(x,X\Pii(x))≥(x))≥ΔΔii
Lower Lower Bound:Bound:
Lower bound : E[|f(x)-f(y)|] ≥ Lower bound : E[|f(x)-f(y)|] ≥ d(x,y)d(x,y)
Two cases:Two cases:
1.1. R < R < ΔΔii/2/2 then then prob. prob. ⅛: ⅛: σσii(P(Pii(x))=1 and (x))=1 and σσii(P(Pii(y))=0(y))=0 Then Then f fii(x) (x) ≥≥ ΔΔii , ,ffii(y)=0(y)=0 |f(x)-f(y)| |f(x)-f(y)| ≥≥ ΔΔii/2 =/2 =ΩΩ(d(x,y)).(d(x,y)).
2.2. R R ≥≥ ΔΔii/2/2 then then prob. prob. ¼: ¼: σσii(P(Pii(x))=0 and (x))=0 and σσii(P(Pii(y))=0(y))=0 ffii(x)=f(x)=fii(y)=0(y)=0 |f(x)-f(y)| |f(x)-f(y)| ≥≥ ΔΔii/2 =/2 =ΩΩ(d(x,y)).(d(x,y)).
ij
jj yfxfR
Coarse Scaling Embedding Coarse Scaling Embedding into Linto Lpp
Definition:Definition: For For uuєєX, X, rrεε(u)(u) is is
the minimal radius such the minimal radius such that that |B(u,r|B(u,rεε(u))| ≥(u))| ≥εεnn..
CoarseCoarse scaling scaling embedding: For each embedding: For each uuєєX,X, preserves distances preserves distances outsideoutside B(u,rB(u,rεε(u)).(u)).
urε(u)
vrε(v)
rε(w)w
Scaling DistortionScaling Distortion ClaimClaim: If : If d(x,y) > rd(x,y) > rεε(x)(x) then then 1 1 ≤≤ dist distff(x,y) (x,y) ≤≤ O(log 1/ O(log 1/εε)) Let Let ll be the scale be the scale d(x,y) d(x,y) ≤≤ ΔΔll < 4d(x,y) < 4d(x,y)
Lower boundLower bound: : E[|f(x)-f(y)|] E[|f(x)-f(y)|] ≥≥ d(x,y) d(x,y) Upper boundUpper bound for for high high diameter termsdiameter terms
Upper boundUpper bound for for lowlow diameter terms diameter terms
ReplicateReplicate D=Θ(log n)D=Θ(log n) times to get high probability. times to get high probability.
yxdOyfxfli
ii ,1log
yxdOyfxfli
ii ,1
Upper Bound for high diameter terms:Upper Bound for high diameter terms:|f(|f(xx)-f()-f(yy)| ≤ )| ≤ OO(log 1/ε) d((log 1/ε) d(xx,,yy))
Scale Scale ll such that such that rrεε(x)(x)≤≤d(x,y) d(x,y) ≤≤ ΔΔll < 4d(x,y). < 4d(x,y).
yxdO
xB
xByxd
xyxdyfxf
li i
i
lii
liii
,1log
4,
4,log,
, 1
nxrxB ,
xPXxdxxPxf iiiii \,1
Upper Bound for low diameter terms:Upper Bound for low diameter terms:|f(u)-f(v)| |f(u)-f(v)| == O(1)O(1) d(u,v) d(u,v)
Scale Scale ll such that such that d(x,y) d(x,y) ≤≤ ΔΔll < 4d(x,y). < 4d(x,y).
All lower levels All lower levels i i ≤≤ l l are bounded by are bounded by ΔΔii..
yxdOyfxf lli
ili
ii ,1
yxdOOyfxfi
ii ,11log0
xPXxdxxPxf iiiii \,1 iiiiii xPXxdxxPxf ,\,min 1
Embedding into LEmbedding into Lpp
Partition Partition PP is is ((ηη,,δδ)-)-paddedpadded if if
Lemma:Lemma: there exists ( there exists (ηη,,δδ)-)-padded partitions with padded partitions with ηη(x)=log(x)=log-1-1((ρρ(v,(v,ΔΔ))·log(1/))·log(1/δδ),), where where v=minv=minuuєєP(x)P(x){{ρρ(u,(u,ΔΔ)}.)}.
Hierarchical partitionHierarchical partition : every cluster in level : every cluster in level ii is a is a refinement of cluster in level refinement of cluster in level i+1i+1..
TheoremTheorem: Every : Every nn point metric space can be embedded point metric space can be embedded into into LLpp with dimension with dimension O(eO(ep p log nlog n). For every ). For every qq::
xPxB ,Pr
logdist min ,q
q nf O
p p
Embedding into LEmbedding into Lpp
Embedding into Embedding into LLpp with with scaling distortion:scaling distortion:
Use partitions with small probability of Use partitions with small probability of padding : padding : δδ=e=e-p-p..
HierarchicalHierarchical Uniform Partitions. Uniform Partitions. Combination with Matousek’s sampling Combination with Matousek’s sampling
techniques.techniques.
Low Dimension EmbeddingsLow Dimension Embeddings
Embedding with distortion Embedding with distortion O(logO(log1+1+θθ n), n), dimension dimension ΘΘ(log n/ ((log n/ (θθ loglog n)). loglog n)).
Optimal trade-off between distortion and Optimal trade-off between distortion and dimension.dimension.
Use partitions with high probability of padding : Use partitions with high probability of padding : δδ=1-log=1-log--θθn.n.
Additional Results: Additional Results: Weighted AveragesWeighted Averages Embedding with weighted average distortion Embedding with weighted average distortion
O(log O(log ΨΨ) for weights with aspect ratio ) for weights with aspect ratio ΨΨ
Algorithmic applications: Algorithmic applications: Sparsest cut, Sparsest cut, Uncapacitated quadratic assignment, Uncapacitated quadratic assignment, Multiple sequence alignment.Multiple sequence alignment.
Low Dimension EmbeddingsLow Dimension EmbeddingsDoubling MetricsDoubling Metrics
Definition:Definition: A metric space has A metric space has doubling constant doubling constant λλ, , if any ball if any ball with radius with radius r>0r>0 can be covered with can be covered with λλ balls of half the radius. balls of half the radius.
Doubling dimensionDoubling dimension = = log log λλ..
[GKL03]:[GKL03]: Embedding doubling metrics, with tight distortion.Embedding doubling metrics, with tight distortion.
Thm:Thm: Embedding arbitrary metrics into Embedding arbitrary metrics into LLpp with distortion with distortion O(logO(log1+1+θθ n),n), dimension dimension O(log O(log λλ).). Same embedding,Same embedding, with similar techniques. with similar techniques.
Use Use nets.nets. Use Use Lovász Local LemmaLovász Local Lemma..
Thm: Thm: Embedding arbitrary metrics intoEmbedding arbitrary metrics into L Lpp with distortion with distortion O(logO(log1-1/p1-1/pλλ·log·log1/p1/p n), n), dimension dimension Õ(log n·logÕ(log n·logλλ).). Use hierarchical partitions as well.Use hierarchical partitions as well.
Scaling Distortion into treesScaling Distortion into trees
[A+ 05][A+ 05]:: ProbabilisticProbabilistic Embedding intoEmbedding into a distribution of a distribution of ultrametrics with scaling distortion ultrametrics with scaling distortion O(log(1/O(log(1/εε)).)).
Thm:Thm: Embedding into an ultrametric with scaling Embedding into an ultrametric with scaling distortion .distortion .
Thm:Thm: Every graph contains a Every graph contains a spanning treespanning tree with with scaling distortion .scaling distortion .
Imply :Imply : Average distortionAverage distortion = O(1). = O(1). LL22-distortion-distortion = =
Can be viewed as a network design objective.Can be viewed as a network design objective.
Thm:Thm: Probabilistic Probabilistic Embedding intoEmbedding into a distribution of a distribution of spanning treesspanning trees with scaling distortion with scaling distortion Õ(logÕ(log22(1/(1/εε)).)).
1O
1O
nO log
New Results:New Results:Nearest-Neighbors Preserving Nearest-Neighbors Preserving EmbeddingsEmbeddings
Definition: Definition: x,yx,y are are kk-nearest neighbors-nearest neighbors if if |B(x,d(x,y))||B(x,d(x,y))|≤≤k.k.
Thm: Thm: Embedding into Embedding into LLpp with distortion with distortion Õ(log k)Õ(log k) on on
k-nearest neighbors, k-nearest neighbors, for all for all kk simultaneously simultaneously, and , and dimension dimension O(log n).O(log n).
Thm:Thm: For fixed For fixed kk, embedding into , embedding into LLpp distortion distortion O(log kO(log k) )
and and dimension dimension O(log k).O(log k). Practically the Practically the same embedding.same embedding. Every level is Every level is scaled downscaled down, higher levels more aggressively., higher levels more aggressively. Lovász Local Lemma.Lovász Local Lemma.
Nearest-Neighbors Preserving Nearest-Neighbors Preserving EmbeddingsEmbeddings
Thm: Thm: Probabilistic embedding into a distribution of Probabilistic embedding into a distribution of ultrametrics with distortion ultrametrics with distortion Õ(log k)Õ(log k) for for all all k-nearest k-nearest neighborsneighbors..
Thm: Thm: Embedding into an ultrametric with distortion Embedding into an ultrametric with distortion k-1k-1 for for all all k-nearest neighbors.k-nearest neighbors.
Applications :Applications : Sparsest-cut with “neighboring” demand pairs.Sparsest-cut with “neighboring” demand pairs. Approximate ranking / Approximate ranking / kk-nearest neighbors search.-nearest neighbors search.
ConclusionsConclusions
Unified frameworkUnified framework for embedding arbitrary metrics. for embedding arbitrary metrics. New New measuresmeasures of distortion. of distortion. Embeddings with improved properties:Embeddings with improved properties:
Optimal Optimal scaling scaling distortion.distortion. Constant average distortion.Constant average distortion. Tight Tight distortion-dimensiondistortion-dimension tradeoff. tradeoff.
Embedding metrics in their doubling dimension.Embedding metrics in their doubling dimension. Nearest-neighbors preserving embedding.Nearest-neighbors preserving embedding. Constant average distortion Constant average distortion spanning treesspanning trees..