mds embedding - math.tau.ac.ildcor/graphics/cg-slides/sgp-rmds.pdf · 30. precision and recall. 31....
TRANSCRIPT
![Page 1: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/1.jpg)
MDS EmbeddingMDS takes as input a distance matrix D, containing allN × N pair of distances between elements xi, and embedthe elements in N dimensional space such that the inter distances Dij are preserved as much as possible by ||xi−xj|| in the embedded space.
1
![Page 2: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/2.jpg)
2
![Page 3: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/3.jpg)
3
![Page 4: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/4.jpg)
4
![Page 5: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/5.jpg)
128 dim space visualized by t-SNE
Joint Embeddings of Shapes and Images
![Page 6: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/6.jpg)
![Page 7: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/7.jpg)
Image based Shape Retrieval
![Page 8: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/8.jpg)
Shape based Image Retrieval
![Page 9: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/9.jpg)
Cross-View Image Retrieval
![Page 10: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/10.jpg)
MDS Embedding
11
![Page 11: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/11.jpg)
Common MDS do not handle outliers
input SMACOF Sammon
12
![Page 12: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/12.jpg)
Two outlier distances lead to significant distortion in the embedding
In many real-world scenarios, input distances may be noisy or contain outliers, due to malicious acts, system faults, or erroneous measures.
13
![Page 13: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/13.jpg)
Two outlier distances lead to significant distortion in the embedding
In many real-world scenarios, input distances may be noisy or contain outliers, due to malicious acts, system faults, or erroneous measures.
14
![Page 14: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/14.jpg)
Least square fitting
15
![Page 15: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/15.jpg)
Least square fitting
16
![Page 16: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/16.jpg)
RANSAC- Generate Lines using Pairs of Points.- Count number of points within ε of line.- Pick the best line.
17
![Page 17: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/17.jpg)
RANSACSadly can’t be applied to MDS – a lot of data is needed for generating an embedding. Almost every sample will still have outliers.
18
![Page 18: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/18.jpg)
Forero and Giannakis method
Lasso regression parameter (when bigger there are less outliers)
The non-zero entries represent the outlier pairs
● Tuning the regularization parameter is not a simple task.● There are NxN unknowns instead of just dxN, thus it is significantly harder to solve
accuratly and thus very sensitive to the initial guess.
19
![Page 19: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/19.jpg)
Different λ applied to the same dataset withthe same initial guess, leads to different embedding qualities.
20
![Page 20: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/20.jpg)
Same λ applied to the same datasets with different initial guesses, yields different embedding qualities.
21
![Page 21: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/21.jpg)
This graph presents the number of non-zero elements in O (which represent outliers) as a function of λ.
The three plots were generated using different initial guesses that were uniformly sampled.
FG12 method is overly sensitive to the initial guess.
22
![Page 22: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/22.jpg)
Embed and remove pairs which are overly stressed…
Sadly, the overly stressed edges are not necessarily outliers.(for example long edge that became a short one can cause a lot of short edges to deform in the embedding).
Also other stress weighting has their shortcomings – we tested that method for a while.
23
![Page 23: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/23.jpg)
Geometric Reasoning
An outlier distance tends to break many triangles.
We detect those outliers and filter them.
24
![Page 24: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/24.jpg)
Broken Triangles
d3
d1
d2
For triangle with edge length
If
d3
d1
d2
then the triangle is broken
25
![Page 25: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/25.jpg)
Broken Triangles
An edge in a broken triangle is not necessarily an outlier
26
Not every outlier edge necessarily breaks a triangle
![Page 26: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/26.jpg)
Histogram of Broken Triangles
27
![Page 27: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/27.jpg)
Histogram of Broken Triangles
We set ф to be the smallest value that satisfies the following two requirements:
28
![Page 28: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/28.jpg)
Shepard Diagram
Each point represents a distance. The X-axis represents the input distances and the Y-axis represents the distance in the embedding result.
29
![Page 29: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/29.jpg)
The Red dots are the distances classified as outliers. Some of the are on diagonal – those are the false positives.
30
![Page 30: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/30.jpg)
Precision and Recall
31
![Page 31: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/31.jpg)
Threshold Performance�The outlier detection rate as a function ofthe shrinkage enlargement of the outliers relative to theground-truth value. Edges that are strongly deformed (either squeezed or enlarged) are likely to be detected.
Note: the X-axis is logarithmic: log2(Dout /DGT ).
32
![Page 32: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/32.jpg)
Qualitative ComparisonA comparison between SMACOF and our method as a function of outlier rate. Up to 22% our method has better performance.
ij
jiij
jiij D
XXSSScore
||||log,
−==∑
≠
33
![Page 33: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/33.jpg)
The embedding of a ’PLUS’ shaped dataset with 10% outliers, and a ’SPIRAL’ shaped dataset with 15% outliers.
(a,c) SMACOF (b,d) Our technique.
35
![Page 34: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/34.jpg)
128 US CitiesTwo-dimensional embedding of SGB128 distances with 10% outliers. �The green dots are the ground-truth locations and the magenta dots represent the embedded points.(a) SMACOF (b) Our Filtering technique.
36
![Page 35: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/35.jpg)
Protein DatasetAverage cluster index value of 10 executions.�The embedding dimension is set to 6, since for lower dimensionsSMACOF fails due to co-located points.
37
![Page 36: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/36.jpg)
Outlier Detection for Robust Multi-dimensional Scaling
Thank You
40
![Page 37: MDS Embedding - math.tau.ac.ildcor/Graphics/cg-slides/sgp-RMDS.pdf · 30. Precision and Recall. 31. Threshold Performance The outlier detection rate as a function of the shrinkage](https://reader033.vdocument.in/reader033/viewer/2022060602/60563aef7e89d336f041e229/html5/thumbnails/37.jpg)
Outlier Detection for Robust Multi-dimensional Scaling
Leonid Blouvshtein and Daniel Cohen-Or
41