![Page 1: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/1.jpg)
SimRank : A Measure of Structural-Context Similarity
Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang
Author : Glen JehJennifer Widom
![Page 2: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/2.jpg)
Outline
Motivation Objective Introduction Basic Graph Model SimRank Random Surfer-Pairs Model Future Work Personal opinion
![Page 3: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/3.jpg)
Motivation
The problem of measuring “similarity” of objects arises in many applications.
![Page 4: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/4.jpg)
Objective The approach, applicable in any
domain with object-to-object relationships.
Two objects are similar if they are related to similar objects.
![Page 5: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/5.jpg)
Introduction
![Page 6: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/6.jpg)
Basic Graph Model We model objects and
relationships as a directed graph G=(V,E).
For a node v in a graph, we denote by I(v) and O(v) the set of in-neighbors and out-neighbors.
![Page 7: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/7.jpg)
SimRank Basic SimRank Equation
If a=b then s(a,b) is defined to be 1. Otherwise,
Where C is a constant between 0 and 1. Set s(a,b)=0 when or .
)|(|
1
)|(|
1
))(),((|)(||)(|
),(aI
i
bI
jji bIaIs
bIaI
Cbas (1)
)(aI )(bI
![Page 8: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/8.jpg)
Bipartite SimRank Two types of objects. Example : Shopping graph G.
SimRank
![Page 9: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/9.jpg)
SimRank
![Page 10: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/10.jpg)
Let s(A,B) denote the similarity between persons A and B, for
Let s(c,d) denote the similarity between items c and d, for
SimRank
BA
dc
)|(|
1
)|(|
1
1 ))(),((|)(||)(|
),(AO
i
BO
jji BOAOs
BOAO
CBAs (2)
)|(|
1
)|(|
1
2 ))(),((|)(||)(|
),(cI
i
dI
jji dIcIs
dIcI
Cdcs (3)
![Page 11: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/11.jpg)
Computing SimRank - Naive Method is a lower bound on the .
To compute from
SimRank
010 {),( baR
(if )
(if )
ba ba
),(1 baRk (*,*)kR
)|(|
1
)|(|
11 ))(),((
|)(||)(|),(
aI
i
bI
jjikk bIaIR
bIaI
CbaR (4)
ba ba 1),(1 baRkFor , and for .
),(0 baR ),( bas
![Page 12: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/12.jpg)
The space required is simply to store the results .
The time required is . K:The number of iterations :The average of |I(a)||I(b)| over all
node pairs (a,b).
SimRank)( 2nO
kR
)( 22dKnO
2d
![Page 13: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/13.jpg)
Computing SimRank - Pruning set the similarity between two nodes far
apart to be 0. consider node-pairs only for nodes which
are near each other.
SimRank
![Page 14: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/14.jpg)
Radius r, and average such neighbors for a node, then there will be node-pairs.
The time and space complexities become and respectively.
SimRank
)( rndO)( 2dKndO r
rd
rnd
![Page 15: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/15.jpg)
Random Surfer-Pair Model Expected Distance
Let H be any strongly connected graph.
Let u,v be any two nodes in H. We define the expected distance
d(u,v) from u to v as
vut
tltPvud:
)(][),( (5)
![Page 16: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/16.jpg)
Expected Meeting Distance(EMD).
Random Surfer-Pair Model
),(),(:
)(][),(xxbat
tltPbam (6)
![Page 17: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/17.jpg)
Expected-f Meeting Distance To circumvent the “infinite EMD”
problem. To map all distances to a finite
interval. Exponential function ,where
is a constant.
Random Surfer-Pair Model
),(),(:
)(][),(xxbat
tlctPbas (7)
zczf )(
)1,0(c
![Page 18: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/18.jpg)
Equivalence to SimRank
Random Surfer-Pair Model
![Page 19: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/19.jpg)
Theorem. The SimRank score, with parameter C, be
tween two nodes is their expected-f meeting distance traveling back-edges,for .
Random Surfer-Pair Model
zczf )(
![Page 20: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/20.jpg)
Future Work
Future Work. Divided and conquer and merge.
Divided a corpus into chunks… Ternary(or more) relationships.
![Page 21: SimRank : A Measure of Structural- Context Similarity Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom](https://reader036.vdocument.in/reader036/viewer/2022062304/56649f3f5503460f94c60407/html5/thumbnails/21.jpg)
Personal Opinion
We believe that the intuition behind SimRank can be used in many domains which based on objects to objects.