Download - Sigmod11 outsource shortest path
Neighborhood-Privacy Protected Shortest Distance Computing in
Cloud
Jun Gao, Jeffrey Yu Xu, Ruoming Jin,
Jiashuai Zhou, Tengjiao Wang, Dongqing Yang
14 Jun, 2011, Greece, SIGMOD 2012
Outline
Motivation
Related work
Our solution
• 1-neighborhood-d-radius graph
• Graph transformation with exact answer
• Graph transformation with approximate answer
Experiment
Conclusion & Future work
2
Graph data management in cloud
3Coauthor Network , from manyeyes.alphaworks.ibm.com
Graph data applications
• Social network, knowledge network...
Time consuming graph operations
• The shortest distance computing takes O(n2)
• The breadth-first-search requires O(n+m)
• ......
Cloud Computing
Advantage of cloud computing
• High computational power
• Easy maintenance
• Easy re-provisioning of resources
• ……
Can we use the cloud serve to manage graph data, such as to answer shortest distance?
Security issues in graph outsourcingAttacks on outsourced graph
• Structural Pattern Attack
- Use sub-graph to re-identify the target part
• Reconstruction Attack
- Recover the original graph from outsourced one.
Security leakage
• Regulation of sensitive data violated
• Untrusted answers produced by cloud server
4
We have to strike a balance between the security and the computational cost saving using cloud
server
Client SideClient Side
Framework of graph outsourcing
5
Original Original GraphGraph
Original Original GraphGraph
Graph Graph TransformationTransformationGraph Graph TransformationTransformation
Link Link graphgraph
Link Link graphgraph
ResultsResultsResultsResults ResultResultCombinationCombinationResultResultCombinationCombination
Cloud ServerCloud Server
OutsourcedOutsourcedGraphGraph
OutsourcedOutsourcedGraphGraph
QueryQueryEvaluationEvaluationQueryQueryEvaluationEvaluation
Query Query RewritingRewritingQuery Query RewritingRewriting
QueryQueryQueryQuery
(1) A reasonable security model on outsourced graph
(2) An efficient method to transform the original graph into the outsourced graph
(3) An approach to rewrite the query and combine the results
(2)
(1)
(3)
Outline
Motivation
Related work
Our solution
• 1-neighborhood-d-radius graph
• Graph transformation with exact answer
• Graph transformation with approximate answer
Experiment
Conclusion & Future work
6
Structural Anonymization
• Structural anonymization in publishing
- 1-neighborhood [icde 08], k-degree [sigmod08], k-automorphism [vldb 09], k-isomorphism [sigmod10], etc
- Using the least amount of modifications of the original graph
7
Original graph 4-isomorphism Attacker’s query
find 4 sub-graphs
No shortest distance preservationNo consideration of edge weight
1
6 2
34 2
362
1
63
1
7
9 6
4
8
3
8
2
4
5
a b c
d gfe
h
p
v
i
k lj
uo 9
7
3
a b c
d gf
e
h
p
v
i
k lj
u o
z
yw x
Feature preservation graph transformation
Eigenvalue preservation [sdm 08]
• Random add/remove/switch edges
• Theoretically prove that the eigenvalue can be preserved.
Shortest path preservation [icde 10]
• Express the shortest path preservation by inequality rules
• Use line programming to find a solution to such rules
• Requires O(dn2) rules in all shortest path preservation
8
No support of exact distance computingNo explicit security guarantee
Shortest distance indexMultiple-level index [tkde98]
• Select nodes to build a higher level graph
• Exploit the shortest paths at a higher level graph to guide the path searching at a lower level
Landmark index [cikm 09, jacm 09]
• Select landmark nodes and build the shortest path
• Exploit the triangle inequality rules to estimate the distance
2-HOP index [soda 02]
• Annotate incoming and outgoing labels on each node
• Compute the distance between two nodes with the intersection
9
No security consideration
Outline
Motivation
Related work
Our solution
• 1-neighborhood-d-radius graph
• Graph transformation with exact answer
• Graph transformation with approximate answer
Experiment
Conclusion & Future work
10
1- Neighborhood-d-Radius GraphIntuition
• Protect the neighborhood information and the close relationship between nodes.
Privacy protection
• Find empty meaningful results for any query pattern11
(1-neighborhood): for any node pair u and v ∈ Vo, (u, v) ∉ E(d-radius): for any node pair u and v ∈ Vo, δG(u, v) >= d.
6 2
34 2
36
2
1
63
7
9 6
4
8
3
8
2
4
5
a b c
d gfe
h
p
v
i
k lj
uo 9
7
3
3
5
5 1116
16
9
10
105
b f
v
io
h
10 z
1
3 6yw 4 x
Original graph Attacker’s query 2-radius graph
1-Neighborhood-d-Radius Graph too strong?
Can we hide the neighbors and relationship with distance less than d, and add direct edges among others?
No, using triangle inequality rules will find the “hidden” edges
• Reconstruction Attack
12
1
6 2
34 2
362
1
63
1
7
9 6
4
8
3
8
2
4
5
a b c
d gfe
h
p
v
i
k lj
uo 9
7
3
ab c
d gfe
p
vi
kl
j
u
o
h
9
1310
11
6
108
1215
16
10
9
6
7
6
4 11
12
14
Original graph non-2-radius graph
Utilization: Shortest Distance Computation
13
1
6 2
3
42
36
2
1
63
1
79
6
4
8
3
8
2
4
5
9
7
3
55 11
16
169
10
105
10
f
a c
d ge
p
k lj
u
b
vi
o
h
b f
v
ioh
Outsourcedgraph
Link Graph
Originalgraph
),(,),(min, vywyxxuwvuoGG
Given a node pair u and v, the shortest distance can be discovered with
…… u v
Graph Transformation ProblemGiven a graph G = (V,E) and d, the graph transformation produces outsourced graphs Go = {G1, ...Gj}, and a local link graph Gl, which achieves the following objectives:
• Security
- Each outsourced graph is a 1-neighborhood-d-radius graph;
• Utility
- The union of Go and Gl can answer the shortest distance in the original graph;
• Local computational cost
- The space cost of Gl and the cost of the shortest distance computation on the client side are minimized.
14
Naive Method
Steps
• Enumerate different forms of the candidate solutions
- One local link graph and outsourced graphs.
• Find the one with the minimal space cost of local graph.
Searching space
• The nodes in a outsourced graph are a sub-set of the these
in original graph, and the different forms of outsourced
graph can be O(2n)
• The brute force strategy will lead to exponential time cost
15
Greedy MethodBasic idea
• Generate more “expressive” outsourced graph which can answer more shortest paths.
- Edges in link graph can be reused so that the space cost of link graph is reduced
Challenges
• How to find “expressive” outsourced nodes?
• How to build d-radius graph from the select nodes?
Steps
1. Enumerate all shortest paths, find possible candidate outsourced nodes, and assign benefit on nodes
2. Generate outsourced graphs according to node benefit
16
Step 1: Enumerate shortest path and benefit assignment
Candidate outsourced node pair
• node pair (x,y) can be used to answer
shortest distance between (u,v)
• (x,y) should meet d-radius.
• x is close to u, y is close to v
Benefit function
• Record the frequency of a node (or node
pair) which can be outsourced
17
1 1a b 2c 3f g
2 1d b 2c 7f e
1 2b c
3f
7g v
2 1d b
2c
3f g
1a 3 g
2d 7f
e
1b c v
b
(a) Sampling Shortest Paths
(b) outsourced node pairs
1 1a b 2c 3f g
Step 2: Generate one outsourced graph
Node selection
• The node which is with the next maximal benefit and is not in any cluster, can be selected
• Build a d-radius cluster for the selected node
Edge building
• The edge weight is the shortest distance between cluster centers
18
3b f
(c) outsourced graphs(part)
1a 3 g
2d 7f
e1
b c v
b
(a) outsourced node pairs
6 2
34 2
36
2
1
63
7
9 6
4
8
3
8
2
4
5
a b c
d gfe
h
p
v
i
k lj
uo 9
7
3
(b) d-radius cluster
Graph transformation with approximate answer
Graph transformation with exact answer at least requires enumeration of all shortest paths.
Approximate distance can be acceptable in many domains
Approximate distance can be measured by
Basic idea
• Transform graph to achieve α = 1 and a given average additive error β?
Main steps
• Construct outsourced graph in a relaxed way
• Estimate the average additive error
19
vuG ,
Q
vuplenQq Gq
,
Relaxed outsourced graph construction
Select outsourced nodes randomly.
Relax edge weight assignment
• Build k shortest path trees
• In each tree, link the outsourced node with its lowest ancestor as the edge.
20
6 2
34 2
362
1
63
7
9 6
4
8
3
8
2
4
5
a b c
d gfe
h
p
v
i
k lj
uo 9
7
3
1 1
2
3
2 3
2
4 8
3
84
5
a
bcd
g
f
e
h
p
v
i
k
l
j
u o7
3
11 5
5
10
11
b
fh
v
i o
3
Estimation of average additive error The error for distance query (u,v) varies according to whether u and v have been outsourced
β can be computed as follows:
• We estimate the percentage of each category with the random node selection assumption
• The average additive error can be estimated by sampling
21
221100 lQpctlQpctlQpct
u x
(a) Q0 Query (b) Q1 Query
x yvu v
dd d d
(c) Q2 Query
u
d d
v
Heuristic outsourced node selection
Single outsourced graph
• Degree based construction
- First select the node with the higher degree
• Cluster size based construction
- First select the node with more nodes in its cluster
Multiple outsourced graphs
• Avoid outsourcing the same graph.
22
Outline
Motivation
Related work
Our solution
• 1-neighborhood-d-radius graph
• Graph transformation with exact answer
• Graph transformation with approximate answer
Experiment
Conclusion & Future work
23
ExperimentMeasures:
• transformation time cost
• space cost of link graph
• average additive error
• local overhead ratio=
Competitor
• LP-based Edge weight anonymization in ICDE 2010
Datasets:
24
Time cost with cloud server
Time cost without cloud server
Results related with exact answers
Scalability
• Better than LP based method
Impact of increase of d
• Strengthen security of outsourced graphs
• Increase the transformation time cost, the space cost of the link graph
25
Results related with exact answers (cont.)
Benefit function
• Vertex pair based method works better
Local overhead ratio
• Very low
• Goes down with the increase of graph size
26
Results related with approximate answers
Scalability
• Support large graph
Impact of increase of error bound
• Decrease of space cost and time cost in outsourcing
27
Results related with approximate answers(cont.)
Additive error bound
• Achieves the given additive error quite well
Local overhead ratio
• Declines with the increase of nodes
28
Outline
Motivation
Related work
Our solution
• 1-neighborhood-d-radius graph
• Graph transformation with exact answer
• Graph transformation with approximate answer
Experiment
Conclusion & Future work
29
Conclusion & Future work
Conclusion:
• A 1-neighbourhood-d- radius security model
• A greedy method to transform graph with exact answer
• A method to transform graph with approximate answer
• Extensive experimental results on real and synthetic data
Future work:
• More graph operations.
• Stronger security model
• Incremental graph outsourcing over dynamic graphs
30