efficient evaluation of k-range nearest neighbor queries in road networks
DESCRIPTION
Efficient Evaluation of k-Range Nearest Neighbor Queries in Road Networks. Jie BaoChi-Yin ChowMohamed F. Mokbel Department of Computer Science and Engineering University of Minnesota – Twin Cities Wei-Shinn Ku Department of Computer Science and Software Engineering Auburn University. - PowerPoint PPT PresentationTRANSCRIPT
Efficient Evaluation of k-Range Nearest Neighbor Queries in Road
Networks
Jie Bao Chi-Yin Chow Mohamed F. MokbelDepartment of Computer Science and Engineering
University of Minnesota – Twin Cities
Wei-Shinn KuDepartment of Computer Science and Software Engineering
Auburn University
2
What is Range NN Queries
• k-Range NN Queries in Euclidean Space– Given a spatial region, find the k
nearest objects to every points within the region
– E.g., Find the nearest hotel to a shopping mall
• k-Range NN Queries in Road Networks– Given a set of road segments, find the k
nearest objects to every points on the road segments
Region
3
Usages of Range NN Queries• Uncertain locations
– Measurement imprecision - due to the limitation of the underlying positioning techniques, e.g., 2G/3G and Wi-Fi
– Sampling imprecision - due to continuous motion, network delays, and location update frequency
• Privacy-preserving queries– Users do not want to reveal their exact
location information to service providers– Their locations are blurred into spatial
areas
iPhone's 3G Positioning
5-Anonymous Area
4
Related Works for k-RNN Queries• K-Nearest Neighbor in Road Networks
– Query processing with pre-computed information Incremental Network Expansion (INE): a best first expansion over the
road networks [Papadias et al., VLDB 2003]
– Query processing with pre-computed information Use extra pre-computed quad-tree indexes to calculate the distances
[Samet et al., SIGMOD 2008]
• K-Range Nearest Neighbor in Euclidean Space– Pre-computed Voironi Diagrams [Chow et al., SSTD 2009]
• K-Range Nearest Neighbor in Road Networks– Range Query + INE for every boundary node [Wang and Liu, PVLDB 2009]
5
Motivating Example• Computational redundancy in the existing solution
– Range Query + Multiple kNN Queries [Wang and Liu, PVLDB 2009]
Total number of road segments searched: 3 + 2 + 5 + 6 = 17Total number of the road segments in the map: 6Redundancy ratio: (17 - 6) / 6 = 183% (Worse if more boundary points)
• Can we provide the results without the computational redundancy?
Range Search
k-NN for D
k-NN for Bk-NN for
F
6
Problem Definition• Given:
– A undirected graph G=(V, E) as road networks– Set of objects O– A query region R (a set of road segments)– A K value
• Find:– Answer set A from O such that A contains the K-
nearest objects of every point in R based on the network distance in G
• Objective:– Provide A without computational redundancy
7
Efficient k-RNN Query Processing• Step 1: Inside Query Step• Step 2: Outside Network
Expansion Step– Multiple searching queues– Stop after closest node is
searched– Switch to the queue with the
smallest searched distance– Termination condition: covers
the distance of its kth object
Example 2-RNN
A
B
P1 P2
P3
1st iterationSearch fromAAnswer SetP1, P2
2nd iterationSearch fromBAnswer SetP1, P2
3rd iterationSearch fromCAnswer SetP1, P2
4th iterationSearch fromCAnswer SetP1, P2, P3
5th iterationSearch fromBAnswer SetP1, P2, P3
C
Road Segment Set (Range)
8
Distance Calculation• Case 1: By a pre-computed
shortest path table– Fast but more storage
• Case 2: Calculation on the fly– Keep the distance information as the
searching expands• Tradeoff between storage and
speed
A B EA 0 1 2B 1 0 3E 2 3 0CDP1P2
A B EA 0 1 2B 1 0 3E 2 3 0C 3 4 5DP1P2
A B EA 0 1 2B 1 0 3E 2 3 0C 3 2 5DP1 2 1 4P2
Search collision!
A B EA 0 1 2B 1 0 3E 2 3 0C 3 2 5D 5 4 6P1 2 1 4P2 4 3 5
9
Experimental Results
Parameters Default Value
Range
K value 10 1 to 20
Number of Objects 600 200 to 1000
Query region size (ratio over total space) 0.018 0.002 to 0.050
• Evaluate our algorithm without pre-computed results (KRNN-E), with pre-computed results (KRNN-F)
• Baseline algorithm: [Wang and Liu, PVLDB 2009]• Road networks (Hennepin county, Minnesota, US)
• 39,513 nodes and 54,444 road segments
Parameter settings
10
Comparison with baseline(1/2)
a) Impact of different k valuesb) Impact of different total objects on the map c) Impact of different query region size
11
Comparison with baseline(2/2)
• Impact of different distribution of the data objects– Uniform distribution– Normal distribution
• SD is the standard deviation to simulate the hot spot locations like downtown area
Uniform SD=1 SD=0.1 SD=0.01SD=0.0010
10000
20000
30000
40000
50000
60000
70000
80000
Baseline KRNN-F KRNN-E
Different POI distributions
Que
ry P
roce
ssin
g Ti
me
(s)
12
Tradeoff between storage and performance• Tuning parameter P
– The percentage of the shortest distance table– Warm up process with 1000 k-RNN queries– Full size of the table is 980 MB
13
Conclusion• An efficient algorithm for k-Range Nearest Neighbor
(k-RNN) queries in road networks without computational overhead
• Experiment evaluation– Our solution outperforms the baseline algorithm– Tuning parameter P achieves a tradeoff
Privacy preserved applications Uncertain locations
14
Q&A