Date post: | 14-Dec-2015 |
Category: |
Documents |
Upload: | hazel-twitchell |
View: | 214 times |
Download: | 0 times |
Efficient Evaluation of k-Range Nearest Neighbor Queries in Road
Networks
Jie Bao Chi-Yin Chow Mohamed F. Mokbel
Department of Computer Science and Engineering
University of Minnesota – Twin Cities
Wei-Shinn Ku
Department of Computer Science and Software Engineering
Auburn University
2
What is Range NN Queries
• k-Range NN Queries in Euclidean Space– Given a spatial region, find the k
nearest objects to every points within the region
– E.g., Find the nearest hotel to a shopping mall
• k-Range NN Queries in Road Networks– Given a set of road segments, find the k
nearest objects to every points on the road segments
Region
3
Usages of Range NN Queries• Uncertain locations
– Measurement imprecision - due to the limitation of the underlying positioning techniques, e.g., 2G/3G and Wi-Fi
– Sampling imprecision - due to continuous motion, network delays, and location update frequency
• Privacy-preserving queries– Users do not want to reveal their exact
location information to service providers– Their locations are blurred into spatial
areas
iPhone's 3G Positioning
5-Anonymous Area
4
Related Works for k-RNN Queries• K-Nearest Neighbor in Road Networks
– Query processing with pre-computed information
Incremental Network Expansion (INE): a best first expansion over the road networks [Papadias et al., VLDB 2003]
– Query processing with pre-computed information
Use extra pre-computed quad-tree indexes to calculate the distances[Samet et al., SIGMOD 2008]
• K-Range Nearest Neighbor in Euclidean Space– Pre-computed Voironi Diagrams
[Chow et al., SSTD 2009]
• K-Range Nearest Neighbor in Road Networks– Range Query + INE for every boundary node
[Wang and Liu, PVLDB 2009]
5
Motivating Example• Computational redundancy in the existing solution
– Range Query + Multiple kNN Queries [Wang and Liu, PVLDB 2009]
Total number of road segments searched: 3 + 2 + 5 + 6 = 17
Total number of the road segments in the map: 6
Redundancy ratio: (17 - 6) / 6 = 183% (Worse if more boundary points)
• Can we provide the results without the computational redundancy?
Range Search
k-NN for D
k-NN for Bk-NN for
F
6
Problem Definition• Given:
– A undirected graph G=(V, E) as road networks– Set of objects O– A query region R (a set of road segments)– A K value
• Find:– Answer set A from O such that A contains the K-
nearest objects of every point in R based on the network distance in G
• Objective:– Provide A without computational redundancy
7
Efficient k-RNN Query Processing• Step 1: Inside Query Step• Step 2: Outside Network
Expansion Step– Multiple searching queues– Stop after closest node is
searched– Switch to the queue with the
smallest searched distance– Termination condition: covers
the distance of its kth object
Example 2-RNN
A
B
P1 P2
P3
1st iterationSearch fromAAnswer SetP1, P2
2nd iterationSearch fromBAnswer SetP1, P2
3rd iterationSearch fromCAnswer SetP1, P2
4th iterationSearch fromCAnswer SetP1, P2, P3
5th iterationSearch fromBAnswer SetP1, P2, P3
C
Road Segment Set (Range)
8
Distance Calculation• Case 1: By a pre-computed
shortest path table– Fast but more storage
• Case 2: Calculation on the fly– Keep the distance information as the
searching expands
• Tradeoff between storage and speed
A B E
A 0 1 2
B 1 0 3
E 2 3 0
C
D
P1
P2
A B E
A 0 1 2
B 1 0 3
E 2 3 0
C 3 4 5
D
P1
P2
A B E
A 0 1 2
B 1 0 3
E 2 3 0
C 3 2 5
D
P1 2 1 4
P2
Search collision!
A B E
A 0 1 2
B 1 0 3
E 2 3 0
C 3 2 5
D 5 4 6
P1 2 1 4
P2 4 3 5
9
Experimental Results
Parameters Default Value
Range
K value 10 1 to 20
Number of Objects 600 200 to 1000
Query region size (ratio over total space)
0.018 0.002 to 0.050
• Evaluate our algorithm without pre-computed results (KRNN-E), with pre-computed results (KRNN-F)
• Baseline algorithm: [Wang and Liu, PVLDB 2009]• Road networks (Hennepin county, Minnesota, US)
• 39,513 nodes and 54,444 road segments
Parameter settings
10
Comparison with baseline(1/2)
a) Impact of different k values
b) Impact of different total objects on the map
c) Impact of different query region size
11
Comparison with baseline(2/2)
• Impact of different distribution of the data objects– Uniform distribution– Normal distribution
• SD is the standard
deviation to simulate
the hot spot locations
like downtown area
Uniform SD=1 SD=0.1 SD=0.01SD=0.0010
10000
20000
30000
40000
50000
60000
70000
80000
Baseline KRNN-F KRNN-E
Different POI distributions
Que
ry P
roce
ssin
g Ti
me
(s)
12
Tradeoff between storage and performance• Tuning parameter P
– The percentage of the shortest distance table– Warm up process with 1000 k-RNN queries– Full size of the table is 980 MB
13
Conclusion• An efficient algorithm for k-Range Nearest Neighbor
(k-RNN) queries in road networks without computational overhead
• Experiment evaluation– Our solution outperforms the baseline algorithm– Tuning parameter P achieves a tradeoff
Privacy preserved applications Uncertain locations