spatial range querying for gaussian-based imprecise query objects yoshiharu ishikawa, yuichi iijima...
DESCRIPTION
2 Imprecise Location Information Sensor Environments –Frequent updates may not be possible GPS-based positioning consumes batteries Robotics –Localization using sensing and movement histories –Probabilistic approach has vagueness Privacy –Location AnonymityTRANSCRIPT
![Page 1: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/1.jpg)
Spatial Range Querying forGaussian-Based Imprecise
Query Objects
Yoshiharu Ishikawa, Yuichi IijimaNagoya University
Jeffrey Xu YuThe Chinese University of Hong Kong
![Page 2: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/2.jpg)
Outline
• Background and Problem Formulation• Related Work• Query Processing Strategies• Experimental Results• Conclusions
2
![Page 3: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/3.jpg)
3
Imprecise Location Information
• Sensor Environments– Frequent updates may not be possible
• GPS-based positioning consumes batteries
• Robotics– Localization using sensing and
movement histories– Probabilistic approach has
vagueness• Privacy
– Location Anonymity
![Page 4: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/4.jpg)
4
Location-based Range Queries
• Location-based Range Queries– Example: Find hotels located within 2 km from
Yuyuan Garden– Traditional problem in spatial databases
• Efficient query processing using spatial indices• Extensible to multi-dimensional cases (e.g., image
retrieval)
• What happen if the location of query object is uncertain? q
![Page 5: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/5.jpg)
5
Probabilistic Range Query (PRQ) (1)
• Assumptions– Location of query object q is
specified as a Gaussian distribution– Target data: static points
• Gaussian Distribution
– Σ: Covariance matrix
)()(
21exp
||)2(1)( 1
2/12/ qxqxx ΣΣ
tdqp
![Page 6: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/6.jpg)
6
Probabilistic Range Query (PRQ) (2)
• Probabilistic Range Query (PRQ)
• Find objects such that the probabilities thattheir distances from qare less than δ are greater than θ
q
})Pr(,|{),,( 22 oxΟooqPRQ
![Page 7: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/7.jpg)
7
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08
-4-2
0 2
4
0 0.02 0.04 0.06 0.08
• Is distance between q and p within ?
Probabilistic Range Query (PRQ) (3)
ppdf of q (Gaussian distribution)
Numericalintegraiton is required
![Page 8: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/8.jpg)
8
Naïve Approach for Query Processing
• Exchanging roles– Pr[p is within from q] = Pr[q is within from p]
• Naïve approach– For each object p,
integrate pdf forsphere region R
– R : sphere withcenter p andradius δ
– If the result it is qualified
• Quite costly!
q
pR
![Page 9: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/9.jpg)
Outline
• Background and Problem Formulation• Related Work• Query Processing Strategies• Experimental Results• Conclusions
9
![Page 10: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/10.jpg)
Related Work
• Query processing methods for uncertain (location) data– Cheng, Prabhakar, et al. (SIGMOD’03, VLDB’04, …)– Tao et al. (VLDB’05, TODS’07)– Parker, Subrahmanian, et al. (TKDE’07, ‘09)– Consider arbitrary PDFs or uniform PDFs– Target objects may be uncertain
• Research related to Gaussian distribution– Gauss-tree [Böhm et al., ICDE’06]– Target objects are based on Gaussian
distributions
10
![Page 11: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/11.jpg)
Outline
• Background and Problem Formulation• Related Work• Query Processing Strategies• Experimental Results• Conclusions
11
![Page 12: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/12.jpg)
Outline of Query Processing
• Generic query processing strategy consists of three phases1. Index-Based Search: Retrieve all candidate
objects using spatial index (R-tree)2. Filtering: Using several conditions, some
candidates are pruned3. Probability Computation: Perform numerical
integration (Monte Carlo method) to evaluate exact probability
• Phase 3 dominates processing cost– Filtering (phase 2) is important for efficiency
12
![Page 13: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/13.jpg)
Query Processing Strategies
• Three strategies1. Rectilinear-Region-Based Approach (RR)2. Oblique-Region-Based Approach (OR)3. Bounding-Function-Based Approach (BF)
• Combination of strategies is also possible
13
![Page 14: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/14.jpg)
Rectilinear-Region-Based (RR) (1)
• Use the concept of -region– Similar concepts are used in query processing for
uncertain spatial databases -region: Ellipsoidal region for which the result of
the integration becomes 1 – 2 :
• The ellipsoidal region
is the -region
21 )()(21)(
r qt
dpqxqx
xxΣ
21)()( r
t
qxqx Σ
![Page 15: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/15.jpg)
15
Rectilinear-Region-Based (RR) (2)• Query processing
– Given a query, -region is computed: it is suffice if we have rθ-table for “normal” Gaussian pdf
• “Normal” Gaussian: = I, q = 0• Given , it returns appropriate rθ
– Derive MBR for -region and perform Minkowski Sum– Retrieve candidates then perform numerical integration
q q
-region
a
b
c
![Page 16: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/16.jpg)
16
Rectilinear-Region-Based (RR) (3)
• Geometry of bounding box
where (Σ)ii is the (i, i) entry of Σ
q
wj
wj
wi wi
xi
xj
iii
ii rw
)(Σ
![Page 17: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/17.jpg)
• Use of oblique rectangle– Query processing based on axis transformation– Not effective for phase 1 (index-based search): Only used
for filtering (phase 2)
17
Oblique-Region-Based (OR) (1)
q
a
b
cb
c
q
a
![Page 18: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/18.jpg)
• Step 1: Rotate candidate objects– Based on the result of eigenvalue decomposition of Σ-1
• Step 2: Check whether each object is inside of the rectangle
– λi: Eigenvalue of Σ-1 for i-th dimension18
Oblique-Region-Based (OR) (2)
q
2/1)( jr
2/1)( ir
![Page 19: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/19.jpg)
19
Bounding-Function-Based (BF) (1)
• Basic idea– Covariance matrix = I (“normal” Gaussian pdf)– Isosurface of pdf has a spherical shape
• Approach– Let be the radius for
which the integration result is
– If dist(q, p) ≤ then p satisfies the condition
– Construct a table that gives (, ) beforehand
q
Pr <
Pr >
Pr =
![Page 20: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/20.jpg)
20
Bounding-Function-Based (BF) (2)
• General case– isosurface has an ellipsoidal shape
• Approach– Use of upper- and
lower-bounding functions for pdf
• They have spherericalisosurfaces
• Derived from covariance matrix
q
![Page 21: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/21.jpg)
21
Bounding Functions
• Original Gaussian pdf
• Upper- and lower-bounding functions
2
2/12/
2
2/12/
2exp
||)2(1)(
2exp
||)2(1)(
qxx
qxx
dq
dq
p
p
T T
)()()( xxx qqq ppp T
Isosurfacehas asphericalshape
)()(
21exp
||)2(1)( 1
2/12/ qxqxx
tdqp
holds
TT
Note: λT = min{λ i}λ = max{λ i}
T
![Page 22: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/22.jpg)
T ( ): Radius with which the integration result of upper- (lower-) bounding function is
22
Bounding-Function-Based (BF) (3)
q
xy
z
T
T
T
T
T
Prob. (integrationresult) =
22
Upper-boundingfunction
Lower-bounding function
Originalpdf
![Page 23: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/23.jpg)
Bounding-Function-Based (BF) (4)
• Theoretical result– Let ST be a spherical region with radius and
its center relative to the origin is βT, and assume that ST satisfies the following equation:
– Using table that gives (, ) , we can get βT:
– Then we can get
23
2/12/norm ||)()( Σd
Sdp
TT
xxx
T
TTT )||)(,( 2/12/d
T
TT
![Page 24: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/24.jpg)
24
Bounding-Function-Based (BF) (5)
• Step 1: Use of R-tree– {b, c, d } are retrieved as
candidates• Step 2: Filtering using T
– b is deleted• Step 2’: Filtering using
– We can determine d as an answer withoutnumerical integration
• Step 3: Numerical integration– Performed on {c}
q
T
T
a
b
c
d T
![Page 25: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/25.jpg)
Outline
• Background and Problem Formulation• Related Work• Query Processing Strategies• Experimental Results• and Conclusions
25
![Page 26: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/26.jpg)
26
Experiments on 2D Data (1)
• Map of Long Beach, CA– Normalized into [0, 1000] [0, 1000]
732327Σ
• 50,747 entries• Indexed by R-tree• Covariance matrix
• γ : Scaling parameter• Default: γ = 10
![Page 27: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/27.jpg)
27
Example Query
• Find objects within distance δ = 50 with probability threshold θ = 1%
![Page 28: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/28.jpg)
28
Experiments on 2D Data (2)
• Numerical integration dominates the total cost• R-tree-based search is negligible• ALL is the most effective strategy
δ = 25θ = 0.01γ = 1
γ = 1000
50
100
150
200
RR BF RR+BF RR+OR BF+OR ALL
γ = 1
γ = 10
γ = 100
![Page 29: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/29.jpg)
Experiments on 2D Data (3)
• Filtering regions (δ = 25, θ = 0.01, γ = 10)
29
RROR
BF (upper)
BF (lower)
IIntegration region for ALL
y
x
![Page 30: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/30.jpg)
Experiments on 2D Data (4)
• Filtering regions for different uncertainty setting (δ = 25, θ = 0.01)
30
γ = 1 :Nearly exact γ = 10 :
Medium uncertaintyγ = 100 :Uncertain
![Page 31: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/31.jpg)
Experiments on 9D Data (1)
• Motivating Scenario: Example-Based Image Retrieval– User specifies
sample images– Image retrieval
system estimates his interest as a Gaussian distribution
31
![Page 32: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/32.jpg)
Experiments on 9D Data (2)
• Data set: Corel Image Features data set– From UCI KDD Archive– Color Moments data– 68,040 9D vectors– Euclidean-distance based similarity
• Experimental Scenario: Pseudo-Feedback– Select a random query object, then retrieve k-
NN query (k = 20) as sample images– Derive the covariance matrix from samples
32IΣΣ κ~ : Sample covariance matrix
κ : Normalization parameterΣ~
![Page 33: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/33.jpg)
Experiments on 9D Data (3)
• Parameters– δ = 0.7: For exact case, it retrieves 15.3 objects – θ = 40%
• Number of candidates (ANS: answer objs)
33
0
500
1000
1500
2000
2500
3000
3500
4000
RR BF RR+BF RR+OR BF+OR ALL ANS
3713
3216
2468
1905 19981699
3.9
Too many candidates to retrieve only 3.9 objects!
![Page 34: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/34.jpg)
Experiments on 9D Data (4)
• Reason: Curse of dimensionality• Plot shows existence probability for pnorm for
different radii and dimensions
34
Location of query object is too vague: In medium dimension, it is quite apart from its distribution center on averageExample: For 9D case, the probability that query object is within distance two is only 9%
0
0.2
0.4
0.6
0.8
1
0 1 2 3 4 5
Prob
abili
ty o
f Exi
sten
ce
Radius
2D
3D
5D
9D
15D
![Page 35: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/35.jpg)
Outline
• Background and Problem Formulation• Related Work• Query Processing Strategies• Experimental Results• Conclusions
35
![Page 36: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/36.jpg)
Conclusions
• Spatial range query processing methods for imprecise query objects– Location of query object is represented by
Gaussian distribution– Three strategies and their combinations– Reduction of numerical integration is important– Problem is difficult for medium- and high-
dimensional data• Our related work
– Probabilistic Nearest Neighbor Queries (MDM’09)
36
![Page 37: Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University](https://reader036.vdocument.in/reader036/viewer/2022062504/5a4d1b377f8b9ab05999d708/html5/thumbnails/37.jpg)
Spatial Range Querying forGaussian-Based Imprecise
Query Objects
Yoshiharu Ishikawa, Yuichi IijimaNagoya University
Jeffrey Xu YuThe Chinese University of Hong Kong