spatial indexing sams. spatial indexing point access methods can index only points. what about...
Post on 19-Dec-2015
231 views
TRANSCRIPT
![Page 1: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/1.jpg)
Spatial Indexing
SAMs
![Page 2: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/2.jpg)
Spatial Indexing Point Access Methods can index only
points. What about regions? Z-ordering and quadtrees Use the transformation technique and a
PAM New methods: Spatial Access Methods
SAMs R-tree and variations
![Page 3: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/3.jpg)
Problem Given a collection of geometric
objects (points, lines, polygons, ...) organize them on disk, to answer
spatial queries (range, nn, etc)
![Page 4: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/4.jpg)
Transformation Technique Map an d-dim MBR into a point: ex. [(xmin, xmax) (ymin, ymax)] =>
(xmin, xmax, ymin, ymax) Use a PAM to index the 2d points Given a range query, map the
query into the 2d space and use the PAM to answer it
![Page 5: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/5.jpg)
R-tree
[Guttman 84] Main idea: allow parents to overlap! => guaranteed 50% utilization => easier insertion/split algorithms. (only deal with Minimum Bounding
Rectangles - MBRs)
![Page 6: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/6.jpg)
R-tree A multi-way external memory tree Index nodes and data (leaf) nodes All leaf nodes appear on the same level Every node contains between m and M
entries The root node has at least 2 entries
(children)
![Page 7: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/7.jpg)
Example
eg., w/ fanout 4: group nearby rectangles to parent MBRs; each group -> disk page
A
B
C
DE
FG
H
J
I
![Page 8: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/8.jpg)
Example F=4
A
B
C
DE
FG
H
I
J
P1
P2
P3
P4
F GD E
H I JA B C
![Page 9: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/9.jpg)
Example F=4| m=2, M=4
A
B
C
DE
FG
H
I
J
P1
P2
P3
P4
P1 P2 P3 P4
F GD E
H I JA B C
P5 P6
P5
P6
![Page 10: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/10.jpg)
R-trees - format of nodes
{(MBR; obj_ptr)} for leaf nodes
P1 P2 P3 P4
A B C
x-low; x-highy-low; y-high
...
objptr ...
![Page 11: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/11.jpg)
R-trees - format of nodes
{(MBR; node_ptr)} for non-leaf nodes
P1 P2 P3 P4
A B C
x-low; x-highy-low; y-high
...nodeptr ...
![Page 12: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/12.jpg)
R-trees:Search
A
B
C
DE
FG
H
I
J
P1
P2
P3
P4
P1 P2 P3 P4
F GD E
H I JA B C
P5 P6
![Page 13: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/13.jpg)
R-trees:Search
P1 P2 P3 P4
F GD E
H I JA B C
P5 P6
A
B
C
DE
FG
H
I
J
P1
P2
P3
P4
![Page 14: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/14.jpg)
R-trees:Search
Main points: every parent node completely covers its
‘children’ a child MBR may be covered by more than
one parent - it is stored under ONLY ONE of them. (ie., no need for dup. elim.)
a point query may follow multiple branches. everything works for any(?) dimensionality
![Page 15: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/15.jpg)
R-trees:Insertion
A
B
C
DE
FG
H
I
J
P1
P2
P3
P4
P1 P2 P3 P4
F GD E
H I JA B CX
X
Insert X
![Page 16: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/16.jpg)
R-trees:Insertion
A
B
C
DE
FG
H
I
J
P1
P2
P3
P4
P1 P2 P3 P4
F GD E
H I JA B CY
Insert Y
![Page 17: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/17.jpg)
R-trees:Insertion
Extend the parent MBR
A
B
C
DE
FG
H
I
J
P1
P2
P3
P4
P1 P2 P3 P4
F GD E
H I JA B CY
Y
![Page 18: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/18.jpg)
R-trees:Insertion How to find the next node to insert
the new object? Using ChooseLeaf: Find the entry that
needs the least enlargement to include Y. Resolve ties using the area (smallest)
Other methods (later)
![Page 19: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/19.jpg)
R-trees:Insertion
If node is full then Split : ex. Insert w
A
B
C
DE
FG
H
I
J
P1
P2
P3
P4
P1 P2 P3 P4
F GD E
H I JA B C
W
K
K
![Page 20: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/20.jpg)
R-trees:Insertion
If node is full then Split : ex. Insert w
A
B
C
DE
FG
H
I
J
P1
P2
P3
P4
Q1 Q2
F G
D E
H I J
A BW
K
C K W
P5 P1 P5 P2 P3 P4
Q1Q2
![Page 21: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/21.jpg)
R-trees:Split
Split node P1: partition the MBRs into two groups.
A
B
CW
KP1
• (A1: plane sweep,
until 50% of rectangles)
• A2: ‘linear’ split
• A3: quadratic split
• A4: exponential split:
2M-1 choices
![Page 22: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/22.jpg)
R-trees:Split pick two rectangles as ‘seeds’; assign each rectangle ‘R’ to the ‘closest’ ‘seed’
seed1
seed2
R
![Page 23: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/23.jpg)
R-trees:Split pick two rectangles as ‘seeds’; assign each rectangle ‘R’ to the ‘closest’
‘seed’: ‘closest’: the smallest increase in area
seed1
seed2
R
![Page 24: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/24.jpg)
R-trees:Split How to pick Seeds:
Linear:Find the highest and lowest side in each dimension, normalize the separations, choose the pair with the greatest normalized separation
Quadratic: For each pair E1 and E2, calculate the rectangle J=MBR(E1, E2) and d= J-E1-E2. Choose the pair with the largest d
![Page 25: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/25.jpg)
R-trees:Insertion Use the ChooseLeaf to find the
leaf node to insert an entry E If leaf node is full, then Split,
otherwise insert there Propagate the split upwards, if
necessary Adjust parent nodes
![Page 26: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/26.jpg)
R-Trees:Deletion Find the leaf node that contains the entry E Remove E from this node If underflow:
Eliminate the node by removing the node entries and the parent entry
Reinsert the orphaned (other entries) into the tree using Insert
Other method (later)
![Page 27: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/27.jpg)
R-trees: Variations R+-tree: DO not allow overlapping, so
split the objects (similar to z-values) R*-tree: change the insertion, deletion
algorithms (minimize not only area but also perimeter, forced re-insertion )
Hilbert R-tree: use the Hilbert values to insert objects into the tree
![Page 28: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/28.jpg)
Spatial Access Methods PAMs
Grid File kd-tree based (LSD-, hB- trees) Z-ordering + B+-tree
R-tree Variations: R*-tree, Hilbert R-tree
![Page 29: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/29.jpg)
R-tree
A
B
C
DE
FG
H
I
J
P1
P2
P3
P4
P1 P2 P3 P4
F GD E
H I JA B C
Multi-way external memory structure, indexes MBRsDynamic structure
![Page 30: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/30.jpg)
R-tree The original R-tree tries to
minimize the area of each enclosing rectangle in the index nodes.
Is there any other property that can be optimized?
R*-tree Yes!
![Page 31: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/31.jpg)
R*-tree Optimization Criteria:
(O1) Area covered by an index MBR (O2) Overlap between directory MBRs (O3) Margin of a directory rectangle (O4) Storage utilization
Sometimes it is impossible to optimize all the above criteria at the same time!
![Page 32: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/32.jpg)
R*-tree
ChooseSubtree: If next node is a leaf node, choose the
node using the following criteria: Least overlap enlargement Least area enlargement Smaller area
Else Least area enlargement Smaller area
![Page 33: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/33.jpg)
R*-tree SplitNode
Choose the axis to split Choose the two groups along the chosen axis
ChooseSplitAxis Along each axis, sort rectangles and break
them into two groups (M-2m+2 possible ways where one group contains at least m rectangles). Compute the sum S of all margin-values (perimeters) of each pair of groups. Choose the one that minimizes S
ChooseSplitIndex Along the chosen axis, choose the grouping
that gives the minimum overlap-value
![Page 34: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/34.jpg)
R*-tree Forced Reinsert:
defer splits, by forced-reinsert, i.e.: instead of splitting, temporarily delete some entries, shrink overflowing MBR, and re-insert those entries
Which ones to re-insert? How many? A: 30%
![Page 35: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/35.jpg)
R-tree: variations What about static datasets?
(no ins/del) Hilbert What about other bounding
shapes?
![Page 36: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/36.jpg)
R-trees - variations
what about static datasets (no ins/del/upd)?
Q: Best way to pack points?
![Page 37: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/37.jpg)
R-trees - variations what about static datasets (no
ins/del/upd)? Q: Best way to pack points? A1: plane-sweep great for queries on ‘x’; terrible for ‘y’
![Page 38: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/38.jpg)
R-trees - variations what about static datasets (no
ins/del/upd)? Q: Best way to pack points? A1: plane-sweep great for queries on ‘x’; bad for ‘y’
![Page 39: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/39.jpg)
R-trees - variations what about static datasets (no
ins/del/upd)? Q: Best way to pack points? A1: plane-sweep great for queries on ‘x’; terrible for ‘y’ Q: how to improve?
![Page 40: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/40.jpg)
R-trees - variations A: plane-sweep on HILBERT curve!
![Page 41: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/41.jpg)
R-trees - variations
A: plane-sweep on HILBERT curve! In fact, it can be made dynamic
(how?), as well as to handle regions (how?)
![Page 42: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/42.jpg)
R-trees - variations Dynamic (‘Hilbert
R-tree): each point has an
‘h’-value (hilbert value)
insertions: like a B-tree on the h-value
but also store MBR, for searches
![Page 43: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/43.jpg)
Hilbert R-tree
Data structure of a node?
LHV x-low, ylowx-high, y-high
ptr
h-value >= LHV &MBRs: inside parent MBR
![Page 44: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/44.jpg)
R-trees - variations Data structure of a node?
LHV x-low, ylowx-high, y-high
ptr
h-value >= LHV &MBRs: inside parent MBR
~B-tree
![Page 45: Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation](https://reader036.vdocument.in/reader036/viewer/2022062407/56649d375503460f94a10040/html5/thumbnails/45.jpg)
R-trees - variations Data structure of a node?
LHV x-low, ylowx-high, y-high
ptr
h-value >= LHV &MBRs: inside parent MBR
~ R-tree