indexing in spatial databases and query processing

12
Indexing in Spatial Databases and Query Processing

Upload: leila-thornton

Post on 03-Jan-2016

51 views

Category:

Documents


4 download

DESCRIPTION

Indexing in Spatial Databases and Query Processing. Query Processing. Efficient algorithms to answer spatial queries Common Strategy - filter and refine Filter Step: Query Region overlaps with MBRs (minimum bounding rectangles) of B,C and D - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Indexing in Spatial Databases and Query Processing

Indexing in Spatial Databases and Query Processing

Page 2: Indexing in Spatial Databases and Query Processing

Query Processing• Efficient algorithms to answer spatial queries• Common Strategy - filter and refine

• Filter Step: Query Region overlaps with MBRs (minimum bounding rectangles) of B,C and D• Refine Step: Query Region overlaps with B and C

- For reducing computation time: - It is easier (computationally cheaper) to compute the intersection between a query region and a rectangle rather than between the query region and an arbitrary, irregular shaped, spatial object.

Fig 1.8

Page 3: Indexing in Spatial Databases and Query Processing

File Organization and Indices

• DBMS have been designed to handle very large amounts of data• The fundamental difference in how algorithms are designed in a GIS data analysis vs a database environment • A difference between GIS and SDBMS assumptions

•GIS algorithms: •Main focus is minimizing the computation time•Assuming that entire dataset is residing in main memory

•SDBMS: dataset is on secondary storage e.g disk•Main focus is on I/O time•I/O time is the time required to transfer data from a disk to the main memory• finite main memory infinite disk

•SDBMS uses spatial indices to efficiently search large spatial datasets in DB

Page 4: Indexing in Spatial Databases and Query Processing

File Organization and Indices

a. programmer’s view-point: computation timeb. DBMS designer’s view-point: I/O time

CPU-bound I/O boundComputation timeI/O time

Page 5: Indexing in Spatial Databases and Query Processing

Indexing

Consider secondary storage as a bookThe smallest unit of transfer between the disk and main memory is a page. And records of a tble are like structured lines of text on the pageAt anytime some pages reside in main meory, some at the diskTo accelerate the search DB uses indexDBMS can fetch all the pages spanned by a table and scan them line by line until the record is foundOr search in the index and for a desired key word and go directly to the page specified in the index.Index entries in a book are sorted in alphabetical order Similarly if the index is built on numbers, like the social security number, then they can be numerically numbered.

Page 6: Indexing in Spatial Databases and Query Processing

Spatial Indexing: Search Data-Structures

R- treeB-tree

•Choice for spatial indexing: for query optimization•B-tree is a hierarchical collection of ranges of linear keys, e.g. numbers•B-tree index is used for efficient search of traditional data

•Crucially depends on the existence of an order in the indexing field.•See the difference between binary tree and B-tree•Each node represents page

Page 7: Indexing in Spatial Databases and Query Processing

Spatial Indexing: Search Data-Structures

Which nodes are index pages and which nodes are data pagesIf a page holds m keys, then the height ?O(logm n)

What is n?

WHY B-TREE IS NOT APPROPRIATE FOR INDEXING SPARIAL DATA

Page 8: Indexing in Spatial Databases and Query Processing

Study Area

Minimum Bounding Rectangle

Minimum Bounding Rectangles

Page 9: Indexing in Spatial Databases and Query Processing

R-tree for Spatial Data Indexing

• Because natural order does not exist in multidimensional space, the B-tree can not be used directly to create an index of spatial objects.

• R-tree data structure was one of the first index structure specifically designed to handle multidimensional extended objects

• R-tree provides better search performance yet!

• R-tree is a hierarchical collection of rectangles

Page 10: Indexing in Spatial Databases and Query Processing

R-tree: Example

Examples of R – Tree Index of polygons

Page 11: Indexing in Spatial Databases and Query Processing

Query Optimization

•Query Optimization• A spatial operation can be processed using different strategies• Computation cost of each strategy depends on many parameters

•Example Query:•Find the names of all female senators who own a businessSELECT S.name FROM Senator S, Business BWHERE S.soc-sec = B.soc-sec AND S.gender = ‘Female’

Here, there is a selection query and a join query

• Optimization •Process (S.gender = ‘Female’) before (S.soc-sec = B.soc-sec )•Do not use index for processing (S.gender = ‘Female’)

Page 12: Indexing in Spatial Databases and Query Processing

Multi-scan Query Example

• Find all senators who serve a district of area greater than 300 square miles and who own business within the district.•Spatial join example

SELECT S.name FROM Senator S, Business BWHERE S.district.Area() > 300 AND Within(B.location, S.district)

• Here, composition of two sub-queries, a range query and a spatial join• Which one is first?