query execution section 15.1 shweta athalye cs257: database systems id: 118 section 1

Post on 04-Jan-2016

214 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Query ExecutionSection 15.1

Shweta AthalyeCS257: Database Systems

ID: 118Section 1

Agenda

Query Processor Query compilation Physical Query Plan Operators

Scanning Tables Table Scan Index scan

Sorting while scanning tables Model of computation for physical operators Parameters for measuring cost I/O cost for scan operators Iterators

Query processor

The query processor is the group of components of a DBMS that turns user queries and data-modification commands into a sequence of database operations and executes those operations

query processor is responsible for supplying details regarding how the query is to be executed

The major parts of the query processor

Query compilation

Query compilation itself is a multi-step process consisting of : Parsing: in which a parse tree representing

query and its structure is constructed Query rewrite: in which the parse tree is

converted to an initial query plan Physical plan generation: where the abstract

query plan is turned into a physical query plan

Outline of query compilation

Physical Query Plan Operators

Physical query plans are built from operators

Each of the operators implement one step of the plan.

Physical operators can be implementations of the operator of relational algebra.

They can also be non relational algebra operators like “scan” which scans tables.

Scanning Tables

One of the most basic things in a physical query plan.

Necessary when we want to perform join or union of a relation with another relation.

Two basic approaches to locating the tuples of a relation R Table-scan

Relation R is stored in secondary memory with its tuples arranged in blocks

it is possible to get the blocks one by one This operation is called Table Scan

Two basic approaches to locating the tuples of a relation R

Index-scan there is an

index on any attribute of Relation R

Use this index to get all the tuples of R

This operation is called Index Scan

Sorting While Scanning Tables

Why do we need sorting while scanning? the query could include an ORDER BY

clause. Requiring that a relation be sorted Various algorithms for relational-algebra

operations require one or both of their arguments to be sorted relation

physical-query-plan operator sort-scan takes a relation R and a specification of the attributes on which the sort is to be made, and produces R in that sorted order

Model of Computation for Physical Operators

Choosing physical plan operators wisely is an essential for a good query processor.

Cost for an operation is measured in number of disk i/o operations.

If an operator requires the final answer to a query to be written back to the disk, the total cost will depend on the length of the answer and will include the final write back cost to the total cost of the query.

Improvements in cost

Major improvements in cost of the physical operators can be achieved by avoiding or reducing the number of disk i/o operations

This can be achieved by passing the answer of one operator to the other in the main memory itself without writing it to the disk.

Parameters for Measuring Costs

Parameters that affect the performance of a query Buffer space availability in the main memory

at the time of execution of the query Size of input and the size of the output

generated The size of memory block on the disk and the

size in the main memory also affects the performance

I/0 Cost for Scan Operators

If Relation R is clustered, i.e. it is stored in approximately B blocks then the total number of disk operations required is B

If R is clustered but requires a two phase multi way merge sort then the total number of disk i/o required will be 3B.

However, if R is not clustered, then the number of required disk I/0's is generally much higher.

Iterators for Implementation of Physical Operators

Many physical operators can be implemented as an iterator

It is a group of three functions that allows a consumer of the result of the physical operator to get the result one tuple at a time

Iterator

The three functions forming the iterator are:

Open: This function starts the process of getting

tuples. It initializes any data structures needed to

perform the operation

Iterator

GetNext This function returns the next tuple in the

result Adjusts data structures as necessary to

allow subsequent tuples to be obtained If there are no more tuples to return,

GetNext returns a special value NotFound

Iterator

Close This function ends the iteration after all

tuples it calls Close on any arguments of the

operator

top related