query execution section 15.1 shweta athalye cs257: database systems id: 118 section 1

19
Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Upload: caroline-wilkins

Post on 04-Jan-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Query ExecutionSection 15.1

Shweta AthalyeCS257: Database Systems

ID: 118Section 1

Page 2: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Agenda

Query Processor Query compilation Physical Query Plan Operators

Scanning Tables Table Scan Index scan

Sorting while scanning tables Model of computation for physical operators Parameters for measuring cost I/O cost for scan operators Iterators

Page 3: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Query processor

The query processor is the group of components of a DBMS that turns user queries and data-modification commands into a sequence of database operations and executes those operations

query processor is responsible for supplying details regarding how the query is to be executed

Page 4: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

The major parts of the query processor

Page 5: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Query compilation

Query compilation itself is a multi-step process consisting of : Parsing: in which a parse tree representing

query and its structure is constructed Query rewrite: in which the parse tree is

converted to an initial query plan Physical plan generation: where the abstract

query plan is turned into a physical query plan

Page 6: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Outline of query compilation

Page 7: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Physical Query Plan Operators

Physical query plans are built from operators

Each of the operators implement one step of the plan.

Physical operators can be implementations of the operator of relational algebra.

They can also be non relational algebra operators like “scan” which scans tables.

Page 8: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Scanning Tables

One of the most basic things in a physical query plan.

Necessary when we want to perform join or union of a relation with another relation.

Page 9: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Two basic approaches to locating the tuples of a relation R Table-scan

Relation R is stored in secondary memory with its tuples arranged in blocks

it is possible to get the blocks one by one This operation is called Table Scan

Page 10: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Two basic approaches to locating the tuples of a relation R

Index-scan there is an

index on any attribute of Relation R

Use this index to get all the tuples of R

This operation is called Index Scan

Page 11: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Sorting While Scanning Tables

Why do we need sorting while scanning? the query could include an ORDER BY

clause. Requiring that a relation be sorted Various algorithms for relational-algebra

operations require one or both of their arguments to be sorted relation

physical-query-plan operator sort-scan takes a relation R and a specification of the attributes on which the sort is to be made, and produces R in that sorted order

Page 12: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Model of Computation for Physical Operators

Choosing physical plan operators wisely is an essential for a good query processor.

Cost for an operation is measured in number of disk i/o operations.

If an operator requires the final answer to a query to be written back to the disk, the total cost will depend on the length of the answer and will include the final write back cost to the total cost of the query.

Page 13: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Improvements in cost

Major improvements in cost of the physical operators can be achieved by avoiding or reducing the number of disk i/o operations

This can be achieved by passing the answer of one operator to the other in the main memory itself without writing it to the disk.

Page 14: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Parameters for Measuring Costs

Parameters that affect the performance of a query Buffer space availability in the main memory

at the time of execution of the query Size of input and the size of the output

generated The size of memory block on the disk and the

size in the main memory also affects the performance

Page 15: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

I/0 Cost for Scan Operators

If Relation R is clustered, i.e. it is stored in approximately B blocks then the total number of disk operations required is B

If R is clustered but requires a two phase multi way merge sort then the total number of disk i/o required will be 3B.

However, if R is not clustered, then the number of required disk I/0's is generally much higher.

Page 16: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Iterators for Implementation of Physical Operators

Many physical operators can be implemented as an iterator

It is a group of three functions that allows a consumer of the result of the physical operator to get the result one tuple at a time

Page 17: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Iterator

The three functions forming the iterator are:

Open: This function starts the process of getting

tuples. It initializes any data structures needed to

perform the operation

Page 18: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Iterator

GetNext This function returns the next tuple in the

result Adjusts data structures as necessary to

allow subsequent tuples to be obtained If there are no more tuples to return,

GetNext returns a special value NotFound

Page 19: Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1

Iterator

Close This function ends the iteration after all

tuples it calls Close on any arguments of the

operator