03 chapter 15 algorithms for query processing optimization

35
ICS 424 - 01 (072) Query Processing and Optimization 1 Chapter 15 Algorithms for Query Processing and Optimization ICS 424 Advanced Database Systems  Dr. Muhammad Shafique

Upload: ashok-varadarajan

Post on 02-Jun-2018

246 views

Category:

Documents


1 download

TRANSCRIPT

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 1/35

ICS 424 - 01 (072) Query Processing andOptimization

Chapter 15

Algorithms for Query Processing

and Optimization

ICS 424 Advanced Database Systems 

Dr. Muhammad Shafique

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 2/35

ICS 424 - 01 (072) Query Processing andOptimization

Outline

• Introduction• Processing a query

• SQL queries and relational algebra

• Implementing basic query operations

• Heuristics-based query optimization

• Overview of query optimization in Oracle

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 3/35

ICS 424 - 01 (072) Query Processing andOptimization

Material Covered from Chapter 15

• Pages 537, 538, 539• Section 15.1

• Section 15.2

• Section 15.6

• Section 15.7

• Section 15.9

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 4/35

ICS 424 - 01 (072) Query Processing andOptimization

Introduction to Query Processing

• Query optimization

• The process of choosing a suitable execution strategy

for processing a query.

• Two internal representations of a query:

• Query Tree• Query Graph

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 5/35

ICS 424 - 01 (072) Query Processing andOptimization

Background Review

• DDL compiler• DML compiler

• Runtime

database

 processor• System catalog

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 6/35

ICS 424 - 01 (072) Query Processing andOptimization

Processing a Query

• Tasks in processing a high-level query1. Scanner scans the query and identifies the language tokens

2. Parser checks syntax of the query

3. The query is validated by checking that all attribute names and

relation names are valid

4. An intermediate internal representation for the query is created(query tree or query graph)

5. Query execution strategy is developed

6. Query optimizer produces an execution plan

7. Code generator generates the object code

8. Runtime database processor executes the code

• Query processing and query optimization

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 7/35

ICS 424 - 01 (072) Query Processing andOptimization

Processing a Query

• Typical steps in processing a high-level query

1. Query in a high-level query language like SQL

2. Scanning, parsing, and validation

3. Intermediate-form of query like query tree

4. Query optimizer5. Execution plan

6. Query code generator

7. Object-code for the query

8. Run-time database processor

9. Results of query

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 8/35

ICS 424 - 01 (072) Query Processing andOptimization

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 9/35

ICS 424 - 01 (072) Query Processing andOptimization

SQL Queries and Relational Algebra

• SQL query is translated into an equivalent extended  

relational algebra expression --- represented as a query tree

• In order to transform a given query into a query tree, the

query is decomposed into query blocks

• Query block :

• The basic unit that can be translated into the algebraic operators andoptimized.

• A query block contains a single SELECT-FROM-WHERE

expression, as well as GROUP BY and HAVING clause if these are

 part of the block. 

• The query optimizer chooses an execution plan for each

 block

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 10/35

ICS 424 - 01 (072) Query Processing andOptimization

10 

COMPANY Relational Database Schema (1)

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 11/35

ICS 424 - 01 (072) Query Processing andOptimization

11 

COMPANY Relational Database Schema (2)

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 12/35

ICS 424 - 01 (072) Query Processing andOptimization

12 

SQL Queries and Relational Algebra (1)

• Example

SELECT Lname, Fname

FROM EMPLOYEE

WHERE Salary > ( SELECT MAX(Salary)

FROM EMPLOYEE

WHERE Dno = 5 )

• Inner block and outer block

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 13/35

ICS 424 - 01 (072) Query Processing andOptimization

13 

Translating SQL Queries into Relational Algebra

SELECT  LNAME, FNAME

FROM  EMPLOYEE

WHERE  SALARY > ( SELECT  MAX (SALARY)

FROM  EMPLOYEE

WHERE  DNO = 5);

SELECT  MAX (SALARY)

FROM  EMPLOYEE

WHERE  DNO = 5

SELECT  LNAME, FNAME

FROM  EMPLOYEE

WHERE  SALARY > C

πLNAME, FNAME (σSALARY>C(EMPLOYEE))  ℱMAX SALARY (σDNO=5 (EMPLOYEE))

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 14/35

ICS 424 - 01 (072) Query Processing andOptimization

14 

SQL Queries and Relational Algebra (2)

• Uncorrelated nested queries Vs Correlated nested queries

• Example

Retrieve the name of each employee who works on all the projectscontrolled by department number 5.

SELECT FNAME, LNAMEFROM EMPLOYEEWHERE ( (SELECT PNO

FROM WORKS_ONWHERE SSN=ESSN)

CONTAINS(SELECT PNUMBERFROM PROJECTWHERE DNUM=5) )

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 15/35

ICS 424 - 01 (072) Query Processing andOptimization

15 

SQL Queries and Relational Algebra (3)

• Example

For every project located in ‘Stafford’, retrieve the project number,the controlling department number and the department manager’slast name, address and birthdate.

• SQL query:

SELECT P.NUMBER,P.DNUM,E.LNAME, E.ADDRESS, E.BDATE

FROM PROJECT AS P,DEPARTMENT AS D, EMPLOYEE AS E

WHERE P.DNUM=D.DNUMBER AND D.MGRSSN=E.SSN ANDP.PLOCATION=‘STAFFORD’; 

• Relation algebra:

PNUMBER, DNUM, LNAME, ADDRESS, BDATE ((( PLOCATION=‘STAFFORD’(PROJECT))

DNUM=DNUMBER  (DEPARTMENT)) MGRSSN=SSN (EMPLOYEE))

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 16/35

ICS 424 - 01 (072) Query Processing andOptimization

16 

SQL Queries and Relational Algebra (4)

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 17/35

ICS 424 - 01 (072) Query Processing andOptimization

17 

Implementing Basic Query Operations

• An RDBMS must provide implementation(s) for

all the required operations including relational

operators and more

• External sorting

• Sort-merge strategy• Sorting phase

•  Number of file blocks (b)

•  Number of available buffers (nB)

• Runs --- (b / nB)

• Merging phase --- passes

• Degree of merging --- the number of runs that are merged

together in each pass

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 18/35

ICS 424 - 01 (072) Query Processing andOptimization

18 

Algorithms for External Sorting (1)

• External sorting:

• Refers to sorting algorithms that are suitable for large files

of records stored on disk that do not fit entirely in main

memory, such as most database files.

• Sort-Merge strategy:• Starts by sorting small subfiles (runs) of the main file and

then merges the sorted runs, creating larger sorted subfiles

that are merged in turn.

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 19/35

ICS 424 - 01 (072) Query Processing andOptimization

19 

Algorithms

for

External

Sorting (2)

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 20/35

ICS 424 - 01 (072) Query Processing andOptimization

20 

Algorithms for External Sorting (3)

• Analysis

 Number of file blocks = b

 Number of initial runs = nR  

Available buffer space = nB 

Sorting phase: nR  = (b/nB) Degree of merging: dM = Min (nB-1, nR );

 Number of passes: nP = (logdM(nR )) 

 Number of block accesses: (2 * b) + (2 * b * (logdM(nR )))

• Example done in the class

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 21/35

ICS 424 - 01 (072) Query Processing andOptimization

21 

Implementing Basic Query Operations (cont.)

• Estimates of selectivity

• Selectivity is the ratio of the number of tuples that satisfy thecondition to the total number of tuples in the relation.

• SELECT (  ) operator implementation

1. Linear search

2. Binary search3. Using a primary index (or hash key)

4. Using primary index to retrieve multiple records

5. Using clustering index to retrieve multiple records

6. Using a secondary index on an equality comparison

7. Conjunctive selection using an individual index

8. Conjunctive selection using a composite index

9. Conjunctive selection by intersection of record pointers

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 22/35

ICS 424 - 01 (072) Query Processing andOptimization

22 

Implementing Basic Query Operations (cont.)

• JOIN operator implementation

1. Nested-loop join

2. Sort-merge join

3. Hash join

• Partition Hash join• Hybrid hash join

• PROJECT operator implementation

• Set operator implementation

• Implementing Aggregate operators/functions

• Implementing OUTER JOIN

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 23/35

ICS 424 - 01 (072) Query Processing andOptimization 23 

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 24/35

ICS 424 - 01 (072) Query Processing andOptimization 24 

Buffer Space and Join performance

In the nested-loop join, it makes a difference which file is chosen forthe outer loop and which for the inner loop. If EMPLOYEE is used for

the outer loop, each block of EMPLOYEE is read once, and the entireDEPARTMENT file (each of its blocks) is read once for each time weread in ( nB - 2) blocks of the EMPLOYEE file. We get the following: 

Total number of blocks accessed for outer file = bE 

 Number of times ( nB - 2) blocks of outer file are loaded =  bE/ nB  –  2  

Total number of blocks accessed for inner file = bD * bE/ nB  –  2

 

Hence, we get the following total number of block accesses:

 bE + ( bE/ nB  –  2  * bD) = 2000 + ( (2000/5)  * 10) = 6000 blocks

On the other hand, if we use the DEPARTMENT records in the outerloop, by symmetry we get the following total number of blockaccesses:

 bD + ( bD/ nB  –  2  * bE) = 10 + ((10/5)  * 2000) = 4010 blocks

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 25/35

ICS 424 - 01 (072) Query Processing andOptimization 25 

Implementing Basic Query Operations (cont.)

• Combining operations using pipelining

• Temporary files based processing

• Pipelining or stream-based processing

• Example: consider the execution of the following query

list of attributes( ( c1(R)  ( c2 (S))

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 26/35

ICS 424 - 01 (072) Query Processing andOptimization 26 

General Transformation Rules for

Relational Algebra Operations

1. Cascade of

: A conjunctive selection condition can be broken up into a cascade (that is, a sequence) ofindividual  operations: C1 AND C2 AND ….AND Cn (R) ≡  C1 (C2( …(Cn(R))…) 

2. Commutativity of  : The  operation is commutative:

C1(C2(R)) ≡ C2(C1(R))3. Cascade of : In a cascade (sequence) of  operations, all

 but the last one can be ignored

4. Commuting  with : If the selection condition cinvolves only those attributes A1, ..., An in the projectionlist, the two operations can be commuted

• And more …

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 27/35

ICS 424 - 01 (072) Query Processing andOptimization 27 

Heuristic-Based Query Optimization

• Outline of heuristic algebraic optimization algorithm

1. Break up SELECT operations with conjunctive conditions into a

cascade of SELECT operations2. Using the commutativity of SELECT with other operations, moveeach SELECT operation as far down the query tree as is permitted

 by the attributes involved in the select condition

3. Using commutativity and associativity of binary operations,rearrange the leaf nodes of the tree

4. Combine a CARTESIAN PRODUCT operation with asubsequent SELECT operation in the tree into a JOIN operation,if the condition represents a join condition

5. Using the cascading of PROJECT and the commuting ofPROJECT with other operations, break down and move lists of

 projection attributes down the tree as far as possible by creatingnew PROJECT operations as needed

6. Identify sub-trees that represent groups of operations that can beexecuted by a single algorithm

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 28/35

ICS 424 - 01 (072) Query Processing andOptimization 28 

Heuristic-Based Query Optimization:

Example

• Query

"Find the last names of employees born after 1957

who work on a project named ‘Aquarius’."

• SQLSELECT LNAME

FROM EMPLOYEE, WORKS_ON, PROJECT

WHERE PNAME=‘Aquarius’ AND PNUMBER=PNO

AND ESSN=SSN AND BDATE.‘1957-12-31’; 

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 29/35

ICS 424 - 01 (072) Query Processing andOptimization 29 

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 30/35

ICS 424 - 01 (072) Query Processing andOptimization 30 

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 31/35

ICS 424 - 01 (072) Query Processing andOptimization 31 

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 32/35

ICS 424 - 01 (072) Query Processing andOptimization 32 

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 33/35

ICS 424 - 01 (072) Query Processing andOptimization 33 

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 34/35

ICS 424 - 01 (072) Query Processing andOptimization 34 

Overview of Query Optimization in Oracle

• Rule-based query optimization: the optimizer chooses execution plans

 based on heuristically ranked operations.• May be phased out

• Cost-based query optimization: the optimizer examines alternative access paths and operator algorithms and chooses the execution plan with lowestestimate cost.

• The query cost is calculated based on the estimated usage of resources such as

I/O, CPU and memory needed.

• Application developers could specify hints to the ORACLE queryoptimizer.

• application developer might know more information about the data.

• SELECT /*+ ...hint... */ [rest of query] 

• SELECT /*+ index(t1 t1_abc) index(t2 t2_abc) */ COUNT(*)FROM t1, t2WHERE t1.col1 = t2.col1;

8/11/2019 03 Chapter 15 Algorithms for Query Processing Optimization

http://slidepdf.com/reader/full/03-chapter-15-algorithms-for-query-processing-optimization 35/35

ICS 424 01 (072) Query Processing andO ti i ti 35

Summary

• Background review

• Processing a query

• SQL queries and relational algebra

• Implementing basic query operations

• Heuristics-based query optimization

• Overview of query optimization in Oracle