query tuning. types of nested queries uncorrelated subqueries with aggregates in the nested query...

16
Query Tuning

Upload: alyson-spencer

Post on 18-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Query Tuning. Types of Nested Queries Uncorrelated subqueries with aggregates in the nested query SELECT ssnum FROM employee WHERE salary > (select avg(salary)

Query Tuning

Page 2: Query Tuning. Types of Nested Queries Uncorrelated subqueries with aggregates in the nested query SELECT ssnum FROM employee WHERE salary > (select avg(salary)

Types of Nested Queries

• Uncorrelated subqueries with aggregates in the nested query

SELECT ssnum FROM employee WHERE salary > (select avg(salary) from employee)

• Uncorrelated subqueries without aggregate in the nested query

SELECT ssnum FROM employeeWHERE dept in (select dept from tech)

• Correlated subqueries with aggregates

SELECT ssnum FROM employee e1WHERE salary =

(SELECT avg(e2.salary) FROM employee e2,

tech WHERE e2.dept =

e1.dept AND e2.dept = tech.dept)

• Correlated subqueries without aggregates

(unusual)

Page 3: Query Tuning. Types of Nested Queries Uncorrelated subqueries with aggregates in the nested query SELECT ssnum FROM employee WHERE salary > (select avg(salary)

Rewriting of Uncorrelated Subqueries without Aggregates

1. Combine the arguments of the two FROM clauses

2. AND together the where cluases, repacing in by =

3. Retain the SELECT clause from the outer block

SELECT ssnum FROM employee WHERE dept in (select dept from tech)

becomesSELECT ssnumFROM employee, techWHERE employee.dept =

tech.dept

Page 4: Query Tuning. Types of Nested Queries Uncorrelated subqueries with aggregates in the nested query SELECT ssnum FROM employee WHERE salary > (select avg(salary)

Rewriting of Uncorrelated Subqueries without Aggregates

• Potential problem with duplicates– SELECT avg(salary)

FROM employeeWHERE manager in (select manager from tech)

– SELECT avg(salary)FROM employee, techWHERE employee.manager = tech.manager

• The rewritten query may include an employee record several times if that employee’s manager manages several departments.

• The solution is to create a temporary table (using DISTINCT to eliminate duplicates).

Page 5: Query Tuning. Types of Nested Queries Uncorrelated subqueries with aggregates in the nested query SELECT ssnum FROM employee WHERE salary > (select avg(salary)

Rewriting of Correlated Subqueries

• Query: find the employees of tech departments who earn exactly the average salary in their department

SELECT ssnumFROM employee e1 WHERE salary = (SELECT avg(e2.salary

FROM employee e2, tech WHERE e2.dept = e1.dept

AND e2.dept = tech.dept);

Page 6: Query Tuning. Types of Nested Queries Uncorrelated subqueries with aggregates in the nested query SELECT ssnum FROM employee WHERE salary > (select avg(salary)

Rewriting of CorrelatedSubqueries

• INSERT INTO tempSELECT avg(salary) as avsalary, employee.deptFROM employee, techWHERE employee.dept = tech.deptGROUP BY employee.dept;

• SELECT ssnumFROM employee, tempWHERE salary = avsalary AND employee.dept = temp.dept

Page 7: Query Tuning. Types of Nested Queries Uncorrelated subqueries with aggregates in the nested query SELECT ssnum FROM employee WHERE salary > (select avg(salary)

Rewriting of Correlated Subqueries

• Query: Find employees of technical departments whose number of friends equals the number of employees in their department where their department is a technique one.

SELECT ssnumFROM employee e1WHERE numfriends = COUNT(SELECT e2.ssnum

FROM employee e2, tech WHERE e2.dept = tech.dept

AND e2.dept = e1.dept);

Page 8: Query Tuning. Types of Nested Queries Uncorrelated subqueries with aggregates in the nested query SELECT ssnum FROM employee WHERE salary > (select avg(salary)

Rewriting of CorrelatedSubqueries

• INSERT INTO tempSELECT COUNT(ssnum) as numcolleagues, employee.deptFROM employee, techWHERE employee.dept = tech.deptGROUP BY employee.dept;

• SELECT ssnumFROM employee, tempWHERE numfriends = numcolleagues AND employee.dept = temp.dept;

• Can you spot the infamous COUNT bug?

Page 9: Query Tuning. Types of Nested Queries Uncorrelated subqueries with aggregates in the nested query SELECT ssnum FROM employee WHERE salary > (select avg(salary)

The Infamous COUNT Bug

• Let us consider Helene who is not in a technical department.

• In the original query, helene’s number of friends would be compared to the count of an empty set which is 0. In case helene has no friends she would survive the selection.

• In the transformed query, helene’s record would not appear in the temporary table because she does not work for a technical department.

• This is a limitation of the correlated subquery rewriting technique when COUNT is involved.

Page 10: Query Tuning. Types of Nested Queries Uncorrelated subqueries with aggregates in the nested query SELECT ssnum FROM employee WHERE salary > (select avg(salary)

Avoid Complicated Correlation Subqueries

• Search all of e2 for each e1 record!

SELECT ssnumFROM Employee e1WHERE salary = (SELECT MAX(salary) FROM Employee e2 WHERE e2.dept = e1.dept

SELECT MAX(salary) as bigsalary, dept INTO TempFROM EmployeeGROUP BY dept

SELECT ssnumFROM Employee, TempWHERE salary = bigsalaryAND Employee.dept = Temp.dept

Page 11: Query Tuning. Types of Nested Queries Uncorrelated subqueries with aggregates in the nested query SELECT ssnum FROM employee WHERE salary > (select avg(salary)

Avoid Complicated Correlation Subqueries

• SQL Server 2000 does a good job at handling the correlated subqueries (a hash join is used as opposed to a nested loop between query blocks)– The techniques

implemented in SQL Server 2000 are described in “Orthogonal Optimization of Subqueries and Aggregates” by C.Galindo-Legaria and M.Joshi, SIGMOD 2001.-10

0

10

20

30

40

50

60

70

80

correlated subquery

Th

rou

gh

pu

t im

pro

vem

ent p

erce

nt

SQLServer 2000

Oracle 8i

DB2 V7.1

> 10000> 1000

Page 12: Query Tuning. Types of Nested Queries Uncorrelated subqueries with aggregates in the nested query SELECT ssnum FROM employee WHERE salary > (select avg(salary)

Avoid Unnecessary Temp Tables

• Creating temp table causes update to catalog• Cannot use any index on original table, e.g. index on dep

SELECT * INTO TempFROM EmployeeWHERE salary > 40000

SELECT ssnumFROM TempWHERE Temp.dept = ‘information systems’

SELECT ssnumFROM EmployeeWHERE Employee.dept = ‘information systems’AND salary > 40000

Page 13: Query Tuning. Types of Nested Queries Uncorrelated subqueries with aggregates in the nested query SELECT ssnum FROM employee WHERE salary > (select avg(salary)

Join Conditions

• It is a good idea to express join conditions on clustering indexes. – No sorting for sort-merge.– Speed up for multipoint access using an indexed nested

loop.• It is a good idea to express join conditions on

numerical attributes rather than on string attributes.

SELECT Employee.ssnumFROM Employee, StudentWHERE Employee.name =Student.name

SELECT Employee.ssnumFROM Employee, StudentWHERE Employee.ssnum =Student.ssnum

Page 14: Query Tuning. Types of Nested Queries Uncorrelated subqueries with aggregates in the nested query SELECT ssnum FROM employee WHERE salary > (select avg(salary)

Join on Clustering and Integer Attributes

• Employee is clustered on ssnum• ssnum is an integer

SELECT Employee.ssnumFROM Employee, StudentWHERE Employee.name = Student.name

SELECT Employee.ssnumFROM Employee, StudentWHERE Employee.ssnum = Student.ssnum

Page 15: Query Tuning. Types of Nested Queries Uncorrelated subqueries with aggregates in the nested query SELECT ssnum FROM employee WHERE salary > (select avg(salary)

Avoid HAVING when WHERE is enough

• May first perform grouping for all departments!

• Having should be reserved for aggregate properties of the groups.– SELECT avg(salary) as

avgsalary, deptFROM employeeGROUP BY deptHAVING count(ssnum) > 100;

SELECT AVG(salary) as avgsalary, deptFROM EmployeeGROUP BY deptHAVING dept = ‘information systems’

SELECT AVG(salary) as avgsalaryFROM EmployeeWHERE dept = ‘information systems’GROUP BY dept

Page 16: Query Tuning. Types of Nested Queries Uncorrelated subqueries with aggregates in the nested query SELECT ssnum FROM employee WHERE salary > (select avg(salary)

summary