sql top-n and pagination pattern maxym kharchenko

40
SQL Top-N and Pagination Pattern Maxym Kharchenko

Upload: yuliana-alley

Post on 28-Mar-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SQL Top-N and Pagination Pattern Maxym Kharchenko

SQL Top-Nand Pagination Pattern

Maxym Kharchenko

Page 2: SQL Top-N and Pagination Pattern Maxym Kharchenko

What is top-N

• Give me the top 10 salaries in the “Sales” dept• Give me the top 10 best selling books• Give me the 10 latest orders

Page 3: SQL Top-N and Pagination Pattern Maxym Kharchenko

What is top-N

Page 4: SQL Top-N and Pagination Pattern Maxym Kharchenko

SetupSQL> @desc cities Name Null? Type -------------------------- -------- --------------- NAME NOT NULL VARCHAR2(100) STATE NOT NULL VARCHAR2(100) POPULATION NOT NULL NUMBER

PCTFREE 99 PCTUSED 1

http://www.census.gov

Page 5: SQL Top-N and Pagination Pattern Maxym Kharchenko

Naïve Top-N

SELECT name, populationFROM citiesWHERE rownum <= 5ORDER BY population DESC;

NAME Pop---------------------- ------Robertsdale city 5,276Glen Allen town (pt.) 458Boligee town 328Riverview town 184Altoona town (pt.) 30

Give me the top 5 cities by population

Statistics 7 consistent gets

Page 6: SQL Top-N and Pagination Pattern Maxym Kharchenko

Naïve Top-N explained

-----------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Time |-----------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 110 | 00:00:01 || 1 | SORT ORDER BY | | 5 | 110 | 00:00:01 ||* 2 | COUNT STOPKEY | | | | || 3 | TABLE ACCESS FULL| CITIES | 10 | 220 | 00:00:01 |-----------------------------------------------------------------

Page 7: SQL Top-N and Pagination Pattern Maxym Kharchenko

Correct top-N query

SELECT name, populationFROM citiesORDER BY population DESCFETCH FIRST 5 ROWS ONLY;

SELECT * FROM ( SELECT name, population FROM cities ORDER BY population DESC) WHERE rownum <= 5;

>= 12c <= 11g

Page 8: SQL Top-N and Pagination Pattern Maxym Kharchenko

Correct top-N query: Execution

SELECT * FROM ( SELECT name, population FROM cities ORDER BY population DESC) WHERE rownum <= 5;

NAME Pop-------------------- ----------Los Angeles city 3,792,621Chicago city (pt.) 2,695,598Chicago city (pt.) 2,695,598Chicago city 2,695,598New York city (pt.) 2,504,700

Statistics 56024 consistent gets

Page 9: SQL Top-N and Pagination Pattern Maxym Kharchenko

Reading, filtering and sorting

---------------------------------------------------------------------| Id | Operation | Name | Rows |TempSpc| Time |---------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | | 00:01:58 ||* 1 | COUNT STOPKEY | | | | || 2 | VIEW | | 56072 | | 00:01:58 ||* 3 | SORT ORDER BY STOPKEY| | 56072 | 1768K| 00:01:58 || 4 | TABLE ACCESS FULL | CITIES | 56072 | | 00:01:54 |---------------------------------------------------------------------

Page 10: SQL Top-N and Pagination Pattern Maxym Kharchenko

Proper data structure

--------------------------------------------------------------------| Id | Operation | Name | Rows | Time |--------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 10 | 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 56072 | 00:00:01 || 4 | INDEX RANGE SCAN DESCENDING| I_POP | 10 | 00:00:01 |--------------------------------------------------------------------

Ordered By: Population

Statistics 12 consistent gets

CREATE INDEX i_pop ON cities(population);

Page 11: SQL Top-N and Pagination Pattern Maxym Kharchenko

Why index works

• Colocation• Can stop after reading N rows• No Sort

Ordered By: Population

CREATE INDEX i_pop ON cities(population);

Page 12: SQL Top-N and Pagination Pattern Maxym Kharchenko

More elaborate top-N

SELECT * FROM ( SELECT name, population FROM cities WHERE state='Florida' ORDER BY population DESC) WHERE rownum <= 5;

NAME Pop-------------------- ----------Jacksonville city 821,784Miami city 399,457Tampa city 335,709St. Petersburg city 244,769Orlando city 238,300

Give me the top 5 cities by population in Florida

Statistics 264 consistent gets

Page 13: SQL Top-N and Pagination Pattern Maxym Kharchenko

Uncertain nature of filtering

WHERE state='Florida' ORDER BY population DESC) WHERE rownum <= 5;

WHERE state='Florida' ORDER BY population DESC) WHERE rownum <= 200;

Ordered By: Population

Statistics19747 consistent gets

Statistics264 consistent gets

Page 14: SQL Top-N and Pagination Pattern Maxym Kharchenko

Multi column indexes

State

Population

AKAL CO FLAZ

Ordered By:StateState+Population

Not Ordered by: Population

WHERE state=‘FL’

NOW: Ordered By: Population

MA WA

where state=‘FL’

CREATE INDEX i_state_pop ON cities(state, population);

Page 15: SQL Top-N and Pagination Pattern Maxym Kharchenko

Multicolumn indexes-------------------------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 11 | 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 1099 | 00:00:01 ||* 4 | INDEX RANGE SCAN DESCENDING| I_STATE_POP | 11 | 00:00:01 |-------------------------------------------------------------------------

Predicate Information (identified by operation id):

1 - filter(ROWNUM<=5) 4 - access("STATE"='Florida')

Statistics 12 consistent gets

Page 16: SQL Top-N and Pagination Pattern Maxym Kharchenko

Trips to the table-------------------------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 11 | 00:00:01 || 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 1099 | 00:00:01 ||* 4 | INDEX RANGE SCAN DESCENDING| I_STATE_POP | 11 | 00:00:01 |-------------------------------------------------------------------------

Predicate Information (identified by operation id):

1 - filter(ROWNUM<=5) 4 - access("STATE"='Florida')

Statistics 12 consistent gets

Page 17: SQL Top-N and Pagination Pattern Maxym Kharchenko

Index range scan: cost mathWindow: 500 records

4-5 logical reads

~ 5-10 logical reads

~ 10-500 logical reads

Page 18: SQL Top-N and Pagination Pattern Maxym Kharchenko

Covering indexCREATE INDEX i_state_pop_cON cities

(state, population, name);

CREATE INDEX i_state_pop ON cities

(state, population);

--------------------------------------------------------------------------| Id | Operation | Name | Rows | Time |--------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 10 | 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I_STATE_POP_C | 506 | 00:00:01 |--------------------------------------------------------------------------

Statistics 12 consistent gets

Statistics 7 consistent gets

Page 19: SQL Top-N and Pagination Pattern Maxym Kharchenko

Ideal top-N

• Use the index• Make the best index• And read only from the index

Page 20: SQL Top-N and Pagination Pattern Maxym Kharchenko

Less than ideal top-N

• Effect of query conditions

• Effect of deletes and updates

• Technicalities

Page 21: SQL Top-N and Pagination Pattern Maxym Kharchenko

Condition better!

WHERE active != 'N' ORDER BY order_date DESC) WHERE rownum <= 10;

WHERE active = 'Y' ORDER BY order_date DESC) WHERE rownum <= 10;

Statistics12345 consistent gets

Statistics10 consistent gets

CREATE TABLE orders ( … active char(1) NOT NULL CHECK (active IN ('Y', 'N'))

Page 22: SQL Top-N and Pagination Pattern Maxym Kharchenko

Trade WHERE for ORDER BY

SELECT * FROM (SELECT * FROM t WHERE a=12 ORDER BY c) ) WHERE rownum <= 10;

WHERE a=12 ORDER BY c

WHERE a=12 ORDER BY b, c

WHERE a=12 AND b=0ORDER BY c

CREATE INDEX t_idx ON t(a, b, c);

Statistics 1200 consistent gets

Statistics 12 consistent gets

Statistics 12 consistent gets

Page 23: SQL Top-N and Pagination Pattern Maxym Kharchenko

Tolerate filteringSELECT * FROM ( SELECT name, population FROM cities WHERE state != 'Florida' ORDER BY population DESC) WHERE rownum <= 10;

Statistics 28 consistent gets

Page 24: SQL Top-N and Pagination Pattern Maxym Kharchenko

Tolerate filtering--------------------------------------------------------------------| Id | Operation | Name | Rows | Time |--------------------------------------------------------------------| 0 | SELECT STATEMENT | | 11 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 11 | 00:00:01 ||* 3 | TABLE ACCESS BY INDEX ROWID | CITIES | 55566 | 00:00:01 || 4 | INDEX RANGE SCAN DESCENDING| I_POP | 12 | 00:00:01 |-------------------------------------------------------------------

Predicate Information (identified by operation id):---------------------------------------------------

1 - filter(ROWNUM<=10) 3 - filter("STATE"<>'Florida')

Page 25: SQL Top-N and Pagination Pattern Maxym Kharchenko

Updates and DeletesSQL> @desc cities2 Name Null? Type ---------------------- -------- ---------------- NAME NOT NULL VARCHAR2(100) STATE NOT NULL VARCHAR2(100) POPULATION NOT NULL NUMBER BUDGET_SURPLUS NOT NULL VARCHAR2(1)

CREATE INDEX i2_pop ON cities2(budget_surplus, population, name);

Page 26: SQL Top-N and Pagination Pattern Maxym Kharchenko

Updates and Deletes

SELECT * FROM ( SELECT name, population FROM cities2 WHERE budget_surplus='Y' ORDER BY population DESC) WHERE rownum <= 5;

-------------------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 12 | 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I2_POP | 56067 | 00:00:01 |-------------------------------------------------------------------

Statistics 7 consistent gets

Page 27: SQL Top-N and Pagination Pattern Maxym Kharchenko

Updates and Deletes

UPDATE cities2 SET budget_surplus='N' WHERE rowid IN ( SELECT * FROM ( SELECT rowid FROM cities2 ORDER BY population DESC ) WHERE rownum <= 200);

-------------------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 12 | 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I2_POP | 56067 | 00:00:01 |-------------------------------------------------------------------

Statistics 207 consistent gets

Page 28: SQL Top-N and Pagination Pattern Maxym Kharchenko

Updates and Deletes

Page 29: SQL Top-N and Pagination Pattern Maxym Kharchenko

Updates and DeletesALTER TABLE cities2 ADD (version number default 0 NOT NULL);

CREATE INDEX i2_vpop ON cities2(budget_surplus, version, population);

UPDATE cities2 SET version=1WHERE budget_surplus='Y' AND version=0;

0

Y

1

Budget_surplus

Version

Population Y

Y

Budget_surplus

Page 30: SQL Top-N and Pagination Pattern Maxym Kharchenko

Updates and Deletes

--------------------------------------------------------------------| Id | Operation | Name | Rows | Time |--------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 00:00:01 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 1 | 00:00:01 ||* 3 | INDEX RANGE SCAN DESCENDING| I2_VPOP | 1 | 00:00:01 |--------------------------------------------------------------------

Statistics 9 consistent gets

SELECT * FROM ( SELECT name, population FROM cities2 WHERE budget_surplus='Y' AND version=1 ORDER BY population DESC) WHERE rownum <= 5;

Page 31: SQL Top-N and Pagination Pattern Maxym Kharchenko

Pagination

SELECT * FROM ( SELECT name, population FROM cities WHERE state='Florida' ORDER BY population DESC) WHERE rownum <= 10;

SELECT * FROM ( SELECT * FROM ( SELECT name, population, rownum AS rn FROM cities WHERE state='Florida' ORDER BY population DESC ) WHERE rownum <= 20) WHERE rn > 10;

Page 32: SQL Top-N and Pagination Pattern Maxym Kharchenko

Dumb Pagination

) WHERE rownum <= 20) WHERE rn > 10;

Statistics 22 consistent gets

) WHERE rownum <= 30) WHERE rn > 20;

Statistics 32 consistent gets

Page 33: SQL Top-N and Pagination Pattern Maxym Kharchenko

Smart paginationSELECT * FROM ( SELECT name, population FROM cities WHERE state='Florida' AND population < 154750 ORDER BY population DESC) WHERE rownum <= 10;

SELECT * FROM ( SELECT * FROM ( SELECT name, population, rownum AS rn FROM cities WHERE state='Florida' ORDER BY population DESC ) WHERE rownum <= 20) WHERE rn > 10;

Statistics 22 consistent gets

Statistics 12 consistent gets

Page 34: SQL Top-N and Pagination Pattern Maxym Kharchenko

Top-N with joinsSELECT * FROM ( SELECT c.name as city,

c.population, s.capital FROM cities c, states s WHERE c.state_id = s.id AND c.state='Florida' ORDER BY c.population DESC) WHERE rownum <= 5/

state

population

state_id

name

Filter

Order By

Join

Select

Drivingtable:

Joined totable:

idJoin

capitalSelect

Use Nested Loops! Build indexes like this!

Page 35: SQL Top-N and Pagination Pattern Maxym Kharchenko

Top-N with joins: Good

-------------------------------------------------------| Id | Operation | Name | Rows | Time |-------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:13 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 10 | 00:00:13 || 3 | NESTED LOOPS | | 10 | 00:00:13 ||* 4 | INDEX RANGE SCAN| I_C | 506 | 00:00:07 ||* 5 | INDEX RANGE SCAN| I_S | 1 | 00:00:01 |-------------------------------------------------------

Page 36: SQL Top-N and Pagination Pattern Maxym Kharchenko

Top-N with joins: Bad

-----------------------------------------------------------| Id | Operation | Name | Rows | Time |-----------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 00:00:07 ||* 1 | COUNT STOPKEY | | | || 2 | VIEW | | 10 | 00:00:07 ||* 3 | SORT ORDER BY STOPKEY| | 10 | 00:00:07 ||* 4 | HASH JOIN | | 10 | 00:00:07 ||* 5 | INDEX RANGE SCAN | I_C | 506 | 00:00:07 ||* 6 | INDEX RANGE SCAN | I_S | 1 | 00:00:01 |-----------------------------------------------------------

Page 37: SQL Top-N and Pagination Pattern Maxym Kharchenko

Gotchas?

TMI“Too many indexes”

Page 38: SQL Top-N and Pagination Pattern Maxym Kharchenko

Thank you!

Page 39: SQL Top-N and Pagination Pattern Maxym Kharchenko

Query conditions

WHERE state = 'Florida'

State

Population

AKAL CO FLAZ MA WA

where state=‘FL’

WHERE state != 'Florida'

where state != ‘FL’

Page 40: SQL Top-N and Pagination Pattern Maxym Kharchenko

AKAL FL MA WA

… …GA HI

Watch out for DESC/ASC

WHERE state >= 'Florida'ORDER BY state, population DESC) WHERE rownum <= 10

WHERE state >= 'Florida'ORDER BY state, population) WHERE rownum <= 10

Statistics 12 consistent gets

Statistics 107408 consistent gets

CREATE INDEX i_s_pop ON cities(state, population);

+ SORTNO SORT