apollo: automatic detection and diagnosis of performance ... · automatically find sql queries...

44
APOLLO AUTOMATIC DETECTION AND DIAGNOSIS OF PERFORMANCE REGRESSIONS IN DATABASE SYSTEMS Jinho Jung , Hong Hu, Joy Arulraj, Taesoo Kim, Woonhak Kang* *

Upload: others

Post on 08-Oct-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

APOLLOAUTOMATIC DETECTION AND DIAGNOSIS OF

PERFORMANCE REGRESSIONS IN DATABASE SYSTEMS

Jinho Jung, Hong Hu, Joy Arulraj, Taesoo Kim, Woonhak Kang*

*

Page 2: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

APOLLO

• Holistic toolchain for debugging DBMS

2

1

JINHO JUNG ([email protected])

AUTOMATICALLY FIND SQL QUERIES EXHIBITINGPERFORMANCE REGRESSIONS

2 AUTOMATICALLY DIAGNOSE THE ROOT CAUSE OFPERFORMANCE REGRESSIONS

Page 3: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

MOTIVATION: DBMS COMPLEXITY

3

6.1

26.4

47.7

1.4 4.48.7

0

10

20

30

40

50

60

2000 2010 Present

Release Year

PostgreSQL SQLite

7x increase

CodeSize(MB)

JINHO JUNG ([email protected])

Page 4: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

MOTIVATION: PERFORMANCE REGRESSIONS

JINHO JUNG ([email protected]) 4

CHALLENGING TO BUILD SYSTEM WITH PREDICTABLE PERFORMANCE

Page 5: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

MOTIVATION: PERFORMANCE REGRESSIONS

• Scenario: User upgrades a DBMS installation▫ Query suddenly takes 10 times longer to execute▫ Due to unexpected interactions between different components▫ Refer to this behavior as a performance regression

• Performance regression can hurt user productivity▫ Can easily convert an interactive query to an overnight one

JINHO JUNG ([email protected]) 5

CHALLENGING TO BUILD SYSTEM WITH PREDICTABLE PERFORMANCE

Page 6: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

MOTIVATION: PERFORMANCE REGRESSIONS

6

SELECT R0.S_DIST_06FROM PUBLIC.STOCK AS R0WHERE (R0.S_W_ID < CAST(LEAST(0, 1) AS INT8))

JINHO JUNG ([email protected])

> 10,000x slowdown

LATEST VERSIONOF POSTGRESQL

• Due to a recent optimizer update▫ New policy for choosing the scan algorithm ▫ Resulted in over-estimating the number of rows in the table▫ Earlier version: Fast bitmap scan▫ Latest version: Slow sequential scan

Page 7: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

MOTIVATION: DETECTING REGRESSIONS

7

SELECT NO FROM ORDER AS R0WHERE EXISTS (SELECT CNT FROM SALES AS R1WHERE EXISTS (SELECT ID FROM HISTORY AS R2WHERE (R0.INFO IS NOT NULL));

Query runsslower on

latest version

SELECT NO FROM ORDER AS R0WHERE EXISTS (

SELECT CNT FROM SALES AS R1WHERE EXISTS (

SELECT ID FROM HISTORY AS R2WHERE (R0.INFO IS NOT NULL));

JINHO JUNG ([email protected])

1 HOW TO DISCOVER QUERIES EXHIBITING REGRESSIONS?

Page 8: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

MOTIVATION: REPORTING REGRESSIONS

8JINHO JUNG ([email protected])

SELECT NO FROM ORDER AS R0WHERE EXISTS (SELECT CNT FROM SALES AS R1WHERE EXISTS (SELECT ID FROM HISTORY AS R2WHERE (R0.INFO IS NOT NULL));

Query runsslower on

latest version

SELECT NO FROM ORDER AS R0WHERE EXISTS (

SELECT CNT FROM SALES AS R1WHERE EXISTS (

SELECT ID FROM HISTORY AS R2WHERE (R0.INFO IS NOT NULL));

2 HOW TO SIMPLIFY QUERIES FOR REPORTING REGRESSION?

Page 9: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

MOTIVATION: DIAGNOSING REGRESSIONS

9JINHO JUNG ([email protected])

SELECT NO FROM ORDER AS R0WHERE EXISTS (SELECT CNT FROM SALES AS R1WHERE EXISTS (SELECT ID FROM HISTORY AS R2WHERE (R0.INFO IS NOT NULL));

Query runsslower on

latest version

SELECT NO FROM ORDER AS R0WHERE EXISTS (

SELECT CNT FROM SALES AS R1WHERE EXISTS (

SELECT ID FROM HISTORY AS R2WHERE (R0.INFO IS NOT NULL));

3 HOW TO DIAGNOSE THE ROOT CAUSE OF THE REGRESSION?

Page 10: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

APOLLO TOOLCHAIN

10

OLDVERSION

NEWVERSION

SQLFUZZ SQLMIN SQLDEBUG

APOLLO TOOLCHAIN BUG REPORTS

- Query- Commit- File list- Function

JINHO JUNG ([email protected])

1 HOW TO DISCOVER QUERIES EXHIBITING REGRESSIONS?

SQLFUZZ: FEEDBACK-DRIVEN FUZZING

Page 11: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

APOLLO TOOLCHAIN

11JINHO JUNG ([email protected])

OLDVERSION

NEWVERSION

SQLFUZZ SQLMIN SQLDEBUG

APOLLO TOOLCHAIN BUG REPORTS

- Query- Commit- File list- Function

2 HOW TO SIMPLIFY QUERIES FOR REPORTING REGRESSION?

SQLMIN: BI-DIRECTIONAL QUERY REDUCTION ALGORITHMS

Page 12: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

APOLLO TOOLCHAIN

12JINHO JUNG ([email protected])

OLDVERSION

NEWVERSION

SQLFUZZ SQLMIN SQLDEBUG

APOLLO TOOLCHAIN BUG REPORTS

- Query- Commit- File list- Function

3 HOW TO DIAGNOSE THE ROOT CAUSE OF THE REGRESSION?

SQLDEBUG: STATISTICAL DEBUGGING + COMMIT BISECTION

Page 13: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

TALK OVERVIEW

13

OLDVERSION

NEWVERSION

SQLFUZZ SQLMIN SQLDEBUG

APOLLO TOOLCHAIN BUG REPORTS

- Query- Commit- File list- Function

JINHO JUNG ([email protected])

Page 14: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#1: SQLFUZZ — DETECTING REGRESSIONS

14

OLD VERSION

NEWVERSION

QueryGenerator

QueryExecutor

BugValidator

SQLFuzz

Randomqueries

Candidatequeries

Queriesexhibiting

performanceregression

Update SQL grammarprobability table

1 2 3

JINHO JUNG ([email protected])

Page 15: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#1: SQLFUZZ — DETECTING REGRESSIONS

15

QueryGenerator

Retrieveschema

SQL grammarprobability table

Validqueries

Checkcomplexity

Queriesfor fuzzing

SELECT 0.3 LEFT JOIN 0.3

LIMIT 0.2 CAST 0.2

JINHO JUNG ([email protected])

1 QUERY GENERATOR: RANDOM QUERY GENERATION

Page 16: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#1: SQLFUZZ — DETECTING REGRESSIONS

16

OLDVERSION

NEWVERSION

QueryExecutor

FoundRegression?

SELECT R0.S_DIST_06 FROM PUBLIC.STOCK AS R0WHERE (R0.S_W_ID <CAST (LEAST(0, 1) AS INT8))

Update table

CASE LEFT JOIN

LIMIT CAST +0.1

JINHO JUNG ([email protected])

2 QUERY EXECUTOR: FEEDBACK-DRIVEN FUZZING

SQL grammar probability table

Page 17: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#1: SQLFUZZ — DETECTING REGRESSIONS

17

1 Non-deterministic behavior?

2 Non-executed plan?

3 Usage of catalog statistics?

4 Enough memory?

5 Limit statement?

6 Query is too complex?

7 …

Developers

Updated filtering rules

RegressionQuery

Filtering rules

Report

JINHO JUNG ([email protected])

3 REGRESSION VALIDATOR: REDUCING FALSE POSITIVES

Page 18: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

TALK OVERVIEW

18

OLDVERSION

NEWVERSION

SQLFUZZ SQLMIN SQLDEBUG

APOLLO TOOLCHAIN BUG REPORTS

- Query- Commit- File list- Function

JINHO JUNG ([email protected])

Page 19: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#2: SQLMIN — REPORTING REGRESSIONS

• Bottom-up Query Reduction▫ Extract valid sub-query

• Top-down Query Reduction▫ Iteratively removes unnecessary expressions

19JINHO JUNG ([email protected])

Page 20: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#2: SQLMIN — REPORTING REGRESSIONS

20

SELECT S1.C2FROM (SELECTCASE WHEN EXISTS (

SELECT S0.C0FROM ORDER AS R1WHERE ((S0.C0 = 10) AND (S0.C1 IS NULL))

) THEN S0.C0 END AS C2,FROM (

SELECT R0.I_PRICE AS C0, R0.I_DATA AS C1,(SELECT ID FROM ITEM) AS C2

FROM ITEM AS R0WHERE R0.PRICE IS NOT NULL

OR (R0.PRICE IS NOT S1.C2)LIMIT 1000) AS S0) AS S1;

JINHO JUNG ([email protected])

Page 21: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#2: SQLMIN — REPORTING REGRESSIONS

21

BOTTOM-UPREDUCTIONEXTRACT SUB-QUERY

Removedependencies

JINHO JUNG ([email protected])

SELECT S1.C2FROM (SELECTCASE WHEN EXISTS (

SELECT S0.C0FROM ORDER AS R1WHERE ((S0.C0 = 10) AND (S0.C1 IS NULL))

) THEN S0.C0 END AS C2,FROM (

SELECT R0.I_PRICE AS C0, R0.I_DATA AS C1,(SELECT ID FROM ITEM) AS C2

FROM ITEM AS R0WHERE R0.PRICE IS NOT NULL

OR (R0.PRICE IS NOT S1.C2)LIMIT 1000) AS S0) AS S1;

Page 22: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#2: SQLMIN — REPORTING REGRESSIONS

22

Remove condition

Remove columnsRemove sub-queries

Remove clauseJINHO JUNG ([email protected])

SELECT S1.C2FROM (SELECTCASE WHEN EXISTS (

SELECT S0.C0FROM ORDER AS R1WHERE ((S0.C0 = 10) AND (S0.C1 IS NULL))

) THEN S0.C0 END AS C2,FROM (

SELECT R0.I_PRICE AS C0, R0.I_DATA AS C1,(SELECT ID FROM ITEM) AS C2

FROM ITEM AS R0WHERE R0.PRICE IS NOT NULL

OR (R0.PRICE IS NOT S1.C2)LIMIT 1000) AS S0) AS S1;

TOP-DOWNREDUCTIONREMOVE ELEMENTS

Page 23: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#2: SQLMIN — REPORTING REGRESSIONS

23JINHO JUNG ([email protected])

SELECTCASE WHEN EXISTS (

SELECT S0.C0FROM ORDER AS R1WHERE ((S0.C0 = 10))

) THEN S0.C0 END AS C2,FROM (

SELECT R0.I_PRICE AS C0,FROM ITEM AS R0WHERE R0.PRICE IS NOT NULL) AS S0)

AS S1;

Page 24: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

TALK OVERVIEW

24

OLDVERSION

NEWVERSION

SQLFUZZ SQLMIN SQLDEBUG

APOLLO TOOLCHAIN BUG REPORTS

- Query- Commit- File list- Function

JINHO JUNG ([email protected])

Page 25: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

25

Slow

Fast

DBMS

Commitbisection

SQLMIN

SQLDEBUG

Regressionquery

First commit exhibiting regression?

StatisticalDebugger

Control-flowGraphs(Traces)

PartiallyReducedqueries

JINHO JUNG ([email protected])

BUG REPORTS

- Query- Commit- File list- Function

Page 26: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

26

COMMIT 1

COMMIT 2

COMMIT 3

COMMIT 5 NEW VERSION (SLOW QUERY EXECUTION)

OLD VERSION (FAST QUERY EXECUTION)

PROBLEM BEGINS HERE!

JINHO JUNG ([email protected])

1 COMMIT BISECTION: FIND EARLIEST PROBLEMATIC COMMIT

Page 27: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

27

Partially reduced queries

Minimizedquery

Originalquery

SELECT NO FROMORDER AS R0 WHEREEXISTS (SELECTCNTFROM SALES AS R1WHERE EXISTS (SELECT ID FROM

SELECT CNTFROM SALESWHERE CNT > ID

JINHO JUNG ([email protected])

2 QUERY REDUCTION: PARTIALLY REDUCED QUERIES

Collect set of queries

Ready to use statistical debugging?

Page 28: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

28

Functions

int func(){if (cond1)

work;}

int func(){if (cond1)

work;}

JINHO JUNG ([email protected])

OLD VERSION

NEW VERSION

3 CONTROL-FLOW GRAPH COMPARISON: ALIGN TRACES

Page 29: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

29

Functions Traces

int func(){if (cond1)

work;}

int func(){if (cond1)

work;}

0x4000x420 TRUE

0x5000x520 FALSE

JINHO JUNG ([email protected])

OLD VERSION

NEW VERSION

3 CONTROL-FLOW GRAPH COMPARISON: ALIGN TRACES

Page 30: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

30

Functions Traces

int func(){if (cond1)

work;}

int func(){if (cond1)

work;}

0x4000x420 TRUE

0x5000x520 FALSE

Trace Alignment

func + 0x0func + 0x20 TRUE

func + 0x0func + 0x20 FALSE

JINHO JUNG ([email protected])

OLD VERSION

NEW VERSION

3 CONTROL-FLOW GRAPH COMPARISON: ALIGN TRACES

Page 31: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

31

Statisticalmodel

JINHO JUNG ([email protected])

Fast queryexecution traces

Slow queryexecution traces

PRED. RESULT

1 TAKEN

2 TAKEN

PRED. RESULT

1 TAKEN

2 NOTTAKEN

4 STATISTICAL DEBUGGING: FAST AND SLOW QUERY TRACES

Page 32: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

32

Fast queryexecution traces

Slow queryexecution traces

Final report

RANK FILE FUNCTION LINE

1 foo.c bar() 2

… … … …

PRED. RESULT

1 TAKEN

2 TAKEN

PRED. RESULT

1 TAKEN

2 NOTTAKEN

Statisticalmodel

JINHO JUNG ([email protected])

4 STATISTICAL DEBUGGING: FAST AND SLOW QUERY TRACES

Page 33: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

RECAP

33

OLDVERSION

NEWVERSION

SQLFUZZ SQLMIN SQLDEBUG

APOLLO TOOLCHAIN BUG REPORTS

- Query- Commit- File list- Function

JINHO JUNG ([email protected])

Page 34: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

EVALUATION

• Tested database systems▫ PostgreSQL, SQLite

• Binary instrumentation to get control flow graphs▫ DynamoRIO instrumentation tool

• Evaluation▫ Efficacy of SQLFuzz in detecting regressions?▫ Efficacy of SQLMin in reducing queries?▫ Accuracy of SQLDebug in diagnosing regressions?

34JINHO JUNG ([email protected])

Page 35: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#1: SQLFUZZ — DETECTING REGRESSIONS

35

218 201

0

50

100

150

200

250

PostgreSQL SQLite

200x performancedrop

MeanPerformance

Drop(Ratio)

Lower isBetter

JINHO JUNG ([email protected])

Discovered 10 previously unknown,unique performance regressions.

Page 36: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#1: SQLFUZZ — FALSE POSITIVES

36

99

0.00440.001

0.01

0.1

1

10

100

DiscoveredQueries

SQLFuzz

False Positives Queries(Percent)

JINHO JUNG ([email protected])

Lower isBetter

Filtering rules removealmost all false positives

Page 37: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#2: SQLMIN — REPORTING REGRESSIONS

37

1602

3800

500

1000

1500

2000

DiscoveredQueries

SQLMin

QuerySize

(Bytes)

JINHO JUNG ([email protected])

Lower isBetter

Significant reductionin query size

Page 38: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

38

52

3 FIRST RANKED BRANCH

SECOND RANKED BRANCH

THIRD RANKED BRANCH

10 regressionsJINHO JUNG ([email protected])

Branch related to root causeCorrectly identified in all cases

(within top-3 ranked branches)

Page 39: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

CASE STUDY #1: OPTIMIZER UPDATE

39

SELECT COUNT (∗)FROM (SELECT R0.IDFROM CUSTOMER AS R0 LEFT JOIN STOCK AS R1ON (R0.STREET = R1.DIST)WHERE R1.DIST IS NOT NULL) AS S0

WHERE EXISTS (SELECT ID FROM CUSTOMER);

• Due to a bug fix (for a correctness bug)▫ Breaks query optimization▫ Optimizer no longer transforms the LEFT JOIN operator

JINHO JUNG ([email protected])

> 1000x slow down

LATEST VERSIONOF SQLITE

Page 40: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

CASE STUDY #2: EXECUTION ENGINE UPDATE

40

SELECT R0.ID FROM ORDER AS R0WHERE EXISTS (SELECT COUNT(∗)

FROM (SELECT DISTINCT R0.ENTRYFROM CUSTOMER AS R1WHERE (FALSE)) AS S1);

• Hashed aggregation executor update

▫ Resulted in redundantly building hash tables

JINHO JUNG ([email protected])

3x slow down

LATEST VERSIONOF POSTGRESQL

Page 41: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

CONCLUSION

41JINHO JUNG ([email protected])

• APOLLO (v1.0)▫ Toolchain for detecting & diagnosing regressions▫ Open-sourced: https://github.com/sslab-gatech/apollo

• Adding support for other types of bugs (v2.0)▫ Correctness bugs▫ Performance bugs▫ Database corruption

Page 42: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

CONCLUSION

42JINHO JUNG ([email protected])

• Interested in integrating APOLLO with more DBMSs▫ Discovered > 5 performance regressions in CockroachDB▫ Improve the toolchain based on developer feedback

• Automation will help reduce labor of developing DBMSs▫ Developers get to focus on more important problems

Page 43: APOLLO: Automatic Detection and Diagnosis of Performance ... · AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2. ... OF POSTGRESQL •Due to a recent optimizer

ACKNOWLEDGEMENTS

JINHO JUNG ([email protected]) 43

Supported by:

Developers: