1dbtest2008. motivation background relational data warehousing (dw) sql server 2008 starjoin...

24
Testing Challenges for Extending SQL Server's Query Processor: A Case Study Torsten Grabs, Steve Herbert, Xin (Shin) Zhang {torsteng; stevhe; xinzh}@microsoft.com 1 DBTest2008

Upload: sheena-thornton

Post on 24-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 1

Testing Challenges for Extending SQL Server's Query Processor:A Case Study

Torsten Grabs, Steve Herbert, Xin (Shin) Zhang {torsteng; stevhe; xinzh}@microsoft.com

Page 2: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 2

Agenda

MotivationBackground

Relational Data Warehousing (DW)SQL Server 2008 Starjoin improvement

Testing ChallengeExtending Enterprise-class Commercial DBMS

SolutionIterative development processMulti-dimensional testing

Case Study ResultsConclusions

Page 3: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 3

Motivation

Data warehouses are hugeBillions of rows in fact tablesMulti-terabyte database

Query response time requirements are strictInteractive response times desired: <5 secIdeally: speed-of-thought response time

Plan choice is CRUCIAL for good performanceUser requirements are challenging

Large input spaceZero administration overheadDo not break existing customer base

Page 4: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 4

Background: Relational DW

Fact Table

PeriodDate_KeyQuarter_NumberYear

ProductProduct_KeyProduct_IDProduct_NameCategory

SalesDate_KeyProduct_KeyQty_SoldDollars

Dimension Tables

Business Question:

Give me total sales of SQL Server 2005 in second quarter of year 2006.

Example Star Query:

SELECT SUM(Dollars) FROM Sales S JOIN Product P ON P.Product_Key = S.Product_KeyJOIN Period Pe ON Pe.Date_Key = S.Date_KeyWHERE Product_Name = 'SQL Server 2005' AND Quarter_Number = 2 AND Year = 2006

Page 5: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 5

Background: New Feature

Fact selectivity matters for plan choiceSQL 2008 improve medium selectivity queries

100% of fact rows qualify

0% of fact rows qualify

Seek-based plans with

nested loop joins

fact table selectivity

Scan-based plans with regular hash joins

Scan-based plans with bitmap hash joins

High selectivity queries

Medium selectivity queries

Low selectivity queries

Page 6: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DW-specific Extensions of the SQL Query Processor

SQL Server Query Optimizer

Standard (join) query optimizations

Alternative query plans

Cost-based plan

choice

Final query plan

Star query detection

Selectivity analysis

Sta

ndar

d op

timiz

atio

nO

ptim

izat

ion

exte

nsio

n fo

r D

W

Schema detection

Star query plans

Page 7: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

Bitmap-based semi-join reduction

Hash Join

Hash JoinFilter

Productdimension

tableFilter

Store dimension

table

Fact Table Scan

Join Reduction Info 1

Join Reduction Info 2

Join Reduction Processing

Join Reduction Info 2

Join Reduction Info 1

SK_D1 SK_D2 Meas1 Meas2

D1_05 D2_01 1 11

D1_01 D2_03 2 11

D1_05 D2_03 3 11

D1_05 D2_03 4 11

D1_07 D2_04 5 11

SK_D1 SK_D2 Meas1 Meas2

D1_05 D2_03 3 11

D1_05 D2_03 4 11

Rowset before join reduction

Rowset afterjoin reduction

SK_D1

D1_05

SK_D2

D2_03

SK_D1

D1_05

SK_D2

D2_03

Surrogate key values of rows qualifying the filter

over the product dimension

Surrogate key values of rows qualifying the filter over the store dimension

Page 8: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 8

Testing Challenge

Large input space of queriesFull range of selectivityMixed ad-hoc and parameterized queriesComplex schema and workloads

Automatic featureCorrect cost based plan choiceSmart plan pattern detectionAccurate join selectivity estimationNo knobs – no application changes required

Happy existing customersSignificant improvements Negligible regressions

Page 9: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 9

Agenda

MotivationBackground

Relational Data Warehousing (DW)SQL Server 2008 Starjoin improvement

Testing ChallengeExtending Enterprise-class Commercial Server

SolutionIterative development process Multi-dimensional testing

Case Study ResultsConclusions

Page 10: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 10

Iterative Development Process

In-cycle validation of assumptionsMitigates risk of major end-of-cycle issues

Especially performance problems

Maximal paralleling of testing and developing efforts

Quality

Page 11: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 11

Multi-Dimensional Testing

Functional testingTarget testing to ensure core functionalityModel-based testing to ensure coverage

Performance testingComponentBenchmarkCustomer Workloads

Page 12: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 12

Functional: Target Testing

Query Results

Functional Correctness

Bitmap Filtering

Page 13: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 13

Functional: Model-Based Testing

Large number of test dimensions10+ test dimensions …If assume 3 variations each …will generate 60K combinations!

Two abstract models covering key requirementsSchema model

Database schema and data

Query modelStar-join queries built on top of the schema model

Page 14: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 14

Functional: Schema Model

Schema Model

Number and Classification

of Tables

Relationships Between Tables

Cardinalities

Data Distributions

Page 15: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 15

Functional: Query Model

Query Model

Number of Facts

Number of Dimensions

Dimension Selectivity

Fact AggregationsNested

Subqueries

Fact Selectivity

Page 16: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 16

Model-based Test Example

Test scenario Testing selectivity estimation of single fact star schema

Schema modelNumber and classification of tables: fact 1, dimension 5

Relationships between tables: star schema

Cardinality: fact 100K rows, dimension 10 rows each

Data distribution: uniform

Query modelNumber of facts: 1

Number of dimensions: 10

Dimension selectivity: 0.4~0.8 (5 choices)

Fact aggregation: 1 aggregation (12 possible types)

Nested subqueries: none

Fact selectivity: 0.1~1.0

Single test covers 55*12 (37,836) tests cases

Page 17: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 17

Performance Testing

Component• Micro-benchmark• Targeted Test

Workloads• Microsoft

Sales• Retail

Business• …

Benchmarks• TPC-H• Decision

Support

Page 18: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 18

Case Study Results

~10 different workloads3 representative results

Decision support workload resultsMicrosoft sales data warehouse resultsRetail workload results

Page 19: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 19

Results: Decision Support Workload

Limited performance benefit for initial designLots of regressions initiallyGood convergence over several iterations

100GB data70+ queriesTypical DSS scenario

SchemaQueries

Run 1

Run 2

Run 3

Run 4

Run 5

Run 6

Run 7

0%

2%

4%

6%

8%

10%

12%

14%

16%

90%

100%

110%

120%

130%

140%

150%

% of regressed queries in workload (comparing baseline vs. star join optimization)SQL Server 2008 with star join optimization

% o

f re

gre

ss

ed

qu

eri

es

ge

om

ea

n q

ue

ry

res

po

ns

e t

ime

ra

tio

Page 20: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 20

Results: Microsoft Sales DW

Started with good design for performanceBut: too many regressions with initial designConverge to good result over several iterations

Run 1

Run 2

Run 3

Run 4

Run 5

Run 6

Run 7

0%

2%

4%

6%

8%

10%

12%

14%

16%

90%

100%

110%

120%

130%

140%

150%

% of regressed queries in workload (comparing baseline vs. star join optimization)SQL Server 2008 with star join optimization

% o

f re

gre

ss

ed

qu

eri

es

ge

om

ea

n q

ue

ry

res

po

ns

e t

ime

750GB data50 queriesComplex queries

> 20 joins

Page 21: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 21

Results: Retail Workload

Several iterations to establish the “winning” designSignificant improvements after several iterationsRegressions limited to “2 wrongs make 1 right” (see Giakoumakis/Galindo-Legaria TKDE 2008)

100GB data30 queriesComplex physical design

Indexes Partitioning

No Run

No Run

Run 3

Run 4

Run 5

Run 6

Run 7

0%

5%

10%

15%

20%

25%

30%

35%

90%

95%

100%

105%

110%

115%

120%

125%

% of regressed queries in workload (comparing baseline vs. star join optimization)SQL Server 2008 with star join optimization

% o

f re

gre

ss

ed

qu

eri

es

ge

om

ea

n q

ue

ry

res

po

ns

e t

ime

Page 22: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

DBTest2008 22

Conclusions

Extension of the SQL Server in relational DWNew feature with zero administration overheadWidely deployed system

Identified testing challengesBalance performance improvement and regression risk

SolutionIterative development and testing cyclesMulti-dimensional testing (functional, performance)

Iterative development and testing insightsSupports learning and adjustment during developmentDelivers well-understood results Leads to high-quality features

Page 23: 1DBTest2008. Motivation Background Relational Data Warehousing (DW) SQL Server 2008 Starjoin improvement Testing Challenge Extending Enterprise-class

© 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market

conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.