1 © copyright 2009 emc corporation. all rights reserved. data warehousing features in sql server...

23
1 © Copyright 2009 EMC Corporation. All rights reserved. Data Warehousing Features in SQL Server 2008 James Rowland-Jones @jrowlandjones

Upload: billy-bellar

Post on 16-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

1© Copyright 2009 EMC Corporation. All rights reserved.

Data Warehousing Features in SQL Server 2008

James Rowland-Jones@jrowlandjones

2© Copyright 2009 EMC Corporation. All rights reserved.

Official DW Feature Set in SQL 2008

  Build Manage Deliver Insight

SQL Server RDBMS

MERGE statementChange data capture (CDC)Minimally logged INSERT

Backup compression

Star join performanceFaster parallel query on partitioned tablesGROUPING SETS

Resource governorData compressionPartition-aligned indexed views

Integration Services

Lookup performancePipeline performance

   

Analysis Services   Backup MDX Query Performance: Block ComputationQuery & Write-back Performance

Scalable Shared DatabaseReporting Services

  Reporting scalabilityServer scalability

3© Copyright 2009 EMC Corporation. All rights reserved.

JRJ’s DW Feature Set in SQL 2008

  Build Manage Deliver Insight

SQL Server RDBMS

MERGE statementChange data capture (CDC)Minimally logged INSERT & TF 610NEW Data Types

Backup compression

Star join performanceFaster parallel query on partitioned tablesFew Outer Rows ParallelismGROUPING SETSISOWEEK IN DATEPART

Resource governorData compressionPartition-aligned indexed viewsPartition index rebuildsFiltered Indexes

Integration Services

Lookup performancePipeline performanceData Profiling Task

   

Analysis Services   Backup MDX Query Performance: Block ComputationQuery & Writeback Performance

Scalable Shared DatabaseReporting Services

  Reporting scalabilityServer scalability

4© Copyright 2009 EMC Corporation. All rights reserved.

What We’ll Focus On

  Build Manage Deliver Insight

SQL Server RDBMS

MERGE statementChange data capture (CDC)Minimally logged INSERT & TF 610NEW Data Types

Backup compression

Star join performanceFaster parallel query on partitioned tablesFew Outer Rows ParallelismGROUPING SETSISOWEEK IN DATEPART

Resource governorData compressionPartition-aligned indexed viewsPartition index rebuildsFiltered Indexes

Integration Services

Lookup performancePipeline performanceData Profiling Task

   

Analysis Services   Backup MDX Query Performance: Block ComputationQuery & Writeback Performance

Scalable Shared DatabaseReporting Services

  Reporting scalabilityServer scalability

5© Copyright 2009 EMC Corporation. All rights reserved.

Data Compression

Enterprise Edition Only

Row and Page Compression Compression Ratio 2 to 1 or 3 to 1 - 50% to 70% reduction in data

Can be for a table, index or a subset of their partitions

Estimate savings: exec sp_estimate_data_compression_savings

Max row size plus compression overhead must not exceed 8060 bytes

6© Copyright 2009 EMC Corporation. All rights reserved.

Compression Alert

CompressedTable

UNCOMPRESSED TEXT

7© Copyright 2009 EMC Corporation. All rights reserved.

Monitoring Compression

SQL Server, Access Methods Object Page compression attempts/sec Pages compressed/sec

Compression Statistics for individual Partitions Dynamic Management Function sys.dm_db_index_operational_stats

8© Copyright 2009 EMC Corporation. All rights reserved.

DEMO TIME

Resource Governor (Quickly)

Data Compression

9© Copyright 2009 EMC Corporation. All rights reserved.

P & P

Partitioning Parallelism

10© Copyright 2009 EMC Corporation. All rights reserved.

Partitioning & Parallelism

Partition Table Parallelism

Few Outer Rows Parallelism

Partition-Aligned Indexed Views SQL 2005 behaviour – needs to be dropped before switch Switch Partition Pulls across indexed view

Rebuild index partition

11© Copyright 2009 EMC Corporation. All rights reserved.

What is a Partitioned Table?

P1 P4P3P2

SELECT

SUM(Sales_Qty) as Sales_Qty,

SUM(Sale_Amt) as Sales_Amount

FROM SalesDB.dbo.Tbl_Fact_Sales

WHERE date_id between '20050703' and '20050716'

12© Copyright 2009 EMC Corporation. All rights reserved.

The “Problem” in SQL 2005

Rows Executes StmtText

1 1 SELECT SUM([Sales_Qty]) [Sales_Qty],SUM([Sale_Amt]) [Sales_Amount] FROM [SalesDB].[dbo].[Tbl_Fact_Sales] WHERE [date_id]>=@1 AND [date_id]<=@2

0 0 |--Compute Scalar(DEFINE:([Expr1002]=CASE WHEN [globalagg1008]=(0) THEN NULL ELSE [globalagg1010] END, [Expr1003]=CASE WHEN [globalagg1012]=(0) THEN NULL ELSE [globalagg1014] END))

1 1 |--Stream Aggregate(DEFINE:([globalagg1008]=SUM([partialagg1007]), [globalagg1010]=SUM([partialagg1009]), [globalagg1012]=SUM([partialagg1011]), [globalagg1014]=SUM([partialagg1013])))

2 1 |--Parallelism(Gather Streams)

2 12 |--Stream Aggregate(DEFINE:([partialagg1007]=COUNT_BIG([SalesDB].[dbo].[Tbl_Fact_Sales].[Sales_Qty] as [ss].[Sales_Qty]), [partialagg1009]=SUM([SalesDB].[dbo].[Tbl_Fact_Sales].[Sales_Qty] as [ss].[Sales_Qty]), [partialagg1011]=COUNT_BIG([SalesDB].[dbo].[Tbl_Fact_Sales].[Sale_Amt] as [ss].[Sale_Amt]), [partialagg1013]=SUM([SalesDB].[dbo].[Tbl_Fact_Sales].[Sale_Amt] as [ss].[Sale_Amt])))

20577235 12 |--Nested Loops(Inner Join, OUTER REFERENCES:([PtnIds1006]) PARTITION ID:([PtnIds1006]))

2 12 |--Parallelism(Distribute Streams, Demand Partitioning)

2 1 | |--Constant Scan(VALUES:(((80)),((81))))

20577235 2 |--Index Seek(OBJECT:([SalesDB].[dbo].[Tbl_Fact_Sales].[IX_Tbl_Fact_Sales_SKDteItmStrIDSalQtySalAmtDiscMkd] AS [ss]), SEEK:([ss].[SK_Date_ID] >= (20050703) AND [ss].[SK_Date_ID] <= (20050716)) ORDERED FORWARD PARTITION ID:([PtnIds1006]))

13© Copyright 2009 EMC Corporation. All rights reserved.

Partitioning & Parallelism Compared

P1 P4P3P2P2

P1 P4P3P2P2

SQL Server 2005

SQL Server 2008

14© Copyright 2009 EMC Corporation. All rights reserved.

Work Around for SQL Server 2005

Partition 4

Partition 3

UNION

SELECT SUM(Sales_Qty) as Sales_Qty, SUM(Sale_Amt) as Sales_AmountFROM SalesDB.dbo.Tbl_Fact_SalesWHERE date_id between '20050703' and '20050709'

SELECT SUM(Sales_Qty) as Sales_Qty, SUM(Sale_Amt) as Sales_AmountFROM SalesDB.dbo.Tbl_Fact_SalesWHERE date_id between '20050710' and '20050716'

15© Copyright 2009 EMC Corporation. All rights reserved.

Few Outer Rows Parallelism

SQL 2005 One thread given per page of rows on a nested loop join

SQL 2008 One thread given per row on a nested loop join

Good for Joins to Date Dim

M$ internal DW Scale Benchmark perf increase by 30%

SELECT d.Date_Desc ,SUM(f.Sale_Amt*f.Sales_Qty)

FROM Tbl_Fact_Store_Sales fJOIN Tbl_Dim_Date dON f.sk_date_id = d.sk_date_id WHERE d.date_value between '10/1/2004' and '10/7/2004'GROUP BY d.Date_Desc

16© Copyright 2009 EMC Corporation. All rights reserved.

Work-Around’s for SQL Server 2005

STUFF YOUR ROW Add a JUNK Col on the Date dimension to force one row

per page

CLUSTER ON A GUID Add a column and populate with GUIDs to encourage

Rows onto separate pages

17© Copyright 2009 EMC Corporation. All rights reserved.

Partition Aligned Indexed Views

The Big Chore was “Sliding” a table with an indexed view on it.

In 2005 this needed to be dropped In 2008 it does not

18© Copyright 2009 EMC Corporation. All rights reserved.

IT’S DEMO TIME

Sliding Window with Indexed View in Place

Rebuild Partitioned Index

Filtered Indexes

19© Copyright 2009 EMC Corporation. All rights reserved.

STAR JOINS “Optimized” Bitmap Filters

What is a Bitmap filter– In memory structure (no index overhead)– Created dynamically– Typically quite small in size

Bitmap Filter SQL 2005– What it was in 2005...– Hash or Merge JOIN

Optimised Bitmap Filter SQL 2008– Enterprise Edition– Parallel Query– Hash JOIN only– Fact table must have > 100 pages– Single Column join (No PK FK relationship requirement)(integer needed for

optimized)– Dimension input cardinalities are smaller than fact input cardinalities– Look for Bitmap warning event for missed opportunities to use Bitmap

20© Copyright 2009 EMC Corporation. All rights reserved.

Minimally Logged Inserts & TF 610

21© Copyright 2009 EMC Corporation. All rights reserved.

Bulk Load Methods Compared

22© Copyright 2009 EMC Corporation. All rights reserved.

FOR THE FINAL TIME

STAR JOINS

Minimally Logged INSERTS