microsoft - data warehousing with fasttrack pdw - teched australia 2010

Nicholas DritsasPrincipal Program ManagerSQL Server Customer Advisory TeamMicrosoft

Data Warehousing with FastTrack and PDW

SESSION CODE: #DAT314

3

Agenda

SQL Server DataWarehouse Offering OverviewFast Track Offering

MotivationBalanced Architecture Approach for DWExample FastTrack Reference ArchitecturesOptimizing Storage, Load and MaintenanceCase Studies

Parallel DataWarouse Offering Overview

4

Agenda




5

Microsoft Data Warehousing - Products Positioning

ScaleComplexityHA by defaultSW-HW integration

1

2

3

SQL Server 2008 R2with Fast Track

Reference Architecture

PDW with Hub-and-spoke

SQL Server 2008 R2

4

PDW

1

2

3

Minimal HW tune up/optimization. Supports mixed workloadsBalanced solution for mostly scan centric workloads.

Max HW tune up for most DW scenarios.

4 Most flexible Architecture for handling all DW scenarios.

6

New in SQL Server 2008Data Warehousing Enablers

Deliver insightsManage easilyBuild faster

High speed Adapters

MERGE SQL Statement

Change Data Capture (CDC)

Persistent Lookups

Data Profiling

…………

Star Join Query Optimization

Parallel Query Enhancements

Scale-out Shared Databases

Data Mining Improvements

New - Report Builder 2.0

…………

Data Compression

Backup Compression

Resource Governor

Policy Based Administration

Partition-Aligned Indexed Views

…………Included at no charge! No Fee Based Options:• Compression• Partitioning• Advanced Security• Manageability• ETL• Business Intelligence

7

Agenda




8



1

2

3




SQL Server 2008 R2

4

PDW

1

2

3




9

SQL Server Relational Data Warehouses Today

Hundreds of deployments > 1 TBDozens of deployments > 5 TBA wide variety of approachesSynergy with the SQL Sever BI StackMomentum!

Steady stream of enabling features Resource Governor, Compression, Star Query, …

Next scale breakthrough coming with Parallel Data Warehouse this year

10

Some SQL Data Warehouses Today

Big SANBiggest 64-core ServerConnected together!

What’s wrong withthis picture???

11

System out of balance

This server can consume 16 GB/Sec of IO, but the SAN can only deliver 2 GB/Sec

Even when the SAN is dedicated to the SQL Data Warehouse, which it often isn’tLots of disks for Random IOPS BUTLimited controllers Limited IO bandwidth

System is typically IO bound and queries are slow

Despite significant investment in both Server and Storage

12

The Alternative: A Balanced System

Design a server + storage configuration that can deliver all the IO bandwidth that CPUs can consume when executing a SQL Relational DW workloadAvoid sharing storage devices among serversAvoid overinvesting in disk drives

Focus on scan performance, not IOPSLayout and manage data to maximize range scan performance and minimize fragmentation

13

What is FastTrack Data Warehouse?

A method for designing a cost-effective, balanced system for Data Warehouse workloads Reference hardware configurations developed in conjunction with hardware partners using this methodBest practices for data layout, loading and management

Relational Database Only – Not SSAS, IS, RS

14

Agenda




15

Data Warehouse Workload Characteristics

Scan Intensive

Hash Joins

Aggregations

SELECT L_RETURNFLAG, L_LINESTATUS, SUM(L_QUANTITY) AS SUM_QTY,SUM(L_EXTENDEDPRICE) AS SUM_BASE_PRICE,SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)) AS

SUM_DISC_PRICE,SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)*(1+L_TAX))

AS SUM_CHARGE,AVG(L_QUANTITY) AS AVG_QTY,AVG(L_EXTENDEDPRICE) AS AVG_PRICE,AVG(L_DISCOUNT) AS AVG_DISC,COUNT(*) AS COUNT_ORDER

FROM LINEITEMGROUP BY L_RETURNFLAG,

L_LINESTATUSORDER BY L_RETURNFLAG,

L_LINESTATUS

16

Balanced Architecture Components

17

Balanced System - CPU

Determine your data consumption rate, per CPU core, for your query mix

Simple example: Assume TPCH query 2 is your average query

Run the query on a test server with data fully cached in memory

Execute parallel query using MAXDOP 4

Observe 100% CPU on 4 cores

Time the query and observe # pages read

(Set Statistics IO on; Set Statistics Time on)

Per Core Consumption = (# Logical Reads* 8K)/(CPU Time)

18

You can get more sophisticated…

Realize that queries performing complex calculations, format conversions, multi-dimension hash joins, etc. will be more cpu-intensive than others

Complex queries will consume data at a slower per-core rate than simpler queries

Alternative: Measure per-core data consumption for a variety of queries, and take the weighted average

A standard approach to capacity planning

19

Or you can leave it to us…

We’ve measured a mix of TPCH queries that reflect a ‘prototype’ Data Warehouse workloadConcluded that SQL Sever 2008 R2 on current x64 cores consume ~200 MB/Sec per core on average for this workloadWe use this as a basis for the published reference architecturesYour mileage will vary!

For precise system sizing, measure your own workload

20

Balanced SystemDetermine Storage Sizing

CPU core count and consumption rate for workload will determine # of controllers and enclosures need to provide aggregate throughput# of controllers will determine minimum disk count for delivering the scan bandwidthDetermine desired per-disk capacity based on expected data volume

Leave enough room for TempDB and for extra copies of the largest tables in the system, for maintenance activities

21

Balanced SystemIO Stack

Use a 2x quad-core server as a building block / starting pointEnsure that the per-core data consumption rate can be delivered by all elements of the IO stack

Maximum theoretical throughput for IO stack components sized for an 8 CPU core Fast Track system (assumes 200 MB/s per core)

CPU Socket

(4 Core)

CPU Socket

(4 Core)

22

Balanced SystemDetermine Storage Sizing (2)

Keep in mind theoretical maximums are just that – theoreticalSome testing/validation may be needed

Observed bandwidth realized on 8 core Fast Track system running SQLIO

CPU Socket

(4 Core)

CPU Socket

(4 Core)

23

Balanced System - Scaling the IO Stack

Server

Fiber Switch

HBA

HBA

Storage Enclosure

Storage Processor

Storage Processor

RAID-1RAID-1

RAID-1RAID-1

RAID-1

CPU Socket

(4 Core)

CPU Socket

(4 Core)

CPU Socket

(4 Core)

CPU Socket

(4 Core)

Storage Enclosure

Storage Processor

Storage Processor

RAID-1RAID-1

RAID-1RAID-1

RAID-1

Storage Enclosure

Storage Processor

Storage Processor

RAID-1RAID-1

RAID-1RAID-1

RAID-1

Storage Enclosure

Storage Processor

Storage Processor

RAID-1RAID-1

RAID-1RAID-1

RAID-1

Storage Enclosure

Storage Processor

Storage Processor

RAID-1RAID-1

RAID-1RAID-1

RAID-1

Storage Enclosure

Storage Processor

Storage Processor

RAID-1RAID-1

RAID-1RAID-1

RAID-1

Storage Enclosure

Storage Processor

Storage Processor

RAID-1RAID-1

RAID-1RAID-1

RAID-1

Storage Enclosure

Storage Processor

Storage Processor

RAID-1RAID-1

RAID-1RAID-1RAID-1

CPU Socket

(4 Core)

CPU Socket

(4 Core)

CPU Socket

(4 Core)

CPU Socket

(4 Core)

HBA

HBA

HBA

HBA

HBA

HBA

24

Agenda


MotivationBalanced Architecture Approach for DWExample FastTrack Reference ArchitecturesOptimizing Storage, Load and MaintenanceCase StudiesConclusions


25

Using a Preconfigured FastTrack Reference Architecture

Guesstimate of 200 MB/sec per core for an ‘average’ DW workloadEquates to 800 MB/Sec enclosure per quad-core CPUEstimate total bandwidth needed under query concurrency

Derives CPU countDerives total Storage profile

26

Published Reference Architectures Balanced System Examples -- HP / Dell / IBM, 8 to 48 core

27

Agenda




28

Optimizing Storage Layoutfor Scan Intensive Workloads

LUN configuration is based on RAID1 pairs

Optimal for scan type access patterns

Striping across storage is accomplished via SQL Server data files

Observed throughput for a single RAID pair >= 130 MB/s

SP

A

SP B

03 04

RAID GP02

LUN3

LUN4

01 02

RAID GP01

LUN1

LUN2

05 06

RAID GP03

LUN5

LUN6

07 08

RAID GP04

LUN7

LUN8

09 10

RAID GP05

LUN0(Logs) HS

29

Storage Layout Implications for SQL Server

Create a SQL data file per LUN, for every filegroupTempDB filegroups share same LUNs as other databasesLog on separate disks, within each enclosure

Striped using SQL StripingLog may share these LUNs with load files, backup targets

Storage Layout Implications for SQL Server

LUN16 LUN 2

LUN 3

Local Drive 1

Log LUN 1

Permanent DB Log

LUN 1

Tem

pD

B

TempDB.mdf (25GB)TempDB_02.ndf (25GB)TempDB_03ndf (25GB) TempDB_16.ndf (25GB)

Permanent FG

Permanent_1.ndf

Per

ma

na

nt_

DB

Sta

ge

D

ata

ba

se Stage FG

Stage_1.ndf Stage_2.ndf Stage_3.ndf Stage_16.ndf

Stage DB Log

Permanent_2.ndf

Permanent_3.ndf

Permanent_16.ndf

31

How Scans are Optimized

SQL Server issues a large number of asynchronous read-ahead requests when performing scansAttempts to issue I/O at rate needed to keep CPUs “busy”Size of I/O issued is dependent on continuity of underlying data pages

I/O size can be any multiple of 8K up to 512K

Average request size that will be issued by read-ahead operations can be determined by looking at

avg_fragment_size_in_pages exposed by sys.dm_index_physical_statsValues >= 64 pages will mean I/O’s sizes issued by read-ahead should be at or near 512K

32

Read-Ahead in ActionClustered index: Key Order1. Next range of pages requests is determined by looking at B-Tree

for next range of key values2. Pages for the range are sorted 3. I/O issued for each contiguous range of pages (up to 64 pages in

a single request)

Heap: Allocation Order Scan GAM pages to determine next range of pagesI/O issued for each contiguous range of pages (up to 64 pages in a single request)

33

Techniques to Maximize Scan Throughput

–E startup parameter

Minimize use of NonClustered indexes on Fact Tables

Load techniques to avoid fragmentationLoad in Clustered Index order (e.g. date) when possible

Index Creation always MAXDOP 1, SORT_IN_TEMPDB

Isolate volatile tables in separate filegroup

Isolate staging tables in separate filegroup or DB

Periodic maintenance

34

Conventional data loads lead to fragmentation

Bulk Inserts into Clustered Index using a moderate ‘batchsize’ parameter

Each ‘batch’ is sorted independently

Overlapping batches lead to page splits

1:321:31 1:351:341:331:36 1:381:37 1:401:391:321:31 1:351:341:33

Key Order of Index

35

Alternatives for loading

Use a heapPractical if queries need to scan whole partitions

or…Use a batchsize = 0Fine if no parallelism is needed during load

or…Use a Two-Step Load 1. Load to a Staging Table (heap)2. INSERT-SELECT from Staging Table into Target CIResulting rows are not fragmentedCan use Parallelism in step 1 – essential for large data

volumes

36

Two-Step Load Variations

To achieve high parallelism during historical loadTypically into a partitioned tableUse a Staging Table (heap) that is partitioned identically to the Target TableUse multiple concurrent streams to load the Staging Table with moderate batchsize (SSIS, Bulk Insert, etc)INSERT-SELECT separate partitions into the Target Table – potentially in parallel

Use ALTER TABLE SET ( LOCK_ESCALATION = AUTO)

Note: If memory is limited, TempDB could be heavily used for sorting

37

Two-Step Load Variations (cont.)

To avoid most TempDB space and TempDB IO during load

Use a partitioned Staging Table that is also indexed identically to Target TableLoad Staging Table using moderate batchsize (< 1M rows)Final INSERT-SELECTs will avoid any sort!

However the staging loads will be loggedNote: Parallelism will be limited if load batches overlap

38

Other fragmentation best practices

Avoid Autogrow of filegroupsPre-allocate filegroups to desired long-term sizeManually grow in large increments when necessary

Keep volatile tables in a separate filegroupTables that are frequently rebuilt or loaded in small increments

If historical partitions are loaded in parallel, consider separate filegroups for separate partitions to avoid extent fragmentation

39

Sometimes fragmentation can’t be avoided

If incremental loads overlap data already present in the Clustered Index, page splits will occur anywayPeriodic table maintenance can reduce the fragmentationPartitioning on history (date key) can help minimize needed maintenance operations

40

Maintenance considerations

Use ALTER INDEX … REBUILD … … WITH (MAXDOP = 1, SORT_IN_TEMPDB)

Single threaded -- avoids creating new extent fragmentationCan rebuild just the “current” partition

Avoid ALTER INDEX … REORGANIZEPages will become physically ordered, but significant extent fragmentation may occur

41

Handling long-term accumulation of fragmentation

Sometimes it may be best to “start fresh”:Create a new filegroup to replace the oldCreate a new copy of the table in new filegroup

With matching Partitions and Clustered IndexINSERT-SELECT from old to new (avoids a sort)Build secondary indexesDrop original table and rename the newAll but final step can be performed online

42

Agenda




43

Case 1: Insurance Claims -- High-volume loads in a short load window

Example: Load and enrich 50 GB of incremental data in less than 1 hourOnly possible with a highly parallel load designUse partitioned destination table

Partitioned by equal ranges of “customer key”But a Clustered Index on Date# partitions = # cores

Parallel loading to staging table firstSeparate filegroups per-partition prevents interleaving during load

44

System Design

Pri_A Pri_B Pri_C Pri_D Log Hot Spare

Hot Spare

Primary Storage8 Drives

(4 RAID1 Pairs)

Logs2 Drives

(1 RAID1 Pair)

Spares2 Drives

MSA2000 DAE

45

Results

Existing Appliance SQL Server Fast Track DW

Comparison

Loading – Subject Area 1

5:10:21 total time 51:31 total time R SQL Server 6x faster

Loading – Subject Area 2

4:36:08 total time 1:50.01 total time R SQL Server 2.5x faster

Query times – Subject Area 1

3:03 avg query time(using 9 benchmark queries)


R SQL Server 12x faster

Query times – Subject Area 2



R SQL Server 7x faster

Price per TB (8TB) – Cal : $22K / TB

Price per TB (16TB) – Cal: $13K / TB

46

Case 2: Telecom--Initial Data Load

Load 400 GB to new Clustered Index on an 8-core server in under 7 hoursTarget table designed with 8 partitions of evenly spaced historical ranges3-step load process leveraging partitioning

Load, Index, SwitchAll steps use parallelismMinimal logging

47

Case 2: Telecom -- Initial Data Load Data Size: 400G (50G * 8) Bulk Insert 8 files to match core count, and partition the final table according to core

count 1 Heap Table per destination partition, and final table is assumed to be Empty Create Clustered Index on the Heap Tables, and 1:1 switch each into the final

Partitioned Table SSIS Package Attributes/MaxConcurrentExecuables: 8 Use MAXDOP=1: minimal fragmentation

47 page

1. Bulk Insert

2. Create Clustered Index

3. Switch

48

Agenda




49



1

2

3




SQL Server 2008 R2

4

PDW

1

2

3




50

SQL Server Parallel Data WarehouseA data warehouse appliance with massive scalability

Massive Scale-Out of SQL Server through Massively Parallel Processing (MPP) system: 10s TB 100s TB PB► ►Choice of hardware vendor - Reference Architectures from HP, Bull, EMC, Dell, IBMLow cost of ownership through industry standard hardwareSimplified deployment & maintenance via appliance modelIntegration with existing SQL Server 2008 data warehouses via Hub & Spoke ArchitectureDeep integration with Microsoft BI

51

Parallel Data Warehouse Appliance - Hardware Architecture

Database Servers

Du

al

Infi

nib

an

d

Control Nodes

Active / Passive

Landing Zone

Backup Node

Storage Nodes

Spare Database Server

Du

al

Fib

er

Ch

an

nel

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

Management Servers

Client Drivers

ETL Load Interface

Corporate Backup Solution

Data Center Monitoring

Corporate Network Private Network

SQL

SQL

Question & Answer Session

© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED

OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

54

www.msteched.com/Australia

Sessions On-Demand & Community

http:// technet.microsoft.com/en-au

Resources for IT Professionals

http://msdn.microsoft.com/en-au

Resources for Developers

www.microsoft.com/australia/learning

Microsoft Certification & Training Resources

Resources

http://www.msteched.com/Australia

http://technet.microsoft.com/en-au







http://www.microsoft.com/australia/learning

microsoft - data warehousing with fasttrack pdw - teched australia 2010

Documents