microsoft - data warehousing with fasttrack pdw - teched australia 2010
TRANSCRIPT
Nicholas DritsasPrincipal Program ManagerSQL Server Customer Advisory TeamMicrosoft
Data Warehousing with FastTrack and PDW
SESSION CODE: #DAT314
3
Agenda
SQL Server DataWarehouse Offering OverviewFast Track Offering
MotivationBalanced Architecture Approach for DWExample FastTrack Reference ArchitecturesOptimizing Storage, Load and MaintenanceCase Studies
Parallel DataWarouse Offering Overview
4
Agenda
SQL Server DataWarehouse Offering OverviewFast Track Offering
MotivationBalanced Architecture Approach for DWExample FastTrack Reference ArchitecturesOptimizing Storage, Load and MaintenanceCase Studies
Parallel DataWarouse Offering Overview
5
Microsoft Data Warehousing - Products Positioning
ScaleComplexityHA by defaultSW-HW integration
1
2
3
SQL Server 2008 R2with Fast Track
Reference Architecture
PDW with Hub-and-spoke
SQL Server 2008 R2
4
PDW
1
2
3
Minimal HW tune up/optimization. Supports mixed workloadsBalanced solution for mostly scan centric workloads.
Max HW tune up for most DW scenarios.
4 Most flexible Architecture for handling all DW scenarios.
6
New in SQL Server 2008Data Warehousing Enablers
Deliver insightsManage easilyBuild faster
High speed Adapters
MERGE SQL Statement
Change Data Capture (CDC)
Persistent Lookups
Data Profiling
…………
Star Join Query Optimization
Parallel Query Enhancements
Scale-out Shared Databases
Data Mining Improvements
New - Report Builder 2.0
…………
Data Compression
Backup Compression
Resource Governor
Policy Based Administration
Partition-Aligned Indexed Views
…………Included at no charge! No Fee Based Options:• Compression• Partitioning• Advanced Security• Manageability• ETL• Business Intelligence
7
Agenda
SQL Server DataWarehouse Offering OverviewFast Track Offering
MotivationBalanced Architecture Approach for DWExample FastTrack Reference ArchitecturesOptimizing Storage, Load and MaintenanceCase Studies
Parallel DataWarouse Offering Overview
8
Microsoft Data Warehousing - Products Positioning
ScaleComplexityHA by defaultSW-HW integration
1
2
3
SQL Server 2008 R2with Fast Track
Reference Architecture
PDW with Hub-and-spoke
SQL Server 2008 R2
4
PDW
1
2
3
Minimal HW tune up/optimization. Supports mixed workloadsBalanced solution for mostly scan centric workloads.
Max HW tune up for most DW scenarios.
4 Most flexible Architecture for handling all DW scenarios.
9
SQL Server Relational Data Warehouses Today
Hundreds of deployments > 1 TBDozens of deployments > 5 TBA wide variety of approachesSynergy with the SQL Sever BI StackMomentum!
Steady stream of enabling features Resource Governor, Compression, Star Query, …
Next scale breakthrough coming with Parallel Data Warehouse this year
10
Some SQL Data Warehouses Today
Big SANBiggest 64-core ServerConnected together!
What’s wrong withthis picture???
11
System out of balance
This server can consume 16 GB/Sec of IO, but the SAN can only deliver 2 GB/Sec
Even when the SAN is dedicated to the SQL Data Warehouse, which it often isn’tLots of disks for Random IOPS BUTLimited controllers Limited IO bandwidth
System is typically IO bound and queries are slow
Despite significant investment in both Server and Storage
12
The Alternative: A Balanced System
Design a server + storage configuration that can deliver all the IO bandwidth that CPUs can consume when executing a SQL Relational DW workloadAvoid sharing storage devices among serversAvoid overinvesting in disk drives
Focus on scan performance, not IOPSLayout and manage data to maximize range scan performance and minimize fragmentation
13
What is FastTrack Data Warehouse?
A method for designing a cost-effective, balanced system for Data Warehouse workloads Reference hardware configurations developed in conjunction with hardware partners using this methodBest practices for data layout, loading and management
Relational Database Only – Not SSAS, IS, RS
14
Agenda
SQL Server DataWarehouse Offering OverviewFast Track Offering
MotivationBalanced Architecture Approach for DWExample FastTrack Reference ArchitecturesOptimizing Storage, Load and MaintenanceCase Studies
Parallel DataWarouse Offering Overview
15
Data Warehouse Workload Characteristics
Scan Intensive
Hash Joins
Aggregations
SELECT L_RETURNFLAG, L_LINESTATUS, SUM(L_QUANTITY) AS SUM_QTY,SUM(L_EXTENDEDPRICE) AS SUM_BASE_PRICE,SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)) AS
SUM_DISC_PRICE,SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)*(1+L_TAX))
AS SUM_CHARGE,AVG(L_QUANTITY) AS AVG_QTY,AVG(L_EXTENDEDPRICE) AS AVG_PRICE,AVG(L_DISCOUNT) AS AVG_DISC,COUNT(*) AS COUNT_ORDER
FROM LINEITEMGROUP BY L_RETURNFLAG,
L_LINESTATUSORDER BY L_RETURNFLAG,
L_LINESTATUS
16
Balanced Architecture Components
17
Balanced System - CPU
Determine your data consumption rate, per CPU core, for your query mix
Simple example: Assume TPCH query 2 is your average query
Run the query on a test server with data fully cached in memory
Execute parallel query using MAXDOP 4
Observe 100% CPU on 4 cores
Time the query and observe # pages read
(Set Statistics IO on; Set Statistics Time on)
Per Core Consumption = (# Logical Reads* 8K)/(CPU Time)
18
You can get more sophisticated…
Realize that queries performing complex calculations, format conversions, multi-dimension hash joins, etc. will be more cpu-intensive than others
Complex queries will consume data at a slower per-core rate than simpler queries
Alternative: Measure per-core data consumption for a variety of queries, and take the weighted average
A standard approach to capacity planning
19
Or you can leave it to us…
We’ve measured a mix of TPCH queries that reflect a ‘prototype’ Data Warehouse workloadConcluded that SQL Sever 2008 R2 on current x64 cores consume ~200 MB/Sec per core on average for this workloadWe use this as a basis for the published reference architecturesYour mileage will vary!
For precise system sizing, measure your own workload
20
Balanced SystemDetermine Storage Sizing
CPU core count and consumption rate for workload will determine # of controllers and enclosures need to provide aggregate throughput# of controllers will determine minimum disk count for delivering the scan bandwidthDetermine desired per-disk capacity based on expected data volume
Leave enough room for TempDB and for extra copies of the largest tables in the system, for maintenance activities
21
Balanced SystemIO Stack
Use a 2x quad-core server as a building block / starting pointEnsure that the per-core data consumption rate can be delivered by all elements of the IO stack
Maximum theoretical throughput for IO stack components sized for an 8 CPU core Fast Track system (assumes 200 MB/s per core)
CPU Socket
(4 Core)
CPU Socket
(4 Core)
22
Balanced SystemDetermine Storage Sizing (2)
Keep in mind theoretical maximums are just that – theoreticalSome testing/validation may be needed
Observed bandwidth realized on 8 core Fast Track system running SQLIO
CPU Socket
(4 Core)
CPU Socket
(4 Core)
23
Balanced System - Scaling the IO Stack
Server
Fiber Switch
HBA
HBA
Storage Enclosure
Storage Processor
Storage Processor
RAID-1RAID-1
RAID-1RAID-1
RAID-1
CPU Socket
(4 Core)
CPU Socket
(4 Core)
CPU Socket
(4 Core)
CPU Socket
(4 Core)
Storage Enclosure
Storage Processor
Storage Processor
RAID-1RAID-1
RAID-1RAID-1
RAID-1
Storage Enclosure
Storage Processor
Storage Processor
RAID-1RAID-1
RAID-1RAID-1
RAID-1
Storage Enclosure
Storage Processor
Storage Processor
RAID-1RAID-1
RAID-1RAID-1
RAID-1
Storage Enclosure
Storage Processor
Storage Processor
RAID-1RAID-1
RAID-1RAID-1
RAID-1
Storage Enclosure
Storage Processor
Storage Processor
RAID-1RAID-1
RAID-1RAID-1
RAID-1
Storage Enclosure
Storage Processor
Storage Processor
RAID-1RAID-1
RAID-1RAID-1
RAID-1
Storage Enclosure
Storage Processor
Storage Processor
RAID-1RAID-1
RAID-1RAID-1RAID-1
CPU Socket
(4 Core)
CPU Socket
(4 Core)
CPU Socket
(4 Core)
CPU Socket
(4 Core)
HBA
HBA
HBA
HBA
HBA
HBA
24
Agenda
SQL Server DataWarehouse Offering OverviewFast Track Offering
MotivationBalanced Architecture Approach for DWExample FastTrack Reference ArchitecturesOptimizing Storage, Load and MaintenanceCase StudiesConclusions
Parallel DataWarouse Offering Overview
25
Using a Preconfigured FastTrack Reference Architecture
Guesstimate of 200 MB/sec per core for an ‘average’ DW workloadEquates to 800 MB/Sec enclosure per quad-core CPUEstimate total bandwidth needed under query concurrency
Derives CPU countDerives total Storage profile
26
Published Reference Architectures Balanced System Examples -- HP / Dell / IBM, 8 to 48 core
27
Agenda
SQL Server DataWarehouse Offering OverviewFast Track Offering
MotivationBalanced Architecture Approach for DWExample FastTrack Reference ArchitecturesOptimizing Storage, Load and MaintenanceCase Studies
Parallel DataWarouse Offering Overview
28
Optimizing Storage Layoutfor Scan Intensive Workloads
LUN configuration is based on RAID1 pairs
Optimal for scan type access patterns
Striping across storage is accomplished via SQL Server data files
Observed throughput for a single RAID pair >= 130 MB/s
SP
A
SP B
03 04
RAID GP02
LUN3
LUN4
01 02
RAID GP01
LUN1
LUN2
05 06
RAID GP03
LUN5
LUN6
07 08
RAID GP04
LUN7
LUN8
09 10
RAID GP05
LUN0(Logs) HS
29
Storage Layout Implications for SQL Server
Create a SQL data file per LUN, for every filegroupTempDB filegroups share same LUNs as other databasesLog on separate disks, within each enclosure
Striped using SQL StripingLog may share these LUNs with load files, backup targets
Storage Layout Implications for SQL Server
LUN16 LUN 2
LUN 3
Local Drive 1
Log LUN 1
Permanent DB Log
LUN 1
Tem
pD
B
TempDB.mdf (25GB)TempDB_02.ndf (25GB)TempDB_03ndf (25GB) TempDB_16.ndf (25GB)
Permanent FG
Permanent_1.ndf
Per
ma
na
nt_
DB
Sta
ge
D
ata
ba
se Stage FG
Stage_1.ndf Stage_2.ndf Stage_3.ndf Stage_16.ndf
Stage DB Log
Permanent_2.ndf
Permanent_3.ndf
Permanent_16.ndf
31
How Scans are Optimized
SQL Server issues a large number of asynchronous read-ahead requests when performing scansAttempts to issue I/O at rate needed to keep CPUs “busy”Size of I/O issued is dependent on continuity of underlying data pages
I/O size can be any multiple of 8K up to 512K
Average request size that will be issued by read-ahead operations can be determined by looking at
avg_fragment_size_in_pages exposed by sys.dm_index_physical_statsValues >= 64 pages will mean I/O’s sizes issued by read-ahead should be at or near 512K
32
Read-Ahead in ActionClustered index: Key Order1. Next range of pages requests is determined by looking at B-Tree
for next range of key values2. Pages for the range are sorted 3. I/O issued for each contiguous range of pages (up to 64 pages in
a single request)
Heap: Allocation Order Scan GAM pages to determine next range of pagesI/O issued for each contiguous range of pages (up to 64 pages in a single request)
33
Techniques to Maximize Scan Throughput
–E startup parameter
Minimize use of NonClustered indexes on Fact Tables
Load techniques to avoid fragmentationLoad in Clustered Index order (e.g. date) when possible
Index Creation always MAXDOP 1, SORT_IN_TEMPDB
Isolate volatile tables in separate filegroup
Isolate staging tables in separate filegroup or DB
Periodic maintenance
34
Conventional data loads lead to fragmentation
Bulk Inserts into Clustered Index using a moderate ‘batchsize’ parameter
Each ‘batch’ is sorted independently
Overlapping batches lead to page splits
1:321:31 1:351:341:331:36 1:381:37 1:401:391:321:31 1:351:341:33
Key Order of Index
35
Alternatives for loading
Use a heapPractical if queries need to scan whole partitions
or…Use a batchsize = 0Fine if no parallelism is needed during load
or…Use a Two-Step Load 1. Load to a Staging Table (heap)2. INSERT-SELECT from Staging Table into Target CIResulting rows are not fragmentedCan use Parallelism in step 1 – essential for large data
volumes
36
Two-Step Load Variations
To achieve high parallelism during historical loadTypically into a partitioned tableUse a Staging Table (heap) that is partitioned identically to the Target TableUse multiple concurrent streams to load the Staging Table with moderate batchsize (SSIS, Bulk Insert, etc)INSERT-SELECT separate partitions into the Target Table – potentially in parallel
Use ALTER TABLE SET ( LOCK_ESCALATION = AUTO)
Note: If memory is limited, TempDB could be heavily used for sorting
37
Two-Step Load Variations (cont.)
To avoid most TempDB space and TempDB IO during load
Use a partitioned Staging Table that is also indexed identically to Target TableLoad Staging Table using moderate batchsize (< 1M rows)Final INSERT-SELECTs will avoid any sort!
However the staging loads will be loggedNote: Parallelism will be limited if load batches overlap
38
Other fragmentation best practices
Avoid Autogrow of filegroupsPre-allocate filegroups to desired long-term sizeManually grow in large increments when necessary
Keep volatile tables in a separate filegroupTables that are frequently rebuilt or loaded in small increments
If historical partitions are loaded in parallel, consider separate filegroups for separate partitions to avoid extent fragmentation
39
Sometimes fragmentation can’t be avoided
If incremental loads overlap data already present in the Clustered Index, page splits will occur anywayPeriodic table maintenance can reduce the fragmentationPartitioning on history (date key) can help minimize needed maintenance operations
40
Maintenance considerations
Use ALTER INDEX … REBUILD … … WITH (MAXDOP = 1, SORT_IN_TEMPDB)
Single threaded -- avoids creating new extent fragmentationCan rebuild just the “current” partition
Avoid ALTER INDEX … REORGANIZEPages will become physically ordered, but significant extent fragmentation may occur
41
Handling long-term accumulation of fragmentation
Sometimes it may be best to “start fresh”:Create a new filegroup to replace the oldCreate a new copy of the table in new filegroup
With matching Partitions and Clustered IndexINSERT-SELECT from old to new (avoids a sort)Build secondary indexesDrop original table and rename the newAll but final step can be performed online
42
Agenda
SQL Server DataWarehouse Offering OverviewFast Track Offering
MotivationBalanced Architecture Approach for DWExample FastTrack Reference ArchitecturesOptimizing Storage, Load and MaintenanceCase Studies
Parallel DataWarouse Offering Overview
43
Case 1: Insurance Claims -- High-volume loads in a short load window
Example: Load and enrich 50 GB of incremental data in less than 1 hourOnly possible with a highly parallel load designUse partitioned destination table
Partitioned by equal ranges of “customer key”But a Clustered Index on Date# partitions = # cores
Parallel loading to staging table firstSeparate filegroups per-partition prevents interleaving during load
44
System Design
Pri_A Pri_B Pri_C Pri_D Log Hot Spare
Hot Spare
Primary Storage8 Drives
(4 RAID1 Pairs)
Logs2 Drives
(1 RAID1 Pair)
Spares2 Drives
MSA2000 DAE
45
Results
Existing Appliance SQL Server Fast Track DW
Comparison
Loading – Subject Area 1
5:10:21 total time 51:31 total time R SQL Server 6x faster
Loading – Subject Area 2
4:36:08 total time 1:50.01 total time R SQL Server 2.5x faster
Query times – Subject Area 1
3:03 avg query time(using 9 benchmark queries)
0:15 avg query time(using 9 benchmark queries)
R SQL Server 12x faster
Query times – Subject Area 2
56:44 avg query time(using 4 benchmark queries)
8:09 avg query time(using 4 benchmark queries)
R SQL Server 7x faster
Price per TB (8TB) – Cal : $22K / TB
Price per TB (16TB) – Cal: $13K / TB
46
Case 2: Telecom--Initial Data Load
Load 400 GB to new Clustered Index on an 8-core server in under 7 hoursTarget table designed with 8 partitions of evenly spaced historical ranges3-step load process leveraging partitioning
Load, Index, SwitchAll steps use parallelismMinimal logging
47
Case 2: Telecom -- Initial Data Load Data Size: 400G (50G * 8) Bulk Insert 8 files to match core count, and partition the final table according to core
count 1 Heap Table per destination partition, and final table is assumed to be Empty Create Clustered Index on the Heap Tables, and 1:1 switch each into the final
Partitioned Table SSIS Package Attributes/MaxConcurrentExecuables: 8 Use MAXDOP=1: minimal fragmentation
47 page
1. Bulk Insert
2. Create Clustered Index
3. Switch
48
Agenda
SQL Server DataWarehouse Offering OverviewFast Track Offering
MotivationBalanced Architecture Approach for DWExample FastTrack Reference ArchitecturesOptimizing Storage, Load and MaintenanceCase Studies
Parallel DataWarouse Offering Overview
49
Microsoft Data Warehousing - Products Positioning
ScaleComplexityHA by defaultSW-HW integration
1
2
3
SQL Server 2008 R2with Fast Track
Reference Architecture
PDW with Hub-and-spoke
SQL Server 2008 R2
4
PDW
1
2
3
Minimal HW tune up/optimization. Supports mixed workloadsBalanced solution for mostly scan centric workloads.
Max HW tune up for most DW scenarios.
4 Most flexible Architecture for handling all DW scenarios.
50
SQL Server Parallel Data WarehouseA data warehouse appliance with massive scalability
Massive Scale-Out of SQL Server through Massively Parallel Processing (MPP) system: 10s TB 100s TB PB► ►Choice of hardware vendor - Reference Architectures from HP, Bull, EMC, Dell, IBMLow cost of ownership through industry standard hardwareSimplified deployment & maintenance via appliance modelIntegration with existing SQL Server 2008 data warehouses via Hub & Spoke ArchitectureDeep integration with Microsoft BI
51
Parallel Data Warehouse Appliance - Hardware Architecture
Database Servers
Du
al
Infi
nib
an
d
Control Nodes
Active / Passive
Landing Zone
Backup Node
Storage Nodes
Spare Database Server
Du
al
Fib
er
Ch
an
nel
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
Management Servers
Client Drivers
ETL Load Interface
Corporate Backup Solution
Data Center Monitoring
Corporate Network Private Network
SQL
SQL
Question & Answer Session
© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED
OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
54
www.msteched.com/Australia
Sessions On-Demand & Community
http:// technet.microsoft.com/en-au
Resources for IT Professionals
http://msdn.microsoft.com/en-au
Resources for Developers
www.microsoft.com/australia/learning
Microsoft Certification & Training Resources
Resources