microsoft data warehousing vision sql server 2008 r2 dw enhancements high speed connectors change...
TRANSCRIPT
<Name><Title, Group>
SQL Server 2008 R2 Data Warehousing Overview
Agenda
Microsoft Data Warehousing VisionSQL Server 2008 R2 DW enhancements
High speed connectorsChange Data CaptureStar join enhancement featuresPartition table performanceData compressionBackup compressionResource management
Summary
HR
Financial/ Accounting
ERP
CRM and
eCRM
Internet
Procurement
CallCenter
Inventory
Islands of information
IT Systems Evolution
EnterpriseData
Warehouse
HR
Financial/ Accounting
ERP
CRM and
eCRM
Internet
Procurement
CallCenter
Inventory
Data Warehousing
Complete Data Warehouse Solution
Flexibility and Choice Massive Scalability at a Low Cost
Microsoft Data Warehouse Vision
Make SQL Server the fastest and most affordable database for customers of all sizes
Simplified Data Warehouse Management
Microsoft Data Warehousing Offerings
Tier 1 Offerings
Tier 1 Services and Support
Enterprise Data Center Fast Track Data Warehouse
Parallel Data Warehouse
Scalable and reliable platform for data
warehousing on any hardware
Scalable and reliable platform for data
warehousing on any hardware
Reference architectures offering best price
performance for data warehousing
Appliance for high-end data warehousing requiring
highest scalability, performance, or complexity
Ideal for data marts or small to mid-sized
enterprise data warehouses (EDWs)
Ideal for data marts or small to mid-sized EDWs
Ideal for data marts or small to mid-sized data warehouses with scan-
centric workloads
Offers flexibility in hardware and architecture
Software only Software onlyReference architecture
(software and hardware)
Data warehouse appliance(fully integrated software
and hardware)
Scale up data warehousing Scale up data warehousing Scale up data warehousing
Scale out data warehousing with massively parallel processing (MPP)
10s of terabytes 10s of terabytes 4–48 terabytes 10s–100s of terabytes
$28.8K per processor$9.9K per server
$162 per CAL$57.5K per Proc only
$107K–$683K (2–4 Procs; includes hardware)
$1.5–$1.7 million per rack (includes hardware)
$13K – 44K per terabyte
Microsoft Data Warehousing Offerings
Tier 1 Offerings
Tier 1 Services and Support
Enterprise Data Center Fast Track Data Warehouse
Parallel Data Warehouse
Scalable and reliable platform for data
warehousing on any hardware
Scalable and reliable platform for data
warehousing on any hardware
Reference architectures offering best price
performance for data warehousing
Appliance for high-end data warehousing requiring
highest scalability, performance, or complexity
Ideal for data marts or small to mid-sized
enterprise data warehouses (EDWs)
Ideal for data marts or small to mid-sized EDWs
Ideal for data marts or small to mid-sized data warehouses with scan-
centric workloads
Offers flexibility in hardware and architecture
Software only Software onlyReference architecture
(software and hardware)
Data warehouse appliance(fully integrated software
and hardware)
Scale up data warehousing Scale up data warehousing Scale up data warehousing
Scale out data warehousing with massively parallel processing (MPP)
10s of terabytes 10s of terabytes 4–48 terabytes 10s–100s of terabytes
$28.8K per processor$9.9K per server
$162 per CAL$57.5K per Proc only
$107K–$683K (2–4 Procs; includes hardware)
$1.5–$1.7 million per rack (includes hardware)
$13K – 44K per terabyte
• Integrated ETL and Reporting tools
• Simplified management
• Predictable response
• Lower storage costs
• Integrated Master Data Management tool
• Ability to scale up to 256 logical processors
• Continuous loading using StreamInsight
SQL Server 2008 R2 Provide a single version of the truth
Provide a centralized repository of consistent data
Provide up-to-date data to all employees to enable intelligent decision-making
Process large amounts of data in a fast, efficient, and affordable manner
Answer complex queries quickly
Guarantee predictable performance
Aggregate data from multiple
sources
Store data efficiently
Guarantee predictable
performance
High-speed connectors
Star join query optimizations
Data and backup compression
Change Data Capture
Policy based management
Table partitioning
Partitioned table parallelism
Resource Governor
Provide enterprise-scale integration
High speed connectorsAttunity high speed connectors for Oracle and Teradata
Deliver unparalleled throughput for extracting and loading data to and from Oracle and Teradata.
Change data captureEnables tracking changes to the data in tables
Provides relatively low impact on performance
Speed updates to data warehouses by capturing net changes
DataWarehouse ETL
CHANGE
INSERT
UPDATE
11001010010100101001
11001010010100101001ETL
Evid
en
ce
The CDC feature gives us the information we need and frees us from the task of creating and testing triggers.”
— Gerald Schinagl, Project Manager and Systems Architect for the Sports Database, Austrian Broadcasting Corporation Radio & Television (ORF)“
Increase performance for large queries
Star join optimizationsProcess more data in a shorter time by optimizing common join scenarios in a data warehouse
Significantly reduce the amount of processing for star schema queries
Faster join processing speeds up lookups during data load, which shortens load windows and enables more frequent updates for better reporting
Evid
en
ce
In addition to faster query processing, ORF has found an immediate improvement of 15 percent in data loading. We consider that a great advantage when you can get 15 percent faster data loading without having to change a line of our own code.”
— Gerald Schinagl, Project Manager and Systems Architect, ORF
DIMENSION TABLE
DIMENSION TABLE
DIMENSION TABLE
DIMENSION TABLE
DIMENSION TABLE
DIMENSION TABLE
FACT TABLE
Rows Returned
1,000,000623,194
“
Divide and Manage Large Tables
Table PartitioningManage and access subsets of data quickly and efficientlyReduce time spent troubleshooting storage allocation issuesSpeed data load and maintenance operationsTake advantage of all CPUs in the machine to complete operations more quickly
Evid
en
ce
Enhancements in partition query dramatically reduce the effects of lock escalation on systems that have to process hundreds and thousands of transactions per second, improving availability and improv[ing] query response time.
—Randy Dyess, SQL Server Mentor, TechNet Article“
Increase Query Performance
Partitioned Tables ParallelismReduce access times for large amounts of data by querying all partitions in parallelTake advantage of all CPUs in the machine to give results more quickly
Evid
en
ce
Enhancements in partition query dramatically reduce the effects of lock escalation on systems that have to process hundreds and thousands of transactions per second, improving availability and improv[ing] query response time.
—Randy Dyess, SQL Server Mentor, TechNet Article
Partitioned query time02468
1012141618
SS 2005 SP2SS 2008 R2
“
Lower storage costs
Data compression 20% to 60% compression ratios 1
Save disk storageProvides more room to store more data, which allows more instances to share disk resourcesReduced data size can increase performance
10010100101001010000111110110101001
10010100101001010000111110110101001
1 Stated percentages are typical but not guaranteed
Evid
en
ce
Our initial testing shows we’ll see 50 percent to 60 percent data compression using SQL Server 2008...we will also benefit from faster query performance.”
— Mazal Tuchler, BI Manager, Clalit Health Services“
Lower storage costs
Backup compression50% to 90% compression ratios 1
Reduce the cost for disks and tapes used to backup dataSmaller backups can be taken offsite more easily to protect dataIncrease administrator productivitySmaller backups usually increase backup and restore speed resulting in higher availability
1 Stated percentages are typical but not guaranteed
Evid
en
ce
We’re anticipating an 80 percent reduction in our backup file sizes using backup compression on SQL Server 2008.”
— Peter Hammond, President, CyberSavvy
Backu
p Size
Backu
p Tim
e
Resto
re T
ime
0
2
4
6
No Com-pressionCompression
“
Manage a large number of servers
Central management serversManage Relational Databases, Analysis Services, Reporting Services, Notification Services, SQL Server Mobile Edition using one management tool
Simplify maintenance by executing commands simultaneously on multiple servers
See integrated results when you query data from a group of servers
Policy-based managementEasily create policies that control security, database options, object naming conventions, and other settings at a highly granular level
Evid
en
ce
Policy-based Management gives us the ability to enforce naming standards, security settings, memory settings, and other elements to simplify database management
—Glenn Berry Database Architect, NewsGator Technologies“
Help ensure predictable performance
Resource GovernorPrevent runaway queries that hold resources for extended periods of time
Allow OLTP and data warehouse workloads on the same server while limiting the impact of large data warehouse queries on OLTP
Provide consistent user experience, which can result in fewer service calls about slow systems
Applications & Business Logic
POOL 1
POOL 2
POOL 0
LIMIT 50%
LIMIT 30%
LIMIT 20%
110010100101 110010100101 110010
LOAD 25%
110010100101 110010100101 110010
LOAD 45%
110010100101 110010100101 110010
LOAD
15%
Evid
en
ce
We deal with a lot of large data feeds—both coming from manufacturers as data updates, and going out to our subscribers. Resource Governor allows us to control the percent[age] of total resources any operation can consume so that they don’t adversely impact our real-time data access.”
— Michael Steineke, Vice President, Information Technology, Edgenet
“
Scale Up to the Hardware Limit
Maximum number of processorsSupport for most powerful servers, scaling up to OS maximumCan execute parallel index and consistency check operationsQuery optimizer makes operations parallel when there is a benefitStandard only uses single CPU for index and consistency check operations
Evid
en
ce
“For most large queries SQL Server generally scales linearly or nearly linearly. For speed up, this means that if we double the number of CPUs, we see the response time drop in half.”
—Craig Freedman, Coauthor, Inside Microsoft SQL Server 2005: Query Tuning and Optimization
Standard Edition Up to 4 CPUs only
Enterprise edition
Up to 8 CPUs
Datacenter edition
Up to HW limit
“
SQL Server Timeline
2008 Beyond
2010
Enterprise ETL ServicesStar Join Query OptimizationsData CompressionPartitioned table parallelism
Scale up to 256 logical processorsData compression for Unicode columnsMaster Data Management(Stratature Integration)Continuous Loading
Preliminary Information Subject to Change
Data Quality Services (Zoomix)Enhanced ETL capabilities
vNext
Next Steps
Learn More:Visit the Microsoft Data Warehousing PortalVisit the Fast Track and Parallel Data Warehouse web pagesVisit the SQL Server DW Portal on TechNet
Try Now:Download SQL Server 2008 R2Talk to your Microsoft representative about scheduling:
Data Warehouse roadmap serviceSQL Server 2008 R2 POC
Summary
Microsoft SQL Server provides a comprehensive, scalable data warehouse platform that enables organizations to:
Build data warehouses faster on the data integration platform.Manage growing data volumes with an enterprise-ready relational database.Deliver actionable, integrated insights with the Microsoft Business Intelligence platform.
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions,
it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Microsoft’s on-going investments in Data Warehousing
Heterogeneous Connectivity & Workloads
Data Integrity & Quality
Compliance & Security
Data Warehouse Scale
Data Warehouse Management
2005 2008 Futures
PB Warehouses>64 Core ProcessingScale out through MPP
Perf. Management ToolsBI Resource GovernanceImproved Predictability
Mixed workload supportContinuous Loading
Master Data Management(Stratature Integration)Integrated DQ Services (Zoomix)
Rights Management
10s of TB WarehousesParallel partitioningData compressionNew Reference
Architectures
Policy Based Admin.DB Resource
Governance
High Perf. Connectors(Oracle, Teradata, SAP BW)
Data Profiling
Policy based auditing
Multi TB WarehousesEnterprise scalabilityDW Reference
Architectures
Unified manageability
Enterprise class ETL tool
Data Cleansing(Fuzzy lookup/matching)
Data Protection & Tracing