april 10-12, chicago, il pdw architecture gets real: customer implementations brian walker |...

28
April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman | Microsoft Corporation SQL Customer Advisory Team

Upload: emery-hoover

Post on 24-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

April 10-12, Chicago, IL

PDW Architecture Gets Real:Customer Implementations

Brian Walker | Microsoft CorporationPDW Center of Excellence

Murshed Zaman | Microsoft CorporationSQL Customer Advisory Team

Page 2: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

April 10-12, Chicago, IL

Please silence cell phones

Page 3: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

3

Agenda

PDW xVelocity - Reporting Structured/Unstructured DataDemos

Introduction to PDW and How it WorksDetail

Highlight Current Customer Use Cases Future

Page 4: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

How does it work?

Page 5: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

5

Introducing Parallel Data Warehouse

Pre-Built Hardware + Software Appliance

• Co-engineered with HP and Dell

• Pre-built Hardware

• Pre-installed Software

• Appliance installed in 1-2 days

• Support - Microsoft provides first call support• Hardware partner provides onsite break/fix

support

Appliance Experience

Plug and Play Built-in Best Practices

Save Time

Page 6: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

6

The Power of PDWMassively Parallel Processing (MPP)

Uses many separate CPUs running in parallel to execute a single query

Each CPU has its own memory

Dedicated Infiniband network communications between servers

Symmetric Multi-Processing (SMP)

Multiple CPUs used to complete individual processes simultaneously

All CPUs share the same memory, and disks

Network controllers share bandwidth

Page 7: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

7

The Basic Full Rack

1 RACK

Infiniband & Ethernet

• 128 cores on 8 compute nodes

• 2TB of RAM on compute• Up to 168 TB of temp

DB• Up to 1PB of user data

• Reduce hardware footprint by virtualizing the entire control server rack down to a few nodes

• 1.5x lower price/TB providing the one of the lowest price/TB in the industry

• Save up to 70% of storage with up to ~15x compression via the xVelocity columstore

• Resilient, scalable, and high performance storage features in Windows Server 2012 replace SAN with high density, low cost SAS JBODS

• 70% more disk I/O bandwidth over SQL Server PDW 2008 R2

SQL Server PDW 2012

Page 8: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

8

Dimensional Model

Date Dim

Date Dim IDCalendar YearCalendar QtrCalendar MoCalendar Day

Store Dim

Store Dim IDStore NameStore MgrStore Size

Item Dim

Prod Dim IDProd CategoryProd Sub CatProd Desc

Sls Fact

Date Dim IDStore Dim IDProd Dim IDMktg Camp IdQty SoldDollars Sold

Promo DimMktg Camp ID

Camp NameCamp MgrCamp StartCamp End

I

D

S

I

P

F2

D

S

I

P

F3

D

S

I

P

S4

D

S

I

P

F5

D

s P

F1

Compute Nodes

PDWData Layout

Page 9: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

9

Seamlessly Add Capacity

Smallest (53TB) To Largest (6PB)

• Start small with a few Terabyte warehouse

• Add capacity up to 6 Petabytes

53 TB

6 PB

AddCapacity

AddCapacity

Largest Warehouse

PB

Start Small And Grow

Start Small Linearly Scale OUT

Page 10: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

10

Any Size : Next-Gen Performance

Columnstore Provides Dramatic Performance

• Updateable and clustered xVelocity columnstore

• Stores data in columnar format

• Memory-optimized for next-generation performance

• Updateable to support bulk and/or trickle loadingUp to50X Faster

Up to 15x compression

Save Timeand Costs

Batch Processing

xVelocity - Fast Data Query Processing

Custo

mer

Sale

s

Country

Supplie

r

Pro

ducts

Page 11: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

Demo: xVelocityThe Power of Updatable ColumnStore Indexing on PDW 2012

Page 12: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

12

Any Data: Hadoop Integration

• External Tables and full SQL query access to data stored in HDFS

• HDFS bridge for direct & fully parallelized access of data in HDFS

• Joining ‘on-the-fly’ PDW data with data from HDFS

• Parallel import of data from HDFS in PDW tables for persistent storage

• Parallel export of PDW data into HDFS including ‘round-tripping’ of data

Polybase Details

Unstructured data

HDFS Data Nodes

Structured data

EnhancedPDW

Query Engine

Regular

T-SQL

Results

PDW 2012

External TableHDFS Bridge

Page 13: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

13

Hadoop Data

Structured Data

Existing Excel Skillset With Big Data

Familiar Tools Analyse Big Data

• Native Microsoft BI Integration to PDW

• Structured and unstructured data in same spreadsheet

• Widely adopted and familiar user tools

No ITIntervention

Analyze AllData Types

High AdoptionOf Excel

Familiar Tools To Analyze Structured/Unstructured Data

Page 14: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

14

Demo: PDW 2012 PolybaseSimultaneous Reporting from Structured and Unstructured Data

Page 15: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

15

Current Customer Use Cases

Page 16: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

16

Large US Grocery Store chain needed an MPP Data Warehouse to improve performance, scale and provide timely data to its Executives and Analysts

PDW will scale to meet future growth and support more functional areas at Hy-Vee

PDW offered 100X Query Performance gain over conventional SQL Server, Faster Data loads and more scale with 7 instead of 2 years of purchasing data

16

Benefits

“…basic queries that previously took 20 minutes only took seconds using the SQL Server 2008 R2 Parallel Data Warehouse.” -Tom Settle, Assistant VP, Data Warehousing, Hy-Vee

Upgrading to PDW Gains 100x Improvement

Page 17: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

17

Business ObjectivesProvide Broader Range of Critical Customer Purchasing Data- Current system only supported 2 years of data – Business required 7 years

Critical

Enable Self-Service Reporting - SSAS/SSRS/SharePoint/Excel

Save Time

Enable User Ad hoc Reporting - Leveraging Excel/SharePoint

Query

Improve Performance of Complex Transformations - Faster delivery of data within specified SLAs

Load Speed

Reduced IT Costs - Creating self-sufficient end users – Frees IT to focus on delivering new data

Save Costs

Provide solution that Scales to Meet Future Data Needs- Expansion of history, point of sale detail, and expansion into social media

Scale

Page 18: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

18

Shift from ETL to ELT

• Move their complex transformations and calculations to SQL Server Parallel Data Warehouse from ETL server

• PDW has allowed Hy-Vee to create an enterprise data warehouse centralizing data from many sources

• Archiving point of sale source files for later data extraction

Using the Power of MPP

Complex Transformations

Page 19: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

19

Upgrade to PDW 2012

• Improves their opportunity to further analyze social media data

• Query data without having to move it into a relational database

• Provides an alternative archive solution for point of sale data

Future Option

Page 20: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

Data Archive Challenge – Financial Customer Reporting

Services

Archive Servers

Centralized EDW

• Business only actively analyzes a rolling 12 months of data

• Regulations require data is on-line and accessible for extended period

• Data > 12 months is pushed to a farm of SQL servers to meet regulatory requirements

Current Solution

Page 21: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

Data Archive Challenge – Financial Customer Reporting

Services

Archive Servers

Centralized EDW

HDFS Data Nodes

Unstructured data

HDFS bridge

• Replace archive farm with Hadoop cluster

• PDW provides single point of access

• Allows analyst to leverage existing SQL skills

• Much lower maintenance and administration

• Meets regulatory requirements

Future Solution

Page 22: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

22

AMD is also processing more reporting queries than it previously could—between 10,000 and 13,000 a day—with an average runtime of a few seconds and virtually no performance issues.

Because of the user complaints about the previous system, the data warehouse team had one employee devoted full time to addressing performance-related support tickets. With Parallel Data Warehouse, AMD has reduced support work to just a few hours a week.

AMD runs an average of 1,500 loads per day, and data loads to a given table range from four-minute to four-hour intervals. AMD averages about 500,000 file loads a day.

22

Benefits

“We used to worry about backlogs, but no more,”

- Rajarao Chitturi, Database and Applications Manager at AMD

AMD Boosts Performance with PDW

Page 23: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

23

AMD Business Challenges

• Only supported 6 month data retention

• Issues loading concurrently with high query volume

Obstacles With SMP Oracle

• Loading data always lagged behind by days

• Analyst couldn’t access recent data

• Continuous data loads throughout the day while users were querying the system

Load Demand

• Custom reporting tools hosted on Linux uses JDBC and ODBC drivers

Linux Based Reporting

Page 24: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

24

Project OverviewWafer Quality Assurance Data- 42 TB on PDW

Space Saving PDW Index Lite Approach- Oracle required excessive non-clustered indexes to get any performance

Improved Loading Speed- 660 GB/hr. throughput

10,000 – 13,000 Analytic Queries per Day- Most are scan intensive

Faster Backups – Complete in 1~2 hours per Database- Compared to a week on Oracle

Reduced Support Costs by 90%- No more chopping up queries to fit the data warehouse

Critical

Save Time

Query

Save Space

Load Speed

Save Costs

Page 25: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

25

Parallel Data Warehouse 2012

Page 27: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

27

Win a Microsoft Surface Pro!

Complete an online SESSION EVALUATION to be entered into the draw.

Draw closes April 12, 11:59pm CTWinners will be announced on the PASS BA Conference website and on Twitter.

Go to passbaconference.com/evals or follow the QR code link displayed on session signage throughout the conference venue.

Your feedback is important and valuable. All feedback will be used to improve and select sessions for future events.

Page 28: April 10-12, Chicago, IL PDW Architecture Gets Real: Customer Implementations Brian Walker | Microsoft Corporation PDW Center of Excellence Murshed Zaman

April 10-12, Chicago, IL

Thank you!Diamond Sponsor Platinum Sponsor