dpct303

39

Upload: sergey

Post on 15-Sep-2015

3 views

Category:

Documents


1 download

DESCRIPTION

DPCT303

TRANSCRIPT

Competing in the Modern Data Warehouse workload

TechReady 18 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.1/27/20141Darwin SchweitzerSenior Program ManagerMicrosoftCompeting in theModern Data Warehouse workload DPCT303MICROSOFT CONFIDENTIAL INTERNAL ONLYTechReady 16 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.1/27/20142Microsoft ConfidentialTechReady 17 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.1/27/20143Session Objective(s): Session Objective 1: Educate the you on key competitors in the rapidly emerging MDW workloadSession Objective 2: Get feedback from you on who and what you are competing with

Key Takeaway 1: Know the CompetitorsKey Takeaway 2: Share your experience in the space Session Objectives And TakeawaysMICROSOFT CONFIDENTIAL INTERNAL ONLY1/27/2014 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.4Tech Ready 15AgendaMDW Competitive LandscapeHear from you Voice of the FieldMDW PositioningDemo of Data Refinery concept on AzureRelated SessionsMDW Competitor (Cloud Players) Resources

MICROSOFT CONFIDENTIAL INTERNAL ONLYMDW Competitive LandscapeMICROSOFT CONFIDENTIAL INTERNAL ONLYBusiness Data LakeEnterprise Data Hub

Data Refinery

EnterpriseData Warehousehttp://www.capgemini.com/big-data-analytics/business-data-lakehttp://hortonworks.com/wp-content/uploads/2012/06/Apache-Hadoop-Big-Data-Refinery-WP.pdf http://www.gopivotal.com/businessdatalake PersistentClusterPatternTransientClusterPatternMICROSOFT CONFIDENTIAL INTERNAL ONLY

AnalyticsMap ReduceQueryInsightHivePigHadoopSQLMap ReduceBusiness IntelligencePredictiveOperationalInteractiveVisualizationExploratoryData WarehouseCloud ScaleReal-TimeBatchMachine LearningSelf-ServiceDremelBig QueryHadoop DBUnstructuredReportingAd-HocPivotDrillSocialData MiningText AnalyticsData Science

The Digital ShoeboxDriving Smarter Decisions with Microsoft Big DataTim Mallalieu at PASS BA Conference - April 10-12, 2013 Chicago IL TechReady 18 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.1/27/20148

Data RefinementAggregation/Compression/Transformation/ExtractionData ConsumptionAnalysis/Modeling/Query/Reporting/Visualization

Data AcquisitionStreaming/Trickle/Bulk TransferDriving Smarter Decisions with Microsoft Big DataTim Mallalieu at PASS BA Conference - April 10-12, 2013 Chicago IL Modern DW CapabilitiesS3Cloud StorageAzure StorageReal-time StreamingData PipelineHiveHBaseImpalaDremelRedshiftCloud SQLBig Data ProcessingBig Data QueryBig Data StorageBig DataMovementMICROSOFT CONFIDENTIAL INTERNAL ONLYTechReady 18 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.1/27/201410Competitor view of MDW Analyticshttp://www.redbooks.ibm.com/abstracts/redp5012.html?Open IBM Smarter Analytics: Information Architecture for a New Era of Computing

12Landing area zoneShared analytics information zone34Deep data zoneIntegrated warehouse and marts zone56Exploration zoneShared operational information zone7Information delivery zoneMICROSOFT CONFIDENTIAL INTERNAL ONLY

EMR

Redshift

HANAEvolution to Modern Data WarehousePure Play Relational EDWPure Play HadoopIntegrated QueryMapReduce

Market preference towards Evolving MDWIntegrated and Interactive SQL QueryData Processing with MapReduceAnalytics EcosystemEvolving MDWDistro DrivenMapRDrillClouderaImpalaHortonworksStingerEDW Vendor Driven Teradata AsterSQL-HOracleBDCIBMBig SQLOther PlayersPivotal HAWQHadaptSharkAWS

Redshift

Data Pipeline

EMR

Cloud DrivenSAPHANA OneGoogle

BigQuery

Hadoop on GCE

MICROSOFT CONFIDENTIAL INTERNAL ONLY12Vendors in the Big Data/MDW Space

HadoopDistrosEnterprisewith DistrosEnterprise partnering with DistrosCloudPlayerswith DistrosProcessors $$ with DistroMICROSOFT CONFIDENTIAL INTERNAL ONLYTechReady 18 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.1/27/201413Lets Compare the CompetitorsCompetitorClusterBig Data StorageBig Data ProcessingBig Data QueryHadoop Distros

HDFSMapReduceHivePigCloudera ImpalaMapR Drill (Dremel)HDP StingerEnterprise with DistrosHDFSGPFS (IBM)MapReduceHivePigBig SQL (IBM)HAWQ (Pivotal)Enterprise partnering with DistrosHDFSMapReduceHivePigDepends on the Distro (See Hadoop Distros above)Cloud Players with Distros

Cloud StorageHDFSMapReduceHivePigAWS Data PipelineAmazon Redshift, RDS, DynamoDB (NoSQL), Impala

Google BigQuery, Cloud SQL, Cloud DataStore (NoSQL)* MapR on AWS and Google and HDP on Azure (HDInsight)** BigInsights on IBM SmartCloud and Pivotal multi-cloud PaaSMICROSOFT CONFIDENTIAL INTERNAL ONLY12 Things to know1Hadoop Partners7Hardware Partners2DW Partners8Virtualization Partners3Data Integration Partners9Hosting Partners4BI Partners10System Integrator Partners5Predictive Analytics Partners11Training6Customer Segments12SearchMICROSOFT CONFIDENTIAL INTERNAL ONLYTechReady 18 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.1/27/201415What are you seeingin the field?MICROSOFT CONFIDENTIAL INTERNAL ONLYMicrosoft MDW PositioningMICROSOFT CONFIDENTIAL INTERNAL ONLYMicrosofts Data Warehouse OfferingsOn-premisesCloud BreadthPremiumSQL EE & FastTrackHDP

PDW & HDI region

HDInsightService

IaaS: DW OptimizedVM ImageAzure Modern Data Warehouse

PDW in market HDI integration shipping very soonIn marketUnder developmentIn marketTorsten Grabs DP314 - SQL Server Azure VM Images Optimized for Data Warehousing Tuesday January 28, 12:45 - 14:00 MICROSOFT CONFIDENTIAL INTERNAL ONLYThe traditional data warehouse19Data sourcesOLTPERPCRMLOBETLData warehouseBI and analytics

DashboardsReportingIncreasing data volumes1Real-time data2Non-Relational DataDevicesWebSensorsSocialNew data sources & types3Cloud-born data4 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.1/27/201419INFRASTRUCTUREDATA MANAGEMENT & PROCESSINGDATA ENRICHMENT AND FEDERATED QUERYBI & ANALYTICSSelf-serviceCollaborationCorporatePredictiveMobileExtract, transform, loadSingle query modelData qualityMaster data managementNon-relationalRelationalAnalyticalStreamingInternal & External

Data sourcesOLTPERPCRMLOBNon-Relational DataDevicesWebSensorsSocialThe modern data warehouse

2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.1/27/201420Exploring non-relational dataMicrosoft Confidential21

LOB

Manage non-relational data 100% Apache-basedManagement simplicity of WindowsBringing Hadoop to software, appliance, cloudBig Data with simplicityNon-relational

Windows AzureParallel Data Warehouse

Hortonworks Data Platform

Hadoop cluster in HDP for Windows and HDInsight 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.1/27/201421Integrate relational + non-relationalMicrosoft Confidential22Query relational and Hadoop in parallelSingle queryNo need to ETL Hadoop data into DWQuery Hadoop with existing T-SQL skillsQuery relational + non relational SQLResult setRelational dataPolyBaseIntegrated query with PolyBase in SQL PDWNon-Relational data 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.1/27/201422What to compete withPrimarily DeployedCompetitorClusterProductsPlaysOn Premises *Hadoop Distros

HDP on WindowsPDW AU1SQL Server RDBMSSQL BITendency: Customer pro Commodity HWPartner with Hortonworks and Surround HadoopSurround Hadoop with DW & BIOn Premises **

Enterprise with DistrosPDW AU1HDP on WindowsSQL Server RDBMSTendency: Appliance or Commodity HWLead with PDW AU1 & HDI regionSurround Hadoop with DW & BIOn Premises

Enterprise partnering with DistrosPDW AU1HDP on WindowsSQL Server RDBMSTendency: Customer pro Appliance ***Lead with PDW AU1 & HDI regionSurround Hadoop with DW & BICloudCloud Players with Distros

HDInsight on AzureSQL Server for DW on AzureTendency: Cloud FirstLead with HDInsight on Azure and SQL Server for DW on Azure

* MapR on AWS and Google and HDP on Azure (HDInsight)** BigInsights on IBM SmartCloud and Pivotal multi-cloud PaaS*** These competitors are just as likely to Surround HadoopMICROSOFT CONFIDENTIAL INTERNAL ONLYTechReady 18 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.1/27/201423Know your customers PatternCompetitorClusterCompetitor DetailPatternPlayHadoop Distros

ClouderaEnterprise Data HubPartner with Hortonworks with DW/BIMapRData Lake or RefineryPartner with Hortonworks with DW/BIEnterprise with DistrosIBMData Lake or RefineryLead with PDW AU1 & HDIPivotalData Lake or RefineryPartner with Hortonworks with DW/BIEnterprise partnering with DistrosHP HAVEnData Refinery or LakeLead with PDW AU1 & HDI with BIOracleData Refinery or LakeLead with PDW AU1 & HDI with BISAPData Refinery or LakeLead with PDW AU1 & HDI with BITeradataData Refinery or LakeLead with PDW AU1 & HDI with BICloud Players with DistrosAmazonData Refinery or LakeLead with HDInsight & SQL DW on Azure with Power BIGoogleData Refinery or LakeLead with HDInsight & SQL DW on Azure with Power BIMICROSOFT CONFIDENTIAL INTERNAL ONLYTechReady 18 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.1/27/201424Demo: Data Refineryon Windows AzureDarwin Schweitzer

MICROSOFT CONFIDENTIAL INTERNAL ONLY # Create a new HDInsight cluster

$subscriptionName = "$clusterName = "tr18hdi"$clusterNodes = "1"$location = "West US"${storageAccountName} = "tr18hdi"$storageAccountKey = "$containerName = "data"

$hdCred = Get-Credential -Message "Provide Username & PW"New-AzureHDInsightCluster -Subscription $subscriptionName -Name $clusterName -Location $location -DefaultStorageAccountName "${storageAccountName}.blob.core.windows.net" `-DefaultStorageAccountKey $storageAccountKey -DefaultStorageContainerName $containerName -ClusterSizeInNodes $clusterNodes -Credential $hdCred Create a new HDInsight cluster (Thanks Murshed Zaman & Cindy Gross)

TechReady 18 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.1/27/201426Data Processing with Hive(Thanks Murshed Zaman & Cindy Gross) # Provide Windows Azure subscription name, Azure Storage account, container, and HDI cluster$subscriptionName = $storageAccountName = "tr18hdi"$containerName = "data"$clusterName = "tr18hdi"# Get-AzureSubscription $subscriptionNameGet-AzureHDInsightCluster -Subscription $subscriptionName -Name $clusterNameUse-AzureHDInsightCluster -Subscription $subscriptionName -Name $clusterName

$startTime = get-dateINVOKE-HIVE "DROP TABLE IF EXISTS twitter_raw_ext;CREATE EXTERNAL TABLE IF NOT EXISTS twitter_raw_ext ( json_response STRING) COMMENT 'This is a twitter sample' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE LOCATION 'wasb:///twitter/input';"$endTime = get-date$elapsedTime = ($endTime - $startTime)Write-Host "Elapsed time :"$elapsedTime Copy output files (Thanks Alexei Khalyako)$localContent = "C:\HadoopOutput"$storageAccountName = "tr18hdi"$storageAccountKey = "$container = "useroutput"

mkdir c:\HadoopOutputImport-Module Azure

$blob_account = New-AzureStorageContext -StorageAccountName $storageAccountName -StorageAccountKey $storageAccountKey -Protocol https

Get-AzureStorageBlob -Container $container -Context $blob_account | ForEach-Object { $local_path = "$localContent\{0}\{1}" -f$container,$_.Name

$local_dir = Split-Path $local_path if (!(Test-Path $local_dir)) { New-Item -Path $local_dir -ItemType directory -Force } Get-AzureStorageBlobContent -Context $blob_account -Container $container -Blob $_.Name -Destination $local_path -Force | Out-Null }

29HDInsight Tasks for SSISCluster ManagementValidate, Create, DeleteUpload data to BLOB storage and provision HDInsight ClusterShape and Query using Hive, Pig or Map Reduce jobsExport to SQL using SqoopImport for BI analysis, and remove the clusterAzure BLOB StorageUpload File, Delete FileHDInsight JobsHive, Pig, Map Reduce, StreamingSQL Server Integration Services Roadmap Matt Masson and Wee Hyong Tok at PASS Summit 2013 October 15-18, 2013 Charlotte, NC Look for Incubator Project at http://hadoopsdk.codeplex.com/ TechReady 18 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.1/27/201429Some Use Case Comparative NumbersBig Data StorageTaskAWSAzureGoogleUpload 2.5GB .gz7 Min15 Min47 MinBig Data ProcessingTaskAWSAzureGoogleProvision 6 Node Hadoop Cluster5 Min13 Min? MinParse and create twitter_temp93 Min101 Min? MinTerminate 6 Node Hadoop Cluster1-4 Min< 1 Min? MinBig Data QueryTaskAWSAzureGoogleLoad BigQueryNANA33 Min *Rows loads6,003,040Query BigQueryNANA1.1 Sec **Select user_location, count(id) as numTweets, sum(retweet_count) as sumRetweets, sum(retweet_count)/count(id) as averageRetweetPerTweet from twitter.Olympics group by user_location order by 2 desc

* Browser-load in 10 steps/ ** 9 Sec not cached

MICROSOFT CONFIDENTIAL INTERNAL ONLYIn Review: Session Objectives And TakeawaysSession Objective(s): Session Objective 1: Educate the you on key competitors in the rapidly emerging MDW workloadSession Objective 2: Get feedback from you on who and what you are competing with

Key Takeaway 1: Know the CompetitorsKey Takeaway 2: Share your experience in the space MICROSOFT CONFIDENTIAL INTERNAL ONLY1/27/2014 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.31Tech Ready 15Breakout Sessions/Chalk TalksDP202 Details of SQL Server 2012 Parallel Data Warehouse Appliance Update 1 Monday January 27, 15:00 - 16:15

DP331 The Role of Polybase in the Modern Data Warehouse Tuesday January 28, 10:15 - 11:30

DP314 SQL Server Azure VM Images Optimized for Data Warehousing Tuesday January 28, 12:45 - 14:00 WSCT205 WhiteSpace: COE - Cloud, Big Data, and PDW - Why is it so Hard to Move? Wednesday January 29, 13:00 - 14:15

BINCT202 Meet the Data Insights CoE - What you always wanted to know about our Data Insights, Analytics and Big Data offeringsThursday January 30, 14:30 - 15:45Related ContentMICROSOFT CONFIDENTIAL INTERNAL ONLY1/27/2014 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.32Tech Ready 15MDW Competitor ResourcesMICROSOFT CONFIDENTIAL INTERNAL ONLYAWS 10/29 EMR goes to Hadoop 2.0 and supports MapR M712/12 Support for Impala linkGoogle 12/2/2013 Google Compute Engineis Generally Available1/14/2014 Google Cloud Storage Connector for Hadoop linkCloudera10/29 The Enterprise Data Hub as part of Cloudera 511/14 Cloudera and Udacity Partner to Address Big Data SkillsPivotal11/12 Pivotal One Next-Gen Multi-Cloud Enterprise PaaS12/4 -Strategic partnership with Capgemini formed to support co-innovation IBM 9/10 Announce IBM PureData System for Hadoop (BigInsights, BigSheets, Big SQL)Oracle 11/12 Announce New Big Data Appliance X4-2Teradata 10/21 Teradata CloudData Warehouse as a Service, Discovery as a Service and Data Management as a ServiceCompetitor NewsMICROSOFT CONFIDENTIAL INTERNAL ONLYTechReady 18 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.1/27/201434Amazon Web Services

MICROSOFT CONFIDENTIAL INTERNAL ONLYGoogle Cloud Platform

Video: Get up and running with Hadoop and Compute Engine

Loading Data Into BigQueryMICROSOFT CONFIDENTIAL INTERNAL ONLYGetting Started with AWS: Analyzing Big Datahttp://awsdocs.s3.amazonaws.com/gettingstarted/latest/awsgsg-emr.pdfAWS Data Pipeline Documentationhttp://aws.amazon.com/documentation/datapipeline/Amazon Elastic MapReduce Documentationhttp://aws.amazon.com/documentation/elasticmapreduce/Amazon Kinesis Developer Resourceshttp://aws.amazon.com/kinesis/developer-resources/Amazon Redshift Documentationhttp://aws.amazon.com/documentation/redshift/AWS Technical Whitepapers and LinksMICROSOFT CONFIDENTIAL INTERNAL ONLYAn Inside Look at Google BigQueryhttps://cloud.google.com/files/BigQueryTechnicalWP.pdfApache Hadoop, Hive, and Pig on Google Compute Enginehttps://cloud.google.com/developers/articles/apache-hadoop-hive-and-pig-on-google-compute-engine Managing Hadoop Clusters on Google Compute Enginehttps://cloud.google.com/developers/articles/managing-hadoop-clusters-on-google-compute-engine Video: Get up and running with Hadoop and Compute Engine in 10 minuteshttps://www.youtube.com/embed/se9vV8eIZME?autoplay=1 Google Technical Whitepapers and LinksMICROSOFT CONFIDENTIAL INTERNAL ONLYPerformance Analysis of Cloud Relational Database Serviceshttp://courses.cs.washington.edu/courses/cse544/13sp/final-projects/p18-lijl.pdfPricing: Google BigQuery charges per unit of data processedAmazons Redshift charges per hour per node

Comparative Technical WhitepaperGoogle BigQueryAmazon RedshiftAdvantagesDisadvantagesAdvantagesDisadvantagesVery easy to setup and run queriesUser cannot tune the system according to needsAlmost fully SQL compatibleRequires setting up and managing the cluster on which queries are runDoes not require any manual conguration of clustersLimited SQL language supportAll TPC-H queries run without anymodicationAutomatically scales up according the dataset sizedoes not scale up well on complex queries involving multiple joins and nested sub queriesMICROSOFT CONFIDENTIAL INTERNAL ONLY 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

MICROSOFT CONFIDENTIAL INTERNAL ONLY1/27/2014 4:30 PM40 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.