big data in azure

20
Big Data in Azure Matthew Winter Azure Global Black Belt 31 st August 2016

Upload: dataworks-summithadoop-summit

Post on 16-Apr-2017

666 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Big Data in Azure

Big Data in AzureMatthew WinterAzure Global Black Belt31st August 2016

Page 2: Big Data in Azure

Big Data is Changing Traditional Data Warehousing

… data warehousing has reached the most significant tipping point since its inception. The biggest, possibly most elaborate data management system in IT is changing. – Gartner, “The State of Data Warehousing”*

* Donald Feinberg, Mark Beyer, Merv Adrian, Roxane Edjlali (Gartner), The State of Data Warehousing in 2012 (Stamford, CT.: Gartner, 2012)

Data Sources

OLTP

ERP

CRM LOB

ETL

Data Warehouse

BI and Analytics

Page 3: Big Data in Azure

Big Data is Driving Transformative Changes

Traditional Big Data

Relational Datawith highly modeled schema

All Datawith schema agility

Specialized Hardware

Commodity Hardware

Datacharacteristics

Costs

Culture Operational ReportingFocus on rear-view analysis

Experimentation leading to intelligent actionWith machine learning, graph, a/b testing

Page 4: Big Data in Azure

Big Data Introduces a Culture of ExperimentationTangerine instantly adapts to customer feedback to offer customers what they want, when they want it

“I can see us…creating predictive, context-aware financial services applications that give information

based on the time and where the customer is.”

Billy LoHead of Enterprise Architecture

Scenario Lack of insight for targeted campaigns Inability to support data growth

SolutionAzure HDInsight (Hadoop-as-a-service) with the Analytics Platform System (APS) enables instant analysis of social sentiment and customer feedback across digital, face-to-face and phone.

Result

• Reduced time to customer insight• Ability to make changes to campaigns or adjust

product rollouts based on real-time customer reactions

• Ability to offer incentives and new services to retain—and grow—its customer base

Page 5: Big Data in Azure

However, there are challenges to Big Data…

Obtaining skills and capabilities

Determining howto get value

Integrating with existing IT investments

*Gartner: Survey Analysis – Hadoop Adoption Drivers and Challenges (Stamford, CT.: Gartner, 2015)

Page 6: Big Data in Azure

But, Microsoft has done it beforeWe needed to better leverage data and analytics to do more experimentation

So we:• Designed a data lake for everyone to put their

data into• Built tools approachable by any developer• Created machine learning tools for collaborating

across large experiment modelsResult:• Across Microsoft, ten thousand developers doing

experimentation leading to better insights

• Leading to growth in our Microsoft businesses:• Office productivity revenue (45%YoY)*• Intelligent Cloud (100% YoY)*• Bing search share doubles

2010 2011 2012 2013 2014 2015

Growth of data @ Microsoft

Windows

SMSG

LiveBing

CRM/Dynamics

Xbox Live

Office365

Malware Protection Microsoft Stores Commerce Risk

Skype

LCA

Exchange

Yammer

Peta

byte

s E

xaby

tes

* Microsoft. FY16 Q4 Results, URL: http://www.microsoft.com/en-us/Investor/earnings/FY-2016-Q4/press-release-webcast

Page 7: Big Data in Azure

Microsoft is now taking everything we’ve learned on this journey

and bringing it to our customers

Technology. Cost. Culture.

Page 8: Big Data in Azure

Big Data as a Cornerstone of Cortana Intelligence

Action

People

Automated Systems

Apps

Web

Mobile

Bots

Intelligence

Dashboards & Visualizations

Cortana

Bot Framework

Cognitive Services

Power BI

Information Management

Event Hubs

Data Catalog

Data Factory

Machine Learning and Analytics

HDInsight (Hadoop / Spark)

Stream Analytics

Intelligence

Data Lake Analytics

Machine Learning

Big Data Stores

SQL Data Warehouse

Data Lake Store

Data Sources

Apps

Sensors and devices

Data

Page 9: Big Data in Azure

Azure HDInsightHadoop and Spark as a Service on Azure

Fully-managed Hadoop and Spark for the cloud100% Open Source Hortonworks data platformClusters up and running in minutes Managed, monitored and supported by Microsoft with the industry’s best SLAFamiliar BI tools for analysis, or open source notebooks for interactive data science63% lower TCO than deploy your own Hadoop on-premises**IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud with Microsoft Azure

HDInsight”

Page 10: Big Data in Azure

Comprehensive Set of Managed Apache Big Data Projects

• Scale to petabytes on demand• Process unstructured and semi-structured data• Develop in Java, .NET, and more• Skip buying and maintaining hardware

• Deploy in Windows or Linux• Spin up an Apache Hadoop cluster in minutes• Visualize your Hadoop data in Excel• Easily integrate on-premises Hadoop clusters

Core Engine

BatchMap Reduce

ScriptPig

SQLHive

NoSQLHBase

StreamingStorm

In-MemorySpark

Page 11: Big Data in Azure

Azure Data Lake StoreA Hyper-Scale Repository for Big Data Analytics Workloads

Hadoop File System (HDFS) for the cloudNo limits to scaleStore any data in its native formatEnterprise-grade access control, encryption at restOptimized for analytic workload performance

Page 12: Big Data in Azure

Azure Data Lake StoreDistributed, parallel file system in the cloud Performance-tuned and optimized for analyticsNo fixed size limitsStores all data typesHighly available with local & geo redundant storageWebHDFS REST APISupported by leadingHadoop distrosRole-based securityLow latency and high throughput workloads

YARNHDFS

HDInsightAnalytics Service

Store

U-SQL

Clickstream

Sensors

Video

Social

Web

Devices

Relational

Applications

Page 13: Big Data in Azure

Azure Data Lake AnalyticsA new distributed analytics service

Distributed analytics service built on Apache YARNElastic scale per query lets users focus on business goals—not configuring hardwareIncludes U-SQL—a language that unifies the benefits of SQL with the expressive power of C#Integrates with Visual Studio to develop, debug, and tune code fasterFederated query across Azure data sourcesEnterprise-grade role based access control

Page 14: Big Data in Azure

Typical Azure Big Data Architecture

AzureAPI

Management

Backend Services

Data sources

Apps

Sensors and devices

Event Hubs

Machine Learning

HDInsight(Apache Spark)

Storage

Power BIStream Analytics

SQL Data Warehouse

Azure Data Factory & Azure Data Catalog

Page 15: Big Data in Azure

Highest availability guarantee in the industry for peace of mind

• Managed, monitored and supported by Microsoft

• Enterprise-leading SLA—99.9% uptime

• No IT resources needed for upgrades and patching

• Microsoft monitors your deployment so you don’t have to

*Applies to HDInsight only

99.9% SLA

Page 16: Big Data in Azure

Runs in the Most Datacenters Worldwide

Azure doubling compute

and storage every 6 months*Applies to HDInsight only

Central USIowa

West USCalifornia

East USVirginia

North Central USIllinois

South Central USTexas

Brazil SouthSao Paulo State

West EuropeNetherlands

China North*Beijing

China South*Shanghai

Japan EastTokyo, Saitama

Japan WestOsaka

East AsiaHong Kong

SE AsiaSingapore

Australia South EastVictoria

Australia EastNew South Wales

India CentralPune

North EuropeIreland

East US 2Virginia

Page 17: Big Data in Azure

Lower Total Cost of Ownership

• No hardware • Hadoop support included

with Azure support • Pay only for what you use• Independently scale

storage and compute• No need to hire

specialized operations team

• 63% lower total cost of ownership than on-premises**IDC study “The Business Value and TCO Advantage of Apache Hadoop in the

Cloud with Microsoft Azure HDInsight”

Page 18: Big Data in Azure

Recognized by Top Analysts

Forrester Wave for Big Data Hadoop Cloud• Named industry leader by

Forrester with the most comprehensive, scalable, and integrated platforms*

• Recognized for its cloud-first strategy that is paying off*

*The Forrester WaveTM: Big Data Hadoop Cloud Solutions, Q2 2016.

Page 19: Big Data in Azure

Microsoft DataScience SummitGet hands-on with the latest cutting edge technologies with Big Data, Machine Learning and Open Source at the Microsoft Data Science Summit.

Hear from thought leaders, data scientists, engineers and customers solving real world problems, make expert connections to help you put these technologies to work for your business.

September 26-27, 2016Atlanta, GA

Register Now!aka.ms/microsoftdatasciencesummit

Target audience• Data Scientists • Big Data Engineers • Machine Learning Practitioners/Engineers• Data Science/Engineering Managers

Why attendReadiness with architectural guidance &

hands-on training to operationalize solutions at scale

Real world examples with how to apply machine learning & data science techniques to your business

Networking with the experts and the community to bring your data to life

Page 20: Big Data in Azure

© 2016 Microsoft Corporation. All rights reserved.