operationalizing data analytics

21
1 © Copyright 2014 Pivotal. All rights reserved. Operationalizing Data Analytics How big data analytics frameworks are evolving in the age of Hadoop Jan 2015

Upload: pivotal

Post on 13-Jul-2015

1.466 views

Category:

Documents


0 download

TRANSCRIPT

1 © Copyright 2014 Pivotal. All rights reserved. 1 © Copyright 2014 Pivotal. All rights reserved.

Operationalizing Data Analytics

How big data analytics frameworks are evolving in the age of Hadoop

Jan 2015

2 © Copyright 2014 Pivotal. All rights reserved.

Abstract

Hadoop is widely regarded as a key component of a Big Data infrastructure, but many companies have yet to reap expected benefits from the platform.

In this Webinar, Brian Hopkins of Forrester, and Greg Chase of Pivotal examine the following:

�  Business-cases driving Hadoop deployments

�  Challenges translating deployment into business benefits

�  Other tools in a big data platform needed to realize insights

�  Future evolution of big data infrastructure with in-memory processing

�  How to achieve business value with data stored in Hadoop

3 © Copyright 2014 Pivotal. All rights reserved. 3 © Copyright 2014 Pivotal. All rights reserved.

Business cases What is driving adoption of Hadoop?

4 © Copyright 2014 Pivotal. All rights reserved.

How is Hadoop being used in the enterprise? Hadoop has been considered key capability for implementing Big Data Initiatives in the enterprise. What are the different ways Hadoop is being used?

�  Hadoop adoption has been driven by large organizations with lots of data and complex needs for insight

�  Early Hadoop adoption in large organizations was driven by need to reduce cost of data persistence for analytics

�  Early use cases – Fraud, IT security, ad tech, pricing optimization, offload transactional analytics

�  Emerging use cases – internet of things, customer journey path analytics, dynamic digital experiences

Percentage of firms that have implemented Hadoop

Percentage of firms with > 20,000 employees that have implemented Hadoop

Source: Forrester’s Business Technology Technographics Global Data And Analytics Survey, Q2 2014 Base: 1658 business and technology management professionals with knowledge of data and analytics

5 © Copyright 2014 Pivotal. All rights reserved.

Hadoop Use Cases for Pivotal Customers Retail • CRM – Customer Scoring • Store Siting and Layout • Fraud Detection / Prevention • Supply Chain Optimization

Advertising & Public Relations • Demand Signaling • Ad Targeting • Sentiment Analysis • Customer Acquisition

Financial Services • Algorithmic Trading • Risk Analysis • Fraud Detection • Portfolio Analysis

Media & Telecommunications • Network Optimization • Customer Scoring • Churn Prevention • Fraud Prevention

Manufacturing • Product Research • Engineering Analytics • Process & Quality Analysis • Distribution Optimization

Energy • Smart Grid • Exploration

Government • Market Governance • Counter-Terrorism • Econometrics • Health Informatics

Healthcare & Life Sciences • Pharmaco-Genomics • Bio-Informatics • Pharmaceutical Research • Clinical Outcomes Research

6 © Copyright 2014 Pivotal. All rights reserved. 6 © Copyright 2014 Pivotal. All rights reserved.

Challenges What are typical barriers to achieving

business results with Hadoop?

7 © Copyright 2014 Pivotal. All rights reserved.

How mature are data analytics on Hadoop, really? Enterprises have invested in Hadoop but their analytics capabilities are still at initial stages of maturity.

�  Enterprises are asking, “what do I need that I don’t have already?” –  Especially the larger ones that already have been investing for years in

analytics technology

�  Firms are struggling with business cases

�  Insight is elusive –  Valuable insight is even more so

�  Experiment and fail fast

Large enterprises say they have enough big data and are not expanding

Source: Forrester’s Business Technology Technographics Global Data And Analytics Survey, Q2 2014 Base: 1658 business and technology management professionals with knowledge of data and analytics

Large enterprises say their business cases have a proven ROI

8 © Copyright 2014 Pivotal. All rights reserved.

Common Hadoop Analytics Challenges

1010101010101010101 1010101010101010101 1010101010101010101

Handling volatile streaming data

1010101010101 1010101010101 1010101010101

Querying large datasets

Applying advanced analytics

1010101010101

1010101010101

10101010

In-Memory

Web App

Web App

Web App

Data consistency at scale

9 © Copyright 2014 Pivotal. All rights reserved. 9 © Copyright 2014 Pivotal. All rights reserved.

Realize Insights How to apply advanced analytics to

data stored in Hadoop?

10 © Copyright 2014 Pivotal. All rights reserved.

What they key data access methods? What methods do you see enterprises taking to access and work with big data?

�  The basic access paradigm is still KVS

�  MapReduce is too limiting for many cases – graph, stream and SQL emerging

�  SQL/Hadoop is hot, and for good reason –  Recognize the different flavors of this –  It’s not an apples-to-apples comparison

�  Recognize the emerging need for speed – streaming and in-memory –  Accelerate mapReduce, graph, and search

11 © Copyright 2014 Pivotal. All rights reserved.

A Business Data Lake Adds Analytic Insights Centralized Management

System monitoring System management

Unified Data Management Tier Data mgmt.

services MDM RDM

Audit and policy mgmt.

Processing Tier

Workflow Management

Distillation Tier

HDFS storage Unstructured and structured data

In-memory MPP database

Unified Sources Flexible Actions

Real-time ingestion

Micro batch ingestion

Batch ingestion

Real-time insights

Interactive insights

Batch insights

12 © Copyright 2014 Pivotal. All rights reserved. 12 © Copyright 2014 Pivotal. All rights reserved.

In-Memory Computing How will in-memory computing evolve

big data platforms?

13 © Copyright 2014 Pivotal. All rights reserved.

What’s is happening with in-memory? There is a lot of momentum around in-memory processing with Spark and other technologies. What is really going on?

�  It’s important to understand the flavors of in-memory –  Pure DBs, DB acceleration, caches/grids, Hadoop/Spark

�  Much of it is not all that new –  Likely your DBs today are accelerated with better in-memory

caching

�  What is new/ interesting? –  Spark (SparkSQL, SparkX, Spark Streaming, SparkML) –  Spark will steal HDFS workloads –  Streaming is a form of in-memory and comes in many flavors too –  Caching

14 © Copyright 2014 Pivotal. All rights reserved.

How In-Memory Evolves the Data Lake �  Current: In-memory distributed databases: SQL & NoSQL

�  Future: Tachyon in-memory File System –  Extending analytical data warehouses to in-memory OLAP –  Convergence of in-memory OLAP and OLTP –  Robust handling of Spark RDDs

Tez HAWQ Spark GemFire

Tachyon (In-mem Polyglot File System)

HDFS NFS S3 Gluster FS

GemFireXD

Ceph

15 © Copyright 2014 Pivotal. All rights reserved. 15 © Copyright 2014 Pivotal. All rights reserved.

Business Value How to achieve business value with

analytics based on Hadoop

16 © Copyright 2014 Pivotal. All rights reserved.

How do enterprises cross the chasm? How can enterprises get over the chasm in rapidly implementing analytics (bottom up) and reaping business benefits (top down)?

�  Look for shared architecture requirements as you run LOB specific pilots –  Focus investment on metrics a LOB exec cares about –  Demonstrate shared architecture benefits

�  Look for ways to automate and scale insights execution –  Deliver insight at the point of decision

�  Stop using security as a blanket cloud objection –  You likely already have sensitive customer data in the cloud –  Leverage cloud when data is “soupy”

17 © Copyright 2014 Pivotal. All rights reserved.

The Journey to Data Driven Innovation

STORE

Business Data Lake

Store everything

ANALYZE

Big Data Analytics

Generate Insights

BUILD

Data-Driven Applications

Operationalize

INNOVATE

Agile Enterprise

Iterate Rapidly

PDL Data Science

Pivotal Labs Agile

Pivotal CF Services

PDL Data Architecture

Agile Development

Big Data SQL-Based Analytics

Enterprise PaaS

18 © Copyright 2014 Pivotal. All rights reserved.

World’s Leading Experts Pivotal Labs – Pivotal Data Labs

BATCH BATCH

INTERACTIVE INTERACTIVE HAWQ Greenplum DB

Pivotal HD

REAL-TIME REAL-TIME GemFire XD GemFire

The Foundation for Data-Driven Enterprise

19 © Copyright 2014 Pivotal. All rights reserved.

Find out more…

� Pivotal Big Data Suite

� Download Pivotal HD

� Enterprise SQL on Hadoop

� Big Data @ Pivotal blog –  The Future Architecture of the Data Lake

20 © Copyright 2014 Pivotal. All rights reserved. 20 © Copyright 2014 Pivotal. All rights reserved.

Thank You

A NEW PLATFORM FOR A NEW ERA