modernising your analytics insights - cdp · advanced analytics • machine learning • predictive...

42
Modernising Your Analytics Insights

Upload: others

Post on 20-May-2020

17 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

Modernising Your Analytics Insights

Page 2: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

Creating a culture of agile data innovation:• speeding up data ingestion, preparation & validation• driving self-service insights into all the metrics that matter• streaming & processing data in real-time as events occur• empowering next-generation predictive & machine learning models

Together, we’ll take a look at modern data engineering frameworks, and how our clients are effectively using the capabilities to transform and optimise their decision-making.

With Stream Processing & Learning Models

Modernising Your Analytic Insights

Page 3: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES
Page 4: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES
Page 5: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

CHALLENGES FOR BUSINESSES

THE PROBLEM: • Struggle to utilise the value of data

• Business never stand still

• Limits to traditional Data Warehousing

THE OPPORTUNITY: • Use data more effectively

• Leverage data for maximum advantage

Page 6: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

ARCHITECTURE LANDSCAPE

Page 7: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

Why we should embrace the new architecture

Page 8: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

TRADITIONAL vs NEW ARCHITECTURE

New Architecture

• Augmentation of existing environment

• Streamed data or Historical archiving

• Fast Capture of a variety information sources –

• Model data on use

• Supports a wide variety use cases not traditionally seen in DWH

• Driving the business forwards

Traditional

• Optimised for batch processed data processes

• Large development effort

• Model data on load

• Mostly structured data

• Monitoring and measuring the business

Page 9: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

BIG DATA LANDSCAPE

Page 10: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

ARCHITECTURE LANDSCAPE

Batch

Processing

Real-timeApplications

Data Visualization Data WranglingStatistical / Predictive

Batch Data Ingestion (ETL)

Da

taL

og

Da

taF

low

Cluster Management

Data Lake

Streaming

Operational Database

Security

MachineLearning

SQL Search

Da

ta G

ove

rna

nc

e

Page 11: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

Foundation

• Data Lake

• Distributed processing

• Security

• Batch Ingestion

• SQL-like access Interface

Data Discovery

• Data wrangling

• Data visualisation

Specialised Components

• Real-time application

• Search capability

• Operational database

• Time Series databaseStreaming

• Real-time data ingestion

• Simple & Complex event process

Advanced Analytics

• Machine Learning

• Predictive

DELIVERY FOCUS

Page 12: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

SESSION FOCUS

Batch

Processing

Real-timeApplications

Data Visualization Data WranglingData WranglingStatistical / Predictive

Batch Data Ingestion (ETL)

Da

taL

og

Da

taF

low

Da

taF

low

Cluster Management

Data LakeData Lake

Streaming

Operational

Database

Security

Machine Learning

SQL Search

Da

ta G

ove

rna

nc

e

Page 13: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

1. Foundation

2. Data Discovery

3. Advanced Analytics

4. Streaming

5. Specialist Components

1. Foundation

Page 14: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

FOUNDATION

Batch

Processing

Real-timeApplications

Data Visualization Data WranglingStatistical / Predictive

Batch Data Ingestion (ETL)

Da

taL

og

Da

taF

low

Cluster Management

Data (Lake) StorageData (Lake) Storage

Streaming

Operational

Database

Security

Machine Learning

SQL Search

Da

ta G

ove

rna

nc

e

Page 15: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

Data Lake

https://en.wikipedia.org/wiki/Comparison_of_distributed_file_systems

FOUNDATION

Many use cases amongst others

�An environment that ensures optimal use of structured and unstructured data to deliver business value

�Central repository of critical and sensitive data

�Data maintained over long duration

�Users can access and analyze data in new and different ways

�Conceptual not technical environment

Page 16: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

CUSTOMER FOUNDATION USE CASES

� Analytical Data Lake

� Data ingestion

Layer

Page 17: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

1. Foundation

2. Data Discovery

3. Advanced Analytics

4. Streaming

5. Specialist Components

2. Data Discovery

Page 18: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

DATA DISCOVERY

Batch

Processing

Real-timeApplications

Data Visualization Data WranglingStatistical / Predictive

Batch Data Ingestion (ETL)

Da

taL

og

Da

taF

low

Cluster Management

Data Lake

Streaming

Operational

Database

Security

Machine Learning

SQL Search

Da

ta G

ove

rna

nc

e

Page 19: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

DATA DISCOVERY

Data Wrangling/ Data Exploration/ Data Understanding

“Data wrangling is an iterative agile process for exploring, combining, cleaning and transforming

raw data into curated datasets for data science, data discovery, business intelligence and analytics.”

Data Exploration

Page 20: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

DATA DISCOVERY USE CASES

� Data preparation and visualisation

� Data wrangling process, preparation for analytics

Page 21: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

1. Foundation

2. Data Discovery

3. Advanced Analytics

4. Streaming

5. Specialist Components

3. Advanced Analytics

Page 22: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

ADVANCED ANALYTICS

Batch

Processing

Batch

Processing

Real-timeApplications

Real-timeApplications

Data VisualizationData Visualization Data WranglingData WranglingStatistical / Predictive

Batch Data Ingestion (ETL)

Batch Data Ingestion (ETL)

Da

taL

og

Da

taL

og

Da

taF

low

Da

taF

low

Cluster ManagementCluster Management

Data LakeData Lake

StreamingStreaming

Operational

Database

Operational

Database

SecuritySecurity

Machine Learning

SQLSQL SearchSearch

Da

ta G

ove

rna

nc

eD

ata

Go

ve

rna

nc

e

Page 23: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

ADVANCED ANALYTICS

Data in

Motion(Cloud)

Data in

Motion(on-

premises)

Data at

Rest(on-

premises)

Edge

Data

Data in

Motion

Edge Analytics

Data at

Rest(Cloud)

Edge

Data

Data at

Rest(on-

premises)

Closed Loop

Analytics

MachineLearning

Deep HistoricalAnalysis

Cloud

Page 24: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

ADVANCED ANALYTICS USE CASES

� Work pattern time

series analysis

� Risk prediction

Page 25: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

1. Foundation

2. Data Discovery

3. Advanced Analytics

4. Streaming

5. Specialist Components

4. Streaming

Page 26: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

STREAMING

Batch

Processing

Batch

Processing

Real-timeApplications

Real-timeApplications

Data VisualizationData Visualization Data WranglingData WranglingStatistical / PredictiveStatistical / Predictive

Batch Data Ingestion (ETL)

Batch Data Ingestion (ETL)

Da

taL

og

Da

taF

low

Da

taF

low

Cluster ManagementCluster Management

Data LakeData Lake

Streaming

Operational

Database

Operational

Database

SecuritySecurity

Machine LearningMachine Learning

SQLSQL SearchSearch

Da

ta G

ove

rna

nc

eD

ata

Go

ve

rna

nc

e

Page 27: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

STREAMING

Page 28: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

STREAMING USE CASES

� Retail real time cash slip

processing to detect

anomalies and patterns

� IOT device data

collection or transport

patterns (late busses)

Page 29: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

1. Foundation

2. Data Discovery

3. Advanced Analytics

4. Streaming

5. Specialist Components5. Specialist Components

Page 30: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

Batch

Processing

Batch

Processing

Real-timeApplications

Data VisualizationData Visualization Data WranglingData WranglingStatistical / PredictiveStatistical / Predictive

Batch Data Ingestion (ETL)

Batch Data Ingestion (ETL)

Da

taL

og

Da

taL

og

Da

taF

low

Da

taF

low

Cluster ManagementCluster Management

Data LakeData Lake

StreamingStreaming

Operational

Database

SecuritySecurity

Machine LearningMachine Learning

SQLSQL Search

Da

ta G

ove

rna

nc

eD

ata

Go

ve

rna

nc

e

SPECIALIST COMPONENTS

Page 31: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

SPECIALIST COMPONENTS

Page 32: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

Batch

Processing

Real-time

Applications

Data Visualization Data Wrangling Statistical / Predictive

Batch Data Ingestion (ETL)

Da

taLo

g

Da

taFl

ow

Cluster Management

Data Lake

Streaming

Operational

Database

Security

Machine

Learning

SQL Search

Da

ta G

ove

rna

nce

HDFS YARN

OPEN TECHNOLOGY LANDSCAPE

Page 33: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

Transition and Augmentation

HYBRID TECHNOLOGY LANDSCAPE

Batch

Processi

ng

Real-time

ApplicationsData Visualization Data Wrangling Statistical /

Predictive

Batch Data Ingestion (ETL)

Da

taLo

g

Da

taFl

ow

Cluster Management

Data Lake

Streami

ng

Operation

al

Database

Security

Machine

Learning

SQL Search

Da

ta G

ove

rna

nce

HDFS YARN

Existing

DWH

Existing

operational Existing

ETL

Page 34: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

Coffee break

Page 35: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES
Page 36: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

CDP GROUP – WHO ARE WE

GOAL: Driving insight into your business

OUR FOCUS: • Design & develop analytics, planning &

data engineering solutions

• Enhance your ability to make high-impact & factually informed decisions

• Reputation based on our deep level of expertise

Page 37: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

IBM VALUE-ADD

� Augmented Analytics with built-in AI• Contextual visualisation recommendations• Automated blend & join data-preparation• Smart natural language

� Data Science Platform for collaboration• Incorporate all your R, Python, Spark & others• Model & library management• Version & deployment control

� SQL Workload Enhancement over HIVE• Rich & complex SQL with outstanding performance• Re-use existing DW queries against Hadoop• Federate DW in place augmented with Hadoop

Page 38: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

HORTONWORKS OPEN-SOURCE

� Modern data architecture• Scalable, secure, hybrid data lakes• Enables deep / machine learning

� Real-time streaming platform• Data pipelines, ingestion & routing• High-volumes from sensors & devices

Page 39: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

Hortonworks

Page 40: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

HOW DO YOU BEGIN THE MODERN ANALYTICS

JOURNEY?

Page 41: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

PERFORMANCE PLAN

Our Performance Plan approach:

• identifies core business information requirements

• measures against current capability

• creates a plan to deliver.

Information, once captured:

• presented back as a templated plan

• road map of prioritised quick-win deliveries.

CDP is offering a first steps Performance Plan engagement from your attendance today:

• 5 Days - $7.5k

Typical initial delivery

20 days or $30k

Page 42: Modernising Your Analytics Insights - CDP · Advanced Analytics • Machine Learning • Predictive DELIVERY FOCUS. SESSION FOCUS Batch Processing Real-time ... STREAMING USE CASES

Thank you