big data analytics - kannan.files.wordpress.com · what is big data analytics •big data is so...

16
Big Data Analytics Why Enterprises are struggling and How Startups can step in Manish Choudhary Sunil Guttula Susheel Kaushik Raghu Mendu

Upload: others

Post on 09-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Big Data Analytics - kannan.files.wordpress.com · What is Big Data Analytics •Big Data is so large and complex that it becomes difficult to process using existing on-hand database

Big Data Analytics

Why Enterprises are struggling

and How Startups can step

in

Manish Choudhary Sunil Guttula

Susheel Kaushik Raghu Mendu

Page 2: Big Data Analytics - kannan.files.wordpress.com · What is Big Data Analytics •Big Data is so large and complex that it becomes difficult to process using existing on-hand database

!!!

!!!

!!!

!!!

!!!

“Big Data Is Less About Size, And More About Freedom”

―TechCrunch

!!!

!!!

!!! “Findings: ‘Big Data’ Is

More Extreme Than Volume”

― Gartner

“Big Data! It’s Real, It’s Real-time, and It’s Already Changing Your World”

―IDC

“Total data: ‘bigger’ than big data”

― 451 Group

Page 3: Big Data Analytics - kannan.files.wordpress.com · What is Big Data Analytics •Big Data is so large and complex that it becomes difficult to process using existing on-hand database

!!!

!!!

!!!

!!!

!!!

“Big Data Is Less About Size, And More About Freedom”

―Techcrunch

!!!

!!!

!!! “Findings: ‘Big Data’ Is

More Extreme Than Volume”

― Gartner

“Big Data! It’s Real, It’s Real-time, and It’s Already Changing Your World”

―IDC

“Total data: ‘bigger’ than big data”

― 451 Group

Stop Talking, Let’s Get Started

Page 4: Big Data Analytics - kannan.files.wordpress.com · What is Big Data Analytics •Big Data is so large and complex that it becomes difficult to process using existing on-hand database

Digital Shadow – it’s growing continuously

Page 5: Big Data Analytics - kannan.files.wordpress.com · What is Big Data Analytics •Big Data is so large and complex that it becomes difficult to process using existing on-hand database

Storage Footprint is also Increasing

"The World’s Technological Capacity to Store, Communicate, and Compute Information", Martin Hilbert and Priscila López (2011), Science

281 471 2,200

67,000 667,000

1

10

100

1,000

10,000

100,000

1,000,000

1986 1993 2000 2007 2012

Peta

byt

es

Storage in Exabytes

10x in 5 yrs

10x in 5 Yrs

2x in 10 Yrs

Page 6: Big Data Analytics - kannan.files.wordpress.com · What is Big Data Analytics •Big Data is so large and complex that it becomes difficult to process using existing on-hand database

What is Big Data Analytics

• Big Data is so large and complex that it becomes difficult to process using existing on-hand database management tools – Walmart – Over 1 Million transactions an hour

• 2.5 Peta Bytes = 167 times the information contained in all books

• Big Data Analytics is application of advanced analytic techniques to very large, diverse and varied data sets

• Amount of Data is ever increasing Volume

• Data creation speed and insights latency Velocity

• Type of Data – Structured & Unstructured Variety

Page 7: Big Data Analytics - kannan.files.wordpress.com · What is Big Data Analytics •Big Data is so large and complex that it becomes difficult to process using existing on-hand database

Predict Buyer Behavior to Increase Revenue Big Data Analytics Enables Increased Per-Customer-Profit

LOW

HIGH

Agent “Best Guess”

Custo

mer

Pro

fit

Branch Level Reporting Enabling

Profit-based Recommendations

Legacy System

TRADITIONAL DATA LEVERAGED BIG DATA LEVERAGED

Big Data Analytics

BI Reporting

Market Basket Analysis & Customer Lifetime Value Computations Enabling

User-based Recommendations

In-Database Analytics

Data Enriched with Unstructured Activity Logs

To Identify At Risk Customers

USE CASE

Page 8: Big Data Analytics - kannan.files.wordpress.com · What is Big Data Analytics •Big Data is so large and complex that it becomes difficult to process using existing on-hand database

Reduce Risk With External Data Big Data Analytics Enables Global Crisis Avoidance

LOW

HIGH

Daily Risk Model Updates

Underw

riting R

isk

Monthly Risk Model Updates

TRADITIONAL DATA LEVERAGED BIG DATA LEVERAGED

Legacy System

BI Reporting In-Database Analytics

K-Means Clustering & Decision Tree Scoring Improves Accuracy

Delivering In Minutes What Was

Days

Big Data Analytics

Unstructured Data Sources Enrich The Data

USE CASE

Page 9: Big Data Analytics - kannan.files.wordpress.com · What is Big Data Analytics •Big Data is so large and complex that it becomes difficult to process using existing on-hand database

Staying Competitive

• Earned Run Average (ERA) – Used to define pitchers

“productivity” for almost 100 years in baseball

– Statistic that measures how good pitchers are preventing runs

• Skill-Interactive Earned Run Average (SIERA)

SIERA = 6.145 - 16.986*(SO/PA) +11.434*(BB/PA) - 1.858*((GB-

FB-PU)/PA) + 7.653*((SO/PA)^2) +/- 6.664*(((GB-FB-PU)/PA)^2) +

0.130*(SO/PA)*((GB-FB-PU)/PA) - 5.195*(BB/PA)*((GB-FB-PU)/PA)

Source: New York Times

Page 10: Big Data Analytics - kannan.files.wordpress.com · What is Big Data Analytics •Big Data is so large and complex that it becomes difficult to process using existing on-hand database

Data Manipulation, Analytics and Visualization

Analytics Productivity Layer

Massively Parallel Processing

SQL

Data Scientist

Data Engineer

Data Analyst Bl Analyst

LOB User

Data Platform Admin

DA

TA S

CIE

NC

E TE

AM

Cloud, x86 Infrastructure, or Appliance

NO SQL

Page 11: Big Data Analytics - kannan.files.wordpress.com · What is Big Data Analytics •Big Data is so large and complex that it becomes difficult to process using existing on-hand database

Data Scientist

• Extract meaning from data and create data products

• Leveraging: Math, Statistics, Machine Learning, Pattern Recognition, Advanced Computing, Uncertainty Modeling, Visualization…

Data Scientist Is The 'Sexiest Job Of The 21st Century' - Harvard Business Review

Page 12: Big Data Analytics - kannan.files.wordpress.com · What is Big Data Analytics •Big Data is so large and complex that it becomes difficult to process using existing on-hand database

Mobile

Data Visualization

Big Data Analytics Reference Architecture

Ingestion Assimilation Store and Access Analysis Visualization

Data Sources

Structured Data Sources

Traditional Data Integration

Traditional Data Warehousing

Big Data Analytics Ramifications

POS

CRM

ERP

LOB Data

Web/Social

Multimedia

Machine

Mobile

Documents

ETL

MDM

Data Quality

Federated Data

Warehouse

Enterprise Data

Warehouse

Dat

a M

arts

BU 1

BU 2

BU 3

BI as a Service

SQL Stores

BIG Data Analytics Platforms (MPP Databases, NoSQL Stores)

EMC Greenplum, Apache Hadoop, HP Vertica, IBM

Netezza, Oracle ExaData, Teradata

Statistics D

ata Min

ing

Op

eration

s Research

Neu

ral Netw

orks

OLA

P

Gen

etic Algo

rithm

s

Alerts Dashboard

Spreadsheet Report

Traditional Analysis

Automation

Page 13: Big Data Analytics - kannan.files.wordpress.com · What is Big Data Analytics •Big Data is so large and complex that it becomes difficult to process using existing on-hand database

Early Adopters of Big Data Analytics Online Use Case

Online Companies

• Media • Ad Optimization

• Article Categorization

• Retailers • Product Recommendation

Telco

• Churn Prediction

Banking

• Product Recommendation

Marketing

• Sentiment Analysis

Why they worked • Everything at scale • Actions were automated • New business process

Data Scientists • Built complex optimization

and recommendation models

Page 14: Big Data Analytics - kannan.files.wordpress.com · What is Big Data Analytics •Big Data is so large and complex that it becomes difficult to process using existing on-hand database

Enterprise Adoption of Big Data Analytics

LaggardsLate

Majority

Early

Majority

Early

AdoptersInnovators

"The

Chasm"

Technology Adoption ProcessData Scientists working with

Business Users

Business Users collaborating with Data

Scientists

Business Users

Online Enterprises

Page 15: Big Data Analytics - kannan.files.wordpress.com · What is Big Data Analytics •Big Data is so large and complex that it becomes difficult to process using existing on-hand database

Great Panel

• Manish Choudhary

• MD at Pitney Bowes Consumer

• Sunil Guttula

• CEO Bizosys Technologies Producer

• Raghu Mendu

• Co-Founder, Ventureast Investor

• Susheel Kaushik

• Senior Director Product Management, EMC All Rounder

Page 16: Big Data Analytics - kannan.files.wordpress.com · What is Big Data Analytics •Big Data is so large and complex that it becomes difficult to process using existing on-hand database

Backup Slides