Other brands and names are the property of their respective owners.
APAC Big Data &
Cloud Summit 2013
Big Data Analytics & Hadoop
Use Cases
Eddie Toh
Server Marketing Manager
21 August 2013
Other brands and names are the property of their respective owners.
From the dawn of civilization until 2003, we humans
created 5 Exabyte of information. Now we create
that same amount of information in two days! In
2012, the digital universe of data will expand to 2.72
zettabytes (ZB). Then it’s predicted to double every
two years.
Other brands and names are the property of their respective owners.
Big Data – Volume, Velocity, Variety (& Value)
7.9 ZB by
2015 3x more bits in digital universe than stars in the physical universe
450 Billion Business transactions per day by 2020 (IDC)
$600 Bn Potential value to US healthcare
90% of Data In the world was created in the last 2 years.
100 years Worth of video uploaded to YouTube every 10 days
>5 Billion People calling, texting, tweeting & browsing on cell phones
How Will Businesses Manage a 50x Data Growth by 2020 in an Affordable Way?
“In God we trust, all others bring data” — NASA, Johnson Space Center
Therapies tailored to a persons genome
Decoding the human genome:
• From 10 years to hours
• On track to hit <$1000 per person
Explosive growth, 30 Tb/month billing data
Radical overhaul of customer service:
• Self service, realtime access
• 30x performance increase
Other brands and names are the property of their respective owners.
BIG DATA MACHINE
GENERATED
HUMAN GENERATED
BUSINESS GENERATED
Edge
Scale Up
Distributed
REQUIRES DIFFERENT APPROACHES
Govt.
Healthcare
Retail
Other brands and names are the property of their respective owners.
Big Data use cases across industries
Education
Financial Services
Other brands and names are the property of their respective owners.
Democratize data analysis from edge to cloud…
Intel can deliver end-to-end analytics from the edge
intelligent systems to the Datacenter/cloud
Other brands and names are the property of their respective owners.
End-uses & DR
Distribution System Transmission System
Energy Storage
Fuel Supply System
Fuel Source/Storage
Power Plants
Renewable Plants
Data Collection and Processing
Predictive Analytics
Sensors
Controllers
End-to-End Power Delivery Chain Operation
Monitoring, Ingestion, Modeling, Analysis, Coordination & Control
Other brands and names are the property of their respective owners.
Will Big Data be the difference between success and failure of a political campaign?
“This was the first
presidential election
campaign where all of the
data that was coming into
the campaign was
successfully collected and
centralized.”
“The Obama campaign
did a successful job with
that; the Romney
campaign did not”
John Aristotle Phillips, Chief
Executive of Aristotle International
(WSJ 11/29/12)
Eric Dishman Video
Other brands and names are the property of their respective owners.
From Intuition to Predictive Analytics Big Data maturity framework
Big Data
Adoption
Introduction
Deployment Production
Ongoing
Application
Business
Challenges
a) Cost
Reduction b) Competitive
differentiation
innovation c) Revenue Growth
• Revenue
Growth • Competitive
differentiation and
innovation
• Cost Reduction
• Obtain / maintain
customer Loyalty
targeted focus • Competitive
differentiation Innovation
• Revenue Growth
• Stabilizing revenue
generation
Intuition
vs.
Analytics Future strategies,
Day-to-Day
operations
Enhanced
Analytics
A data driven
enterprise
Descriptive Analytics • Financial and
Operation
Management
• Sales and Marketing
• Customer services
Prescriptive analytics • Strategic Business
development
• Research and product
• Development
• Enhanced Customer
services
Predictive analytics • Risk Management
• Real time customer
experience
• Automated resource
allocation
• Value creation and Brand
Management
Other brands and names are the property of their respective owners.
Bus. Strategy
KPIs
LOB
Reporting
Visual
Structures
Machine Learning
Algorithm
Analysis
Integration
Query Performance
Transport
Transformation
Warehousing
Efficiency
Trust
Workload
Governance
Tools
Network
Compute
Storage
Big Data Functional Models “Where do I start?”
Big Data
Infrastructure
Data
Management
Data
Usage
Hadoop
Framework
Data
Science Domain
Expertise
Value
Data Ingestion
and
Processing
Transform
question to
algorithm
Asking the
right
question
Driving Value from Big Data depends on Quality,
Accuracy and Efficiency
Other brands and names are the property of their respective owners.
Lastly… Intel® Distribution for Apache Hadoop*
Performance
Security Management
Intel
Architecture
Backup
Other brands and names are the property of their respective owners.
Intel* Distribution for Apache Hadoop What did we launch…
Focus on near real-time analytics w/ HBase & Hive enhancements
Access control, encryption, secure data movement
Job throughput efficiency for HDFS
Dynamic replication for HDFS & HBase
Intel optimized total solution architecture -distro, storage, network, compute
0
2000
4000
700
3500
Open
Sourc
e
Optimized Intel
IA/Distro
5X Performance
for Real-time
jobs
HBase as the data store. Query all CDR in month
− Inserting 10000 records/second/server
− Read from disk: >400 query/second/server
Intel® Manager for Apache Hadoop software Deployment, Configuration, Monitoring, Alerts, and Security
HDFS
Hadoop Distributed File System
YARN (MRv2) Distributed Processing Framework
HBase
Colu
mnar
Sto
re
Zo
ok
ee
pe
r Coord
inati
on
Flu
me
Log C
ollecto
r
Sq
oo
p
Data
Exchange
Pig Scripting
Hive SQL
Query
Oozie Workflow
Mahout Machine
Learning
R
connector
s Statistics
Intel enhancements
contributed back to open
source Open source components
included without change
Intel unique