sjsu business school: guest lecture - big data in business (sept 28, 2015)
Post on 12-Apr-2017
316 Views
Preview:
TRANSCRIPT
1© Copyright 2013 Pivotal. All rights reserved. 1.
@krishdpi
The Foundation for ChangeBig Data in Business
9/28/2015SJSU
2© Copyright 2013 Pivotal. All rights reserved. 2.
@krishdpi
kriss@mba.berkeley.edu@krishdpihttp://www.linkedin.com/in/kriss
SK(Saravana Krishnamurthy)
Dir of Product Management Motorola Mobility
3© Copyright 2013 Pivotal. All rights reserved. 3.
@krishdpi
What is “Big Data”
“Big Data” refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze. This definition is intentionally subjective and incorporates a moving definition of how big a dataset needs to be in order to be considered big data—i.e., we don’t define big data in terms of being larger than a certain number of terabytes (thousands of gigabytes).
-McKinsey Global Institute, May 2011
4© Copyright 2013 Pivotal. All rights reserved. 4.
@krishdpi
!!!
!!!
!!!
!!!
!!!“Big Data Is Less About Size, And More About Freedom”
―Techcrunch
!!!
!!!
!!!“Findings: ‘Big Data’ Is More Extreme Than Volume”
― Gartner “Big Data! It’s Real, It’s Real-time, and It’s Already Changing Your World” ―IDC
“Total data: ‘bigger’ than big data” ― 451 Group
THE ERA OF
BIG DATA
IS HERE
5© Copyright 2013 Pivotal. All rights reserved. 5.
@krishdpi
Data VolumeGrowing 44x
2020: 35.2 Zettabytes
2010:1.2
Zettabytes
The Digital Universe 2010 - 2020
Source: IDC Digital Universe Study, sponsored by EMC, May 2010
6© Copyright 2013 Pivotal. All rights reserved. 6.
@krishdpi
Growth of Data5 Exabytes of online data in 2002
281 Exabytes by 2009
56x growth over 7 years Source: Marissa Mayer
• By 2015, Mobile data traffic is predicted to be 75 Exabytes annually – Cisco
• Healthcare (as of 2011) is calculated at 150 Exabytes – SAS
• The smallest and most conservative growth rate shows 100,000 Exabytes of data by 2020 – Digital Universe Study by IDC
7© Copyright 2013 Pivotal. All rights reserved. 7.
@krishdpi
2008 2009 2010 2011 2012 2013 $-
$20,000
$40,000
$60,000
$80,000 Big Data Platform Price/TB
Big Data DB Hadoop
Economics Have Changed the Game
8© Copyright 2013 Pivotal. All rights reserved. 8.
@krishdpi
Big Data Analytics: The Path to
Business Value
IN THE BIG DATA ERA: ANALYTICS ARE THE KEY TO SUCCESS
9© Copyright 2013 Pivotal. All rights reserved. 9.
@krishdpi
Analytics TermsAnalyticsThe practice of applying aggregations, statistics and models to large datasets to solve problems in business and industry
Business intelligenceAnother term for analytics, but often used to refer specifically to reporting, OLAP and other descriptive statistics
Data miningExtracting patterns and insights from large data sets using tools from statistics and machine learning.
Machine learningAlgorithms that allow computers to learn behaviors from data
10© Copyright 2013 Pivotal. All rights reserved. 10.
@krishdpi
Analytics Evolution Desired by CustomerHIGH
FutureLOW Past Time
BUSINESS VALUE
ThenBusiness Intelligence(Descriptive)
NowPredictive Analytics and Data Mining
11© Copyright 2013 Pivotal. All rights reserved. 11.
@krishdpi
Some DefinitionsDescriptive Analytics:- Raw facts- Summaries- Nice charts- Slice & Dice- History, up to this moment
Predictive Analytics:- Patterns from the past- Statistically relevant- Current conditions- Events that are
likely to happen- Data Mining, Machine
Learning- 70% of who bough A and B
also bought C- John bought A and B …
Prescriptive Analytics:- Large number of options or possible actions
- Provides the best one- Operations Research- Store Assortment- Shelf-Space Optimization
Perf. Mgmt. Analytics:- Descriptive Analytics- Plus Goals
12© Copyright 2013 Pivotal. All rights reserved. 12.
@krishdpi
Private/Hybrid Cloud Infrastructure or Appliance
Data Access & Query Layer
Tools & Services
Analytic Productivity Layer
Hadoop
Data Scientist
Data Engineer
Data Analyst
Bl Analyst
LOB User
DatabaseData Platform Admin
DAT
A S
CIE
NC
E T
EA
M
Visualization Layer
CxO/Decision Maker
13© Copyright 2013 Pivotal. All rights reserved. 13.
@krishdpi
Is this user likely to be interested in this ad? Conjugate Gradient, SVMWhich campaign is working better? Mann-Whitney U Test
Does this product appeal to some segments more than others? Log-likelihood
How do I do hyper-targeting of my high-value frequent visitors? Cohort analysis
How can I tell if certain advertisers are fraudulent? tf-idf and Cosine Similarity
Which features of a campaign result in user revisits? Regression
How do I segment Users? K-means clustering
What are people saying about my new Product Launch? MapReduce, Sparse Vectors, K-Means
How do I optimise my SKU’s? Genetic Algorithms
How do I promote increased usage of credit/loyalty cards Decision Trees
Advanced Analytics
14© Copyright 2013 Pivotal. All rights reserved. 14.
@krishdpi
Industries Are Broadly Embracing Big Data
Retail•CRM – Customer Scoring•Store Siting and Layout•Fraud Detection / Prevention•Supply Chain Optimization
Advertising & Public Relations•Demand Signaling•Ad Targeting•Sentiment Analysis•Customer Acquisition
Financial Services•Algorithmic Trading•Risk Analysis•Fraud Detection•Portfolio Analysis
Media & Telecommunications•Network Optimization•Customer Scoring•Churn Prevention•Fraud Prevention
Manufacturing•Product Research•Engineering Analytics•Process & Quality Analysis•Distribution Optimization
Energy•Smart Grid•Exploration
Government•Market Governance•Counter-Terrorism•Econometrics•Health Informatics
Healthcare & Life Sciences•Pharmaco-Genomics•Bio-Informatics•Pharmaceutical Research•Clinical Outcomes Research
Big Data Users
15© Copyright 2013 Pivotal. All rights reserved. 15.
@krishdpi
Big Data Ecosystem Enablers
16© Copyright 2013 Pivotal. All rights reserved. 16.
@krishdpi
ORACLESQL ServerSAP HANATerradataGreenplum
MS ExcelSASBusiness ObjectsPivotal
Platform SupportRedHatWindowsServerPivotalVMware
Modern Big Data Architecture
17© Copyright 2013 Pivotal. All rights reserved. 17.
@krishdpi
Use Cases
18© Copyright 2013 Pivotal. All rights reserved. 18.
@krishdpi
Flight Test
ObjectiveOptimize flight time
ProblemManual diagnostics4 hours test flight is 2 TB400 000 parameters, only widely 4000 used
SolutionRealtime big data analyticsMachine learning
19© Copyright 2013 Pivotal. All rights reserved. 19.
@krishdpi
ObjectiveImprove patient care
ProblemScattered member dataFrequent hospital visit
SolutionCombine behavioral, contextual dataUtilize member history and data scienceProvide accurate diagnostics
20© Copyright 2013 Pivotal. All rights reserved. 20.
@krishdpi
Physical Data Strategy
Data Flow Use Case
Extreme OLTP(Cassandra)
Streaming Data
Interactive Data
Operational(DB2, Oracle,
Informix)
Landing(Hadoop)
Repository(Teradata)
OLTP(DB2, Oracle,
Informix)
Repository(lower SLA)(Greenplum)
Batch
BI(Teradata, Oracle)
General BI
Perf Analytics(Greenplum)
Lab Analytics(Hadoop)
RL 2.0
Analytics Lab
21© Copyright 2013 Pivotal. All rights reserved. 21.
@krishdpi
Q & A
top related