big data insights part i
TRANSCRIPT
Government • Street bump
Mobile App – City of Boston initiatives
• City gets real time information on “bump” data
Car Insurance • More granular in
pricing • Address more in
depth questions
Recruiting • How to hire
better employees and retain?
Simple solutions to Big problems
Source: Phil Simon’s Big Data, too big to ignore
Consumer driven trends
and technologies
Data Science
Infonomics Platforms
Sandy, Politics,
Social Media
Big Data Influence
Data Science
Ultimate Goal – Improve Decision making
principles
Frameworks
Data analytic thinking
• Extract patterns• Mining for useful knowledge
• Models • Process • Stages
• Assess how data can improve performance
• Understand data science • See data oriented competitive threats • Question
Big Data Advantage – Analytics and decision management
Decision making
Data deluge
Techniques Solutions
Rewards
Source: Forbes.com, Cloud Predictive Analytics most used to gain customer insight, 10/24/2013
Data types in a Big Data context
Big Data Use Cases – Financial Sector
Source – HP Sponsored white paper, Case for Big Data in the Financial Services sector, IDC Financial Insights opinion 2012
Investment banks - speed post-trade settlement,
confirmation
Hedge funds - optimize price discovery
Retail banking - create merchant intelligence and assist in
optimizing offers and pricing
Social media tracking for marketing campaigns
Investment research to understand traffic patterns, trends
Community banksAnalyzing unstructured data to anticipate workloads in call centers
Social analytics towards product and service initiatives
Big Data Use Cases – Government
Source – IBM, “Accelerate Analytics and harness Big data within government”
Improve citizen and business services -
Smarter social services. Fight fraud, abuse and errors
Manage resources effectively – tax compliance
Fight fraud, abuse and errors
Strengthen public safety – crime prediction and prevention
Strengthen national and security defense - Threat prediction and prevention, cyber security,Video analytics
Big Data Capabilities – health care
Source: McKinsey Report titled Big Data Revolution in health care, exhibit 9
Risk stratification – patient identification for integrated care
Risk adjusted benchmark/ simulation of hospital productivity
Personal health records
Evaluation – identification of patients with negative drug-drug
interactions, potential diseases.
Systematic Reporting of misuse of drugs, systematic Identification of obsolete drug usage
Monitoring - Identification of inappropriate medication
Data mining – Why did it happen?Evaluation – clinical pathways, drug efficacy based on real world data
Big Data Analytics capabilities – travel and transportation
Source: IBM, Big Data and Analytics in Travel and Transportation white paper, figure 4
Maintenance and Engineering – asset management data, Spec sheets, product data
Capacity and Pricing optimization
Call center – call logs, voice/ audio, incidents, email, text Images, videos, graphics – Track, rail segment images and video
Geospatial and temporal – GIS, GPS, weather, Environmental
Sensors, Actuators and detectors – Equipment, wheels, engines, Tracks
Data Analytics for Software Assessment/ Evaluation
Source: http://blog.sei.cmu.edu/post.cfm/data-analytics-for-open-source-software-assessment
Context and challenges
Meeting milestones
Documentation
User base growth over time
bugsDeveloper involvement over
time
Analytics - Explained
Source: Analytics 3.0 by Thomas H Davenport, HBR, Dec 2013
Analytics 1.0Era of business intelligence, go beyond intuition, fact based comprehension for decision making. Era of enterprise data warehouse. Dominant for about 50 years.
Analytics 2.0From about 2005 onwards, Internet based social network firms – Google, eBay, LinkedIn..Not only internal, externally sourced, sensors, public data initiatives, multi media recordings. Innovative technologies NoSQL, Hadoop, machine learning. Computational and analytical skills
Analytics 3.0 Data enriched offerings for every industry. Driven by analytics, rooted in enormous amounts of data.Co-existence of traditional and new.
Information Providers Insight Providers
Companies Capitalizing on Analytics
Essence of Analytics 3.0:
“The resolve by a company’s management to compete on analytics not only in the traditional sense (by improving internal business decisions) but also by creating more valuable products and services”
Analytics 3.0 by Thomas H. Davenport, Dec 2013, Harvard Business Review
What is different from the past?
Ability to handle new varieties of data – voice, text, log files, images, video on a large scale
Sensors and operational data gathering devices in motion to optimize
Cost savings of storage – data base to database appliance to a Hadoop cluster
Big companies always wrestled with the data volume issues. Bigness is not new! Variety is new!
What is different from the past?
Source: Big Data in Big companies, May 2013: Authored by: Thomas H. Davenport, Jill Dyché
Big Data Techniques – explained
Sources: Data Science for Business, Chapter 2, Business Problems and Data Science solutions Too Big to Ignore by Philip Simon, Chapter 3, elements of persuasion: Big Data Techniques
Techniques
Statistical – Regression, A/B Testing
Data visualization - Heat Maps, Time Series analysis
Automation – Machine learning, Sensors, Nano technology, RFID and NFC
Semantics – natural language processing, text analytics, sentiment analysis
Predictive analytics
Collaborative Filtering
Business problems to Data Mining tasks
BI reporting
Visualization
Functional Applications
Industry Applications
PredictiveAnalytics
Content Analytics
Analytics solutions
Source: IBM Big Data Application layer
Source: Information week, 16 top big data analytics platforms, 1/30/ 2014
Top 16 Big Data Analytics Platforms
Platform connections
Business platforms, Gang
of four – Amazon, Apple,
Google, Facebook
More businesses setting platform
trends – Industry wide
transformation Netflix, LinkedIn
Third Platform – popularized by IDC for social, mobile, cloud,
Big data/ analytics and
emerging markets
Mainframe,TerminalsLevel 1 platforms ‘70s
Tiered architectures (client server – 2 tier), (’80 - 90s)Multi tiered architectures (2000+ )
Social, mobile, Cloud, Big data/ analytics Convergence(2010 +)
Value shifts for the enterprise
Big Data Optimizations – Concept
Distributed optimization
Parallel optimization
Large scale optimizations
Optimizations and Challenges – know how for handling bottle necks
Industry level Research – SAS, IBM
Statistics – optimization challenges
Computational challenges
Myths and Overlaps
Not just another hype of data related decisions and insights – requires a new mindset
People – roles
Data Scientists
Statisticians
Business Analysis
• Find story in a data set
• Experimental, exploratory
• Data mining • Statistical analysis• Predictive model
development •
• Multi dimensional analysis
• Visual, data discovery
References:Davenport, T. H., & Patil, D. J. (2012)]Harvard Business Review, October 2012, pp 70- 76
Big Data and Analytics technologies – supplementing RDBMS’s
Scalable MPP Data
warehouse Hadoop
NewSQL Graph Database
NoSQL
Reference: WHITE PAPER Discovering the Value of a Data Discovery Platform, Sponsored by: Teradata, Dan Vesset, September 2013
Impact on management
New skills and new management style
References:McAfee, A., & Brynjolfsson, E. (2012). Big Data: The Management Revolution. (cover story).Harvard Business Review, 90(10), 60-68.
Data driven companies, evidence based decisions
look for opportunities based on Big data in every business function
Leadership, talent, technology, Organizational culture
Experimental and exploratory
Introductory EMC Videos – Animated
http://www.youtube.com/watch?v=eEpxN0htRKI#t=67
Big Ideas – Simplifying cluster architectures http://www.youtube.com/watch?v=4M3cROio9vU
Big Ideas - How big is Big data?
Big ideas – Why Big Data matters
http://www.youtube.com/watch?v=rTAn1bvy8vU
Big Ideas – Demystifying Hadoop http://www.youtube.com/watch?v=xJHv5t8jcM8
EMC – Big Ideas videos http://www.youtube.com/playlist?list=PLD298CBF8D0908E4C&feature=view_all