big data insights part i

30
Big Data Raji Gogulapati

Upload: raji-gogulapati

Post on 20-Aug-2015

543 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Big Data Raji Gogulapati

Government • Street bump

Mobile App – City of Boston initiatives

• City gets real time information on “bump” data

Car Insurance • More granular in

pricing • Address more in

depth questions

Recruiting • How to hire

better employees and retain?

Simple solutions to Big problems

Source: Phil Simon’s Big Data, too big to ignore

Consumer driven trends

and technologies

Data Science

Infonomics Platforms

Sandy, Politics,

Social Media

Big Data Influence

Describe Big Data

Incomplete

Fragmented

Not precise

Dynamic

External

Unmanageable

Democratic

???

Data Science

Ultimate Goal – Improve Decision making

principles

Frameworks

Data analytic thinking

• Extract patterns• Mining for useful knowledge

• Models • Process • Stages

• Assess how data can improve performance

• Understand data science • See data oriented competitive threats • Question

Big Data Advantage – Analytics and decision management

Decision making

Data deluge

Techniques Solutions

Rewards

Techniques

Statistical Visualization

Semantics Automation

Predictive Analytics

Plunge Issues

Problems

Source: Forbes.com, Cloud Predictive Analytics most used to gain customer insight, 10/24/2013

Data types in a Big Data context

Big Data Use Cases – Financial Sector

Source – HP Sponsored white paper, Case for Big Data in the Financial Services sector, IDC Financial Insights opinion 2012

Investment banks - speed post-trade settlement,

confirmation

Hedge funds - optimize price discovery

Retail banking - create merchant intelligence and assist in

optimizing offers and pricing

Social media tracking for marketing campaigns

Investment research to understand traffic patterns, trends

Community banksAnalyzing unstructured data to anticipate workloads in call centers

Social analytics towards product and service initiatives

Big Data Use Cases – Government

Source – IBM, “Accelerate Analytics and harness Big data within government”

Improve citizen and business services -

Smarter social services. Fight fraud, abuse and errors

Manage resources effectively – tax compliance

Fight fraud, abuse and errors

Strengthen public safety – crime prediction and prevention

Strengthen national and security defense - Threat prediction and prevention, cyber security,Video analytics

Big Data Capabilities – health care

Source: McKinsey Report titled Big Data Revolution in health care, exhibit 9

Risk stratification – patient identification for integrated care

Risk adjusted benchmark/ simulation of hospital productivity

Personal health records

Evaluation – identification of patients with negative drug-drug

interactions, potential diseases.

Systematic Reporting of misuse of drugs, systematic Identification of obsolete drug usage

Monitoring - Identification of inappropriate medication

Data mining – Why did it happen?Evaluation – clinical pathways, drug efficacy based on real world data

Big Data Analytics capabilities – travel and transportation

Source: IBM, Big Data and Analytics in Travel and Transportation white paper, figure 4

Maintenance and Engineering – asset management data, Spec sheets, product data

Capacity and Pricing optimization

Call center – call logs, voice/ audio, incidents, email, text Images, videos, graphics – Track, rail segment images and video

Geospatial and temporal – GIS, GPS, weather, Environmental

Sensors, Actuators and detectors – Equipment, wheels, engines, Tracks

Data Analytics for Software Assessment/ Evaluation

Source: http://blog.sei.cmu.edu/post.cfm/data-analytics-for-open-source-software-assessment

Context and challenges

Meeting milestones

Documentation

User base growth over time

bugsDeveloper involvement over

time

Analytics - Explained

Source: Analytics 3.0 by Thomas H Davenport, HBR, Dec 2013

Analytics 1.0Era of business intelligence, go beyond intuition, fact based comprehension for decision making. Era of enterprise data warehouse. Dominant for about 50 years.

Analytics 2.0From about 2005 onwards, Internet based social network firms – Google, eBay, LinkedIn..Not only internal, externally sourced, sensors, public data initiatives, multi media recordings. Innovative technologies NoSQL, Hadoop, machine learning. Computational and analytical skills

Analytics 3.0 Data enriched offerings for every industry. Driven by analytics, rooted in enormous amounts of data.Co-existence of traditional and new.

Information Providers Insight Providers

Companies Capitalizing on Analytics

Essence of Analytics 3.0:

“The resolve by a company’s management to compete on analytics not only in the traditional sense (by improving internal business decisions) but also by creating more valuable products and services”

Analytics 3.0 by Thomas H. Davenport, Dec 2013, Harvard Business Review

What is different from the past?

Ability to handle new varieties of data – voice, text, log files, images, video on a large scale

Sensors and operational data gathering devices in motion to optimize

Cost savings of storage – data base to database appliance to a Hadoop cluster

Big companies always wrestled with the data volume issues. Bigness is not new! Variety is new!

What is different from the past?

Source: Big Data in Big companies, May 2013: Authored by: Thomas H. Davenport, Jill Dyché

Big Data Techniques – explained

Sources: Data Science for Business, Chapter 2, Business Problems and Data Science solutions Too Big to Ignore by Philip Simon, Chapter 3, elements of persuasion: Big Data Techniques

Techniques

Statistical – Regression, A/B Testing

Data visualization - Heat Maps, Time Series analysis

Automation – Machine learning, Sensors, Nano technology, RFID and NFC

Semantics – natural language processing, text analytics, sentiment analysis

Predictive analytics

Collaborative Filtering

Business problems to Data Mining tasks

BI reporting

Visualization

Functional Applications

Industry Applications

PredictiveAnalytics

Content Analytics

Analytics solutions

Source: IBM Big Data Application layer

Source: Information week, 16 top big data analytics platforms, 1/30/ 2014

Top 16 Big Data Analytics Platforms

Platform connections

Business platforms, Gang

of four – Amazon, Apple,

Google, Facebook

More businesses setting platform

trends – Industry wide

transformation Netflix, LinkedIn

Third Platform – popularized by IDC for social, mobile, cloud,

Big data/ analytics and

emerging markets

Mainframe,TerminalsLevel 1 platforms ‘70s

Tiered architectures (client server – 2 tier), (’80 - 90s)Multi tiered architectures (2000+ )

Social, mobile, Cloud, Big data/ analytics Convergence(2010 +)

Value shifts for the enterprise

Big Data Optimizations – Concept

Distributed optimization

Parallel optimization

Large scale optimizations

Optimizations and Challenges – know how for handling bottle necks

Industry level Research – SAS, IBM

Statistics – optimization challenges

Computational challenges

Myths and Overlaps

Not just another hype of data related decisions and insights – requires a new mindset

People – roles

Data Scientists

Statisticians

Business Analysis

• Find story in a data set

• Experimental, exploratory

• Data mining • Statistical analysis• Predictive model

development •

• Multi dimensional analysis

• Visual, data discovery

References:Davenport, T. H., & Patil, D. J. (2012)]Harvard Business Review, October 2012, pp 70- 76

Big Data and Analytics technologies – supplementing RDBMS’s

Scalable MPP Data

warehouse Hadoop

NewSQL Graph Database

NoSQL

Reference: WHITE PAPER Discovering the Value of a Data Discovery Platform, Sponsored by: Teradata, Dan Vesset, September 2013

Impact on management

New skills and new management style

References:McAfee, A., & Brynjolfsson, E. (2012). Big Data: The Management Revolution. (cover story).Harvard Business Review, 90(10), 60-68.

Data driven companies, evidence based decisions

look for opportunities based on Big data in every business function

Leadership, talent, technology, Organizational culture

Experimental and exploratory

Introductory EMC Videos – Animated

http://www.youtube.com/watch?v=eEpxN0htRKI#t=67

Big Ideas – Simplifying cluster architectures http://www.youtube.com/watch?v=4M3cROio9vU

Big Ideas - How big is Big data?

Big ideas – Why Big Data matters

http://www.youtube.com/watch?v=rTAn1bvy8vU

Big Ideas – Demystifying Hadoop http://www.youtube.com/watch?v=xJHv5t8jcM8

EMC – Big Ideas videos http://www.youtube.com/playlist?list=PLD298CBF8D0908E4C&feature=view_all

Conclusions

Smart

Internet of things

Predictions Big data evolution

RFID, sensors, NFC

Standards ODaF

Conclusions