big data, analytics and data science

60
Data Big Data and Data Science Da vid Lambert

Upload: dlamb3244

Post on 20-Jul-2015

147 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Page 1: Big Data, Analytics and Data Science

Data

Big Data and Data ScienceDavid

Lambert

Page 2: Big Data, Analytics and Data Science

Agenda

• Introduction to Big Data

• The V’s

• What Businesses Need to Know

• Market Research

• Analytics

• Modeling

• Data Science

Page 3: Big Data, Analytics and Data Science

Introduction

Census 2013: Internet and Smartphone use

Page 4: Big Data, Analytics and Data Science

Introduction

140,000-190,000 job shortfall by 20184.4 million jobs in Big Data related by 2015 in US

Gartner 2012

Page 5: Big Data, Analytics and Data Science

Introduction

Page 6: Big Data, Analytics and Data Science

Introduction

Forbes 2014

Page 7: Big Data, Analytics and Data Science

What Is Data

Two Basic Types of Computer Data

• Structured

• Variable or Metric information that is easily accessible and shared between computers and databases. (time, date, location, user ID, file.pathway, sensor)

• Unstructured

• Information that is difficult to quantify such as text, emails, pictures, videos and other socially generated content and is beyond typical processing power.

Page 8: Big Data, Analytics and Data Science

What Is Data

Structured

Unstructured

Page 9: Big Data, Analytics and Data Science

What Is Data

Page 10: Big Data, Analytics and Data Science

Share of Global Internet Searches

1.2 Trillion searches per year 40,000 searches per second

How Much Data

http://www.internetlivestats.com

Page 11: Big Data, Analytics and Data Science

How Much Data

http://www.internetlivestats.com

Page 12: Big Data, Analytics and Data Science

How Much Data

64kb for the 1969 moon launch

Page 13: Big Data, Analytics and Data Science

Why Now

Digital Storage Has Become Very Cheap Price for CPU Performance is Cheap

Page 14: Big Data, Analytics and Data Science

• Information Connectivity

• Databases • Machines • Employees • Customers • Products

1

Basic Corporate Interaction with Clients

Experiences are ‘Pushed’ on Consumers by companies. Companies advertise and market to what they want the consumer to think with little or no feedback.

2

Back and forth communication between ISOLATED users and producers.

Push-Pull marketing efforts evolve to gauge consumer experience. Exclusive Company/Customer Dialogue, costly Surveys, and limited Focus Groups were typical of this type of corporate relationship.

3

Consumer community connections and corporate relationships.   User-to-user connections grew exponentially with the rise of social media platforms, like Facebook and the internet. Consumer experience shifted away from getting information from companies but rather gaining insight through advanced social webs. The general thought being that consumers-to-consumer interactions are less bias and convey more knowledge about an experience than an organizations. There Is an increased feeling of trust in dealing with a non-partisan opinion.

4 Cloud

Server

Cloud and Mobile system integration and infrastructure Through the development of advanced computer infrastructures, information growth was so rapid that it is referred to as an explosion of Big Data. The transition to smartphone use over standard computer use caused greater behavioral data to be captured by cloud services. Smartphones act as the most common gateway or remote to this advanced network computers.

Cloud

Service

5

Internet of things and User Connectivity   The cost and benefit of connecting a product to a cloud or internet service captures so much value that companies everywhere are integrating systems to utilize this capability. Interactions between consumers and their products and product-to-product interactions will become increasingly more frequent and will boast a wave of new services to better integrate these systems into the consumer’s life.

Progression

Cloud

Cloud

Service Service Service

Service

PullPushWeb Cloud

Internet  of  Things

Server

Page 15: Big Data, Analytics and Data Science

Big Data

People are creating more Data

People, Products and Companies are creating ENORMOUS amounts of Data

Companies are creating more Data

Products are creating more Data

Page 16: Big Data, Analytics and Data Science

Big DataThe 3 Vs Chart

Innovation must be done more quickly and effectively due to this increase in

competition, availability of services and

dynamically changing technology.

Old business issues are still prevalent in

BD except the speed,

scope & tempo of

business offerings have

dramatically increased. Businesses

MUST take more control over their

service offerings.

The importance of

maintaining a

consistent and

authentic experience throughout all operations is

much more significant with the influence of BD.

Organizations must choose or

transition to

Attractive channels for

their objectives,

customers, and culture.

Greatest Challenge:Simply put, Big Data is an increase in the relationships between Hardware & Software, Products & Services, Customers & Employees within an organization or business.

Page 17: Big Data, Analytics and Data Science

The V’s

Page 18: Big Data, Analytics and Data Science

The Infinite V’s

1. Volume

2. Velocity

3. Variety

4. Variability

5. Veracity

6. Visualisation

7. Value

Data Capture

Data Cleaning

Data Preparation and Reporting

Data Marketing Pr

oces

s

Page 19: Big Data, Analytics and Data Science

Data Capture Data Cleaning Data Preparation Data Marketing

Process

What Businesses Should Know

Page 20: Big Data, Analytics and Data Science

What Businesses Should Know

https://infocus.emc.com/william_schmarzo/big-data-business-model-maturity-chart/

• Consultant at EMC Global Services

• Former Vice President of Advertising Analytics at Yahoo

Bill Schmarzo

Big Data Author and Blogger at EMC

*** Text is excerpted or paraphrased from several articles by Bill Schmarzo at his EMC blog.

53 41 2

Page 21: Big Data, Analytics and Data Science

What Businesses Should Know

1.Integrate (structured) Meta-Data with detailed

(unstructured) Behavioral Data to provide new metrics and new dimensions against which to monitor and

optimize key business processes.

Initial Big Data Focus: Optimize Internal Business Process

https://infocus.emc.com/william_schmarzo/the-4-ms-of-big-data/

There are three big data capabilities that organizations can leverage

to expand their business intelligence and data warehouse

investments to optimize versus just monitor.

2.Deploy predictive analytics to

uncover insights buried in the massive volumes of detailed structured and unstructured data. Having business users slice-and-dice the data to uncover insights does not work very well when dealing with terabytes and petabytes of data.

3. Leverage real-time (or low-latency)

data feeds to accelerate organizationalprocesses to identify and act upon business and market opportunities in a timely

manner.

Page 22: Big Data, Analytics and Data Science

What Businesses Should KnowUltimate Big Data Opportunity: Monetize External Customer Insights

As organizations advance along the maturity index, three organizational transformations take place to create

new monetization based upon the

customer, product and

market insights gleaned from the first three

phases of the maturity index

1. Organizations start to treat data as an asset, not a cost of business. 

2. Organizations place formal processes to capture,

inventory, refine, and protect their analytics

as intellectual property (IP). Analytics, models, processes, etc.

3. Organizational confidence in making

decisions using data and analytics will grow.

Organizational investments in data, analytics, people, processes, and technology will be used to justify decision

making.

Page 23: Big Data, Analytics and Data Science

What Businesses Should Know

Page 24: Big Data, Analytics and Data Science

A SHIFT FROM SURVEYS TO INTERACTION

• Surveys are difficult to… • Scale • Incentivize customers • Assess validity • Ask appropriate questions for meaningful insight

• Interaction monitoring is more successful to… • Scale • Incentivize customers • Assess Validity • Monitor Appropriate behavior for meaningful insight

Interaction Collection = Data Mining

Market Research

Page 25: Big Data, Analytics and Data Science

Market Research

The platform or “skeleton” service initially starts out with very little information about that particular user and thus the experience for that user is typically unexciting or ambiguous.  

User

Fundamental service strategy: solutions selling, system integration

Interaction

BehaviorKnowledge

Consumer Interaction Platform

Page 26: Big Data, Analytics and Data Science

As the User begins to interact with the platform, the user provides user-defined inputs to basic profile characteristics. User defined variables should be limited to varaibles that are not easily gathered from general usage information. Age, Gender, income or other basic characteristicsmay be identified.

Interaction

BehaviorKnowledge

Market Research

User

Fundamental service strategy:

solutions selling, system integration

Consumer Interaction Platform

Page 27: Big Data, Analytics and Data Science

General activity and interaction with the Service Platform gives enough behavioral data to give

an extrapolated rough sketch of a consumer’s behavioral profile. General usage data can include activity hours/per month, activity duration per use, typical activities performed during use.

Interaction

BehaviorKnowledge

Market Research

User

Fundamental service strategy:

solutions selling, system integration

Consumer Interaction Platform

Page 28: Big Data, Analytics and Data Science

Adoption of the service platform into the clients lifestyle gives a caricature-like view of a consumer, with

exaggerated likes and dislikes. The consumer engages with the

platform regularly and has customized it to their preferences and application.

Interaction

BehaviorKnowledge

Market Research

User

Fundamental service strategy:

solutions selling, system integration

Consumer Interaction Platform

Page 29: Big Data, Analytics and Data Science

Immersion into the platform by the user creates a sophisticated profile that can be leveraged to gain unprecedented insight and predictability into consumer behavior. The dedicated use of a service platform by a single user details an almost life-like portrait of the user. If the user is engaged in multiple channels the data becomes even more relevant as it is analyzed across different markets segments and applications.Interaction

BehaviorKnowledge

Market Research

User

Fundamental service strategy:

solutions selling, system integration

Consumer Interaction Platform

Page 30: Big Data, Analytics and Data Science

Low  Customization

Profile Activation

Low Account Activity: A picture, couple friends

Active Member: Moderate Photos, Postings, Friendship

Prolonged Use: Lots of Engagement

Interaction

BehaviorKnowledge

Consolidated  User  Profile  with  

Behavior

Generate Value by analyzing behavior

Compile  Similar  Users  into  Segment  Data  Packages

Mobile  Advertising  and  Proximity  Analytics  

Service  Innovation  and  IT-­‐Enablement  through  Behavioral  Analytics.  ExactTarget

Market ResearchConsumer Interaction Platform

Page 31: Big Data, Analytics and Data Science

Interaction

BehaviorKnowledge

Consolidated  User  Profile  with  

Behavior

Leverage Data (Generate Value)

Compile  Similar  Users  into  Segment  Data  Packages

Many Companies reflect only a rough sketch of their customers

Prolonged Use: Lots of Engagement

Market ResearchConsumer Interaction Platform

Page 32: Big Data, Analytics and Data Science

Interaction

Behavior

Consolidated  User  Profile  

with  Behavior

Leverage Data

Compile  Similar  Users  into  Segment  Data  

Packages

Many Companies reflect only a rough sketch of their customers

Prolonged Use: Lots of Engagement

Consumer Interaction Platform

Market Research

Marketing based on Behavior

It is important to note that this also applies to process.

Page 33: Big Data, Analytics and Data Science

Market Research

Page 34: Big Data, Analytics and Data Science

Analytics

Page 35: Big Data, Analytics and Data Science

Defining a system of metrics or variables to be monitored, compared and contrasted within that system to determine the PREDICTIVE power of specific variables to

specific outcomes.

Analytics

Key Performance Indicator (KPI)

• ROI

• Net Profit/Profit Margin

• Customer Acquisition Cost

Page 36: Big Data, Analytics and Data Science

Analytics

Advertising/Sales/Marketing

Interaction/Usage Sentiment

Brand Culture

Customer Acquisition

ROI

Conversion Rate

Social Media Engagement

Distinguishing Attributes or Characteristics

Behavioral Rituals and Norms

Attitudes

Lexicon

Positioning

User Experience

Where is the interaction

Usage Time

Page 37: Big Data, Analytics and Data Science

AnalyticsProduct/Corporate Relationship

Advertising/Sales/Marketing

Interaction/Usage Sentiment

Brand Culture

Customer Acquisition

ROM

Conversion Rate

Social Media Followers/Mentions

Distinguishing Attributes or Characteristics

Behavioral Rituals and Norms

Attitudes

Lexicon

Positioning

User Experience

Where is the interaction

Usage Time

• ROI• Net Profit/Profit Margin

• Customer Acquisition Cost• Lifetime Value of Customer

Page 38: Big Data, Analytics and Data Science

Analytics

Page 39: Big Data, Analytics and Data Science

Analytics

Page 40: Big Data, Analytics and Data Science

Analytics

Page 41: Big Data, Analytics and Data Science

Analytics

Page 42: Big Data, Analytics and Data Science

Analytics

Page 43: Big Data, Analytics and Data Science

Analytics

Page 44: Big Data, Analytics and Data Science

Analytics

Page 45: Big Data, Analytics and Data Science

Analytics

Page 46: Big Data, Analytics and Data Science

AnalyticsHow Google uses Analytics to drive Technology

Page 47: Big Data, Analytics and Data Science

AnalyticsHow Google uses Analytics to drive Technology

• Purchased by Google in 2006 • All Music videos except #8

• Music has a lot of replay value. Sharable

• Psy and Katy Perry Appear twice on the top 15

http://en.wikipedia.org/wiki/List_of_most_viewed_YouTube_videos

Page 48: Big Data, Analytics and Data Science

How Google has Used Analytics to drive Technology

Analytics

Page 49: Big Data, Analytics and Data Science

Google Adwords • Sunshine Dairy

Analytic Modeling

• Media Buy • Geography • Click Through-Rate • Impressions • Conversion Rate • Cost Per Click • Ad Position • Size • Media (Vid/Pic/Audio) • Lifetime Value of Customer

• Creative • Style (Color/Mood/Message) • Short/Long Term Branding

Metrics

Page 50: Big Data, Analytics and Data Science

Analytic ModelingAnalytic Modeling is using the identified variables to forecast possible outcomes for the future. These models will be compared to actual results to build better models that more accurately predict consumer influences and outcomes on the

market.

A & B TestingForecast Method

Page 51: Big Data, Analytics and Data Science

Analytic Modeling

A & B TestingForecast MethodExponential Smoothing

Simple Regression

Multiple Regression

Moving Averages

Substitution Forecasting

Hybrid Forecasting

Page 52: Big Data, Analytics and Data Science

Data ScienceA Data Scientist is a generic term for someone

who possess the ability to do a combination of jobs which include;

Data Development Large Data Statistics

Data Analyst

Page 53: Big Data, Analytics and Data Science

Data ScienceT - Shaped Skills

Page 54: Big Data, Analytics and Data Science

Data Science

Page 55: Big Data, Analytics and Data Science

What To LearnLearn to Code

Software Engineering

Algorithms & Data Structures

Visualization

Data Munging

Distributed Computing

Machine Learning

Supervised (SVM, Random Forest)

Unsupervised (K-means, LDA)

Validation, Model Comparison

Page 56: Big Data, Analytics and Data Science

What To Learn

Linear Algebra(Matrix Factorization)

Calculus (Integrals, Derivatives)

Distribution (Binomial, Poisson)

Summary Statistics (Mean, Variance, Std Dev.)

Multivariate Analysis

Mathematics Statistics

Learn Math and Stats

Page 57: Big Data, Analytics and Data Science

Where To Learn