machine learning in big data

31
Machine Learning in Big Data - Look forward or be left behind V. William Porto Hadoop Summit Dublin 2016

Upload: hadoop-summit

Post on 08-Jan-2017

275 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Machine Learning in Big Data

Machine Learning in Big Data- Look forward or be left behind

V. William Porto Hadoop Summit Dublin 2016

Page 2: Machine Learning in Big Data

Overview of RedPoint Global

2 RedPoint Global Inc. 2016 Confidential

Launched in 2006

Founded and staffed by industry veterans

Headquarters: Wellesley, Massachusetts

Offices in US, UK, Australia, Philippines

Global customer base

Serves most major industries

Page 3: Machine Learning in Big Data

Overview of RedPoint Global

3 RedPoint Global Inc. 2016 Confidential

MAGIC QUADRANTData Quality

MAGIC QUADRANTIntegrated Marketing

Management

MAGIC QUADRANTMultichannel Campaign

Management

MAGIC QUADRANTDigital Marketing Hubs

FORRESTER WAVE™Cross-channel

Campaign Management

FORRESTER WAVE™Data Quality Solutions

Page 4: Machine Learning in Big Data

4 RedPoint Global Inc. 2015 Confidential

With apologies to Gary Larson

Hadoop

Page 5: Machine Learning in Big Data

5 RedPoint Global Inc. 2015 Confidential

Machine Learning – why bother?

If you have always done it that way, it is probably wrong” - Charles Kettering

Page 6: Machine Learning in Big Data

6 RedPoint Global Inc. 2015 Confidential

Machine Learning – keeping ahead of the curve

• Three basic tenants for success in today’s world

• Prediction - you need to learn and use what you’ve learned

• Optimization - the world is a dynamic place

• Automation - because people don’t scale well

Page 7: Machine Learning in Big Data

7 RedPoint Global Inc. 2015 Confidential

Machine Learning – what really is it all about?

• Learning vs. instruction

• Humans learn instinctively – computers not so much

• Intelligent Systems

• Memory

• Prediction (modeling)

• Assessment

• Feedback

• Adaptation

Page 8: Machine Learning in Big Data

8 RedPoint Global Inc. 2015 Confidential

Data Modeling – what, why, how

• Regression – what happened in the past• Prediction – what will happen in the future

“Prediction is very difficult – especially if it’s about the future”

- Nihls Bohr

Page 9: Machine Learning in Big Data

9 RedPoint Global Inc. 2015 Confidential

Data Modeling – what, why, how

The wide world of data modeling

• Supervised models• you have historical data and known correlated outputs (truth)

• Unsupervised models• historical data, but may not have (or trust) associated outputs

Page 10: Machine Learning in Big Data

10 RedPoint Global Inc. 2015 Confidential

Decision Trees

Major Assumption: the world is discrete• fast, easy to understand, no linearity assumptions

• ‘human time’ required, unbalanced and/or large trees

Page 11: Machine Learning in Big Data

11 RedPoint Global Inc. 2015 Confidential

Standard Linear Models

Assumption: the world is linear• the real world really isn’t linear

• all errors are not all equal

• easy to get misleading results

? !

Which line is best?

Page 12: Machine Learning in Big Data

12 RedPoint Global Inc. 2015 Confidential

Generalized ‘Non-Linear’ Models

Assumptions• underlying functional mapping is known

• all errors are equal

• data is ‘well-conditioned’

• ‘standard’ error distribution

• Polynomials

• Exponentials (e.g., Gaussian, Poisson)

• Piece-wise linear

Page 13: Machine Learning in Big Data

13 RedPoint Global Inc. 2015 Confidential

Non-Linear Models

Assumption: data is representative• ‘universal’ modeling tools

• fast execution

• no linearity assumptions

• lots of parameters, many techniques

• difficult to explain

Artificial Neural Network

Page 14: Machine Learning in Big Data

14 RedPoint Global Inc. 2015 Confidential

User Story: Predict Retention / Attrition

Historical Behavioral Data

Customer Rating

Retention Customer NameLoyalty

MemberDays Since

Last PurchaseImmediate Relatives

Household Children

Customer IDLatest

Purchase Price

Latest Purchase Item ID

Region Code

Customer Capture Method

Customer Contact Code

Domicile

1 1 Allen, Geraldine yes 29 0 2 24160 211.39 B5 MW 2 6 St Louis, MO1 1 Anderson, Harry no 48 0 3 19952 26.55 E12 NE 3 New York, NY1 1 Andrews, Cynthia yes 63 1 0 13502 77.95 D7 NE 10 6 Hudson, NY1 0 Andrews, Thomas Jr no 39 0 0 112050 0 A36 SW Los Angeles, CA1 1 Appleton, Mary yes 53 2 3 11769 51.49 C101 NE D Bayside, Queens, NY1 0 Ashbury, Jeffrey no 47 1 0 PC 17757 29.99 C62 C64 NE 124 New York, NY1 1 Aston, Mrs. yes 18 1 0 PC 17757 29.99 C62 C64 NE 4 New York, NY1 1 Barber, Ellen yes 26 0 2 19877 78.85 S 61 1 Barkley, Henry no 80 0 0 27042 30 A23 NE B Yorktown, PA1 0 Baumann, David no 0 0 PC 17318 25.99 NE New York, NY1 1 Bazzeno, Alice yes 32 0 1 11813 76.95 D15 C 8 341 0 Beattie, Mr. Samuel no 36 0 0 13050 75.29 C6 C A 11 Winnipeg, MN1 1 Beckworth, June yes 47 1 1 11751 52.49 D35 NE 5 New York, NY1 1 Behr, John no 26 0 0 111369 30 C148 NE 5 New York, NY1 1 Biden, Roseanne yes 42 0 0 PC 17757 127.99 C 41 1 Bird, Ellen yes 29 0 0 PC 17483 18.95 C97 S 81 0 Birnbaum, Jason no 25 0 0 13905 26 C 148 San Francisco, CA

Page 15: Machine Learning in Big Data

15 RedPoint Global Inc. 2015 Confidential

User Story: Predict Customer Retention / Attrition

Machine Learning Processing Chain - Training

Page 16: Machine Learning in Big Data

16 RedPoint Global Inc. 2015 Confidential

User Story: Predict Retention / Attrition

Machine Learning Processing Chain - Prediction

Reward predicted ‘retainees’ with

targeted product offerings

Give potential attrition customers special

incentives to stay with the business

Page 17: Machine Learning in Big Data

17 RedPoint Global Inc. 2015 Confidential

User Story: Accurate vs. Useful Prediction

Sparse data + Least-Squares (Linear) Classifier• Task: predict chance of purchasing a sundry item

• Result: ‘best’ model always predicts “none”

• Analysis: LS algorithm assumes all errors are equalBread

Cake & Pie

Chocolate Coffee Cookie DieselJuice & Smoothies

Lubricants MilkOther Bakery

Premium Sandwich Snack TeaTotal Transaction

Total Revenue

0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 30000 0 0 0 0 3 0 0 0 0 0 0 0 0 3 20000 0 0 0 0 0 0 0 0 0 0 0 0 0 6 18000 0 0 0 0 5 0 0 0 0 0 0 0 0 6 48000 0 0 2 0 0 0 0 0 0 0 0 0 0 2 1000 0 0 0 0 1 0 0 0 0 0 0 0 0 1 18280 0 0 0 0 0 0 0 0 0 0 0 0 0 13 164600 0 0 0 0 2 0 0 0 0 0 0 0 0 2 10000 0 0 0 0 2 0 0 0 0 0 0 0 0 2 15000 0 0 0 0 0 0 0 0 0 0 0 0 0 7 46000 0 0 0 0 11 0 0 0 0 0 0 0 0 11 19381.50 0 0 0 0 1 0 0 0 0 0 0 0 0 1 18600 0 0 0 0 0 0 0 0 0 0 0 0 0 3 30000 0 0 0 0 0 0 0 0 0 0 0 0 0 18 9838.820 0 0 0 0 0 0 0 0 0 0 0 0 0 22 110000 0 0 0 0 5 0 0 0 0 0 0 0 0 19 182250 0 0 0 0 0 0 0 0 0 0 0 0 0 1 5000 0 0 0 0 0 0 0 0 0 0 0 0 0 1 8000 0 0 0 0 0 0 0 0 0 0 1 0 0 7 79900 0 0 0 0 0 0 0 0 0 0 0 0 0 5 38200 0 0 0 0 1 0 0 0 0 0 0 0 0 55 43230

Page 18: Machine Learning in Big Data

18 RedPoint Global Inc. 2015 Confidential

Clustering/Segmentation – group think

Collaborative FilteringRelationship Matrix

Page 19: Machine Learning in Big Data

19 RedPoint Global Inc. 2015 Confidential

Personalization – not really

!=

Page 20: Machine Learning in Big Data

20 RedPoint Global Inc. 2015 Confidential

Clustering/Segmentation

Similarity?

Customer Browser GenderAge

SectorIncome Sector

Married Children HomeownerRecent Baby

Clothes Purchase

George IE9 M 0 A N 0 1 NCarol Chrome F 1 B Y 1 0 YMary IE9 F 0 A N 1 0 Y

Dist(George,Carol) = 8Dist(George,Mary) = 4Dist(Carol,Mary) = 4

Can you afford to target (George,Mary) the same way as (Carol,Mary) ?

Page 21: Machine Learning in Big Data

21 RedPoint Global Inc. 2015 Confidential

Clustering/Segmentation

Basic Question – which one describes the data the best?

Raw data

How many clusters are there ?

Two Clusters

Four Clusters

Six Clusters

Page 22: Machine Learning in Big Data

22 RedPoint Global Inc. 2015 Confidential

Clustering/Segmentation with Statistics

• relatively simple

• data distribution assumptions

• initialization dependencies

0 10 20 30 40 50 60 70 80 90 1000

102030405060708090

100Raw Data

0 10 20 30 40 50 60 70 80 90 1000

102030405060708090

100Ellipsoidal Clustering

0 10 20 30 40 50 60 70 80 90 1000

102030405060708090

100K-Means Clustering

Page 23: Machine Learning in Big Data

23 RedPoint Global Inc. 2015 Confidential

Clustering/Segmentation – data driven

• let the data speak for itself

• multiple data projection ‘views’

• important boundary relationships

(“swing voters”)

Customer Demographics

Page 24: Machine Learning in Big Data

24 RedPoint Global Inc. 2015 Confidential

User Story: Clustering / Segmentation

ML Clustering - Training ML Clustering – Processing New Data

Page 25: Machine Learning in Big Data

25 RedPoint Global Inc. 2015 Confidential

Model Selection – how to choose?

• Basic Model Type (prediction or segmentation)• inputs + correlated outputs• inputs only?

• Basic Questions:• what to use for my problem?• parameters?• is this the best choice?• could I do better, and how?

Page 26: Machine Learning in Big Data

26 RedPoint Global Inc. 2015 Confidential

Optimization – Evolving better solutions

• Simulated Evolution• fast, efficient search• always have a solution• arbitrary ‘evaluation’ functions• can start with existing solution(s)

• Variation – alter model type, parameters• Assessment – how well does the model work?• Selection – survival of the fittest

Page 27: Machine Learning in Big Data

27 RedPoint Global Inc. 2015 Confidential

Evolutionary Optimization – Evaluation Function

• can use any measureable data• no continuity assumptions• no differentiability assumptions• no symmetry assumptions

Sunshine Hurricane

20 -10005 50

SunshineHurricane

Prediction

Reality (Truth)

Page 28: Machine Learning in Big Data

28 RedPoint Global Inc. 2015 Confidential

User Story: Optimizing Classification Models

Task: Predict Retention/Attrition

0 1 2 3 4 5 60.00

20.00

40.00

60.00

80.00

100.00

34.828.8

24.5 22.1 20.9

62

70.2 72.3 73.4 75.2

Model Performance Optimization

Classification AccuracyTest Set Error (RMS)

GenerationPe

rfor

man

ce

17 Potential input features(customer demographics)

2 outputs (retention/attrition)

1300 Training Samples (50 – 50, A / B Split)1300 Test Samples ( naïve test data )

Page 29: Machine Learning in Big Data

29 RedPoint Global Inc. 2015 Confidential

Use Case – Fully Adaptive Feedback (Next Best Offer)

DB

Historical User Behavior

(stimulus/response)

Train / Update Model

Non-Adaptive (Fixed) Mode

Randomized A/B/C Offer Selection

Adaptive ML Mode

ML Prediction Offer Selection

Operation (Trigger)

Ad / Offer (stimulus)

Feedback Cycle

Page 30: Machine Learning in Big Data

30 RedPoint Global Inc. 2015 Confidential

Five Keys to Successful Machine Learning

• Let the data speak for itself – don’t force fit your models• Remember, all errors are not all equal – use this to your advantage• True learning requires continual adaptation !• Automate the process with feedback – remove the “man-in-the-loop”• Trust the optimization process – it really works!

Page 31: Machine Learning in Big Data

31 RedPoint Global Inc. 2015 Confidential

Q&A

Contact InfoVisit : www.redpoint.net

Bill PortoSr. Engineering AnalystRedPoint Global [email protected]

Want More Information about this topic?

Fill out your card or go to redpoint.net/hadoopeurope