predictive analytics with machine learning - buckeye · pdf filemachine learning recap...

58
It’s a Machine World Predictive Analytics with Machine Learning Greg Deckler [email protected] @GregDeckler

Upload: trinhthuy

Post on 06-Mar-2018

219 views

Category:

Documents


2 download

TRANSCRIPT

It’s a Machine World

Predictive Analytics with Machine Learning

Greg Deckler

[email protected]

@GregDeckler

It’s a Machine World

Predictive Analytics with Machine Learning

Greg Deckler

[email protected]

@GregDeckler

Greg Deckler

Fusion AllianceSolution Director – Cloud ServicesColumbus, OH United States• Email: [email protected]• LinkedIn: https://www.linkedin.com/in/gregdeckler• Twitter: @GregDeckler• PBI Community: smoupre• ScoopIt: Business Intelligence Insights

• Founder of the Columbus Azure ML and Power BI User Group• Author of Achieving Process Profitability, Building the IT Profit Center

Agenda• What is Machine Learning?

• History of Machine Learning

• Why Machine Learning?

• Examples of Predictive Analytics

• Core Concepts

• Putting Theory into Practice

• Demo

• Common Issues in ML

• Operationalizing ML

• Resources

• Questions?

About Fusion Alliance

What is Machine Learning?• Machine learning can be described as computing systems that

improve with experience. It can also be described as a method of turning data into software. Whatever term is used, the results remain the same; data scientists have successfully developed methods of creating software “models” that are trained from huge volumes of data and then used to predict certain patterns, trends, and outcomes.

• Predictive analytics is the underlying technology behind Machine Learning, and it can be simply defined as a way to scientifically use the past to predict the future to help drive desired outcomes.

History

• Machine learning was born from the quest for artificial intelligence

• Antiquity has stories of artificial beings

• The study of form or mechanical reasoning began with ancient philosophers

• But, where things really got moving was right around 1956…

History - 1956

• Dartmouth Summer Research Project on Artificial Intelligence• John McCarthy, Marvin Minsky,

Nathan Rochester, Claude Shannon• Arthur Samuel, Allen Newell,

Herbert Simon, Dr. HeintzDoofenshmirtz, Alloyse von Roddenstein, Dr. Diminutive, Dr. Killbot, Dr. Goatfish

History – Arthur Lee Samuel•Coined the term “machine

learning” in 1959

• The 1955 version of his checker playing program, the Samuel Checkers-playing Program, is arguably the first example of a self-learning program

History –The Rift

• By 1980, machine learning as well as neural networks were out-of-favor within AI in favor of expert systems

• Machine learning, reorganized as a separate field, started to flourish in the 1990s.

• Changed goal to solvable problems of a practical nature

• Shifted away from symbolic approaches to statistics and probability theory

Machine Learning Recap• Evolved from pattern recognition and

computational learning theory

• Explores the study and construction of algorithms that can learn and make predictions on data

• Closely related and overlaps with computational statistics

• Has strong ties to mathematical optimization

• Sometimes conflated with data mining

• Used within the field of data analytics

• Tom M. Mitchell’s formal definition (1997): "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E."

• In short, turn data into programs to predict something

Why Machine Learning?

•Exponential data growth

•Cheap global digital storage

•Ubiquitous computing power

•The rise of big data analytics

Examples of Predictive Analytics

• Warranty reserve estimation

• Propensity to buy

• Demand forecasting

• Predictive inventory planning

• Recommendation engines

• Dynamic pricing

• Credit worthiness evaluation

• Smart grid management

• Energy supply and demand

• Carbon emissions and trading

• Patient triage optimization

• Spam/junk email filters

• Mortgage applications

• Various forms of pattern recognition

• Life insurance

• Medical insurance

• Liability/property insurance

• Credit card fraud detection

• Airline flight scheduling

• Web search page results

• Predictive maintenance

• Proactive health management

Core Concepts•Data Preparation

•Types of Learning

•Approaches

•Outputs

•Questions

•Linearity

•Algorithms

•Training, Scoring and Evaluation

Data Preparation

•Relevant

•Connected

•Accurate

•Enough

•Access

Data Preparation - Relevant

Data Preparation - Connected

Data Preparation - Accurate

Data Preparation - Enough

Learning

•Supervised

•Unsupervised

•Reinforcement

Approaches

• Decision tree learning

• Association rule learning

• Artificial neural networks

• Deep learning

• Inductive logic programming

• Support vector machines

• Clustering

• Bayesian networks

• Reinforcement learning

• Representation learning

• Similarity and metric learning

• Sparse dictionary learning

• Genetic algorithms

• Rule-based machine learning

• Learning classifier systems

ApproachesSupervised learningAODE

Artificial neural network BackpropagationAutoencodersHopfield networksBoltzmann machinesRestricted Boltzmann MachinesSpiking neural networks

Bayesian statisticsBayesian networkBayesian knowledge base

Case-based reasoning

Gaussian process regression

Gene expression programming

Group method of data handling

Inductive logic programming

Instance-based learning

Lazy learningLearning AutomataLearning Vector QuantizationLogistic Model Tree

Minimum message lengthNearest Neighbor AlgorithmAnalogical modeling

Probably approximately correct learning Ripple down rulesSymbolic machine learningSupport vector machinesRandom Forests

Ensembles of classifiersBootstrap aggregating (bagging)Boosting (meta-algorithm)

Ordinal classificationInformation fuzzy networks (IFN)Conditional Random FieldANOVAHidden Markov models

Linear classifiersFisher's linear discriminantLinear regressionLogistic regressionMultinomial logistic regressionNaive Bayes classifierPerceptronSupport vector machines

Quadratic classifiersk-nearest neighborBoosting

Decision treesC4.5Random forestsID3CARTSLIQSPRINT

Bayesian networksNaive Bayes

Approaches Unsupervised learningExpectation-maximization algorithmVector QuantizationGenerative topographic mapInformation bottleneck methodArtificial neural network

Self-organizing mapAssociation rule learning

Apriori algorithmEclat algorithmFP-growth algorithm

Hierarchical clusteringSingle-linkage clusteringConceptual clustering

Cluster analysisK-means algorithmFuzzy clusteringDBSCANOPTICS algorithm

Outlier DetectionLocal Outlier Factor

Semi-supervised learningGenerative modelsLow-density separationGraph-based methodsCo-training

Deep learningDeep belief networksDeep Boltzmann machinesDeep Convolutional neural networksDeep Recurrent neural networksHierarchical temporal memory

Reinforcement learningTemporal difference learningQ-learningLearning AutomataSARSA

Outputs

•Classification

•Anomaly Detection

•Regression

•Clustering

•Density Estimation

•Dimensionality Reduction

Questions

•Is this A or B?•Is this weird?•How much, how many?•How is this organized?•What should I do next?

What Questions does ML Answer?

•Will this tire fail in the next 1,000 miles: Yes or no?•Which brings in more customers: a $5 coupon or a 25% discount?

What Questions does ML Answer?

•If you have a car with pressure gauges, you might want to know: Is this pressure gauge reading normal?•If you're monitoring the internet you’d want to know: Is this message from the internet typical?

What Questions does ML Answer?

•What will the temperature be next Tuesday?•What will my fourth quarter sales be?

What Questions does ML Answer?

•Which viewers like the same types of movies?•Which printer models fail the same way?

What Questions does ML Answer?

•If I'm a temperature control system for a house: Adjust the temperature or leave it where it is?•If I'm a self-driving car: At a yellow light, brake or accelerate?•For a robot vacuum: Keep vacuuming, or go back to the charging station?

Linearity

Algorithms

•Classification

•Anomaly Detection

•Regression

•Clustering

Algorithms

Support Vector Machine

Logistic Regression

Algorithms

Decision Trees

One-vs-All Multiclass Classifier

Algorithms

Neural NetworksLinear Regression

Algorithms

K-means

Principal Components Analysis

It’s Just Simple Math...

AODE

Naive Bayes ClassifierFischer’s Linear Discriminant

Boltzmann Machines

Perceptron

Random Forests

k-Nearest Neighbors

Quadratic Classifiers

k-Means Clustering

Support Vector Machines

Training, Scoring and Evaluation• Training• Scoring• Evaluation• Cross Validation• Confusion Matrix• Accuracy (ACC)• Precision (PPV)• Recall (TPR)• F1 Score (F1)• Area Under Curve

Break

Break

Putting Theory into Practice

•Brainstorming – What questions could we answer with data?

•Ranking – What questions are most suitable for machine learning?

Putting Theory into PracticeValue Suitability Data Available Complexity Score Question Type1-5 Low to High

1-5 Low to High

1-5 Low to High 1-5 High to Low

1-5 Bad to Good

Predict when and why a customer becomes a "Leaver" 3 5 5 4 4.25Is this A or B? Classification

Route deliveries to ensure guaranteed timeframe is achieved 3 2 1 1 1.75What should I do now? Reinforcement

Price products for greater unit sales and profitability 3 2 2 2 2.25How much, how many? RegressionAnalyze social media to understand customer personas 3 3 3 2 2.75How is this organized? Clustering

Forecast sales and labor staffing efficiently 3 4 3 3 3.25How much, how many? RegressionDetermine media for best return on marketing investments 3 3 4 5 3.75

How much, how many?How is this organized?

RegressionClustering

DEMO• Don’t Panic

Common Issues

•Bias

•Class Imbalance Problem

Bias

•Bias in the data

•Bias created by using ML

Class Imbalance Problem

• If there is a dataset consisting of 10000 genuine and 10 fraudulent transactions, the classifier will tend to classify fraudulent transactions as genuine transactions. The reason can be easily explained by the numbers. Suppose the machine learning algorithm has two possible outputs as follows:

• Model 1 classified 7 out of 10 fraudulent transactions as genuine transactions and 10 out of 10000 genuine transactions as fraudulent transactions.

• Model 2 classified 2 out of 10 fraudulent transactions as genuine transactions and 100 out of 10000 genuine transactions as fraudulent transactions.

Class Imbalance Problem

Class Imbalance Problem

Class Imbalance Problem - Solutions

•Cost function based approaches

•Sampling

Cost Function Based Approaches

The intuition behind cost function based approaches is that if we think one false negative is worse than one false positive, we will count that one false negative as, e.g., 100 false negatives instead. For example, if 1 false negative is as costly as 100 false positives, then the machine learning algorithm will try to make fewer false negatives compared to false positives (since it is cheaper).

Sampling - Undersampling

Sampling - Oversampling

Sampling - SMOTE

Operationalizing Machine Learning

R Gateway

Power BIPower BI Service

Azure ML

1. Extract 4. Publish

2. Call web service

3. Return predictions

5. Schedule refresh

Platforms

• Automatic Business Modeler

• Algorithmia• algorithms.io• Amazon Machine Learning• BigML• DataRobot• FICO Analytic Cloud

• Google Prediction API• HPE Haven OnDemand• IBM’s Watson Analytics• Microsoft Azure Machine

Learning• MLJAR.com• PurePredictive• Yottamine

Questions?

•Try Machine Learning, what’s the worst that could happen…?•[email protected]•@GregDeckler

Questions?

•Try Machine Learning, what’s the worst that could happen…?•[email protected]•@GregDeckler