operationalizing machine learning

Operationalizing Machine Learning

5/24/2017 Taylor Howard

Presenter

C. Taylor Howard

Data Analytics & Collaboration

[email protected]

Agenda10 Minutes of Content & 5 for Minutes for Questions

3

• Traditional software systems versus ML & AI SolutionsLet’s apply some definitions for clarity. Why would we want to invest in ML?

• What do I want for lunch?A simple example demonstrating the benefits, challenges, and operational

considerations of ML versus traditional software.

• Machine Learning in the fieldWhat do we need to know about deploying a solution with ML?

• Where do I begin and what is this going to cost me?Adding ML can be very cost effective when using trained models, but what about

training bespoke models for custom needs?

Definitions

4

Artificial IntelligenceEncompasses all approaches to

simulate human intelligence.

General AI is the goal.

Machine LearningAlgorithmic approach to parse

data, learn from it, and make

predictions.

Deep LearningMassive artificial neural networks

targeting narrow AI.

Source – A great article by Michael Copelandhttps://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/

Why Machine Learning?

5

• Many problems do not

require ML.

• Where there is overlap,

the ML offers

generalization.

• To the far right are

problem domains that

traditional software

cannot solve, e.g., speech

recognition, computer

vision, etc.

Traditional Software

Machine Learning

Descriptive Predictive

Version 3

Provide recommendations

based on past order history

Version 2

Order history remembered

for quick reordering

Lunchtime Ordering AppIntegrating machine learning with a traditional app

6

Version 1

Users can

order lunch

Cuisine?

American

Italian

Indian

French

Menu?

Hamburger

Fries

Chicken

Application Development Methodology (Scrum)

Data Science Methodology (CRISP-DM)

=

Version 3

Provide recommendations

based on past order history

Lunchtime Ordering AppMachine learning versus conventional approaches

7

Non-ML Approach – Considerations

Simple list of whatever a user orders the most is what is recommended

What was ordered most recently is what is recommended

Recommend based on price

Mine reviews from Yelp and recommend based on user reviews

Recommended based on location

?

ML Approach

There is no need for us to programmatically try to understand all the relationships between the influencing factors that go into making a lunch determination. We will let Machine Learning determine this for us, but we need to provide the inputs – these are called Features.

Lunchtime Ordering AppMachine learning observations, features & labels

8

Features are an individual property being observed

that be believe will have predictive power.

External features can

have significance

The label informs our

algorithm of the correct

result we seek to predict

Lunchtime Ordering AppMachine learning training & model selection

9

Multiclass neural networkAccuracy, long training times

Multiclass logistic regressionFast training times, linear model

Multiclass decision forestAccuracy, fast training times

Multiclass decision jungleAccuracy, small memory footprint

A good article on model performance – Accuracy, Precision, Recallhttps://blogs.msdn.microsoft.com/andreasderuiter/2015/02/09/performance-measures-in-azure-ml-accuracy-precision-recall-and-f1-score/

Lunchtime Ordering AppDeploying our model and predicting where to eat!

10

Multiclass decision forestAccuracy, fast training times

Observation Day of Week Time of Day Ordered Recently Distance Vegan Option Calories Cuisine Price Yelp Rating Weather

1 Wednesday 12:30 PM No 0 - 5 Miles Yes Medium American $$ 4 Clear

2 Wednesday 12:30 PM No 15 - 25 Miles Yes High Indian $ 5 Clear

3 Wednesday 12:30 PM No 6 - 10 Miles Yes Medium American $$$ 4 Clear

4 Wednesday 12:30 PM No 0 - 5 Miles Yes Medium Italian $ 4 Clear

5 Wednesday 12:30 PM No 15 - 25 Miles Yes High American $$ 5 Clear

6 Wednesday 12:30 PM Yes 6 - 10 Miles Yes High Italian $$$ 5 Clear

7 Wednesday 12:30 PM No 0 - 5 Miles No Medium American $ 4 Clear

8 Wednesday 12:30 PM No 15 - 25 Miles Yes Low American $$ 4 Clear

1

2

3

Observation Scored Label Scored Probobablites

1 High 0.92

2 High 0.80

3 Medium 0.94

4 Medium 0.70

5 High 0.80

6 Low 0.95

7 Low 0.80

8 High 0.50

What does this cost?

11

Custom Model - ML

• Azure ML

• Microsoft R Server

• Google TensorFlow

• Amazon Machine Learning

• Big Data – Spark \ R

1-4 Months

Prebuilt Intelligence APIs

Microsoft Azure Cognitive Services

Google Cloud Prediction

IBM Watson APIs

4-12+ Months

Deep Learning

• Microsoft Cognitive Toolkit

(CNTK)

• Google TensorFlow

• Custom Algorithm \ Neural

Network

6-18+ Months

https://azure.microsoft.com/en-us/services/cognitive-services/

https://cloud.google.com/prediction/docs/gallery

https://www.ibm.com/us-en/marketplace/cognitive-application-development

Observation User Age Income Gender Day of Week Time of Day Satisfaction

1 Jeff 44 50 - 75k Male Thursday 11:00 AM High

2 Jeff 44 50 - 75k Male Friday 1:00 PM Low

3 Jeff 44 50 - 75k Male Friday 1:00 PM Medium

4 Tony 43 75 - 100k Male Monday 12:30 PM Medium

5 Tony 43 75 - 100k Male Tuesday 12:30 PM High

6 Tony 43 75 - 100k Male Friday 12:00 PM Low

7 Jill 28 75 - 100k Female Friday 11:30 AM High

8 Jill 28 75 - 100k Female Friday 2:00 PM High

…

N -

Lunchtime Ordering AppBonus section – Unsupervised learning (clustering)

12

Imagine adding demographic features

to our data set.

The label informs our

algorithm of the correct

result we seek to predict

What type of clusters do we see for users that are highly satisfied?

{Female, 24-30, 75-100k}

Perhaps an ad campaign?

Thank You

www.agilethought.com linkedin.com/company/AgileThought @AgileThought

C. Taylor Howard

Director of Data Analytics & Collaboration

[email protected]

www.agilethought.com

727.248.2478

13

http://www.agilethought.com/

Data AnalyticsCRISP Methodology

14

• Business UnderstandingThis initial phase focuses on understanding the project objectives and requirements

from a business perspective, and then converting this knowledge into a data mining

problem definition, and a preliminary plan designed to achieve the objectives.

• Data UnderstandingThe data understanding phase starts with an initial data collection and proceeds with

activities in order to get familiar with the data, to identify data quality problems, to

discover first insights into the data, or to detect interesting subsets to form hypotheses

for hidden information.

• Data PreparationThe data preparation phase covers all activities to construct the final dataset. This data

will be fed into the modeling tools from the initial raw data. Data preparation tasks are

likely to be performed multiple times.

• ModelingModeling techniques are selected and applied, and their parameters are calibrated to

optimal values. Typically, there are several techniques for the same data mining problem

type. Stepping back to the data preparation phase is often needed.

Azu

re M

L Alg

orith

ms

15

Azure ML Cheat Sheethttps://docs.microsoft.com/en-us/azure/machine-learning/machine-learning-algorithm-cheat-sheet

https://docs.microsoft.com/en-us/azure/machine-learning/machine-learning-algorithm-cheat-sheet

Additional and Links

• Scrum Software Development

https://en.wikipedia.org/wiki/Scrum_(software_development)

• CRISP-DM, Cross Industry Standard Process for Data Mining

https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining

https://en.wikipedia.org/wiki/Scrum_(software_development)

https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining

operationalizing machine learning

Data & Analytics