![Page 1: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/1.jpg)
11
Recommendations for Building Machine Learning SoftwareJustin BasilicoPage Algorithms Engineering November 13, 2015
@JustinBasilico
SF 2015
![Page 2: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/2.jpg)
22
Introduction
![Page 3: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/3.jpg)
3
Change of focus
2006 2015
![Page 4: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/4.jpg)
4
Netflix Scale > 69M members > 50 countries > 1000 device types > 3B hours/month
36.4% of peak US downstream traffic
![Page 5: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/5.jpg)
5
Goal
Help members find content to watch and enjoy to maximize member satisfaction and retention
![Page 6: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/6.jpg)
6
Everything is a Recommendation
Row
s
Ranking
Over 80% of what people watch comes from our recommendations
Recommendations are driven by Machine Learning
![Page 7: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/7.jpg)
7
Machine Learning Approach
Problem
Data
AlgorithmModel
Metrics
![Page 8: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/8.jpg)
8
Models & Algorithms Regression (linear, logistic, elastic net) SVD and other Matrix Factorizations Factorization Machines Restricted Boltzmann Machines Deep Neural Networks Markov Models and Graph Algorithms Clustering Latent Dirichlet Allocation Gradient Boosted Decision
Trees/Random Forests Gaussian Processes …
![Page 9: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/9.jpg)
9
Design Considerations
Recommendations• Personal• Accurate• Diverse• Novel• Fresh
Software• Scalable• Responsive• Resilient• Efficient• Flexible
![Page 10: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/10.jpg)
10
Software Stack
http://techblog.netflix.com
![Page 11: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/11.jpg)
1111
Recommendations
![Page 12: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/12.jpg)
12
Be flexible about where and when computation happens
Recommendation 1
![Page 13: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/13.jpg)
13
System Architecture Offline: Process data
Batch learning
Nearline: Process events Model evaluation Online learning Asynchronous
Online: Process requests Real-time
More details on Netflix Techblog
![Page 14: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/14.jpg)
14
Where to place components? Example: Matrix Factorization Offline:
Collect sample of play data Run batch learning algorithm like SGD
to produce factorization Publish video factors
Nearline: Solve user factors Compute user-video dot products Store scores in cache
Online: Presentation-context filtering Serve recommendations
V
sij=uivj Aui=b
sij
X≈UVt
X
sij>t
![Page 15: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/15.jpg)
15
Think about distribution starting from the outermost levels
Recommendation 2
![Page 16: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/16.jpg)
16
Three levels of Learning Distribution/Parallelization1. For each subset of the population (e.g.
region) Want independently trained and tuned models
2. For each combination of (hyper)parameters Simple: Grid search Better: Bayesian optimization using Gaussian
Processes
3. For each subset of the training data Distribute over machines (e.g. ADMM) Multi-core parallelism (e.g. HogWild) Or… use GPUs
![Page 17: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/17.jpg)
17
Example: Training Neural Networks Level 1: Machines in different
AWS regions Level 2: Machines in same AWS
region Spearmint or MOE for parameter
optimization Mesos, etc. for coordination
Level 3: Highly optimized, parallel CUDA code on GPUs
![Page 18: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/18.jpg)
18
Design application software for experimentation
Recommendation 3
![Page 19: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/19.jpg)
19
Example development process
Idea Data
Offline Modeling
(R, Python, MATLAB, …)
Iterate
Implement in production
system (Java, C++, …)
Data discrepancies
Missing post-processing
logic
Performance issues
Actual output
Experimentation environment
Production environment(A/B test) Code
discrepancies
Final model
![Page 20: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/20.jpg)
20Shared Engine
Avoid dual implementations
Experimentcode
Productioncode
ProductionExperiment • Models• Features• Algorithms• …
![Page 21: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/21.jpg)
21
Solution: Share and lean towards production Developing machine learning is iterative
Need a short pipeline to rapidly try ideas Want to see output of complete system
So make the application easy to experiment with Share components between online, nearline, and offline Use the real code whenever possible
Have well-defined interfaces and formats to allow you to go off-the-beaten-path
![Page 22: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/22.jpg)
22
Make algorithms extensible and modularRecommendation 4
![Page 23: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/23.jpg)
23
Make algorithms and models extensible and modular Algorithms often need to be tailored for
a specific application Treating an algorithm as a black box is limiting
Better to make algorithms extensible and modular to allow for customization
Separate models and algorithms Many algorithms can learn the same model
(i.e. linear binary classifier) Many algorithms can be trained on the same
types of data
Support composing algorithms
Data
Parameters
Data
Model
Parameters
ModelAlgorithm
Vs.
![Page 24: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/24.jpg)
24
Provide building blocks Don’t start from scratch Linear algebra: Vectors, Matrices, … Statistics: Distributions, tests, … Models, features, metrics, ensembles, … Loss, distance, kernel, … functions Optimization, inference, … Layers, activation functions, … Initializers, stopping criteria, … … Domain-specific components
Build abstractions on familiar concepts
Make the software put them together
![Page 25: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/25.jpg)
25
Example: Tailoring Random Forests
Using Cognitive Foundry: http://github.com/algorithmfoundry/Foundry
Use a custom tree split
Customize to run it for an hour
Report a custom metric each iteration
Inspect the ensemble
![Page 26: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/26.jpg)
26
Describe your input and output transformations with your model
Recommendation 5
![Page 27: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/27.jpg)
27
Application
Putting learning in an application
FeatureEncoding
OutputDecoding
?Machine Learned Model
Rd R⟶ k
Application or model code?
![Page 28: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/28.jpg)
28
Example: Simple ranking system High-level API: List<Video> rank(User u, List<Video>
videos) Example model description file:{
“type”: “ScoringRanker”,“scorer”: {
“type”: “FeatureScorer”,“features”: [
{“type”: “Popularity”, “days”: 10},
{“type”: “PredictedRating”}
], “function”: {
“type”: “Linear”, “bias”: -0.5, “weights”: {
“popularity”: 0.2,
“predictedRating”: 1.2,
“predictedRating*popularity”: 3.5}
}}
}
Ranker
Scorer
Features
Linear function
Feature transformations
![Page 29: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/29.jpg)
29
Don’t just rely on metrics for testingRecommendation 6
![Page 30: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/30.jpg)
30
Machine Learning and Testing Temptation: Use validation metrics to test software When things work and metrics go up this seems great When metrics don’t improve was it the
code data metric idea …?
![Page 31: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/31.jpg)
31
Reality of Testing Machine learning code involves intricate math and logic
Rounding issues, corner cases, … Is that a + or -? (The math or paper could be wrong.) Solution: Unit test
Testing of metric code is especially important Test the whole system: Just unit testing is not enough
At a minimum, compare output for unexpected changes across versions
![Page 32: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/32.jpg)
3232
Conclusions
![Page 33: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/33.jpg)
33
Two ways to solve computational problems
Know solution Write code
Compile code Test code Deploy
code
Know relevant data
Develop algorithmic approach
Train model on data using
algorithm
Validate model with
metrics
Deploy model
Software Development
Machine Learning(steps may involve Software Development)
![Page 34: Recommendations for Building Machine Learning Software](https://reader033.vdocument.in/reader033/viewer/2022061307/58a421aa1a28abec1a8b605b/html5/thumbnails/34.jpg)
34
Take-aways for building machine learning software Building machine learning is an iterative process Make experimentation easy Take a holistic view of application where you are placing
learning Design your algorithms to be modular Look for the easy places to parallelize first Testing can be hard but is worthwhile