![Page 1: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/1.jpg)
Jenia Gorokhovsky - Algorithms Team Lead @ Taboola June 2018
Deep Model ServingScale and ergonomics
![Page 2: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/2.jpg)
2• Problem setup
• What do we want from deep model serving
– From the Data Science Perspective
– From the Engineering Perspective
• Design Considerations
• Implementation
OutlineDEEP MODEL SERVING
![Page 3: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/3.jpg)
You’ve Seen Us Before
Enabling people to discover information at that moment when they’re likely to engage
![Page 4: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/4.jpg)
Algo Engineer = Software Engineer + Data Scientist
![Page 5: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/5.jpg)
Online vs. Offline Data Science
![Page 6: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/6.jpg)
6Offline Data Science Cycle
Train model Look at performance on test set
DEEP MODEL SERVING
![Page 7: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/7.jpg)
7Why go for online experiments?
• Many models are not measured by test set accuracy– Recommendation systems
– Digital assistants
– Art generation services
• Always useful to see how models are actually used
DEEP MODEL SERVING
![Page 8: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/8.jpg)
8Online Data Science Cycle
Train DeployAnalyzeOnline Metrics
DEEP MODEL SERVING
![Page 9: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/9.jpg)
Serving for
Data Scientists
![Page 10: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/10.jpg)
10
Model uses title and time of day to predict click probability
For ex. P(click | <“Buy shaving razors”, 7:30am>)
A Simple Recommender FlowSERVING FOR DATA SCIENTISTS
Train time based models
Rank items Report results
Compare models
![Page 11: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/11.jpg)
11A Simple Recommender Flow
SERVING FOR DATA SCIENTISTS
Train time based models
Rank items Report results
Compare models
Items with highest probability are displayed
![Page 12: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/12.jpg)
12A Simple Recommender Flow
SERVING FOR DATA SCIENTISTS
Train time based models
Rank items Report results
Compare models
Report the time, date, and how many items were clicked on my model
![Page 13: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/13.jpg)
13Non-trivial Experiment Design
• Timeframes matter a lot– Model freshness
– Time of day / day of week / holidays
• Stickiness sometimes matters
• Traffic required to reach confidence
– Depends on variance/measurement noise
– Depends on independence assumptions
• Traffic for experiments might need to be fixed
SERVING FOR DATA SCIENTISTS
![Page 14: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/14.jpg)
14Scale for data science
SERVING FOR DATA SCIENTISTS
Concurrency - how many different experiments are running?
Throughput - how much of the traffic is helping answer questions?
Visibility - can I know what happened with each experiment?
Reliability - how certain I am that results are correct?
![Page 15: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/15.jpg)
Serving for
Engineers
![Page 16: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/16.jpg)
16Production Loop
Configure Deploy Monitor
SERVING FOR ENGINEERS
![Page 17: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/17.jpg)
17Deploying Deep Learning Models
SERVING FOR ENGINEERS
• Single model resource usage is complex– Changes with geography and time of day
– Depends on architecture properties
– Changes with inputs
– Mostly not linear
• Multiple experiments require complex deployments
– Could be coordinated
– Could be periodic
– Distinguish between default / experiment / debug models
![Page 18: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/18.jpg)
18Scale for production systems
SERVING FOR ENGINEERS
Latency - how quickly can I get an answer?
Throughput - how many requests I can serve on given hardware?
Reliability - how likely I am to get an answer?
Visibility - what is running? where? With what performance?
![Page 19: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/19.jpg)
Having it all
![Page 20: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/20.jpg)
As much traffic as hardware allows
Responses return within latency budget
Most traffic used for experiments
Flexibility in designing these experiments
Data Science Engineering
Ability to look at experiment results
Reliability
Ability to monitor running services
Reliability
![Page 21: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/21.jpg)
As much traffic as hardware allows
Responses return within latency budget
Most traffic used for experiments
Flexibility in designing these experiments
Data Science Engineering
Ability to look at experiment results
Reliability
Ability to monitor running services
ReliabilityNeed to prioritize
![Page 22: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/22.jpg)
As much traffic as hardware allows
Responses return within latency budget
Most traffic used for experiments
Flexibility in designing these experiments
Data Science Engineering
Ability to look at experiment results
Reliability
Ability to monitor running services
Reliability
Need to coordinate
![Page 23: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/23.jpg)
23The “variant” concept
DEEP MODEL SERVING
Configuration for Deployment
Configuration for Training
Service name for RoutingID for
Monitoring/Reporting
![Page 24: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/24.jpg)
Recommends the same items every hour! Maybe we
need to add features!
Our Example variantDEEP MODEL SERVING
Does better than the default model
on weekends!
Latency is >20ms (too
high). We need more cores!
![Page 25: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/25.jpg)
But how?Designing a serving solution
![Page 26: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/26.jpg)
Stateless
Horizontally scalable
No external dependencies
Compute only
Hard to validate outputs
No runtime config
No internal state to monitor
Complicated resource footprint
Micro service Black box
![Page 27: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/27.jpg)
27Train -> Deploy Pipeline
DESIGNING DEEP MODEL SERVING SOLUTION
Training Config Model Serving
DockerService
![Page 28: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/28.jpg)
28Train -> Deploy Pipeline
DESIGNING DEEP MODEL SERVING SOLUTION
Training code (over Tensorflow) takes configuration and trains the model
Training Config Model Serving
DockerService
![Page 29: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/29.jpg)
29Train -> Deploy Pipeline
DESIGNING DEEP MODEL SERVING SOLUTION
Training Config Model Serving
DockerService
Model is packaged into a docker running Tensorflow Serving
![Page 30: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/30.jpg)
30Train -> Deploy Pipeline
DESIGNING DEEP MODEL SERVING SOLUTION
Training Config Model Serving
DockerService
A Kubernetes service runs as many dockers as needed
![Page 31: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/31.jpg)
ComponentsDESIGNING DEEP MODEL SERVING SOLUTION
Config
Orchestrator
Serving on Kubernetes
(CPUs)
Monitoring Reporting
Training on Kubernetes
(GPUs)
Code + Model
![Page 32: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/32.jpg)
Nuts and Boltsimplementation
![Page 33: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/33.jpg)
33Train -> Deploy Pipeline
DEEP MODEL SERVING: NUTS AND BOLTS
Training code takes configuration and trains the model
Training Config Model Serving
DockerService
![Page 34: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/34.jpg)
34Building Graphs
• Use Tensorflow– Compose pre-made or custom tf.Layer
– Stock or custom tf.Estimator
– Prefer high level reusable components
• Use external configuration
• Save as SavedModel
DEEP MODEL SERVING: NUTS AND BOLTS
![Page 35: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/35.jpg)
35• Descriptive
– What is in the model?
– Who put it there?
– What models should go together?
• Hierarchical
– Often we only want to test a small thing
– Often makes sense to compare to “default”
• Thorough– Shouldn’t be necessary to look at code at given point
• Points to the code
Solid configuration modelDEEP MODEL SERVING: NUTS AND BOLTS
![Page 36: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/36.jpg)
36• Uses git as backend
• Has REST interface (for Python!)
• Flexible
– Multiple profiles
– Inheritance
– Dynamic profile loading
– Work from a branch
• Reliable
Spring cloud configDEEP MODEL SERVING: NUTS AND BOLTS
![Page 37: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/37.jpg)
37Train -> Deploy Pipeline
DEEP MODEL SERVING: NUTS AND BOLTS
Training Config Model Serving
DockerService
Model is packaged into a docker running Tensorflow Serving
![Page 38: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/38.jpg)
38Serves TensorFlow models!
• Written in C++
• Uses gRPC for communications
– A protobuf extension
– Wire protocol is HTTP/2.0
– Good load balancer support
• Efficient– Low overhead
– Good threading model
– Automatic batching
TensorFlow ServingDEEP MODEL SERVING: NUTS AND BOLTS
![Page 39: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/39.jpg)
39Train -> Deploy Pipeline
DESIGNING DEEP MODEL SERVING SOLUTION
Training Config Model Serving
DockerService
A Kubernetes service runs as many dockers as needed
![Page 40: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/40.jpg)
40• Dynamic scaling over variants
– Let’s give these two 5% traffic
• Scaling over time
– The Japanese language model should get more traffic now
• Monitor at different granularities
– Individual variant (for optimizations)
– Individual host/data centre
– The entire system (for sizing)
• Integration tests
Containerized deploymentDEEP MODEL SERVING: NUTS AND BOLTS
![Page 41: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/41.jpg)
41• Out-of-the-box scaling
– Dynamic scaling
– Service discovery
– Load balancers (pluggable)
• Deployment and monitoring is easy
– Blue/green deployments
– Liveness and readiness tests
• Easy to use– Great UI with dynamic editing
– IP address for every container
Why Kubernetes?DEEP MODEL SERVING: NUTS AND BOLTS
![Page 42: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/42.jpg)
421. Train your model with TensorFlow
2. Put TF Serving + model in docker
3. Deploy on Kubernetes
4. Serve real traffic
5. Analyze online metrics
The RecipeDEEP MODEL SERVING: NUTS AND BOLTS
Most routine data science is done in #1 and #5 (and in Python)
![Page 43: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/43.jpg)
But what about..more tips
![Page 44: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/44.jpg)
44The one predictable part of the service
• Which features?
• What are the shapes?
• Is padding used?
• What is the vocabulary?
Managing SchemaDEEP MODEL SERVING: MORE TIPS
![Page 45: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/45.jpg)
45Only move forward!
• Features are added to all services
• All models “read” all features but only use some
• Deprecate and stop filling instead of removing
• Like protobuf
Managing SchemaDEEP MODEL SERVING: MORE TIPS
![Page 46: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/46.jpg)
Supported by Tensorflow Serving
Can be enabled per request
Tracing DEEP MODEL SERVING: MORE TIPS
![Page 47: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/47.jpg)
And does it work?
![Page 48: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing](https://reader033.vdocument.in/reader033/viewer/2022042410/5f27d0eb32638365c7751220/html5/thumbnails/48.jpg)
4848Deep Learning in Taboola
DEEP MODEL SERVING
60
4Different algorithmic
families
40Data Scientists
Using the platform
Models trained / day
20ms
5Data centers
150K+Requests/sec
Typical response time