deep model serving - s3-eu-west-1.amazonaws.com€¦ · the “variant” concept deep model...

48
Jenia Gorokhovsky - Algorithms Team Lead @ Taboola June 2018 Deep Model Serving Scale and ergonomics

Upload: others

Post on 07-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

Jenia Gorokhovsky - Algorithms Team Lead @ Taboola June 2018

Deep Model ServingScale and ergonomics

Page 2: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

2• Problem setup

• What do we want from deep model serving

– From the Data Science Perspective

– From the Engineering Perspective

• Design Considerations

• Implementation

OutlineDEEP MODEL SERVING

Page 3: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

You’ve Seen Us Before

Enabling people to discover information at that moment when they’re likely to engage

Page 4: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

Algo Engineer = Software Engineer + Data Scientist

Page 5: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

Online vs. Offline Data Science

Page 6: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

6Offline Data Science Cycle

Train model Look at performance on test set

DEEP MODEL SERVING

Page 7: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

7Why go for online experiments?

• Many models are not measured by test set accuracy– Recommendation systems

– Digital assistants

– Art generation services

• Always useful to see how models are actually used

DEEP MODEL SERVING

Page 8: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

8Online Data Science Cycle

Train DeployAnalyzeOnline Metrics

DEEP MODEL SERVING

Page 9: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

Serving for

Data Scientists

Page 10: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

10

Model uses title and time of day to predict click probability

For ex. P(click | <“Buy shaving razors”, 7:30am>)

A Simple Recommender FlowSERVING FOR DATA SCIENTISTS

Train time based models

Rank items Report results

Compare models

Page 11: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

11A Simple Recommender Flow

SERVING FOR DATA SCIENTISTS

Train time based models

Rank items Report results

Compare models

Items with highest probability are displayed

Page 12: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

12A Simple Recommender Flow

SERVING FOR DATA SCIENTISTS

Train time based models

Rank items Report results

Compare models

Report the time, date, and how many items were clicked on my model

Page 13: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

13Non-trivial Experiment Design

• Timeframes matter a lot– Model freshness

– Time of day / day of week / holidays

• Stickiness sometimes matters

• Traffic required to reach confidence

– Depends on variance/measurement noise

– Depends on independence assumptions

• Traffic for experiments might need to be fixed

SERVING FOR DATA SCIENTISTS

Page 14: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

14Scale for data science

SERVING FOR DATA SCIENTISTS

Concurrency - how many different experiments are running?

Throughput - how much of the traffic is helping answer questions?

Visibility - can I know what happened with each experiment?

Reliability - how certain I am that results are correct?

Page 15: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

Serving for

Engineers

Page 16: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

16Production Loop

Configure Deploy Monitor

SERVING FOR ENGINEERS

Page 17: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

17Deploying Deep Learning Models

SERVING FOR ENGINEERS

• Single model resource usage is complex– Changes with geography and time of day

– Depends on architecture properties

– Changes with inputs

– Mostly not linear

• Multiple experiments require complex deployments

– Could be coordinated

– Could be periodic

– Distinguish between default / experiment / debug models

Page 18: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

18Scale for production systems

SERVING FOR ENGINEERS

Latency - how quickly can I get an answer?

Throughput - how many requests I can serve on given hardware?

Reliability - how likely I am to get an answer?

Visibility - what is running? where? With what performance?

Page 19: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

Having it all

Page 20: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

As much traffic as hardware allows

Responses return within latency budget

Most traffic used for experiments

Flexibility in designing these experiments

Data Science Engineering

Ability to look at experiment results

Reliability

Ability to monitor running services

Reliability

Page 21: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

As much traffic as hardware allows

Responses return within latency budget

Most traffic used for experiments

Flexibility in designing these experiments

Data Science Engineering

Ability to look at experiment results

Reliability

Ability to monitor running services

ReliabilityNeed to prioritize

Page 22: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

As much traffic as hardware allows

Responses return within latency budget

Most traffic used for experiments

Flexibility in designing these experiments

Data Science Engineering

Ability to look at experiment results

Reliability

Ability to monitor running services

Reliability

Need to coordinate

Page 23: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

23The “variant” concept

DEEP MODEL SERVING

Configuration for Deployment

Configuration for Training

Service name for RoutingID for

Monitoring/Reporting

Page 24: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

Recommends the same items every hour! Maybe we

need to add features!

Our Example variantDEEP MODEL SERVING

Does better than the default model

on weekends!

Latency is >20ms (too

high). We need more cores!

Page 25: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

But how?Designing a serving solution

Page 26: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

Stateless

Horizontally scalable

No external dependencies

Compute only

Hard to validate outputs

No runtime config

No internal state to monitor

Complicated resource footprint

Micro service Black box

Page 27: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

27Train -> Deploy Pipeline

DESIGNING DEEP MODEL SERVING SOLUTION

Training Config Model Serving

DockerService

Page 28: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

28Train -> Deploy Pipeline

DESIGNING DEEP MODEL SERVING SOLUTION

Training code (over Tensorflow) takes configuration and trains the model

Training Config Model Serving

DockerService

Page 29: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

29Train -> Deploy Pipeline

DESIGNING DEEP MODEL SERVING SOLUTION

Training Config Model Serving

DockerService

Model is packaged into a docker running Tensorflow Serving

Page 30: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

30Train -> Deploy Pipeline

DESIGNING DEEP MODEL SERVING SOLUTION

Training Config Model Serving

DockerService

A Kubernetes service runs as many dockers as needed

Page 31: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

ComponentsDESIGNING DEEP MODEL SERVING SOLUTION

Config

Orchestrator

Serving on Kubernetes

(CPUs)

Monitoring Reporting

Training on Kubernetes

(GPUs)

Code + Model

Page 32: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

Nuts and Boltsimplementation

Page 33: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

33Train -> Deploy Pipeline

DEEP MODEL SERVING: NUTS AND BOLTS

Training code takes configuration and trains the model

Training Config Model Serving

DockerService

Page 34: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

34Building Graphs

• Use Tensorflow– Compose pre-made or custom tf.Layer

– Stock or custom tf.Estimator

– Prefer high level reusable components

• Use external configuration

• Save as SavedModel

DEEP MODEL SERVING: NUTS AND BOLTS

Page 35: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

35• Descriptive

– What is in the model?

– Who put it there?

– What models should go together?

• Hierarchical

– Often we only want to test a small thing

– Often makes sense to compare to “default”

• Thorough– Shouldn’t be necessary to look at code at given point

• Points to the code

Solid configuration modelDEEP MODEL SERVING: NUTS AND BOLTS

Page 36: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

36• Uses git as backend

• Has REST interface (for Python!)

• Flexible

– Multiple profiles

– Inheritance

– Dynamic profile loading

– Work from a branch

• Reliable

Spring cloud configDEEP MODEL SERVING: NUTS AND BOLTS

Page 37: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

37Train -> Deploy Pipeline

DEEP MODEL SERVING: NUTS AND BOLTS

Training Config Model Serving

DockerService

Model is packaged into a docker running Tensorflow Serving

Page 38: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

38Serves TensorFlow models!

• Written in C++

• Uses gRPC for communications

– A protobuf extension

– Wire protocol is HTTP/2.0

– Good load balancer support

• Efficient– Low overhead

– Good threading model

– Automatic batching

TensorFlow ServingDEEP MODEL SERVING: NUTS AND BOLTS

Page 39: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

39Train -> Deploy Pipeline

DESIGNING DEEP MODEL SERVING SOLUTION

Training Config Model Serving

DockerService

A Kubernetes service runs as many dockers as needed

Page 40: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

40• Dynamic scaling over variants

– Let’s give these two 5% traffic

• Scaling over time

– The Japanese language model should get more traffic now

• Monitor at different granularities

– Individual variant (for optimizations)

– Individual host/data centre

– The entire system (for sizing)

• Integration tests

Containerized deploymentDEEP MODEL SERVING: NUTS AND BOLTS

Page 41: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

41• Out-of-the-box scaling

– Dynamic scaling

– Service discovery

– Load balancers (pluggable)

• Deployment and monitoring is easy

– Blue/green deployments

– Liveness and readiness tests

• Easy to use– Great UI with dynamic editing

– IP address for every container

Why Kubernetes?DEEP MODEL SERVING: NUTS AND BOLTS

Page 42: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

421. Train your model with TensorFlow

2. Put TF Serving + model in docker

3. Deploy on Kubernetes

4. Serve real traffic

5. Analyze online metrics

The RecipeDEEP MODEL SERVING: NUTS AND BOLTS

Most routine data science is done in #1 and #5 (and in Python)

Page 43: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

But what about..more tips

Page 44: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

44The one predictable part of the service

• Which features?

• What are the shapes?

• Is padding used?

• What is the vocabulary?

Managing SchemaDEEP MODEL SERVING: MORE TIPS

Page 45: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

45Only move forward!

• Features are added to all services

• All models “read” all features but only use some

• Deprecate and stop filling instead of removing

• Like protobuf

Managing SchemaDEEP MODEL SERVING: MORE TIPS

Page 46: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

Supported by Tensorflow Serving

Can be enabled per request

Tracing DEEP MODEL SERVING: MORE TIPS

Page 47: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

And does it work?

Page 48: Deep Model Serving - s3-eu-west-1.amazonaws.com€¦ · The “variant” concept DEEP MODEL SERVING Configuration for Deployment Configuration for Training Service name for Routing

4848Deep Learning in Taboola

DEEP MODEL SERVING

60

4Different algorithmic

families

40Data Scientists

Using the platform

Models trained / day

20ms

5Data centers

150K+Requests/sec

Typical response time