machine learning in production

RunningMachine Learning Applications

In Production

Sam BESSALAH@samklr

Might Works well for KAGGLE!

Might Works well for KAGGLE!But Kaggle isn’t real world Machine

learning!

In Real Life

- Trade off : Accuracy vs Interpretability cs Speed vs Infrastructure contraints

- Interpretability and Speed often beats accuracy

- Most of the time Kaggle is a feature engineering contest

- Contest oriented vs Real Product Impact

But in real life … Things are less obvious

Data Engineers

Data Pipeline

Data Scientists / ML Engineers

Applications Developers

But in real life … Things are less obvious

Data Engineers

Data Pipeline

Data Scientists / ML Engineers

Applications Developers

Innovation is often (wrongly ?) thought to be here ...

http://www.slideshare.net/jssm1th/an-architecture-for-agile-machine-learning-in-realtime-applications

@josh_wills

josh_wills

Production Requirements :

- Flexibility and agility

- Scalability and Performance

- Enable Real time decision making, sometimes at huge QPS at subseconds pace.

- Security

josh_wills

Machine Learning as a Software Problem

- Most ML developement patterns lead to software design anti patterns

- Dependencies in code, creeps through Models dependencies in Data

- Wasteful use of data, since most ml model selection require multiple version of data. Hence the instability of data, and of prediction services

- Breaks system isolation, leading to un-maintainable stacks

In Production, Machine Learning is a Software and System Problem.

Treat it accordingly !!!!

Deployment / Model Serving

Deployment / Model ServingThe Missing Part in ML

- Model Serving is often ignored or left out to Back End Engineers to implement at their own liking.

- More often it involves serving an API or a Service to do the Predict function. But that not often enough.

- Software scaling can become problematic to the accuracy of the model.

- How many models are you serving?

- Are you running something else ?

- Are you updating your model in real time?

Example : AirBnb

- Trained Models are stored in PMML files

- They serve their models via Openscoring

- They do most of the experiments in a different pipeline.

PMML ?

- Might be the solution for some (most ?) cases

- Support many models, but lacks support for many others

- Fails to capture the evolution of your modeling process … Transformations, re encoding, etc .

- Better suited for exporting models to other systems, rather than being served to machine learning products with real user facing.

- And … XML ?? Really????

Model Versioning - Packaging

- You usually don’t serve only one model. But a lot more. Especially when running experiments.

- You should vie to package your model in versionned way.

- Git is awesome, but not appropriate for live model serving

- Build a model repository or a model index

- I usually use fast KV store or advanced data stores to save my models

- Build a service to manage your models (Model Manager) responsible for evaluating and updating your model.

TensorFlow Serving

Serialization

- Remember PMML ?

- In Big Data, data has schema and proper evolution?

- Why not models ?

- Lots to choose from : Protobuf, Avro

- Use binary schema to represent and version your models

Evaluation

- Business metrics often differ from core model metrics : Trade off between long term metrics and short term metrics.

- Hyperparameters

- A/B Testing - Multi Armed Bandits Problem

Hyper Parameters

Netflix

A/B Testing - Multi-armed Bandit

Dataiku

Experiments

Reproducibility

- How to keep track of data used for training ?

- Are notebooks enough?

- Junpyter Notebooks, Spark Notebooks, Zeppelin, etc ….

- Need for an end to end solution. Not perfect, but a workable one.

I forgot many things

- Monitoring

- Pipeline tuning (one model is often fed to another one)

- RPC over REST for fast model serving ?

- How to deal with heterogeneous systems ?

- Do you really have to distribute your processing?

- Is more data better than smartly tuned algorithms?

machine learning in production

Data & Analytics

data management challenges in production machine learning...

data management challenges in production machine learning -...

machine learning model life cycle management in production

ai, machine learning & big data2020 · production process,...

production data analysis by machine learning

introducing machine learning - mathworks · machine...

data management challenges in production … management...

analyticops: lessons learned moving machine-learning...

train, predict, serve: how to go into production your...

machine learning in production with scikit-learn

introduction to azure machine learning...expected learning...

insights from machine learning for evaluating production...

application of machine learning in production scheduling

machine learning in production

dataengconf sf16 - three lessons learned from building a...

machine learning for coal seam gas production diane...

the netflix prize and production machine learning systems...

machine learning in production with dato predictive services

machine learning best practices in financial...

machine-learning-based forecasting of distributed solar...