michelle casbon yow! brisbane december 4, 2018 kubeflow ... · run wherever k8s runs use k8s to...

Post on 09-Jul-2020

8 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Kubeflow Explained: NLP Architectureson Kubernetes

Michelle CasbonYOW!BrisbaneDecember 4, 2018

@texasmichelle

whoami

@texasmichelle

Agenda

Problems Goals What's inside Demo Future Direction

1 2 3 4 5

@texasmichelle

Move along

Is this a clearly defined

problem?

Can it be solved in a

deterministic way?

Do that

Dive in

No

No

Yes

YesML decision tree

Credit: David Andrzejewski

@texasmichelle

Counting things is still really hard.

MACHINELEARNING

@texasmichelle

https://github.com/kubeflow/examples/demos

@texasmichelle

A curated set of compatible tools and artifacts that lays a foundation for

running production ML appsEnables consistency across deployments by providing Kubernetes object

templates that bring together disparate components

@texasmichelle

Infrastructure

Application

Platform

GCP

Yelp Sentiment

Kubeflow

GCP

Sentiment

Kubeflow

@texasmichelle

2

Agenda

Problems Goals What's inside Demo Future Direction

1 3 4 5

@texasmichelle

Production code

@texasmichelle

Moving from local to production

Credit: Jörg Wagner and Stefan Prehn

PortabilityPackage infrastructure components together

GCP

Sentiment

Kubeflow

@texasmichelle

ComplexityGCP

Sentiment

Kubeflow

@texasmichelle

Perception

Credit: Hidden Technical Debt of Machine Learning Systems, D. Sculley, et al.

GCP

Sentiment

Kubeflow

@texasmichelle

Reality

Credit: Hidden Technical Debt of Machine Learning Systems, D. Sculley, et al.

GCP

Sentiment

Kubeflow

@texasmichelle

GCP

Sentiment

Kubeflow

Feature ExtractionData Ingestion

Data Exploration

Data Transformation

Data Validation

Data Analysis

Training Data Segmentation

Model Building

Model Validation

Model Versioning

Model Auditing

Distributed Training

Continuous Training

Process Management

Configuration

Resource Management

Monitoring

Logging

Continuous Delivery

Authentication/ Authorization

Serving Infrastructure

UI

Business Logic

Load Balancing

Data Featurization Training Application Platform

@texasmichelle

Complexity

ComposabilityLogical groupings

Reusable components

GCP

Sentiment

Kubeflow

@texasmichelle

Maintainability

● Error resolution, recovery, & prevention● Speed of iteration● Versioning Composability

Shorten the development lifecycle

Automation

GCP

Sentiment

Kubeflow

@texasmichelle

Capacity Planning

ScalabilityKubernetes

Autoprovisioning

● Usage patterns● Demand spikes● Efficient resource usage

GCP

Sentiment

Kubeflow

@texasmichelle

1 2

Agenda

3 4 5

Problems Goals What's inside Demo Future Direction

@texasmichelle

Make it easy for everyone to develop, deploy, and manage portable,

scalable ML everywhere

@texasmichelle

Kubeflow

Composability

Single, unified tool for common processes

Portability

Entire stack

Scalability

Native to k8s

Reduce variability between services & environments

Full product lifecycle

Support specialized hardware, like GPUs & TPUs

Reduce costs

Improve model performance

GCP

Sentiment

Kubeflow

@texasmichelle

KubeflowWho

Data scientists

ML researchers

Software engineers

Product managers

Why

Because building a platform is too big of a problem to tackle alone

What

Portable ML products on k8s

v0.3.4 release

https://github.com/kubeflow/kubeflow

GCP

Sentiment

Kubeflow

@texasmichelle

KubeflowKubernetes-native platform for ML

Run wherever k8s runs

Use k8s to manage ML tasks

CRDs for distributed training

Adopt k8s patterns

Microservices

Manage infra declaratively

Package infrastructure components together

Ksonnet

Move between local -> dev -> test -> prod -> onprem

Support multiple ML frameworks

Tensorflow

Pytorch

Scikit

Xgboost

Et al.

GCP

Sentiment

Kubeflow

@texasmichelle

2

Agenda

1 3 4 5

Problems Goals What's inside Demo Future Direction

@texasmichelle

But what is it?

@texasmichelle

@texasmichelle

A curated set of compatible tools and artifacts that lays a foundation for

running production ML appsEnables consistency across deployments by providing Kubernetes object

templates that bring together disparate components

@texasmichelle

GKEIngress

(e.g. Ambassador)

Pipelines Controllers

ArgoControllers

Katib HP Tuning Controllers

IAP

Central Dashboard

JupyterHub TF Job Dashboard

TF JobOperator

PipelinesDashboard

ArgoDashboard

Katib HP TuningDashboard

Pytorch Operator

What's Inside v0.3?GCP

Sentiment

Kubeflow

@texasmichelle

Click-to-deploy

GCP

Sentiment

Kubeflow

@texasmichelle

What's new in v0.3?● Deploy

○ Click-to-deploy

○ CLI (kfctl)

● Develop

○ Argo

○ Pytorch operator

○ Hyperparameter tuning StudyJob CRD

○ Kubeflow Pipelines

● Autoprovisioning

GCP

Sentiment

Kubeflow

@texasmichelle

Pipelines● End-to-end ML workflows● Orchestration● Service integration● Components & sharing● Job tracking, experimentation,

monitoring● Notebook integration

GCP

Sentiment

Kubeflow

@texasmichelle

2

Agenda

1 3 4 5

Problems Goals What's inside Demo Future Direction

@texasmichelle

https://github.com/kubeflow/examples/demos

GKEIngress

(e.g. Ambassador)

Pipelines Controllers

ArgoControllers

Katib HP Tuning Controllers

IAP

Central Dashboard

JupyterHub TF Job Dashboard

TF JobOperator

PipelinesDashboard

ArgoDashboard

Katib HP TuningDashboard

Pytorch Operator

TF Serving

Application UI

Notebook

TF Job

Parameter Server1

TensorFlow Master

TensorFlow

Workers1 2 3Pipeline

StudyJob

GCP

Sentiment

Kubeflow

@texasmichelle

Try it Yourself● deploy.kubeflow.cloud (Click-to-deploy)● codelabs.developers.google.com

○ Intro to Kubeflow on Google Kubernetes Engine: https://goo.gl/192bs7○ Kubeflow End-to-End: GitHub Issue Summarization: https://goo.gl/qLXUTG

● Qwiklabs: https://qwiklabs.com● Katacoda: https://www.katacoda.com/kubeflow● GitHub: https://github.com/kubeflow/examples/tree/master/github_issue_summarization● http://gh-demo.kubeflow.org

GCP

Sentiment

Kubeflow

@texasmichelle

2

Agenda

1 3 4 5

Problems Goals What's inside Demo Future Direction

@texasmichelle

Roadmap● 0.4 in December

● Enterprise readiness

○ 1.0 with hardened APIs

○ IAM/RBAC

○ Clean deployments, upgrades

● Better Jupyter Notebook integration

● Pipeline experiment comparison

● Model management

● Test release infrastructure

● You tell us! (Or better yet, help!)

GCP

Sentiment

Kubeflow

@texasmichelle

Just the Beginning● Kubeflow is open

○ Open community○ Open design○ Open source○ Open to ideas

● You tell us! Get involved○ github.com/kubeflow○ kubeflow.slack.com○ @kubeflow○ kubeflow-discuss@googlegroups.com○ Community call Tuesdays alternating

8:30am and 5:30pm Pacific○ Kubeflow Contributor Summit

■ Q1 2019

GCP

Sentiment

Kubeflow

@texasmichelle

Questions?

top related