women in big data x pinterest · journey to k8s the future agenda. motivation unify orchestration...

81

Upload: others

Post on 05-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending
Page 2: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Women in Big Data x Pinterest

Page 3: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Welcome!

Regina Karson, WiBD Chapter DirectorTian-Ying Chang, Engineering Manager

Page 4: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending
Page 5: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending
Page 6: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Goku: Pinterest’s in house Time-Series Database

Tian-Ying ChangSr. Staff Engineer ManagerPinterest

Page 7: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Confidential

● Discover new ideas and find inspiration to do the things they love

○ 300M+ MAU, billions pins● Metrics for monitoring site health

○ Latency, QPS, CPU, memory● Metric about product quality

○ MAU, Impression, etc● Monitoring service needs to be fast,

reliable and scalable

Pinterest

7

Page 8: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Confidential

● Graphite○ Easy to setup at small scale○ Down sampling support long range query well○ Hard to scale○ Deprecated at Pinterest’s current scale

● OpenTSDB○ Rich query, tagging support○ Easy to scale horizontally with underlying HBase cluster○ Long latency for high cardinality data○ Long latency for query over longer time range

■ No down sampling○ Heavy GC worsened by combined heavy write QPS and long range scan

Monitoring at Pinterest

8

Page 9: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Confidential

● HBase Schema○ Row key: <metric><timestamp>[<tagk1><tagv1><tagk2><tagv2>...] (metric, tag key values are encoded in 3 bytes)○ Column qualifier: <delta to row key timestamp(up to 4 bytes)>

● Unnecessary Scan○ Query: m1{rpc=delete} [t1 to t2]○ <m1><t1><host=h1><rpc=delete>○ <m1><t1><host=h1><rpc=get>○ <m1><t1><host=h1><rpc=put>○ <m1><t2><host=h2><rpc=delete>

● Data size○ 20 bytes per data point

● Aggregation○ Read data onto one opentsdb and aggregate○ Ex. ostrich.gauges.singer.processor.stuck_processors{host=*}

● Serialization○ Json. Super slow when there are many many data points to return

Why OpenTSDB is not good fit

9

HBase RS

HBase RS

HBase RS

OpenTSDB

Page 10: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Confidential

Goku is here to save

Page 11: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Statsboard(Read Client)

Ingestor(Write Client)

Kafka

OpenTSDB

HBase RS

HBase RS

HBase RS

OpenTSDB

● Read/|Write requests are sent to a random selected OpenTSDB box, and then routed to corresponding RS based on row key range

● Reads: raw data is read from individual HBase RS, send to OpenTSDB box, then aggregated at openTSDB, then send result to client

Write

Read

Page 12: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Statsboard(Read Client)

Ingestor(Write Client)

Kafka

Goku

Goku Goku

Goku

Goku cluster

● A Goku box is not only storage engine, but also:

○ Proxy that route requests○ Aggregation engine

● Client can send requests to any Goku box who will route requests

○ Scatter and Gather

Write

Read

Page 13: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Two level sharding

● Group# hashed from metric name○ E.g tc.metrics.rpc_latency

● Shard# hashed from metric + set of tagk and tagv

○ E.g. tc.metrics.rpc_latency{rpc=put,host=m1}

● Control read fanout while easy to scale out individual group

G1:S1 G1:S2G2:S1 G2:S2 G3:S1G1:S3

G4:S1 G4:S2G3:S1G1:S3

1.Requests sent to a random goku box

G3:S3

4.Retrieve data and local aggregate

5.another aggregation

6.return response

Goku2.comput sharding to G2: S1 and S2, then look up shard config

Shard config

3.route requests

Page 14: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Confidential

Goku#1. Time Series Database based on Beringei

Page 15: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Beringei

● In-memory key value store○ Key: string○ Value: list of <timestamp, value> pairs

● Gorilla compression○ Delta-of-Delta encoding on timestamps○ Delta encoding on values

● Stores most recent 24 hours data (configurable)

● One level of sharding to distribute

● Datapoint size reduced ○ from 20 bytes to 1.37 bytes

Bucket

Bucket

Bucket

ts

ts

ts

ts

ts

ts

ts

ts

Disk

Gorilla Encode

Gorilla Decode

Beringei

Write Read

Shard

Page 16: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Confidential

Goku#2. Query Engine -- Inverted Index

Page 17: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Inverted Index

● A map from search term to its bitset

● Built along with processing incoming data points

● Fast lookup when serve query

● Support query filters○ ExactMatch: metricname{host=h1,api=get). => intersect

bitsets of metricname, host=h1, api=get○ Or: metricname{host=h1|h2}. => union bitsets of host=h1

and host=h2 and intersect with bitset of metricname○ Nor: metricname{host=not_literal_or(h1|h2)}. => remove

bitsets of host=h1 and host=h2 from bitset of metricname○ Wildcard: a. metricname{host=*} => intersect bitsets of

metricname and host=*; b.metricname{host=h*} => convert to regex filter

○ Regex: metricname{host=h[1|2].*, api=get, az=us-east-1} => apply other filters first. Then build a regex pattern based on the filter values and then iterate corresponding full metric names of all ids after applying other filters.

Bucket

Bucket

Bucket

ts

ts

ts

ts

ts

ts

ts

ts

DISK

Inverted Index

Gorilla Encode

Gorilla Decode

Goku Phase #1

Write Read

Shard

Page 18: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Confidential

Goku#3. Query Engine -- Aggregation

Page 19: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Aggregation

● Post-process after retrieving all relevant time series

● Mimic OpenTSDB’s aggregation layer

● Support basic aggregators, including SUM, AVG, MAX, MIN, COUNT, DEV and Downsampling

● Versus OpenTSDB○ OpenTSDB does aggregation on a

single instance since HBase RS don’t know how to aggregate

○ Goku does aggregation in two phases. First on each leaf goku node, and second on the routing goku node

○ Distribute the computation and save data on the wire

Bucket

Bucket

Bucket

ts

ts

ts

ts

ts

ts

ts

ts

DISK

Inverted Index

Aggregation

Gorilla Encode

Gorilla Decode

Goku Phase #1

Write Read

Shard

Page 20: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Confidential

AWS EFS

Page 21: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

AWS EFS

● Store log and data files to recovery

● Posix compliant

● Data durability

● Operate it asynchronously, so latency isn’t an issue

● Easy to move shard

● Easy to use on AWS

Bucket

Bucket

Bucket

ts

ts

ts

ts

ts

ts

ts

ts

AWS EFS

Inverted Index

Aggregation

Gorilla Encode

Gorilla Decode

Goku Phase #1

Write Read

Shard

Page 22: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Confidential

Phase #2 Disk based Goku

Page 23: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Bucket

Bucket

Bucket

ts

ts

ts

ts

ts

ts

ts

ts

AWS EFS

Inverted Index

Aggregation

Gorilla Encode

Gorilla Decode

Group

Goku Phase #2

Write Read

Shard

Distributed KV

store(RockStore)

Hadoop job

Goku Phase #2 -- Disk based

● Hadoop job constantly runs to compact data into disk with downsample

● Data stored into S3 for better availability and low cost

● RocksDB is used for online serving data

S3

Page 24: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Confidential

● Replication○ Currently dule write to two clusters for

fault tolerance ○ Replication to improve availability and

consistency

● More query possibilities○ TopK○ Percentile

● Analytics use case○ Another big consumer of Time Series

data

Next step for Goku

24

Page 25: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Thanks!

Page 26: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Scheduling Asynchronous Tasks at Pinterest

Isabel TallamData (Core Services) TeamPinterest

Page 27: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Why asynchronous tasks?

Asynchronous task processing service

Design considerations

Page 28: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Why asynchronous tasks?

SPAM

SPAM

%$#*

%$#*

%$#*SPAM

SPAMSPAM

Page 29: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Why asynchronous tasks?

Asynchronous task processing service

Design considerations

Page 30: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

PinlaterAsynchronous Task Processing Service

Pinlater features

- High throughput

- Easily create new tasks

- At-least-once guarantee- Strict ack mechanism

- Metrics and debugging support

- Different task priorities

- Scheduling future tasks

- Python, Java support

Page 31: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

PinlaterAsynchronous Task Processing Service

PinlaterServersPinlaterServersPinlaterServers

Clients PinlaterWorkers

PinlaterWorkersPinlaterWorkers

Storage

Master Slave

Storage

Master Slave

Storage

Master Slave

ClientsClients

Pinlater components

insert request /ack

Page 32: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

PinlaterAsynchronous Task Processing Service

Pinlater Stats

~1000 different tasks defined

~8 billion task instances processed per day

~3000 Pinlater hosts

Page 33: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Why asynchronous tasks?

Asynchronous task processing service

Design considerations

Page 34: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Storage

PinlaterAsynchronous Task Processing Service

PinlaterServersPinlaterServersPinlaterServers

Storage

Master Slave

Storage Layer

Master Slave

Storage

Master Slave

Cache

Page 35: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

PinlaterAsynchronous Task Processing Service

PinlaterServersPinlaterServers

Clients PinlaterWorkersPinlaterWorkers

PinlaterWorkers

ClientsClientsinsert request

Handling failures in the system

/ack

timeoutmonitor

Storage

Master Slave

PinlaterServers

Page 36: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

PinlaterAsynchronous Task Processing Service

Thank You!

Page 37: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Experimentation at Pinterest

Lu YangData (Data Analytics - Core Product Data) TeamPinterest

Page 38: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Outline

1 Background

2 Platform

3 Architecture

Page 39: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

What is an a/b experiment? It is a method to compare two (or more) variations of something to determine which one performs better against your target metrics

OR

Page 40: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

With Experiment Mindset

Idea → Feature Development → Release to small % of users → Measure impact → Release to 100% of users based on the impact of sample launch

A randomized, controlled trial with measurement

Existing code - CONTROL

Changed code - ENABLED

Not in experiment

All Users

Page 41: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Number of Experiments Over Time

Page 42: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

New Experiments/wk

7Languages and platforms

800+Different Metrics

<5minExperiment setup

8.3/10Developer satisfaction score

140+

Experimentation by the Numbers

Page 43: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Outline

1 Background

2 Platform

3 Architecture

Page 44: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Typical Experiment TimelineExperiments vary, but here is a typical timeline

Idea

Form a hypothesis Analyze results Make Decision and Iterate

Experiment Setup and Launch

Iterate

Feature Development

Page 45: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Experimentor

Platform

Page 46: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Gatekeeper

Platform

Languages: python, java, scala, go, …

Platform: web, ios, android, ...

Page 47: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Experiment Dashboard

Platform

Page 48: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Outline

1 Background

2 Platform

3 Architecture

Page 49: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending
Page 50: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Batch processing

Real-time analytics

Experiment Data Pipeline

Hbase

Page 51: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Airflow at PinterestIndy PrenticeData (Big Data Compute) TeamPinterest

Page 52: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

What is a workflow?

DefineWhat to run, dependencies, etc. When to run it

Schedule

Page 53: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Confidential

Scheduled with Pinball

*check it out at https://github.com/pinterest/pinball!

coordinator

worker

train the visual search model

make the search index

rank the ads

free spot!worker

download all pin images

count number of Pinterest users

(get it?)

find great recommendations

calculate experiment metrics

Page 54: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Fixed number of workers and slots

Unfortunately…

Page 55: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Confidential

free spot!

Fixed number of workers and slots

coordinator

worker

train the visual search model

free spot!worker

free spot!

free spot!

free spot!

free spot!

calculate experiment metrics

Page 56: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Confidential

Fixed number of workers and slots

coordinator

worker

train the visual search model

make the search index

rank the ads

find duplicate pin imagesworker

calculate experiment metrics

find great recommendations

download all pin images

count number of Pinterest users

Page 57: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Fixed number of workers and slots

Shared environment

Unfortunately…

Page 58: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Confidential

Shared host resources

coordinator

worker

worker

generate experiment metrics

index the pins

download all pin images

count number of Pinterest users

train the visual search model

make the search index

rank the ads

free spot!

Page 59: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Confidential

Shared codebase

coordinator

worker

worker

generate experiment metrics

index the pins

download all pin images

count number of Pinterest users

train the visual search model

make the search index

rank the ads

free spot!

Page 60: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Fixed number of workers and slots

Shared environment

Implementation doesn’t scale

Unfortunately…

Page 61: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

● Industry + community support● Will it support our scale?

○ We think so● Will it solve the problems with Pinball?

○ With Kubernetes executor

Page 62: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Enter Spinner (get it???)

Jobs run in containers

Containers are scheduled by kubernetes

Scheduler submits jobs to run on k8s

Physical machine (8 CPUs, 10 GBs of disk space)

Can’t touch this

Train the visual search model

(4 CPUs, 4 GBs of disk space) Download all images

(10 GBs of disk space)Generate experiment metrics

(2 CPUs, 1 GB of disk space)

Page 63: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Enter Spinner (get it???)

Jobs run in containers

Containers are scheduled by kubernetes

Scheduler submits jobs to run on k8s

train the visual search model

make the search index

rank the ads

generate experiment metrics

index the pins

write 10TB of data to disk!

count number of Pinterest users

Tasks

Servers

Page 64: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Enter Spinner (get it???)

Jobs run in containers

Containers are scheduled by kubernetes

Scheduler submits jobs to run on k8s

Page 65: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Autoscaling happens at k8s level

Jobs run in isolated containers

Lightweight maintenance

On Airflow...

Page 66: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

● Integration with Pinterest k8s infrastructure

● Scheduler scalability● Integration with existing

composers

Airflow @ PinterestAdoption challenges

Page 67: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Machine Learning and Big Data on K8S at Pinterest

June LiuInfrastructure (Core Cloud - Cloud Management Platform) TeamPinterest

Page 68: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

● Motivation● Architecture● Journey to K8S● The Future

Agenda

Page 69: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Motivation● Unify orchestration and big data

infrastructure○ simplify tech stack and reduce operation

overhead● Trending in community support of ML

and BD workloads on K8S● Better interfaces and richer feature

including CNI, CSI, Autoscaling, etc

Page 70: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Architecture

Page 71: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending
Page 72: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Journey to Kubernetes

Page 73: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

SidecarNo good support for sidecar lifecycle of run-to-finish Pod

● A lot of sidecars to nanny workloads○ Security○ Metadata○ Logs○ Metrics○ Traffic, Service discovery○ ...

Page 74: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

SidecarNo good support for sidecar lifecycle of run-to-finish Pod

● Kill Pod?○ Pod recreated due to reconcile

● Force mark Pod state?○ Confuses scheduler and Kubelet

● Docker kill sidecar?○ Messed up with Restart Policy

● Inter container signals?○ Not scalable in operation

Page 75: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

SidecarNo good support for sidecar lifecycle of run-to-finish Pod

● Write our own and contribute!○ Main container obeys restart policy,

sidecar always restart○ Kill sidecar after main containers quits○ Pod phase computed based on main

container exit codehttps://github.com/zhan849/kubernetes/tree/pinterest-sidecar-1.14.5

Page 76: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

VolumeIs PVC really an option for serving models?

● Ideally○ Flexibility to select storage medium○ Data isolation○ Dynamic provisioning saves money○ ...

Page 77: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

VolumeIs PVC really an option for serving models?

● Actually…○ EBS provisioner is not efficient, nor is EC2○ ~100% 500s with batch EBS provisioning

Page 78: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

VolumeIs PVC really an option for serving models?

● DescribeInstance: by name -> by ID (#78140)● Optimize EBS provisioner cloud provider

calls (#78276)Batch Size

Total Calls Peak QPS

Original Optimized Original Optimized

50 1360 116 52 8

75 1464 139 70 10

100 2427 157 75 8

150 4384 209 93 11

Page 79: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

AutoScalingNode scales slower than Pod

● This is expected or container becomes meaningless

● Long tail scheduling Parameter Servers and cause job to fail

● Use bogus pods with low priority as cluster buffers to scale in advance

Page 80: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Our Future

Page 81: Women in Big Data x Pinterest · Journey to K8S The Future Agenda. Motivation Unify orchestration and big data infrastructure simplify tech stack and reduce operation overhead Trending

Future WorksNode scales slower than Pod

● Gang scheduling● Federation layer takes care of data /

network locality● Fine-grained preemption, task queuing● More caching storage options (PVC/EBS)

○ https://github.com/kubeflow/community/pull/263