getting the most out of your mesos david r. morrison, phd · 2019-02-26 · how yelp grew its soa...

39
David R. Morrison, PhD [email protected] Getting the Most out of Your Mesos

Upload: others

Post on 14-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

David R. Morrison, [email protected]

Getting the Most out of Your Mesos

Page 2: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Yelp’s MissionConnecting people with great

local businesses.

Page 3: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

How Yelp grew its SOA infrastructure on top of Apache Mesos

How Yelp has gained Observability into its Mesos clusters to increase reliability

How Yelp uses data about its infrastructure to move from Observability to Predictability

What can you learn from this talk?

Page 4: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

(A few) Examples of Mesos Applications at Yelp

TRONJO T

https://github.com/Yelp/paastahttps://github.com/Yelp/tronhttps://github.com/Yelp/task_processing

Page 5: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Mesos Adoption at Yelp

Page 6: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Why are we (still) using Mesos?

Page 7: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Why are we (still) using Mesos?

Page 8: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Observability: What are our Mesos clusters actually doing?

Page 9: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Metrics and Monitoring

Page 10: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Tracing and Logging

Page 11: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

We had a few problems in 2017

Page 12: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

How many services can we support?

Page 13: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

How many services can we support?

Page 14: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

How many services can we support?

Page 15: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

How many services can we support?

Page 16: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

How many services can we support?

Page 17: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

How many services can we support?

Page 18: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

How many services can we support?

Page 19: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

How many frameworks can we run?

Page 20: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

How many frameworks can we run?

Page 21: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Moving from Observability to Predictability (Part 1)

Page 22: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Load Testing Apache Mesos

Experimental Design

(YAML file)

Benchmarking Script

(Git Repo)

Real Live Mesos Cluster

Docker Image + Jupyter Notebook

Page 23: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Docker Containerizer vs UCR

time (seconds)

Doc

ker c

ount

s

Page 24: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Docker Containerizer vs UCR

time (seconds)

UC

R c

ount

s

Page 25: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Moving from Observability to Predictability (Part 2)

Page 26: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Clusterman: Yelp's Pluggable Autoscaling Engine for Apache Mesos

Page 27: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Clusterman: Yelp's Pluggable Autoscaling Engine Simulator for Apache Mesos

Page 28: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Clusterman Example: what happened

Page 29: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Clusterman Example: what happened

Page 30: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Clusterman Example: What will happen

Page 31: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Clusterman Example: What will happen

Page 32: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

What comes after Predictability?

Page 33: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Feedbackability

Page 34: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Feedbackability

Page 35: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Feedbackability

Page 36: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

Feedbackability

Page 37: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

We collect a lot of data about our infrastructure -- using that data intelligently can make us more reliable and save us money!

What did we learn?

Page 38: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

www.yelp.com/careers/

We're Hiring!

Page 39: Getting the Most out of Your Mesos David R. Morrison, PhD · 2019-02-26 · How Yelp grew its SOA infrastructure on top of Apache Mesos How Yelp has gained Observability into its

@YelpEngineering

fb.com/YelpEngineers

engineeringblog.yelp.com

github.com/yelp