orchestration: fancy buzzword, or the inevitable fate of docker containers?

Fancy buzzword, or the fate of containers?

Orchestration

© 2015 Mesosphere, Inc.

2

Connor DoyleSoftware EngineerMesosphere, Inc.

[email protected]@nor0101

Hi!

http://mesosphere.com

http://mesosphere.com

mailto:[email protected]

mailto:[email protected]

https://twitter.com/nor0101

https://twitter.com/nor0101


• What problem are we solving?• Prior art• Axes of choice

• The allure of two-level scheduling• To infinity and beyond• Oversubscription• Maintenance

Agenda

3


“Container orchestration” implies horizontal scalability.

Why you need scale varies, and your workload profile has bearing on how you should run your clusters. (e.g. HPC/HTC needs are different from a consumer retail website).

Mo’ scale, mo’ problems:failure (and cascading failure), fault zones, maintaining SLOs, maintenence windows, monitoring/alerts

The problem space

4





The problem space

5





The problem space

6





The problem space

7

We want:

- Stability- Performance- Flexibility- Abstractions we can grasp and

explain


Orchestration starts with a good scheduler.

8


We have options :)

• Centralized• Batch schedulers (HTCondor, Slurm, Torque)• Monolithic schedulers (Borg)• Process schedulers (systemd, fleet, Kubernetes)• Two-level schedulers (Mesos, Ω)

• Decentralized• Completely! Sparrow• Hybrid! Mercury

9


This is a HUGE opportunity

10


- To get the abstractions right

- To mitigate the next software crisis

- To do better!


11




- To do better!


12




- To do better!


13


Two-level scheduling is a nice model.

14


Let the cluster manager:

- Keep track of resources- Offer resources to applications fairly- Implement low-level isolation

Two-level scheduling

15


Let the application-specific scheduler:

- Track its own job queue- Think about task constraints- Define task semantics- Choose appropriate containerization- Respond to failures

Two-level scheduling

16


Hey, looks like a managed runtime!

These have been popular lately!

• JVM• HHVM• V8• ...

17


Hey, looks like a managed runtime!

These have been popular lately!

• JVM• HHVM• V8• ...

Why?

They allow high-level general-purpose programs to benefit from:

- Portable units of execution- Architecture dependent optimizations- Dynamic (de)optimizations based on insights learned at execution time

and it gets better over time for free!

18


A goal: maximize utilization

19


...safely!

20

Jobs like to run on underutilized hardware!

Contention for shared resources can negatively impact other goals (such as tail-latency or throughput)

Besides estimating oversubscribable resources we need to revise the estimates over time!


...safely!

21





...safely!

22





More challenges opportunities

Choose victims wisely!

Is killing the only option?

25


Another goal: orderly downtime

“I’m removing this node from the cluster NOW.”

“I’m going to take this node offline in three hours.”

26


Another goal: orderly downtime

27

Tag resource offers with a time horizon

Give application schedulers a chance to relocate affected tasks


References1. Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing2. Distributed Computing in Practice: The Condor Experience3. Heracles: Improving Resource Efficiency at Scale4. Large-scale cluster management at Google with Borg5. Mercury: Hybrid Centralized and Distributed Scheduling in Large Shared Clusters6. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center7. Mesos Oversubscription Design Document8. MESOS-1474: Provide cluster maintenance primitives for operators9. Omega: flexible, scalable schedulers for large compute clusters10. Quasar: Resource-Efficient and QoS-Aware Cluster Management11. Reliable Cron across the Planet12. Sparrow: Distributed, Low Latency Scheduling

28

https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-boutin_0.pdf

https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-boutin_0.pdf

http://research.cs.wisc.edu/htcondor/doc/condor-practice.pdf

http://research.cs.wisc.edu/htcondor/doc/condor-practice.pdf

http://csl.stanford.edu/~christos/publications/2015.heracles.isca.pdf

http://csl.stanford.edu/~christos/publications/2015.heracles.isca.pdf

http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43438.pdf


http://research.microsoft.com/pubs/238833/mercury-tr.pdf

http://research.microsoft.com/pubs/238833/mercury-tr.pdf

https://www.cs.berkeley.edu/~alig/papers/mesos.pdf

https://www.cs.berkeley.edu/~alig/papers/mesos.pdf

https://docs.google.com/document/d/1pUnElxHy1uWfHY_FOvvRC73QaOGgdXE0OXN-gbxdXA0

https://docs.google.com/document/d/1pUnElxHy1uWfHY_FOvvRC73QaOGgdXE0OXN-gbxdXA0

https://issues.apache.org/jira/browse/MESOS-1474

https://issues.apache.org/jira/browse/MESOS-1474



http://web.stanford.edu/~cdel/2014.asplos.quasar.pdf

http://web.stanford.edu/~cdel/2014.asplos.quasar.pdf

http://delivery.acm.org/10.1145/2750000/2745840/p30-petoff.pdf?ip=69.181.40.46&id=2745840&acc=OPEN&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E532ADAB637C2E954%2E6D218144511F3437&CFID=693238782&CFTOKEN=24102472&__acm__=1436995233_d09d13e139e1d2dae1e171eff5a729ac

http://delivery.acm.org/10.1145/2750000/2745840/p30-petoff.pdf?ip=69.181.40.46&id=2745840&acc=OPEN&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E532ADAB637C2E954%2E6D218144511F3437&CFID=693238782&CFTOKEN=24102472&__acm__=1436995233_d09d13e139e1d2dae1e171eff5a729ac

http://people.csail.mit.edu/matei/papers/2013/sosp_sparrow.pdf

http://people.csail.mit.edu/matei/papers/2013/sosp_sparrow.pdf

orchestration: fancy buzzword, or the inevitable fate of docker containers?

Software

mo scale

cascading failure

mo problems

container orchestration

problem space

maintenence windows

fault zones

consumer retail website