container world 2017 - characterizing and contrasting container orchestrators

66
Characterizing and Contrasting http://calcotestudios.com/talks February 2017 Lee Calcote Container Orchestrators KNect365 Delivered by TMT #CONTAINERWORLD World Container

Upload: lee-calcote

Post on 12-Apr-2017

329 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Characterizing and Contrasting

http://calcotestudios.com/talks

February 2017

Lee Calcote

Container Orchestrators

KNect365

Delivered by

TMT

#CONTAINERWORLD

World

Container

Page 2: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Lee Calcote

linkedin.com/in/leecalcote

@lcalcote

blog.gingergeek.com

[email protected]

clouds, containers, infrastructure,applications  and their management

Page 3: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Show of Hands

Page 4: Container World 2017 - Characterizing and Contrasting Container Orchestrators

[kuh n-tey-ner] [awr-kuh-streyt-or] 

Definition:

@lcalcote

Page 5: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Fleet Nomad Swarm

Kubernetes Mesos+Marathon

CaaS

@lcalcote(Stay tuned for updates to presentation and book)

Joyent TritonDocker Datacenter

AWS ECSAzure Container Service

Rackspace Carina

Page 6: Container World 2017 - Characterizing and Contrasting Container Orchestrators

One size does not fit all.

A strict apples-to-apples comparison is inappropriate and not

the objective, hence characterizing and contrasting.

@lcalcote

Page 7: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Let's not go here today.

Container orchestrators may be intermixed.

@lcalcote

Page 8: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Categorically Speaking

Scheduling

Genesis & Purpose

Support & Momentum

Host & Service Discovery

Modularity & Extensibility

Updates & Maintenance

Health Monitoring

Networking & Load-Balancing

Secrets Management

High Availability & Scale

@lcalcote

Page 9: Container World 2017 - Characterizing and Contrasting Container Orchestrators

CoreCapabilities

Cluster Management

Host Discovery

Host Health Monitoring

Scheduling

Orchestrator Updates and Host

Maintenance

Service Discovery

Networking and Load-Balancing

Stateful services

Multi-tenant, multi-region

AdditionalKey CapabilitiesApplication Health & Performance

Monitoring

Application Deployments

Application Secrets

@lcalcote

Page 10: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Nomad

Page 11: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Genesis & Purposedesigned for both long-lived services and short-livedbatch processing workloads.  cluster manager with declarative job specifications.  ensures constraints are satisfied and resourceutilization is optimized by efficient task packing.  supports all major operating systems and virtualized,containerized or standalone workloads.  written in Go and under the Unix philosophy.

 @lcalcote

Page 12: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Support & MomentumProject began June 2015 (19 months old) has 141

contributors 

Current release v0.5.4

Nomad Enterprise offering aimed for first half of this

year.

 

Supported and governed by HashiCorp

HashiConf US '15 had ~300 attendees

HashiConf EU '16 had ~320 attendees

HashiConf US '16 had ~500 attendees 

@lcalcote

Page 13: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Nomad Architecture

@lcalcote

Page 14: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Host &

      Service Discovery

Host Discovery

Gossip protocol - Serf is usedDocker multi-host networking and Swarmkit use Serf, too

Servers advertise full set of Nomad servers to clientsheartbeats every 30 seconds

Creating federated clusters is simple  

Service Discovery

Nomad integrates with  to provide servicediscovery and monitoring.

Consul

@lcalcote

Page 15: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Scheduling

two distinct phases, feasibility checking and ranking.  optimistically concurrent

enabling all servers to participate in scheduling decisionswhich increases the total throughput and reduces latency  

three scheduler types used when creating jobs:service, batch and system

 `nomad plan` point-in-time-view of what Nomad will do

@lcalcote

Page 16: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Modularity & ExtensibilityTask drivers

Used by Nomad clients to execute a task and provide

resource isolation.

 

By having extensible task drivers are important for

flexibility to support a broad set of workloads (e.g. rkt, lxc).

 

Does not currently support pluggable task drivers,

Have to implement task driver interface and compile

Nomad binary.

@lcalcote

Page 17: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Updates & Maintenance

Nodes

Drain allocations on a running node.integrates with tools like Packer, Consul, and Terraform

to support building artifacts, service discovery, monitoring and capacity

management.  

Applications

Log rotation (stderr and stdout)

no log forward support, yet

Rolling updates (via the `update` block in the job specification).

@lcalcote

Page 18: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Health Monitoring

Nodes

Node health monitoring is done via heartbeats, soNomad can detect failed nodes and migrate theallocations to other healthy clients.

 Applications

currently http, tcp and script

In the future Nomad will add support for more Consul checks.

`nomad alloc-status` reports actual resource utilization

@lcalcote

Page 19: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Networking& Load-Balancing

Networking 

Dynamic ports are allocated in a range from 20000 to 60000.Shared IP address with Node.  

Load-Balancing

Consul provides DNS-based load-balancing

@lcalcote

Page 20: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Secrets Management

Nomad agents provide secure integration with Vaultfor all tasks and containers it spins up

 

gives secure access to Vault secrets through aworkflow which minimizes risk of secret exposureduring bootstrapping.

@lcalcote

Page 21: Container World 2017 - Characterizing and Contrasting Container Orchestrators

High Availability & Scale

distributed and highly available, using both leaderelection and state replication to provide availability inthe face of failures.  shared state optimistic scheduler

only open source implementation.  

1,000,0000 across 5,000 hosts and scheduled in 5 min.

 

Built for managing multiple clusters / cluster federation.

@lcalcote

Page 22: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Easy to use

Single binary for both clients and servers

Supports non-containerized tasks and

multiple container runtimes

Arguably the most advanced scheduler

design

Upfront consideration of federation /

hybrid cloud

Broad OS support

Outside of scheduler, comparatively less

sophisticated

Young project

Less relative momentum

Less relative adoption

Less extensible / pluggable

@lcalcote

Page 23: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Docker Swarm

Page 24: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Docker Swarm 1.12aka

Swarmkit or Swarm mode

@lcalcote

Page 25: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Genesis & PurposeSwarm is simple and easy to setup.

 

Initially responsible for clustering and scheduling  

Driving toward application's needs with services,

secrets, etc.

 

Originally an imperative system, now declarative.

 

Swarm’s architecture is not complex as those of

Kubernetes and Mesos.

 

Written in Go, Swarm is lightweight, modular and

somewhat extensible.@lcalcote

Page 26: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Docker Swarm 1.11 (Standalone)

Docker Swarm Mode 1.12 (Swarmkit)

@lcalcote

Page 27: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Support & Momentum

Contributions:Standalone: ~3,000 commits, 12 core maintainers (140 contributors)

Swarmkit: ~2,800 commits, ~12 core maintainers (70 contributors)  

~289 Docker meetups worldwideDisclaimer: I organize Docker Austin.  

Production-ready:Standalone announced ~15 months ago (Nov 2015)

Swarmkit announced ~7 months ago (July 2016)

@lcalcote

Page 28: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Host & Service Discovery

Host Discovery

Like Nomad, uses Hashicorp's  for storing cluster state

Pull model - where worker checks-in with the Manager

Rate Control - of checks-in with Manager may be controlled at

Manager - add jitter

Workers don't need to know which Manager is active; Follower

Managers will redirect Workers to Leader

Service Discovery

Embedded DNS and round robin load-balancing

Services are a new concept

 

goMemDB

@lcalcote

Page 29: Container World 2017 - Characterizing and Contrasting Container Orchestrators

SchedulingSwarm’s scheduler is pluggableSwarm scheduling is a combination of strategies andfilters/constraint: 

StrategiesRandomSpread*Binpack

Filterscontainer constraints (affinity, dependency, port) are defined as

environment variables in the specification file

node constraints (health, constraint) must be specified when starting the

docker daemon and define which nodes a container may be scheduled on.

@lcalcote

Page 30: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Modularity & ExtensibilityAbility to remove batteries is a strength for Swarm:

Pluggable schedulerPluggable network driverPluggable distributed K/V storeDocker container engine runtime-onlyPluggable authorization (in docker engine)*

@lcalcote

Page 31: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Updates & Maintenance

Nodes

Nodes may be Active, Drained and PausedManager weights are used to drain or pause Managers

Manual swarm manager and worker updates  

Applications

Rolling updates now supported--update-delay

--update-parallelism

--update-failure-action@lcalcote

Page 32: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Health Monitoring

Nodes

Swarm monitors the availability and resource usage

of nodes within the cluster

 

Applications

One health check per container may be run

check container health by running a command inside the container

--interval=DURATION (default: 30s)

--timeout=DURATION (default: 30s)

--retries=N (default: 3)

@lcalcote

Page 33: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Networking & Load-Balancing

Swarm and multi-host networking are simpatico

provides for user-defined overlay networks that are micro-segmentable

uses Hashicorp's Serf gossip protocol for quick convergence of neighbor table

facilitates container name resolution via embedded DNS server (previously via etc/hosts)

 

Load-balancing based on IPVS

expose Service's port externally

L4 load-balancer; cluster-wide port publishing

 

Mesh routing

send a request to any one of the nodes and it will be routed automatically

send a request to any one of the nodes and it will be internally load balanced

@lcalcote

Page 34: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Secrets Management

@lcalcote

Landed in 1.13 

encrypted and kept in Raft storemanaged by Swarm Managersretrieved by Swarm Services (not containers)via mounted in-memory filesystem on the node

Page 35: Container World 2017 - Characterizing and Contrasting Container Orchestrators

High Availability & Scale

Managers may be deployed in a highly-availableconfiguration

Active/Standby - only one active Leader at-a-time

Maintain odd number of managers  

Rescheduling upon node failure No rebalancing upon node addition to the cluster

 Does not support multiple failure isolation regions orfederation

although, with caveats, .

 federation is possible

@lcalcote

Page 36: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Scaling swarm to 1,000 AWS  nodesand 50,000 containers

@lcalcote

Page 37: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Suitable for orchestrating a combination of infrastructure containers

Has only recently added capabilities falling into the application bucket

Swarmkit is a young project

advanced features forthcoming

natural expectation of caveats in functionality

No rebalancing, autoscaling or monitoring, yet

Only schedules Docker containers, not containers using other specifications.

Does not schedule VMs or non-containerized processes

Does not provide support for batch jobs

Need separate load-balancer for overlapping ingress ports

While dependency and affinity filters are available, Swarm does not provide

the ability to enforce scheduling of two containers onto the same host or not

at all.

Filters  facilitate sidecar pattern. No “pod” concept.

Swarm works. Swarm is simple and easy to

deploy.

1.12 eliminated need for much, but not all third-party software

Facilitates earlier stages of adoption by organizations viewing

containers as faster VMs

now with built-in functionality for applications

Swarm is easy to extend, if can already know

Docker APIs, you can customize Swarm

Still modular, but has stepped back here.

Moving very fast; eliminating gaps quickly.

Page 38: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Kubernetes

Page 39: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Genesis & Purposean opinionated framework for building distributed

systems

"an open source system for automating deployment, scaling, and operations

of applications."

Written in Go, Kubernetes is lightweight, modular and

extensible

considered a third generation container orchestrator

led by Google, Red Hat and others.

Declaratively, opinionated with many key features

included

bakes in load-balancing, scale, volumes, deployments, secret

management and cross-cluster federated services among other features.

  @lcalcote

Page 40: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Kubernetes Architecture

@lcalcote

Page 41: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Support & MomentumKubernetes is 2 yrs. 20 months old (June 2014)

Announced as production-ready 19 months ago (July 2015)  

Project has over 1,000 commits per month (~44,000 total)reach 1,000 committers (~100 core) Kubernauts in Dec. 2016~5,000 commits made in each release (1.5 is latest)  

~244 Kubernetes meetups worldwide. Disclaimer: I organize Microservices and Containers Austin.  

Under the governance of the Cloud Native ComputingFoundation

KubeCon earlier this year capped at 1,000 attendees

@lcalcote

Page 42: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Host & Service Discovery

Host Discovery

by default, the node agent (kubelet) is configured to registeritself with the master (API server)

automating the joining of new hosts to the cluster

Service Discovery

Two primary modes of finding a Service

DNS

SkyDNS is deployed as a cluster add-on

environment variables

environment variables are used as a simple way of providing compatibility

with Docker links-style networking @lcalcote

Page 43: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Scheduling

By default, scheduling is handled by kube-scheduler (pluggable).

 

Selection criteria used by kube-scheduler to identify the best-fit

node is defined by policy:

Predicates (node resources and characteristics):

PodFitPorts , PodFitsResources, NoDiskConflict , MatchNodeSelector, HostName , ServiceAffinity,

LabelsPresence

Priorities (weighted strategies used to identify “best fit” node):

LeastRequestedPriority, BalancedResourceAllocation, ServiceSpreadingPriority, EqualPriority

@lcalcote

Page 44: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Modularity &         Extensibility

One of Kubernetes strengths its pluggable

architecture and it being an extensible platform 

Choice of:

database for service discovery or network driver

container runtime - may choose to run docker with rkt containers

Cluster add-ons

optional system components that implement a cluster feature (e.g.

DNS, logging, etc.)

shipped with the Kubernetes binaries and are considered an inherent

part of the Kubernetes clusters

 

@lcalcote

Page 45: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Updates & MaintenanceApplications

`Deployment` objects automate deploying and

rolling updating applications.

Support for rolling back deployments

Kubernetes Components

Consistently backwards compatible

Upgrading the Kubernetes components and hosts is

done via shell script 

Host maintenance - mark the node as unschedulable.

existing pods are vacated from the node

prevents new pods from being scheduled on the node

@lcalcote

Page 46: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Health Monitoring

Nodes

Failures - actively monitors the health of nodes within the cluster

via Node Controller

Resources - usage monitoring leverages a combination of open

source components:

cAdvisor, Heapster, InfluxDB, Grafana, Prometheus

Applications 

three types of user-defined application health-checks and uses the

Kubelet agent as the the health check monitor

HTTP Health Checks, Container Exec, TCP Socket

Cluster-level Logging

collect logs which persist beyond the lifetime of the pod’s container

images or the lifetime of the pod or even cluster

standard output and standard error output of each container can be ingested using a

agent running on each nodeFluentd

Page 47: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Networking & Load-Balancing

…enter the Pod

atomic unit of scheduling

flat networking with each pod receiving an IP address

no NAT required, port conflicts localized

intra-pod communication via localhost

Load-Balancing

Services provide inherent load-balancing via kube-proxy:

runs on each node of a Kubernetes cluster

reflects services as defined in the Kubernetes API

supports simple TCP/UDP forwarding and round-robin and Docker-links-

based service IP:PORT mapping. @lcalcote

Page 48: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Secrets Managementencrypted and stored in etcdused by containers in a pod either: 1. mounted as data volumes2. exposed as environment variables

 None of the pod’s containers will start until all the pods'volumes are mounted.

Individual secrets are limited to 1MB in size.

Secrets are created and accessible within a given namespace,not cross-namespace.

@lcalcote

Page 49: Container World 2017 - Characterizing and Contrasting Container Orchestrators

High Availability & Scale

Each master component may be deployed in a highly-

available configuration.

Active/Standby configuration

Federated clusters / multi-region deployments

Scale

v1.2 support for 1,000 node clusters

v1.3 supports 2,000 node clusters

 

Horizontal Pod Autoscaling (via Replication Controllers ).

Cluster Autoscaling (if you're running on GCE with AWS support is

coming soon).

@lcalcote

Page 50: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Only runs containerized applications

For those familiar with Docker-only, Kubernetes

requires understanding of new concepts

Powerful frameworks with more moving pieces beget complicated

cluster deployment and management.

Lightweight graphical user interface

Does not provide as sophisticated techniques for

resource utilization as Mesos

 

 

Kubernetes can schedule docker or rkt

containers

Inherently opinionated w/functionality built-in.

relatively easy to change its opinion

little to no third-party software needed

builds in many application-level concepts and services

(petsets, jobsets, daemonsets, application packages /

charts, etc.)

advanced storage/volume management

project has most momentum

project is arguably most extensible

thorough project documentation

Supports multi-tenancy

Multi-master, cross-cluster federation, robust

logging & metrics aggregation

 @lcalcote

Page 51: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Mesos+

Marathon

Page 52: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Genesis & PurposeMesos is a distributed systems kernel

stitches together many different machines into a logical computer

Mesos has been around the longest (launched in 2009)and is arguably the most stable, with highest (proven) scale currently

Mesos is written mostly in C++with Java, Python and C++ APIs

Marathon as a FrameworkMarathon is one of a number of frameworks (Chronos and Aurora other

examples) that may be run on top of Mesos

Frameworks have a scheduler and executor. Schedulers get resource offers.

Executors run tasks.

Marathon is written in Scala@lcalcote

Page 53: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Mesos Architecture

@lcalcote

Page 54: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Support & MomentumMesosCon 2016 in Denver had    ? attendeesMesosCon 2015 in Seattle had 700 attendees

up from 262 attendees in 2014  

Mesos has 224 contributorsMarathon has 227 contributors  Mesos under the governance of Apache FoundationMarathon under governance of Mesosphere  Mesos is used by Twitter, AirBnb, eBay, Apple, Cisco, YodleMarathon is used by Verizon and Samsung

@lcalcote

Page 55: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Host &

      Service Discovery

Mesos-DNS generates an SRV record for each Mesostask

including Marathon application instances

Marathon will ensure that all dynamically assignedservice ports are uniqueMesos-DNS is particularly useful when:

apps are launched through multiple frameworks (not just Marathon)

you are using an IP-per-container solution like

you use random host port assignments in Marathon

Project Calico

@lcalcote

Page 56: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Scheduling

Two-level schedulerFirst-level scheduling happens at Mesos master based onallocation policy, which decides which framework getresources.

Second-level scheduling happens at Framework scheduler,which decides what tasks to execute.  

Provide reservations, over-subscriptions and preemption.

@lcalcote

Page 57: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Modularity & ExtensibilityFrameworks

multiple available

may run multiple frameworks concurrently

Modules

extend inner-workings of Mesos by creating and using

shared libraries that are loaded on demand

many types of Modules

Replacement, Isolator, Allocator, Authentication, Hook, Anonymous

@lcalcote

Page 58: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Updates & MaintenanceNodes- Mesos has maintenance mode.- Marathon does not.

Mesos API backwards compatiblefrom v1.0 forward

 Applications

Marathon can be instructed todeploy containers based on thatcomponent using a blue/greenstrategy

where old and new versions co-exist for a

time. @lcalcote

Page 59: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Health Monitoring

Nodes

Master tracks a set of statistics and metrics to

monitor resource usage

Applications

support for health checks (HTTP and TCP)

an event stream that can be integrated with load-

balancers or for analyzing metrics

@lcalcote

Page 60: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Networking & Load-Balancing

Networking

An IP per Container

No longer share the node's IP

Helps remove port conflicts

Enables 3rd party network drivers

  isolator with

MesosContainerizer

Load-Balancing

Marathon offers two TCP/HTTP proxies

A simple shell script and a more complex one called `marathon-lb` that

has more features.

Pluggable (e.g. Traefik for load-balancing)

Container Network Interface (CNI)

@lcalcote

Page 61: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Secrets Management

Not yet. 

Only supported by Enterprise DC/OS

 

Stored in ZooKeeper, exposed as ENV variables in Marathon

Secrets shorter than eight characters may not be accepted by Marathon.

By default, you cannot store a secret larger than 1MB.

@lcalcote

Page 62: Container World 2017 - Characterizing and Contrasting Container Orchestrators

High Availability & Scale

A strength of Mesos’s architecturerequires masters to form a quorum using ZooKeeper (point of failure)

only one Active (Leader) master at-a-time in Mesos and Marathon  

Scale is a strong suit for Mesos. TBD for Marathon.  Autoscale

`marathon-autoscale.py` - autoscales application based on the

utilization metrics from Mesos

 - request rate-based autoscaling with Marathon.

 

Great at short-lived jobs. High availability built-in.Referred to as the “golden standard” by Solomon Hykes, Docker CTO.

marathon-lb-autoscale

Page 63: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Still needs 3rd party tools

Marathon interface could be more Docker friendly

(hard to get at volumes and registry)

May need a dedicated infrastructure IT team

an overly complex solution for small deployments

Universal Containerizer

abstract away from docker, rkt, kurma?, lxc?

Can run multiple frameworks, including Kubernetes and Swarm.

Supports multi-tenancy.

Good for Big Data shops and job / task-oriented workloads.

Good for mixed workloads and with data-locality policies

Mesos is powerful and scalable, battle-tested

Good for multiple large things you need to do 10,000+ node cluster system

Marathon UI is young, but promising.

@lcalcote

Page 64: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Summary

Page 65: Container World 2017 - Characterizing and Contrasting Container Orchestrators

A high-level perspective of the container orchestratorspectrum.

@lcalcote

Page 66: Container World 2017 - Characterizing and Contrasting Container Orchestrators

Lee Calcote

linkedin.com/in/leecalcote

@lcalcote

blog.gingergeek.com

[email protected] you.Questions?

clouds, containers, infrastructure,applications and their management

http://calcotestudios.com/talks