kubernetes competetive analysis

Kubernetes competitive analysis:

k8s – mesos – swarm - fleet

Kumar Gaurav27th Aug’16

(presented in Kubernetes meetup, Bangalore)

Requirements of a cluster scheduler• Goals• High resource utilization• User supplied placement constraints• Rapid decision making• “fairness” and business priority• Robust and always available

Types of schedulers• Monolithic: uses single, centralized scheduling for all job• 2 level: single active resource manager that offers compute

resources to multiple parallel, independent “scheduler frameworks”• Static partition: may lead to fragmentation, and sub-optimal utilization• In Mesos, a centralized resource allocator dynamically partitions a cluster• Mesos achieves fairness by alternatively offering all available cluster resources to different

schedulers

• Shared state: using lock-free optimistic concurrency control • Grant each scheduler full access to entire cluster, compete in free-for-all• Once a scheduler makes placement decision, it updates shared copy of cell state in atomic

commit• Schedulers can have different policies. Fairness is not guaranteed

Marathon, from Mesosphere• Designed to scale to very large clusters involving hundreds or thousands of hosts. • Mesos supports diverse workloads from multiple tenants;

one user’s Docker containers may be running next to another user’s Hadoop tasks• Built for High-availability and resiliency• Components:

• Agent Nodes = runs tasks• Master = resource aggregators• ZooKeeper = master election• Frameworks = task scheduler (Marathon)

• Marathon is designed to start, monitor and scale long-running applications

• Supports various affinity and constraint rules• Supports health checks and events stream (for LB)• Clients interact with Marathon through a REST API

Why? You have a 10,000+ node cluster

Swarm, from Docker• Swarm is the native clustering tool for Docker• Swarm uses the standard Docker API,

• Swarm will take care of selecting an appropriate host to run the container on

• Each host runs a Swarm agent and one host runs a Swarm manager• HA mode where one of etcd, Consul or ZooKeeper is used• Different methods for how hosts are

found and added to a cluster, which is known as discovery (default:token)• Scheduling:

• filters (health, affinity, dependency)• strategy (spread,binpack, random)

Why? you like using the docker CLI, and ecosystem tools

Fleet, from CoreOS• it builds on top of systemd

• systemd provides system and service initialization for a single machine, • meant for starting, stopping, managing processes• fleet extends this to a cluster of machines

• Each machine runs an engine and an agent. Only one engine is active in the cluster at any time• Fault-tolerant; if a machine dies,

any units scheduled on that machine will be restarted on new hosts• Etcd is used to enable communication

between machines and store the status of the cluster and units• Fleet supports various scheduling hints and constraints• a “low-level cluster engine”, meaning that it is expected to form a “foundation layer”

for higher-level solutions such as Mesos

https://wiki.freedesktop.org/www/Software/systemd/

Kubernetes, from Google • Enforces several concepts around how containers are organized and networked (opinionated)

• Pods are groups of containers that are deployed and scheduled together• k8s bridge: all pods can talk to each other without any NAT• Labels are key-value pairs attached to objects in Kubernetes• Service will automatically round-robin requests between the pods• Registry (etcd) is a highly-available key value store used for

persistent storage of all of its REST API object • Replication controller controls and monitors the number of running

pods (called replicas) for a service

Why? You prefer a declarative system. Above all, Google lover

Conclusion

* Google search result count

Size Time to install Maturity Popularity* RBAC Multi-Master

Kubernetes 10s-100s Medium High 1.1M FullLimited. HA purpose only

Swarm 1-100s Low Medium 300KNone (use Orca instead)

Limited. HA purpose only

Fleet 1-10s Low Low 50K None No

Marathon 10-1000s High High 3.4MWith LDAP integration Fully supported

kubernetes competetive analysis

Engineering