container networking: the gotchas (mesos london meetup 11 may 2016)
Post on 12-Jan-2017
156 Views
Preview:
TRANSCRIPT
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
Networking in a Containerized Data Center: the Gotchas!MESOS LONDON MEETUP
Andy Randall | @andrew_randall May 11, 2016
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
Calico’s Adventures in Containerland
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
Run anywhere Simple
Lightweight StandardSpeed
Cloud
Efficient
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
The original “container approach” to networking
All containers on a machine share the same IP address Gotcha #1:
WWW1
WWW2
80
80
Proxy8080
8081
Still most container deployments use this method!
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
World is moving to “IP per container”
Container Network Interface (CNI)
Container Network Model
(libnetwork, 0.19)
net-modules (Mesos 0.26)(future: CNI?)
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
We’ve solved “IP per VM” before…
VM1
VM2
VM3
Virtual Switch
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
We’ve solved “IP per VM” before…
VM1
VM2
VM3
Virtual Switch
VM1
VM2
VM3
Virtual Switch
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
Consequences for containers (gotcha #2): Scale
Hundreds of servers, low churn Millions of containers, high churn
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
pHost 1
Virtual Switch / encapsulation
vNIC
pNIC
vNIC
VM1
Consequences for containers (gotcha #3): Layering
Packets are double encap’d!
ContainerA
ContainerB
ContainerC
Virtual Switch / encapsulation
veth0 veth1 veth2
pHost 2
Virtual Switch / encapsulation
VM2
ContainerD
ContainerE
ContainerF
Virtual Switch / encapsulation
pNIC
vNIC vNIC
veth0 veth1 veth2
Physical Switch
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
Consequences for containers (gotcha #4): walled gardens
Legacy App
pHost 1
Virtual Switch / encapsulation
vNIC
pNIC
vNIC
VM1
ContainerA
ContainerB
ContainerC
Virtual Switch / encapsulation
veth0 veth1 veth2
Physical Switch
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
“Any intelligent fool can make things bigger, more complex… It takes a touch of genius – and a lot of courage – to move in the opposite direction.”
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
A Saner Approach: just route IP from the container
pHost 1
Virtual underlay
vNIC
pNIC
vNIC
VM1
ContainerA
ContainerB
ContainerC
Linux kernel routing (no encapsulation)
veth0 veth1 veth2
pHost 2
Virtual Underlay
VM2
ContainerD
ContainerE
ContainerF
Linux kernel routing (no encapsulation)
pNIC
vNIC vNIC
veth0 veth1 veth2
Physical Underlay
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
Variant: 1 vm per host, no virtual underlay, straight-up IP
pHost 1 pNIC
vNIC
VM1
ContainerA
ContainerB
ContainerC
Linux kernel routing (no encapsulation)
veth0 veth1 veth2
pHost 2
VM2
ContainerD
ContainerE
ContainerF
Linux kernel routing (no encapsulation)
pNIC
vNIC
veth0 veth1 veth2
Physical Underlay
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
Results: bare metal performance from virtual networks
Bare metal Calico OVS+VXLAN0123456789
10
Throughput Gbps
Bare metal Calico OVS+VXLAN0
20
40
60
80
100
120
CPU % per Gbps
Source: https://www.projectcalico.org/calico-dataplane-performance/
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
Some container frameworks still assume port mapping E.g. Marathon load balancer service (but being fixed…)
Some PaaS’s not yet supporting IP per container But several moving to build on Kubernetes, and will likely pick it up
Gotcha #5: IP per container not yet universally supported
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
You can easily get your configuration wrong and get sub-optimal performance, e.g. select wrong Flannel back-end for your fabric turn off AWS src-dest IP checks get MTU size wrong for the underlay…
Gotcha #6: running on public cloud
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
Consequences of MTU size…
t2.micro m4.xlarge0
50
100
150
200
250
300
qperf bandwidth
Bare Metal Calico
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
Consequences of MTU size…
t2.micro m4.xlarge0
50
100
150
200
250
300
qperf bandwidth
Bare Metal Calico (MTU=1440) Calico (MTU=8980)
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
Suppose we assign a /24 per Kubernetes node (=> 254 pods) Run 10 VMs per server, each with a Kubernetes node 40 servers per rack 20 racks per data center 4 data centers => now need a /15 for the rack, a /10 space for the data center,
and the entire 10/8 rfc1918 range to cover 4 data centers. … and hope your business doesn’t expand to need a 5th data
center!
Gotcha #7: IP addresses aren’t infinite
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
DC/OS / Mesos – multiple ways to network your container Net-modules – but only supports Mesos containerizer Docker networking – but then not fully integrated e.g. into MesosDNS CNI – possible future, but not here today Roll-your-own orchestrator-network co-ordination – the approach some of our users
have taken
Kubernetes CNI fairly stable Fine-grained policy being added – will move from alpha (annotation—based) to beta
(first-class citizen API) in 1.3
Docker Swarm / Docker Datacenter still early; libnetwork evolution? policy?
Gotcha #8: orchestration platforms support still evolving
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
Docker libnetwork provides limited functionality / visibility to plug-ins E.g. network name you specify as a user is NOT passed to the
underlying SDN
Consequences: Diagnostics hard to correlate Hard to enable ”side loaded” commands referring to networks created
on Docker command line (e.g. Calico advanced policy) Hard to network between Docker virtual network domain and non-
containerized workloads
Gotcha #9: Docker libnetwork is “special”
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
“Can you write a function that tells me when all nodes have caught up to the global state?”
Sure…
Gotcha #10: at cloud scale, nothing ever converges
function is_converged()return false
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
The Future of Cloud Networking
Flat routed IP networking with fine-grained policy
Broad set of overlay options
De facto industry standard for policy-driven networking for cloud native applications
@projectcalico Project Calico is sponsored by Tigera, Inc. | www.tigera.io
https://www.projectcalico.org/calico-dcos-demo-security-speed-and-no-more-port-forwarding/
Check it out – Calico is in the Mesosphere Universe!
top related