an exploration of linux container network monitoring and
TRANSCRIPT
![Page 1: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/1.jpg)
Alban Crequy
Exploration of Linux Container Network Monitoring and
Visualization
ContainerCon Europe - October 2016https://goo.gl/iDL8te
![Page 2: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/2.jpg)
Alban Crequy
∘ Worked on the rkt container run-time∘ Contributed to systemd
https://github.com/alban
![Page 3: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/3.jpg)
Berlin-based software company building foundational Linux technologies
Some examples of what we work on...
OSTreegit for operating system binaries
![Page 4: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/4.jpg)
Find out more about us…
Blog: https://kinvolk.io/blog
Github: https://github.com/kinvolk
Twitter: https://twitter.com/kinvolkio
Email: [email protected]
![Page 5: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/5.jpg)
∘ First use-case: visualizing tcp connections∘ Microservices application with containers: Weave Socks∘ CoreOS Linux, Kubernetes, Weave Scope
∘ Using /proc & conntrack∘ Limitations∘ proc connector, eBPF & kprobes
∘ Next use cases:∘ L7, HTTP: eBPF & kprobes∘ Simulating degraded networks with traffic control
Plan
![Page 6: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/6.jpg)
The demo application
![Page 7: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/7.jpg)
microservices-demo
https://github.com/microservices-demo/microservices-demo
![Page 8: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/8.jpg)
Some micro-services
front-end Firefox
catalogue
ordersorders-db
payment
![Page 9: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/9.jpg)
![Page 10: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/10.jpg)
Orchestrating containersWith Kubernetes
![Page 11: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/11.jpg)
Kubernetes Replica Sets
Kubernetesnode 1
front-end
Kubernetesnode 2
front-end
Kubernetesnode 3
ordersorders
catalogue catalogue
![Page 12: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/12.jpg)
Kubernetesnode 1
front-end
Kubernetesnode 2
front-end
Kubernetesnode 3
ordersorders
Kubernetes Services
orders service
![Page 13: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/13.jpg)
Kubernetes ServicesProxying the traffic from the virtual service IP to a Kubernetes pod
Several implementations possible:
- Userspace proxy in kube-proxy- Iptables rules (Destination NAT) installed by kube-proxy- Cilium implements a load balancer based on eBPF (tc level)
![Page 14: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/14.jpg)
Weave Scope
![Page 15: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/15.jpg)
Weave Scope
![Page 16: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/16.jpg)
Weave Scope
demo
![Page 17: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/17.jpg)
procfs
![Page 18: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/18.jpg)
procfs files- /proc/$PID- /proc/$PID/ns/net network namespace- /proc/$PID/fd/ file descriptors- /proc/$PID/net/tcp tcp connections
![Page 19: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/19.jpg)
procfs files
![Page 20: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/20.jpg)
procfs limitations- No notifications- Need to read procfs for
- new processes- new network namespaces- new sockets- every second?
- CPU intensive for systems with high number of processes- Missing short-lived connections- Issues with packet modifications (e.g. DNAT)
![Page 21: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/21.jpg)
Packet modifications
Local process
Socket lookup
Traffic control, ingress
packet
Protocol layer
Network layer
Link layer
Local process
NAT
Traffic control, egress
Kubernetes node 1 Kubernetes node 2
![Page 22: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/22.jpg)
Netlink
![Page 23: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/23.jpg)
Netlink socketssocket(AF_NETLINK, SOCK_RAW, NETLINK_...);
Several Netlink sockets:
- NETLINK_ROUTE- NETLINK_INET_DIAG- NETLINK_SELINUX- NETLINK_CONNECTOR- NETLINK_NETFILTER- ...
![Page 24: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/24.jpg)
conntrack
![Page 25: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/25.jpg)
conntrack -E- Use NETLINK_NETFILTER sockets to subscribe to Conntrack events
from the kernel - Is aware of NAT rewritings
![Page 26: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/26.jpg)
conntrack limitations- Conntrack events don’t include:
- Process ID- Network namespace ID
- Conntrack zones included but not necessary used by container run-times
- So harvesting procfs regularly still necessary
![Page 27: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/27.jpg)
Other kind of Netlink sockets?
![Page 28: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/28.jpg)
NETLINK_INET_DIAGsocket(AF_NETLINK, SOCK_RAW, NETLINK_INET_DIAG);
- Fetch information about sockets- Used by ss (“another utility to investigate sockets”)- Basic bytecode to filter the sockets (e.g. “INET_DIAG_BC_JMP”)
- But no notification mechanism- Patch “sock_diag: notify packet socket creation/deletion” (2013)
rejected
![Page 29: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/29.jpg)
Kernel Connectorsocket(AF_NETLINK, SOCK_RAW, NETLINK_CONNECTOR);
Several Kernel Connector agents:
- Device mapper- HyperV- Proc connector
![Page 30: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/30.jpg)
Proc connectorbind(sockfd, ...CN_IDX_PROC...);
sendmsg(sockfd, ...PROC_CN_MCAST_LISTEN...)
- Since Linux v2.6.15 (January 2006)
Notifications for:
- fork- exec- exit
![Page 31: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/31.jpg)
Proc connectorMissing:
- network namespace- RFC patch “proc connector: add namespace events” last month
https://lkml.org/lkml/2016/9/8/588- Sockets
So harvesting procfs regularly still necessary
![Page 32: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/32.jpg)
Proc connector
demo
![Page 33: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/33.jpg)
BPF
![Page 34: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/34.jpg)
Classic BPF (cBPF)
socket
kernel
userspace
BPF_JMP...BPF_LD...BPF_RET...
setsockopt(sockfd,SOL_SOCKET,SO_ATTACH_FILTER,&bpf, sizeof(bpf));recvfrom()
![Page 35: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/35.jpg)
Extended BPF (or eBPF)Program type:
- BPF_PROG_TYPE_SOCKET_FILTER- BPF_PROG_TYPE_KPROBE- BPF_PROG_TYPE_SCHED_CLS- BPF_PROG_TYPE_SCHED_ACT- BPF_PROG_TYPE_TRACEPOINT (Linux >= 4.7)- BPF_PROG_TYPE_XDP
![Page 36: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/36.jpg)
eBPF classifier for qdiscs
eth0
classifier
kernel
userspace
BPF_JMP...BPF_LD...BPF_RET...
if (skb->protocol…) return TC_H_MAKE(TC_H_ROOT, mark); compilation
clang... -march=bpf
uploadin the kernel:
- bpf()- Netlink
x86_64 codeJIT compilation
![Page 37: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/37.jpg)
eBPF maps
kernel
userspace
x86_64 code
eBPF maps
Userspace program
∘ Keep context between calls∘ Report statistics to userspace
![Page 38: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/38.jpg)
Tracepoints with eBPF- BPF_PROG_TYPE_TRACEPOINT since Linux 4.7- Find the list of tracepoints in /sys/kernel/debug/tracing/events- Stable API- But limited tracepoints
![Page 39: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/39.jpg)
kprobes with eBPF- BPF_PROG_TYPE_KPROBE since Linux 4.1- No ABI guarantees- Probe any kernel function
![Page 40: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/40.jpg)
Socket events with kprobe / eBPF- BPF Compiler Collection (BCC)
- bcc/examples/tracing/tcpv4connect.py- Iago’s tcp4tracer (WIP)
- Get connection tuple, pid, netns- tcp_v4_connect- tcp_close- inet_csk_accept
![Page 41: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/41.jpg)
Packet modifications
Local process
Socket lookup
Traffic control, ingress
packet
Protocol layer
Network layer
Link layer
Local process
NAT
Traffic control, egress
Kubernetes node 1 Kubernetes node 2
![Page 42: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/42.jpg)
tcp4tracer & NAT- The connection tuple from the process’ point of view is not enough
- NAT- Kubernetes Services
- Iago’s tcp4tracer (WIP)- nf_nat_ipv4_manip_pkt- nf_nat_tcp_manip_pkt
![Page 43: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/43.jpg)
More metrics
![Page 44: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/44.jpg)
Weave Scope architecture
Kubernetesnode 1
Kubernetesnode 2
Scope App
Scope Probe
Firefox
Scope Probe
![Page 45: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/45.jpg)
Weave Scope plugins
Kubernetesnode 1
Kubernetesnode 2
Scope App
Scope Probe
Firefox
Scope Probe
plugin plugin plugin plugin
![Page 46: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/46.jpg)
![Page 47: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/47.jpg)
HTTP requests plugin- Number of HTTP requests per second- Without instrumenting the application- eBPF kprobe on skb_copy_datagram_iter
kernel
userspace
HTTP serverHTTP client
recvfrom()sendmsg()
GET / HTTP/1.1 skb_copy_datagram_iter()copies the skb into the iovec
![Page 48: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/48.jpg)
HTTP responses plugin- Number of HTTP responses by category (404, etc.)- Without instrumenting the application- eBPF kprobe on skb_copy_datagram_from_iter- Using an eBPF map to track the context between kprobe & kretprobe
kernel
userspace
HTTP serverHTTP client
sendmsg()recvfrom()
HTTP/1.0 200 OK skb_copy_datagram_from_iter()copies the iovec into the skb
![Page 49: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/49.jpg)
Testing degraded networks
![Page 50: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/50.jpg)
Traffic control, why?
web server client
client
client
THEINTERNET
∘ fair distribution of bandwidth
∘ reserve bandwidth to specific applications
∘ avoid bufferbloat
![Page 51: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/51.jpg)
∘ Network scheduling algorithm∘ which packet to emit next?∘ when?
∘ Configurable at run-time:∘ /sbin/tc∘ Netlink
∘ Default on new network interfaces: sysctl net.core.default_qdisc
Queuing disciplines(qdisc)
eth0 THE INTERNETqdisc
![Page 52: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/52.jpg)
Stochastic FairnessQueueing (sfq)
eth0
THE INTERNET
FIFO n
FIFO 1
FIFO 0
...
round robin
![Page 53: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/53.jpg)
Demo
Reproduce this demo yourself: https://github.com/kinvolk/demo
![Page 54: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/54.jpg)
Network emulator(netem)
eth0 THE INTERNETnetem
bandwidth
latency packet loss
corrupt...
![Page 55: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/55.jpg)
Testing with containers
container 1 container 2
eth0eth0
Testing framework
configure “netem” qdiscs:bandwidth, latency, packet drop...
![Page 56: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/56.jpg)
Add latency on a specific connection
front-end Firefox
catalogue
ordersorders-db
payment
latency=100ms
![Page 57: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/57.jpg)
How to define classes of traffic
eth0
netem
interface
latency=100ms
dest_ip=10.0.4.* dest_ip=10.0.5.* other
![Page 58: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/58.jpg)
u32: filter on contenteth0
HTB
HTB
HTBHTB HTB
netemnetem netem
interface
root qdisc (type = HTB)
root class (type = HTB)
leaf qdiscs (type = netem)
leaf classes (type = HTB)
filters (type=u32)
otherip=10.0.5.*ip=10.0.4.*
latency=10ms
![Page 59: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/59.jpg)
Filtering with cBPF/eBPF
eth0
BPF
netemnetem
kernel
userspace
BPF_JMP...BPF_LD...BPF_RET...
if (skb->protocol…) return TC_H_MAKE(TC_H_ROOT, mark); compilation
clang... -march=bpf
uploadin the kernel:
- bpf()- Netlink
x86_64 codeJIT compilation
![Page 60: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/60.jpg)
eBPF maps
eth0
BPF
netemnetem
kernel
userspace
x86_64 code
eBPF map
tc
![Page 61: An Exploration of Linux Container Network Monitoring and](https://reader036.vdocument.in/reader036/viewer/2022082213/58a2e9a21a28ab02228b9242/html5/thumbnails/61.jpg)
Questions?The slides: https://goo.gl/iDL8te