make your containers faster: linux container performance tools
TRANSCRIPT
Linux Container Performance Analysis Sasha Goldshtein
CTO, Sela Group
goldshtn
goldshtn
The Plan
• This talk explains how to analyze Linux container performance and which challenges you’ll encounter along the way
• You’ll learn:
To retrieve basic resource utilization per-container
Which deployment strategies to use with container performance tools
To dig into high-CPU issues using perf
To make symbol resolution work across namespaces
To use BPF-based tools with containers
2
Container Resource Utilization
3
Containers Under The Hood in 30 Seconds
CONTROL GROUPS
• Restrict usage (place quotas)
• cpu,cpuacct: used to cap CPU usage and apply CPU shares • docker run --cpus --cpu-shares
• memory: used to cap user and kernel memory usage • docker run --memory --kernel-memory
• blkio: used to cap IOPS and throughput per block device, and to assign weights (shares)
NAMESPACES
• Restrict visibility
• PID: container gets its own PIDs
• mnt: container gets its own /
• net: container gets its own network interfaces
• user: container gets its own user and group ids
• Etc.
USE Checklist for Linux Systems http://www.brendangregg.com/USEmethod/use-linux.html
CPU
Core Core
LLC
RAM
Memory
controller GFX
I/O
controller SSD
E1000
FS
B
PCIe
U: perf
U: mpstat -P 0
S: vmstat 1
E: perf
U: sar -n DEV 1
S: ifconfig
E: ifconfig
U: free -m
S: sar -B
E: dmesg
U: iostat 1
S: iostat -xz 1
E: …/ioerr_cnt U: perf
Retrieving Container Resource Utilization
• High-level, Docker-specific: docker stats
• htop with cgroup column (highly unwieldy)
• systemd-cgtop • Execute a command in the container :
• nsenter -t $PID -p -m top • docker exec -it $CID top
• Control group metrics
• E.g. /sys/fs/cgroup/cpu,cpuacct/*/*/*
• Third-party monitoring solutions: cAdvisor, Intel Snap, PCP, collectd
6
But Beware Of:
• /proc/loadavg showing host statistics
• free showing systemwide memory usage
• iostat showing host block I/O
7
Tool Deployment Strategies
8
Tool Deployment: On The Host
9
User
Kernel
perf_events
node:6 openjdk:8 ubuntu:xenial # perf record -G …
-g -F 97
Tool Deployment: In Container
10
User
Kernel
perf_events
node:6 openjdk:8 ubuntu:xenial
# perf record -G … -g -F 97
☢
Problems
• On the host:
• Need privileged access to the host (most container orchestrators try to abstract this away from you)
• Tools need to understand container pid namespace, mount namespace, and other details (discussed later)
• In the container:
• Need to run the container with extra privileges (e.g. for perf: enable perf_event_open syscall, perf_event_paranoid sysctl)
• Bloats container images with performance tools that are not always used, and increases attack surface
• Performance tools may be throttled by quotas placed on the container
11
Tool Deployment: Sidecar Container
12
User
Kernel
perf_events
node:6 openjdk:8 ubuntu:xenial ubuntu:xenial
# perf record -G … -g -F 97
☢
Profiling Containers with perf
13
perf_events Architecture
14
User Kernel
tcp_msgsend
Page fault
LLC miss
sched_switch
pe
rf_e
ve
nts
libc:malloc
mmap buffer
mmap buffer
Real-time
analysis
perf script… perf.data file
Recording CPU Stacks With perf
• To find a CPU bottleneck, record stacks at timed intervals:
# system-wide perf record -ag -F 97
# specific process perf record -p 188 -g -F 97
# specific cgroup perf record -G docker-1ae… -g -F 97
Legend
-a all CPUs
-p specific process
-G specific cgroup
-g capture call stacks
-F frequency of samples (Hz)
-c # of events in each sample
Symbol Resolution Challenges
• Address to module and symbol resolution, dynamic instrumentation require access to debug information
• Because of mount namespace, container’s binaries and debuginfo are not visible to host (/lib64/libc.so.6 – what?)
• Need to enter the container’s namespace or share the binaries
16
Perf Maps: Containerized Java/Node/.NET…
• For interpreted or JIT-compiled languages, map files need to be generated at runtime $ cat /tmp/perf-1882.map 7f2cd1108880 1e8 Ljava/lang/System;::arraycopy 7f2cd1108c00 200 Ljava/lang/String;::hashCode 7f2cd1109120 2e0 Ljava/lang/String;::indexOf
• When profiling from the host: • PID namespace – always /tmp/perf-1.map in the container, not on the host
• Mount namespace – container’s /tmp/perf-1.map not visible to host
• perf handles this automatically in Linux 4.13+, as do the BCC tools (discussed next)
17
Tracing Containers with BPF
18
BPF Tracing kernel
BPF runtime kprobes
tracepoints
user
control program
BPF program
map
application
USDT
uprobes perf output
installs BPF program and attaches to events
events invoke the BPF program
perf_events
BPF program updates a map or pushes a
new event to a buffer shared with user-space
user-space program is invoked with
data from the shared buffer
user-space program reads statistics
from the map and clears it if necessary
control
program
19
Java/Node/…
Syscall interface
Block I/O Ethernet Scheduler Mem
Device drivers
Filesystem TCP/IP
CPU
Applications
System libraries
profile
llcstat
hardirqs
softirqs
ttysnoop
runqlat
cpudist
offcputime
offwaketime
cpuunclaimed
memleak
oomkill
slabratetop tcptop
tcplife
tcpconnect
tcpaccept
biotop
biolatency
biosnoop
bitesize
filetop
filelife
fileslower
vfscount
vfsstat
cachestat
cachetop
mountsnoop
*fsslower
*fsdist
dcstat
dcsnoop
mdflush
execsnoop
opensnoop
killsnoop
statsnoop
syncsnoop
setuidsnoop
mysqld_qslower
bashreadline
dbslower
dbstat
mysqlsniff
memleak
sslsniff
gethostlatency
deadlock_detector
ustat
ugc
ucalls
uthreads
uobjnew
uflow
argdist
trace
funccount
funclatency
stackcount
20
Summary
• We have seen:
To retrieve basic resource utilization per-container
Which deployment strategies to use with container performance tools
To dig into high-CPU issues using perf
To make symbol resolution work across namespaces
To use BPF-based tools with containers
21
References
• Container performance analysis
• https://www.slideshare.net/brendangregg/container-performance-analysis
• http://batey.info/docker-jvm-flamegraphs.html
• Performance tooling support
• https://lkml.org/lkml/2017/7/5/771
• https://github.com/iovisor/bcc/pull/1051
• Monitoring tools
• https://github.com/intelsdi-x/snap
• http://pcp.io/docs/guide.html
• https://github.com/bobrik/collectd-docker
• https://github.com/ivotron/docker-perf 22
Thank You!
Slides: https://s.sashag.net/ktlv1217
Demos & labs: https://github.com/goldshtn/linux-tracing-workshop
✁
Sasha Goldshtein
CTO, Sela Group
goldshtn
goldshtn