packet filtering & linux today · /me studied math and computer science 2018- innovo cloud...
TRANSCRIPT
packet filtering & Linux today
from iptables to nftables, back and more
/meStudied math and computer science
2018- iNNOVo CloudCloud Gardener, OpenStack, k8s, edge computing
2016-2018 FHE3, Sysadmin, internal & external consultant
2012-2016 1&1, DNS Team, System Admin
IT as a Service Platform Provider
About iNNOVO
07.08.2019 3
Bank-level Compliance &
Sicherheit
2x Tier 3+ DCs in Frankfurt
Modular & standardised Edge Datacenters
50+employees
Developing & Operating standardised, agile ITaaS
Cloud Platforms
80% Tech Engineers/ Admins
20% Business Development + Backoffice
Offices in Frankfurt und Berlin
iMKEiNNOVO managed Kubernetes engine
NEU!Tolles Team,
spannende Aufgaben und interessante
Technik!
Where are we?
● netfilter / iptables since 11/2002● in transition to nftables - Migration?
How it works: hooks -> tables -> chain -> rules
How it works: hooks -> tables -> chain -> rules
very basic example
● iptables -P INPUT DROP
● iptables -A INPUT -p icmp -j ACCEPT● be more precise … why?
● iptables -A INPUT \-p icmp --icmp-type echo-request \
-j ACCEPT
POLICY
MATCH
CHAINTARGET
Where is iptables used?● linux based router with firewall● host firewalling● docker● k8s● application level filtering● debugging
How is iptables used?- long list of n rules
- origin? - shell script- framework- …-
- O(n) - worst case
How is iptables used? Issues?
- long list of n rules- origin?
- shell script- framework- …-
- O(n) - worst case
- Code duplication in userland and kernel- iptables, ip6tables, ebtables, arptables
Issues
● long lists→ tracking which rule matched which packet→ in kernel
→ high latencies
How to cope with that?● ignoring
○ missing knowledge/awareness○ issue in big deployments
● big deployment?○ linux based routers with many interfaces○ host firewalls for IP blocking (before IP sets)○ k8s network polices
Use Case - iptables performance for small rulesets● enable services, simple stupid
SSH and HTTP(S) (DNS, or …) how hard can that be? -> Easy● Naive solution, via conntrack● Pitfalls?
iptables -A INPUT -i lo -j ACCEPT iptables -A INPUT -p icmp -j ACCEPT iptables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPTiptables -A INPUT -i intern -p tcp --dport 22 -j ACCEPTiptables -A INPUT -p tcp --dport 80 -j ACCEPTiptables -A INPUT -p tcp --dport 443 -j ACCEPT
Assume: ● Tables and chains empty● iptables -P INPUT DROP
Use Case - iptables performance for small rulesets● enable services, simple stupid
SSH and HTTP(S) (DNS, or …) how hard can that be? -> Easy● Naive solution, via conntrack
iptables -A INPUT -i lo -j ACCEPT iptables -A INPUT -p icmp -j ACCEPT iptables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPTiptables -A INPUT -i intern -p tcp --dport 22 -j ACCEPT
iptables -A INPUT -p udp --dport 53 -j ACCEPTiptables -A INPUT -p tcp --dport 53 -j ACCEPT
Assume: ● Tables and chains empty● iptables -P INPUT DROP
Use Case - iptables performance for small rulesetsUse cases
● enable simple servicesSSH and HTTP(S) (DNS, or …) how hard can that be? -> Easy
● Naive solution, via conntrack● Pitfalls?
→ Conntrack expection tables migth get exthausted, → loss of control and service
# iptables -t raw -A PREROUTING -p udp --dport 53 -j NOTRACK# iptables -t raw -A OUTPUT -p udp --sport 53 -j NOTRACK
iptables - performance for small rulesetsUse cases
● enable SSH and DNS, how hard can that be? -> Easy● DNS DDoS, near line rate 10G/1G, many locations● could not be filtered properly on AS borders
One possibe solution u32 match:
u32 filter generate-netfilter-u32-dns-rule
iptables performance for small rulesets
u32 filter generate-netfilter-u32-dns-rule
# python generate-netfilter-u32-dns-rule.py \ --qname heise.de --qtype AAAA
0>>22&0x3C@20&0xFFDFDFDF=0x05484549&&0>>22&0x3C@24&0xDFDFFFDF=0x53450244&&0>>22&0x3C@28&0xDFFFFFFF=0x4500001C
# iptables [...] --match u32 --u32 "$rule" -j DROP
tune iptables performance ● state might kill -> connection tracking● protocols using UDP -> might be a bad idea
* DNS, syslog, NTP ...
→ iptables -t RAW -A -m match … -j NOTRACK
● sysctl tuneable for timeouts in conntrack stacknet.netfilter.nf_conntrack_tcp_timeout_established=7200net.netfilter.nf_conntrack_udp_timeout=60net.netfilter.nf_conntrack_udp_timeout_stream=180...
Examples: other cool matches-m
● u32 - very flexible, but annoying to write● bpf● conntrack - use the state of connections
● cgroup● probability - testing● recent - port knocking without daemon
https://www.digitalocean.com/community/tutorials/how-to-configure-port-knocking-using-only-iptables-on-an-ubuntu-vps
Examples: other cool matches & targets-j
● REDIRECT - Application level fitering, debugging aid● MARK / CONNMARK● LOG / ULOG - Logging / structured & flexible logging● TRACE - ruleset debugging helper, show packet flow throught the rulesets
iptables - nftables - Transitione.g. Debian 10 Buster - iptables-nft is standard
#Warning: iptables-legacy tables present, use iptables-legacy-save to see them
● iptables-nft vs. iptables-legacy
● What’s in /etc/modules, ...? ○ iptables-legacy-save | iptabes-nft-restore○ remove old modules ipt_filter, ....○ black list those modules
How it works: hooks -> tables -> chain -> rules● dynamic tables and chain creation● no default tables and chains
→ netfilter hooks
nftables# nft list tables# nft list table inet filter
# nft flush ruleset
# nft add table inet filter
### iptables compat# nft add chain inet filter input { type filter hook input priority 0 \; policy drop \; }# nft add chain inet filter forward { type filter hook forward priority 0 \; policy drop \; }# nft add chain inet filter output { type filter hook output priority 0 \; policy accept \; }
#!/usr/sbin/nft -f # nft add rule inet filter input ct state related,established accept
# nft add rule inet filter input iif lo accept
# nft add rule inet filter input ip protocol tcp dport 22 accept
atomicity
nftables: ingress hook● no conntrack, before any other tables● Why this is useful? → veth, macvtap, Containers
What else do we have?● iptables/ip6tables/ebtables/arptables● nftables● tc● bpfilter● XDP
tc and tcpdump● tc → traffic control, strange syntax, but useful
○ QoS○ Filtering○ Mirroring○ Network simulation
● tcpdump pcap compiles BPF fragment→ loaded into kernel, → attached to an interface→ hand over matching packets/frames to tcpdump
● How to generate fragments?
# tcpdump -ddd Note: use an interface with same encapsulation
tc and tcpdump# ip tuntap add dev tun0 mode tun; ip l set up tun0
# tcpdump -i tun0 -ddd icmp | tee filter.bpf748 0 0 084 0 0 24021 0 3 6448 0 0 921 0 1 16 0 0 2621446 0 0 0 Note:
tun0 transport raw IP packets, might look differentethernet devices has ethernet frames
tc and tcpdump# tc qdisc add dev eth0 handle ffff: ingress# tc filter add dev eth0 parent ffff: bpf bytecode-file filter.bpf action drop
# tc filter show dev eth0 parent ffff:
bpfilter, XDP● similiar to nftables ingress hook, attach fragments to interfaces● BPF in fact eBPF● Hardware offloading possible!
see Cililum, good quick start tutorial, https://docs.cilium.io/en/v1.4/bpf/
Fun fact: loopless 6502 derivative, but with proper register sizes
Questions
?
tc and tcpdump - syntax pogo edition!# tc qdisc add dev eth0 handle ffff: ingress# tc filter add dev eth0 parent ffff: bpf bytecode-file filter.bpf action drop
# tc filter show dev eth0 parent ffff:
How to delete?
tc filter del dev enp0s8 parent ffff:
local traffic redirection - debugging
# iptables -t nat -A OUTPUT -p tcp --dport 80 \-j REDIRECT --to-ports 8080
# iptables -t nat -A OUTPUT -p tcp --dport 443 \-j REDIRECT --to-ports 8080
# nc -l 0.0.0.0 8080
# mitmproxy --mode transparent --showhost -k
tc and the network emulator
Simulate delays or losses
tc qdisc add dev eth0 root netem loss 10%
https://wiki.linuxfoundation.org/networking/netem
iptables ... -m probabilty