How to run Docker Engine on nftables
In the Linux world we are transitioning firewall management from the legacy iptables to the newer (and better) nftables. It may cause issues with Docker Engine which still doesn't have a native support for nftables and workarounds are based on using iptables-nft compatibility layer and converting legacy settings to the nft format. It's not a perfect workaround for people who prefer to not install anything iptables-related on their systems and some distributions are already shipping without the compatibility layer so here's my solution to run Docker on nftables.
What to expect?
- Docker containers can access the Internet using source NAT.
- Docker networks work and the host can access containers on bridge network by their IP.
- Port forwarding works and we can bind a Docker service to a port on the main interface.
- Docker is no longer creating automatic allow rules for ports exposed with port forwarding. I see it as a feature because people unaware of this behavior could open ports to the public Internet that were meant to be accessible only from the local network. With this solution each port must be explicitly allowed in the nftables config to be accessible from outside.
Step 1 - Disable iptables handling in Docker
We must tell Docker Engine to disable iptables integration. Docker won't manage the rules automatically but if we don't have iptables on our system, there is nothing to manage and Docker could fail on a container start.
Edit /etc/docker/daemon.json
{
"iptables": false
}
If the file doesn't exist, just create it.
Restart the Docker service to apply daemon config changes:
systemctl restart docker.service
Step 2 - Enable IP forwarding
IP forwarding must be enabled to forward the packets between host and Docker interfaces. It transforms your system into a router, so keep the security in mind to not expose local network to the Internet by accident. Our nftables config will manage the firewall for forwarding.
Edit /etc/sysctl.conf and uncomment (or add) following rules:
# Enable IPv4 forwarding
net.ipv4.ip_forward=1
# Enable IPv6 forwarding
# You can skip it, if you don't use IPv6 with Docker
net.ipv6.conf.all.forwarding=1
Reload system config with to apply changes:
sysctl -p
Step 3 - Configure nftables to allow Docker
Without iptables, Docker can't create the firewall rules for NAT or forwarding so we must configure them manually. The rules will be handled by nftables and here's an example config to make it work.
We will create the chains as follows:
- table inet nat chain postrouting - rules enabling source NAT from Docker interfaces to the WAN interface masquerading on the public IP.
- table inet filter chain input - rules filtering the input traffic from the Internet and on the local interfaces.
- table inet filter chain forward - rules to filter IP forwarding and allow it only between Docker interfaces and the WAN interface for source NAT.
Edit /etc/nftables.conf
#!/usr/sbin/nft -f
flush ruleset
# The name of network interface connected to the Internet for source NAT.
# You can find it using `ip addr` command and looking for an interface with
# your public IP. Usually it's eth0 or enpXsY depending on your distribution.
define WAN_NAT_IFACE = "enp1s0"
table inet nat {
chain prerouting {
type nat hook prerouting priority filter;
}
# Postrouting chain must setup a source NAT for Docker interfaces
# to let them access the Internet from prviate IP space.
chain postrouting {
type nat hook postrouting priority filter;
# Enable source NAT masquerading on the Internet interface IP
iif docker0 oif $WAN_NAT_IFACE masquerade comment "SNAT for Docker"
iifname "br-*" oif $WAN_NAT_IFACE masquerade comment "SNAT for Docker"
}
}
table inet filter {
chain input {
type filter hook input priority filter; policy drop;
# Basic rules to accept loopback and established connections
iif lo accept comment "Accept loopback"
ct state invalid drop comment "Drop invalid"
ct state established,related accept comment "Accept traffic originated from server"
# Basic rules to accept ICMP and various network protocols for auto-configuration
meta l4proto icmp accept comment "Accept ICMP"
meta l4proto ipv6-icmp accept comment "Accept ICMPv6"
ip protocol igmp accept comment "Accept IGMP"
udp dport mdns ip6 daddr ff02::fb accept comment "Accept mDNS"
udp dport mdns ip daddr 224.0.0.251 accept comment "Accept mDNS"
udp sport bootpc udp dport bootps ip saddr 0.0.0.0 ip daddr 255.255.255.255 accept comment "Accept DHCPDISCOVER (for DHCP-Proxy)"
# Arbitrary rules to accept SSH and HTTP(S) traffic
tcp dport ssh accept comment "Accept SSH"
tcp dport { http, https } accept comment "Accept HTTP/HTTPS"
udp dport https accept comment "Accept HTTPS/QUIC"
# Accept any on local networks
# Please adapt these rules if the local traffic on your network is not fully trusted
# In general, we need to allow at least the subnets of Docker interfaces, which are
# created as /16 networks on 172.16.0.0/12 by default
ip daddr 10.0.0.0/8 accept comment "Accept local 10.0.0.0/8"
ip daddr 172.16.0.0/12 accept comment "Accept local 172.16.0.0/12"
ip daddr 192.168.0.0/16 accept comment "Accept local 192.168.0.0/16"
# Docker will NOT add automatic rules for exposed ports
# so if you want to expose other port to the Internet,
# you have to add an explicit rule:
#
# tcp dport 2137 accept comment "Accept TCP on :2137"
# udp dport 2137 accept comment "Accept TCP on :2137"
#
# After adding a rule, reload nftables with:
# /usr/sbin/nft -f /etc/nftables.conf
#
# Uncomment to log dropped packets for troubleshooting
# It's commented by default because real-world public network
# can drop A LOT of traffic.
#
# log drop;
}
# Forward chain must allow forwarding in and out of Docker interfaces
chain forward {
type filter hook forward priority filter; policy drop;
# Allow forwarding to WAN interface for source NAT
oif $WAN_NAT_IFACE accept comment "Allow source NAT"
# Allow forwarding in and out of Docker interfaces
iif docker0 accept comment "Allow Docker"
oif docker0 accept comment "Allow Docker"
iifname "br-*" accept comment "Allow Docker Networks"
oifname "br-*" accept comment "Allow Docker Networks"
# Log dropped packets for troubleshooting
log drop;
}
chain output {
type filter hook output priority filter; policy accept;
}
}
Reload nftables with:
/usr/sbin/nft -f /etc/nftables.conf
That's all. From now the Docker networking should work and containers should be accessible form local network and on port forwarding rules (actually, docker-proxy because that's not forwarding in the firewall sense).
To make sure that nftables config is automatically loaded after reboots, enable the nftables service:
systemctl enable nftables.service
systemctl start nftables.service
Troubleshooting
If your traffic seems to get dropped, add "log drop" rule at the end of the input and forward chains in nftables.conf, then reload the firewall. Every dropped packet will be logged to the system log (e.g. /var/log/syslog) and you can see what needs to be added to nftables.conf
You can also dump the current runtime firewall config using
nft list ruleset
General recommendations
The example firewall config assumes full trust on the local network and does rather minimal filtering. It should be a reasonable default to copy-and-paste it on a cloud VPS and call it a day, but if your environment needs more security, then try to review the rules and adapt them. The declarative config as a file makes working with nftables a pleasure, so even with minimal knowledge about Linux netfilter you can still express even more complicated rules rather easily. A good source for learning nftables is the official Wiki and Arch Linux Wiki (best for beginners).
If you are feeling fancy, then instead of allowing in and out forwarding on all Docker interfaces, you can create explicit rules which interfaces can forward to which and isolate different networks from each other of other interfaces. Alternatively, it can be done by source/destination IP in the input filter chain.