不论是Kubernetes还是Docker本身,我们在一个节点上暴露端口对外提供服务的时候,总是少不了端口转发。例如以下方式:
docker run -d -p 8080:80 nginx:latest
就是用最简单的方法,在本机暴露8080端口,转发到容器内的80端口,这样一来外界可以通过本机IP+8080端口就可以访问容器内提供的nginx服务了。
大家有没有想过,这是怎么做到的呢?
其实不难,Docker实现本身就是高度依赖Linux内核的若干模块,这里的端口转发也不例外,也是用了Linux iptables的小把戏,接下来我们就来分析下这个小把戏。
既是知道Docker的端口转发与iptables有关,那么究竟有什么样的关系呢?我觉得从log入手,尝试记录下数据包通过iptables的轨迹,然后再作分析吧。
在不确定具体的涉及哪些iptables表的情况下,我想最好的方法是把所有的可能的表都加上适当的log,从而记录数据包在各个表和链之间的流转。
接下来,我们就来照此思路去探究iptables如何做到端口转发。
在CentOS7中打开 /etc/rsyslog.conf ,并添加如下配置:
kern.* /var/log/iptables.log
并重启rsyslog:
systemctl restart rsyslog.service
这样,就可以在 /var/log/iptables.log 中看到iptables的log了,当然,前提是你得有log输出。
在操作以前,首先保存原先的iptables配置,以免后续操作带有破坏性,也便于还原:
iptables-save > iptb_origin
然后,分别在所有可能的表的所有的可能涉及的链中增加对于源目端口为80或者8080的DEBUG log,可以用如下脚本方便进行:
#!/bin/bash
for table in raw filter nat mangle
do
for chain in PREROUTING OUTPUT FORWARD INPUT POSTROUTING
do
for port in 8080 80
do
iptables -t ${table} -A ${chain} -p tcp -m tcp --dport $port -j LOG --log-prefix "[debug]-${table}-${chain}:" --log-level 7 || true
iptables -t ${table} -A ${chain} -p tcp -m tcp --sport $port -j LOG --log-prefix "[debug]-${table}-${chain}:" --log-level 7 || true
done
done
done
可能某些表无法加入某些链,不过,我们暂且忽略,为了简化逻辑,直接等以上脚本执行完,而忽略其中的错误。
从外部地址 1.2.3.4 发起请求:
curl 172.20.242.183:8080
我们再看
1 Jan 21 18:09:52 centos7 kernel: [debug]-raw-PREROUTING:IN=eth0 OUT= MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.20.242.183 LEN=64 TOS=0x14 PREC=0x00 TTL=48 ID=0 DF PROTO=TCP SPT=10656 DPT=8080 WINDOW=65535 RES=0x00 SYN URGP=0 2 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-PREROUTING:IN=eth0 OUT= MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.20.242.183 LEN=64 TOS=0x14 PREC=0x00 TTL=48 ID=0 DF PROTO=TCP SPT=10656 DPT=8080 WINDOW=65535 RES=0x00 SYN URGP=0 3 Jan 21 18:09:52 centos7 kernel: [debug]-nat-PREROUTING:IN=eth0 OUT= MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.20.242.183 LEN=64 TOS=0x14 PREC=0x00 TTL=48 ID=0 DF PROTO=TCP SPT=10656 DPT=8080 WINDOW=65535 RES=0x00 SYN URGP=0 4 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-FORWARD:IN=eth0 OUT=docker0 MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.17.0.2 LEN=64 TOS=0x14 PREC=0x00 TTL=47 ID=0 DF PROTO=TCP SPT=10656 DPT=80 WINDOW=65535 RES=0x00 SYN URGP=0 5 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-POSTROUTING:IN= OUT=docker0 SRC=1.2.3.4 DST=172.17.0.2 LEN=64 TOS=0x14 PREC=0x00 TTL=47 ID=0 DF PROTO=TCP SPT=10656 DPT=80 WINDOW=65535 RES=0x00 SYN URGP=0 6 Jan 21 18:09:52 centos7 kernel: [debug]-nat-POSTROUTING:IN= OUT=docker0 SRC=1.2.3.4 DST=172.17.0.2 LEN=64 TOS=0x14 PREC=0x00 TTL=47 ID=0 DF PROTO=TCP SPT=10656 DPT=80 WINDOW=65535 RES=0x00 SYN URGP=0 7 Jan 21 18:09:52 centos7 kernel: [debug]-raw-PREROUTING:IN=docker0 OUT= PHYSIN=veth57b521b MAC=02:42:96:cd:15:f7:02:42:ac:11:00:02:08:00 SRC=172.17.0.2 DST=1.2.3.4 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=28960 RES=0x00 ACK SYN URGP=0 8 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-PREROUTING:IN=docker0 OUT= PHYSIN=veth57b521b MAC=02:42:96:cd:15:f7:02:42:ac:11:00:02:08:00 SRC=172.17.0.2 DST=1.2.3.4 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=28960 RES=0x00 ACK SYN URGP=0 9 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-FORWARD:IN=docker0 OUT=eth0 PHYSIN=veth57b521b MAC=02:42:96:cd:15:f7:02:42:ac:11:00:02:08:00 SRC=172.17.0.2 DST=1.2.3.4 LEN=60 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=28960 RES=0x00 ACK SYN URGP=0 10 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-POSTROUTING:IN= OUT=eth0 PHYSIN=veth57b521b SRC=172.17.0.2 DST=1.2.3.4 LEN=60 TOS=0x00 PREC=0x00 TTL=63 ID=0 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=28960 RES=0x00 ACK SYN URGP=0 11 Jan 21 18:09:52 centos7 kernel: [debug]-raw-PREROUTING:IN=eth0 OUT= MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.20.242.183 LEN=52 TOS=0x14 PREC=0x00 TTL=48 ID=0 DF PROTO=TCP SPT=10656 DPT=8080 WINDOW=2061 RES=0x00 ACK URGP=0 12 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-PREROUTING:IN=eth0 OUT= MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.20.242.183 LEN=52 TOS=0x14 PREC=0x00 TTL=48 ID=0 DF PROTO=TCP SPT=10656 DPT=8080 WINDOW=2061 RES=0x00 ACK URGP=0 13 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-FORWARD:IN=eth0 OUT=docker0 MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.17.0.2 LEN=52 TOS=0x14 PREC=0x00 TTL=47 ID=0 DF PROTO=TCP SPT=10656 DPT=80 WINDOW=2061 RES=0x00 ACK URGP=0 14 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-POSTROUTING:IN= OUT=docker0 SRC=1.2.3.4 DST=172.17.0.2 LEN=52 TOS=0x14 PREC=0x00 TTL=47 ID=0 DF PROTO=TCP SPT=10656 DPT=80 WINDOW=2061 RES=0x00 ACK URGP=0 15 Jan 21 18:09:52 centos7 kernel: [debug]-raw-PREROUTING:IN=eth0 OUT= MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.20.242.183 LEN=134 TOS=0x14 PREC=0x00 TTL=48 ID=0 DF PROTO=TCP SPT=10656 DPT=8080 WINDOW=2061 RES=0x00 ACK PSH URGP=0 16 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-PREROUTING:IN=eth0 OUT= MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.20.242.183 LEN=134 TOS=0x14 PREC=0x00 TTL=48 ID=0 DF PROTO=TCP SPT=10656 DPT=8080 WINDOW=2061 RES=0x00 ACK PSH URGP=0 17 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-FORWARD:IN=eth0 OUT=docker0 MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.17.0.2 LEN=134 TOS=0x14 PREC=0x00 TTL=47 ID=0 DF PROTO=TCP SPT=10656 DPT=80 WINDOW=2061 RES=0x00 ACK PSH URGP=0 18 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-POSTROUTING:IN= OUT=docker0 SRC=1.2.3.4 DST=172.17.0.2 LEN=134 TOS=0x14 PREC=0x00 TTL=47 ID=0 DF PROTO=TCP SPT=10656 DPT=80 WINDOW=2061 RES=0x00 ACK PSH URGP=0 19 Jan 21 18:09:52 centos7 kernel: [debug]-raw-PREROUTING:IN=docker0 OUT= PHYSIN=veth57b521b MAC=02:42:96:cd:15:f7:02:42:ac:11:00:02:08:00 SRC=172.17.0.2 DST=1.2.3.4 LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=22142 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=227 RES=0x00 ACK URGP=0 20 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-PREROUTING:IN=docker0 OUT= PHYSIN=veth57b521b MAC=02:42:96:cd:15:f7:02:42:ac:11:00:02:08:00 SRC=172.17.0.2 DST=1.2.3.4 LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=22142 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=227 RES=0x00 ACK URGP=0 21 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-FORWARD:IN=docker0 OUT=eth0 PHYSIN=veth57b521b MAC=02:42:96:cd:15:f7:02:42:ac:11:00:02:08:00 SRC=172.17.0.2 DST=1.2.3.4 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=22142 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=227 RES=0x00 ACK URGP=0 22 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-POSTROUTING:IN= OUT=eth0 PHYSIN=veth57b521b SRC=172.17.0.2 DST=1.2.3.4 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=22142 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=227 RES=0x00 ACK URGP=0 23 Jan 21 18:09:52 centos7 kernel: [debug]-raw-PREROUTING:IN=docker0 OUT= PHYSIN=veth57b521b MAC=02:42:96:cd:15:f7:02:42:ac:11:00:02:08:00 SRC=172.17.0.2 DST=1.2.3.4 LEN=290 TOS=0x00 PREC=0x00 TTL=64 ID=22143 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=227 RES=0x00 ACK PSH URGP=0 24 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-PREROUTING:IN=docker0 OUT= PHYSIN=veth57b521b MAC=02:42:96:cd:15:f7:02:42:ac:11:00:02:08:00 SRC=172.17.0.2 DST=1.2.3.4 LEN=290 TOS=0x00 PREC=0x00 TTL=64 ID=22143 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=227 RES=0x00 ACK PSH URGP=0 25 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-FORWARD:IN=docker0 OUT=eth0 PHYSIN=veth57b521b MAC=02:42:96:cd:15:f7:02:42:ac:11:00:02:08:00 SRC=172.17.0.2 DST=1.2.3.4 LEN=290 TOS=0x00 PREC=0x00 TTL=63 ID=22143 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=227 RES=0x00 ACK PSH URGP=0 26 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-POSTROUTING:IN= OUT=eth0 PHYSIN=veth57b521b SRC=172.17.0.2 DST=1.2.3.4 LEN=290 TOS=0x00 PREC=0x00 TTL=63 ID=22143 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=227 RES=0x00 ACK PSH URGP=0 27 Jan 21 18:09:52 centos7 kernel: [debug]-raw-PREROUTING:IN=docker0 OUT= PHYSIN=veth57b521b MAC=02:42:96:cd:15:f7:02:42:ac:11:00:02:08:00 SRC=172.17.0.2 DST=1.2.3.4 LEN=664 TOS=0x00 PREC=0x00 TTL=64 ID=22144 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=227 RES=0x00 ACK PSH URGP=0 28 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-PREROUTING:IN=docker0 OUT= PHYSIN=veth57b521b MAC=02:42:96:cd:15:f7:02:42:ac:11:00:02:08:00 SRC=172.17.0.2 DST=1.2.3.4 LEN=664 TOS=0x00 PREC=0x00 TTL=64 ID=22144 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=227 RES=0x00 ACK PSH URGP=0 29 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-FORWARD:IN=docker0 OUT=eth0 PHYSIN=veth57b521b MAC=02:42:96:cd:15:f7:02:42:ac:11:00:02:08:00 SRC=172.17.0.2 DST=1.2.3.4 LEN=664 TOS=0x00 PREC=0x00 TTL=63 ID=22144 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=227 RES=0x00 ACK PSH URGP=0 30 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-POSTROUTING:IN= OUT=eth0 PHYSIN=veth57b521b SRC=172.17.0.2 DST=1.2.3.4 LEN=664 TOS=0x00 PREC=0x00 TTL=63 ID=22144 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=227 RES=0x00 ACK PSH URGP=0 31 Jan 21 18:09:52 centos7 kernel: [debug]-raw-PREROUTING:IN=eth0 OUT= MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.20.242.183 LEN=52 TOS=0x14 PREC=0x00 TTL=48 ID=0 DF PROTO=TCP SPT=10656 DPT=8080 WINDOW=2047 RES=0x00 ACK URGP=0 32 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-PREROUTING:IN=eth0 OUT= MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.20.242.183 LEN=52 TOS=0x14 PREC=0x00 TTL=48 ID=0 DF PROTO=TCP SPT=10656 DPT=8080 WINDOW=2047 RES=0x00 ACK URGP=0 33 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-FORWARD:IN=eth0 OUT=docker0 MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.17.0.2 LEN=52 TOS=0x14 PREC=0x00 TTL=47 ID=0 DF PROTO=TCP SPT=10656 DPT=80 WINDOW=2047 RES=0x00 ACK URGP=0 34 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-POSTROUTING:IN= OUT=docker0 SRC=1.2.3.4 DST=172.17.0.2 LEN=52 TOS=0x14 PREC=0x00 TTL=47 ID=0 DF PROTO=TCP SPT=10656 DPT=80 WINDOW=2047 RES=0x00 ACK URGP=0 35 Jan 21 18:09:52 centos7 kernel: [debug]-raw-PREROUTING:IN=eth0 OUT= MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.20.242.183 LEN=52 TOS=0x14 PREC=0x00 TTL=48 ID=0 DF PROTO=TCP SPT=10656 DPT=8080 WINDOW=2048 RES=0x00 ACK FIN URGP=0 36 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-PREROUTING:IN=eth0 OUT= MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.20.242.183 LEN=52 TOS=0x14 PREC=0x00 TTL=48 ID=0 DF PROTO=TCP SPT=10656 DPT=8080 WINDOW=2048 RES=0x00 ACK FIN URGP=0 37 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-FORWARD:IN=eth0 OUT=docker0 MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.17.0.2 LEN=52 TOS=0x14 PREC=0x00 TTL=47 ID=0 DF PROTO=TCP SPT=10656 DPT=80 WINDOW=2048 RES=0x00 ACK FIN URGP=0 38 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-POSTROUTING:IN= OUT=docker0 SRC=1.2.3.4 DST=172.17.0.2 LEN=52 TOS=0x14 PREC=0x00 TTL=47 ID=0 DF PROTO=TCP SPT=10656 DPT=80 WINDOW=2048 RES=0x00 ACK FIN URGP=0 39 Jan 21 18:09:52 centos7 kernel: [debug]-raw-PREROUTING:IN=docker0 OUT= PHYSIN=veth57b521b MAC=02:42:96:cd:15:f7:02:42:ac:11:00:02:08:00 SRC=172.17.0.2 DST=1.2.3.4 LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=22145 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=227 RES=0x00 ACK FIN URGP=0 40 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-PREROUTING:IN=docker0 OUT= PHYSIN=veth57b521b MAC=02:42:96:cd:15:f7:02:42:ac:11:00:02:08:00 SRC=172.17.0.2 DST=1.2.3.4 LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=22145 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=227 RES=0x00 ACK FIN URGP=0 41 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-FORWARD:IN=docker0 OUT=eth0 PHYSIN=veth57b521b MAC=02:42:96:cd:15:f7:02:42:ac:11:00:02:08:00 SRC=172.17.0.2 DST=1.2.3.4 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=22145 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=227 RES=0x00 ACK FIN URGP=0 42 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-POSTROUTING:IN= OUT=eth0 PHYSIN=veth57b521b SRC=172.17.0.2 DST=1.2.3.4 LEN=52 TOS=0x00 PREC=0x00 TTL=63 ID=22145 DF PROTO=TCP SPT=80 DPT=10656 WINDOW=227 RES=0x00 ACK FIN URGP=0 43 Jan 21 18:09:52 centos7 kernel: [debug]-raw-PREROUTING:IN=eth0 OUT= MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.20.242.183 LEN=52 TOS=0x14 PREC=0x00 TTL=48 ID=0 DF PROTO=TCP SPT=10656 DPT=8080 WINDOW=2048 RES=0x00 ACK URGP=0 44 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-PREROUTING:IN=eth0 OUT= MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.20.242.183 LEN=52 TOS=0x14 PREC=0x00 TTL=48 ID=0 DF PROTO=TCP SPT=10656 DPT=8080 WINDOW=2048 RES=0x00 ACK URGP=0 45 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-FORWARD:IN=eth0 OUT=docker0 MAC=00:16:3e:01:ae:15:ee:ff:ff:ff:ff:ff:08:00 SRC=1.2.3.4 DST=172.17.0.2 LEN=52 TOS=0x14 PREC=0x00 TTL=47 ID=0 DF PROTO=TCP SPT=10656 DPT=80 WINDOW=2048 RES=0x00 ACK URGP=0 46 Jan 21 18:09:52 centos7 kernel: [debug]-mangle-POSTROUTING:IN= OUT=docker0 SRC=1.2.3.4 DST=172.17.0.2 LEN=52 TOS=0x14 PREC=0x00 TTL=47 ID=0 DF PROTO=TCP SPT=10656 DPT=80 WINDOW=2048 RES=0x00 ACK URGP=0
以上涉及raw,mangle和nat几个iptables表,选取第1行到第22行的log,数据包的基本轨迹可以归纳为如下
以上分成5个色块,分别为如下表示:
从以上的log可以看到,从nat表的PREROUTING链开始后,目的端口就从8080变成了80,那这个会不会就是其端口转发的根源呢?我们接下来再分析下。
上面我们说到,数据包从NAT表出来后,目的端口发生了变化。我们将iptables回退到增加log之前的状态,看看这几个iptables的表都有什么变化。
与安装docker前相比,其他几个表都没有什么相关的变化,唯独nat表,看上去有些不一样:
[root@centos7 ~]# iptables -t nat -nvL Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 60 3132 DOCKER all -- * * 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL Chain INPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 0 0 DOCKER all -- * * 0.0.0.0/0 !127.0.0.0/8 ADDRTYPE match dst-type LOCAL Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 0 0 MASQUERADE all -- * !docker0 172.17.0.0/16 0.0.0.0/0 Chain DOCKER (2 references) pkts bytes target prot opt in out source destination 0 0 RETURN all -- docker0 * 0.0.0.0/0 0.0.0.0/0
我们可以看到,它在nat表中增加了一个自定义链DOCKER,而这个链被两处引用,一处是在PREROUTING,另一处在OUTPUT,为什么在这两处呢?
为了回答这个问题,我们有必要先谈谈iptables的基本知识。
简单地讲,Linux iptables一共有5条内置链,分别为PREROUTING,FORWARDING,POSTROUTING,INPUT和OUTPUT。
此外,iptables还有5个表,分别为filter,nat,mangle,raw和security。
数据包经过这些链是有顺序的,经过某个链时,如果链上有对应的表在作用,则去执行该表中对应的链里的rule,大致如下所示:
即数据进入网卡接口后,到达协议栈,首先经过的iptables链是PREROUTING,这里给数据包一些预处理的机会,譬如修改目的地址等,把所有的表都撸一遍,如果看到某个表有PREROUTING链的规则,则进去执行,如果没有就跳过,多说一句,不是每个表都有PREROUTING链,只有raw、mangle和nat表才有的,以下链也是的,后续也就不一一而足了。
数据包经过PREROUTING链的处理后,然后交给Routing决策,结合路由表看看该数据包是继续在协议栈中往下走进入本地进程,还是直接把本网卡所在的机器当路由,直接通过FOWARD出该网卡去往其他网卡,甚至其他机器。
到达了INPUT就表示肯定是去往本地的某个进程了,不过再给一次处理的机会再进入进程,于是乎,再查看下具有INPUT链的几张表(即mangle、nat和filter),看看其中有没有能够匹配的规则,有就执行,对数据包作送入进程socket前的最后一次处理。
一旦到达了进程socket,进入进程用户空间,这个数据包的生命就该到达终点了。
本来到了这里,故事也就结束了,然而实际使用的协议都是有来有回的,回包的起点就是本地进程了,因而出现了上图中的右边部分,即当本地进程发起一个数据包(这里可能是一个回包)时,首先就要交给OUTPUT链,在该链上先给个机会对进程发出的数据包做个处理,然后再经过路由决策决定发往哪张网卡。
经由上面的分析,Docker对于入向包在PREROUTING链中处理,而出向包在OUTPUT链中处理,也就顺理成章了吧。也就是说,要赶在让路由策略决定往哪里发前,先处理下,这样保证如向包能够顺利进入相应的进程,而出向包能够达到相应的网卡接口。
我们先来解读下以上的nat表rule的作用:
以上就是docker安装好的nat表,启动了以上nginx的端口转发后,
既然我们谈的是Docker的端口转发,那我们真的来一个转发看看,瞧瞧iptables中有什么变化:
docker run -d -p 8080:80 nginx:latest
看iptables变化:
[root@centos7 ~]# iptables -t nat -nvL Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 60 3132 DOCKER all -- * * 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL Chain INPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 0 0 DOCKER all -- * * 0.0.0.0/0 !127.0.0.0/8 ADDRTYPE match dst-type LOCAL Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination 0 0 MASQUERADE all -- * !docker0 172.17.0.0/16 0.0.0.0/0 0 0 MASQUERADE tcp -- * * 172.17.0.2 172.17.0.2 tcp dpt:80 Chain DOCKER (2 references) pkts bytes target prot opt in out source destination 0 0 RETURN all -- docker0 * 0.0.0.0/0 0.0.0.0/0 0 0 DNAT tcp -- !docker0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8080 to:172.17.0.2:80
好家伙,偷偷在POSTROUTING和DOCKER中增加了两条rule:
看到这里,我们或许已经大体上有了一个包的基本流向的概念了,结合已经的iptables的log,我们来分析下具体的流向。
首先,查看了本地的路由:
[root@centos7 ~]# ip route default via 172.20.255.253 dev eth0 169.254.0.0/16 dev eth0 scope link metric 1002 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 172.20.240.0/20 dev eth0 proto kernel scope link src 172.20.242.183
我们从这台机器去inspect这个nginx container,拿到部分IP信息如下:
"Networks": {
"bridge": {
"IPAMConfig": null,
"Links": null,
"Aliases": null,
"NetworkID": "a2d0aec96e15c5cb8ead04d0d95fd00d71b72b17a8214ce78b4af9a4c71b6248",
"EndpointID": "c1faabc8c50e3a703219900e029ebf813cfcd619030865c2d8991ac966f56b95",
"Gateway": "172.17.0.1",
"IPAddress": "172.17.0.2",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "02:42:ac:11:00:02"
}
}
并查看本地的所有的网卡接口的地址:
[root@centos7 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:16:3e:01:ae:15 brd ff:ff:ff:ff:ff:ff inet 172.20.242.183/20 brd 172.20.255.255 scope global dynamic eth0 valid_lft 315273420sec preferred_lft 315273420sec inet6 fe80::216:3eff:fe01:ae15/64 scope link valid_lft forever preferred_lft forever 3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether 02:42:c0:8e:53:0e brd ff:ff:ff:ff:ff:ff inet 172.17.0.1/16 scope global docker0 valid_lft forever preferred_lft forever inet6 fe80::42:c0ff:fe8e:530e/64 scope link valid_lft forever preferred_lft forever 11: veth02d2a0d@if10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default link/ether da:75:64:82:95:10 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet6 fe80::d875:64ff:fe82:9510/64 scope link valid_lft forever preferred_lft forever
那么从外网1.2.3.4发出的第一个SYN包(log总结中粉色部分)的走向是这样的:
那从容器里返回的SYN+ACK包该怎么办呢?结合上面的log总结中的橘色部分,我们可以作如下推演:
接下来的部分,就显得雷同了,大差不差,只是少了进入nat表了,原因之前说了,iptables是有状态的。那我们就不在这里赘述了。
[1] https://www.cnblogs.com/yum777/articles/8514636.html
[2] https://tonybai.com/2016/01/15/understanding-container-networking-on-single-host/
[3] https://www.rigacci.org/wiki/lib/exe/fetch.php/doc/appunti/linux/sa/iptables/conntrack.html
原文:https://www.cnblogs.com/lylex/p/14304318.html