环境:centos7.4 内核版本3.10
最近看内核参数tcp_tw_recycle(该参数在内核 4.12 之后被移除),它用于快速回收处理TIME_WAIT状态的socket。搜索该参数相关的资料,发现同时启用该参数和tcp_timestamps后有可能在NAT环境下导致客户端始连接失败,抓包表现为:客户端一直发送SYN报文,但服务端不响应。但这些文章中只给出了如何解决问题,并没有给出如何复现问题。特别怪异的是,服务端是被动关闭的,并不会进入TIME_WAIT状态,到底怎么产生的呢?
先使用如下拓扑复现该场景,其中10.85.3.51机器为NAT服务器,10.85.1.2和10.85.3.52通过NAT服务器访问server 10.85.3.111:19090
+-------------+ | 10.85.1.2 +------------+ +-------------+ | +-----+-------+ +----------------------+ | 10.85.3.51 +---------+ 10.85.3.111:19090 | +-----+-------+ +----------------------+ +-------------+ | | 10.85.3.52 +------------+ +-------------+
在10.85.3.51机器上配置如下iptables表项,用于转发client和server之间的TCP报文。(10.85.3.51需要开启net.ipv4.ip_forward功能)
# iptables -t nat -I PREROUTING -d 10.85.3.51 -p tcp -m tcp --dport 29090 -j DNAT --to 10.85.3.111:19090# iptables -t nat -I POSTROUTING -d 10.85.3.111 -p tcp -m tcp --dport 19090 -j SNAT --to 10.85.3.51
在10.85.3.111上进行抓包并且启动10.85.1.2和10.85.3.52进行连接。报文如下,其中第4和第7条为两个连接的TCP SYN报文,后续server都进行了回复,两条连接正常建链
1 # tcpdump -i eth0 src port 19090 or dst port 19090 2 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode 3 listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes 4 17:39:27.970358 IP 10.85.3.51.57104 > 10.85.3.111.19090: Flags [S], seq 2466985868, win 25200, options [mss 1260,sackOK,TS val 3075335984 ecr 0,nop,wscale 7], length 0 5 17:39:27.970417 IP 10.85.3.111.19090 > 10.85.3.51.57104: Flags [S.], seq 2846609535, ack 2466985869, win 24960, options [mss 1260,sackOK,TS val 2612548200 ecr 3075335984,nop,wscale 7], length 0 6 17:39:27.970783 IP 10.85.3.51.57104 > 10.85.3.111.19090: Flags [.], ack 1, win 197, options [nop,nop,TS val 3075335985 ecr 2612548200], length 0
7 17:39:29.059890 IP 10.85.3.51.34230 > 10.85.3.111.19090: Flags [S], seq 2892210420, win 25200, options [mss 1260,sackOK,TS val 1740811766 ecr 0,nop,wscale 7], length 0 8 17:39:29.059949 IP 10.85.3.111.19090 > 10.85.3.51.34230: Flags [S.], seq 3434079625, ack 2892210421, win 24960, options [mss 1260,sackOK,TS val 2612549289 ecr 1740811766,nop,wscale 7], length 0 9 17:39:29.060623 IP 10.85.3.51.34230 > 10.85.3.111.19090: Flags [.], ack 1, win 197, options [nop,nop,TS val 1740811767 ecr 2612549289], length 0
启用tcp_tw_recycle,重复上面操作。发现及时后面一个连接的SYN报文的时间戳小于前面一个连接的SYN报文中的时间戳,也能够正常建链,并没有出现连接异常。
1 # tcpdump -i eth0 src port 19090 or dst port 19090 2 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode 3 listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes 4 17:49:12.111152 IP 10.85.3.51.58164 > 10.85.3.111.19090: Flags [S], seq 2599215624, win 25200, options [mss 1260,sackOK,TS val 3075920126 ecr 0,nop,wscale 7], length 0 5 17:49:12.111221 IP 10.85.3.111.19090 > 10.85.3.51.58164: Flags [S.], seq 795235982, ack 2599215625, win 24960, options [mss 1260,sackOK,TS val 2613132341 ecr 3075920126,nop,wscale 7], length 0 6 17:49:12.111766 IP 10.85.3.51.58164 > 10.85.3.111.19090: Flags [.], ack 1, win 197, options [nop,nop,TS val 3075920127 ecr 2613132341], length 0
7 17:49:12.871092 IP 10.85.3.51.34234 > 10.85.3.111.19090: Flags [S], seq 3696139072, win 25200, options [mss 1260,sackOK,TS val 1741395578 ecr 0,nop,wscale 7], length 0 8 17:49:12.871149 IP 10.85.3.111.19090 > 10.85.3.51.34234: Flags [S.], seq 3928136503, ack 3696139073, win 24960, options [mss 1260,sackOK,TS val 2613133101 ecr 1741395578,nop,wscale 7], length 0 9 17:49:12.871697 IP 10.85.3.51.34234 > 10.85.3.111.19090: Flags [.], ack 1, win 197, options [nop,nop,TS val 1741395579 ecr 2613133101], length 0
因此复现场景为:服务端主动断开与客户端的一条连接,在后续的TCP_PAWS_MSL(60s)时间内,如果客户端发过来的SYN报文的TSVal时间戳小于系统保留的上一个连接的时间戳,则该SYN报文会被丢弃,实际表现为客户端连接超时或很慢(60s之后可正常连接)
1 # tcpdump -i eth0 src host 10.85.3.51 or dst host 10.85.3.51 2 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode 3 listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes 4 22:45:10.015335 IP 10.85.3.111.49416 > 10.85.3.51.ssh: Flags [S], seq 1039611753, win 25200, options [mss 1260,sackOK,TS val 2630890245 ecr 0,nop,wscale 7], length 0 5 22:45:10.016055 IP 10.85.3.51.ssh > 10.85.3.111.49416: Flags [S.], seq 489573340, ack 1039611754, win 24960, options [mss 1260,sackOK,TS val 3035577005 ecr 2630890245,nop,wscale 7], length 0 6 22:45:10.016074 IP 10.85.3.111.49416 > 10.85.3.51.ssh: Flags [.], ack 1, win 197, options [nop,nop,TS val 2630890246 ecr 3035577005], length 0 7 22:45:10.023482 IP 10.85.3.51.ssh > 10.85.3.111.49416: Flags [P.], seq 1:22, ack 1, win 195, options [nop,nop,TS val 3035577013 ecr 2630890246], length 21 8 22:45:10.023507 IP 10.85.3.111.49416 > 10.85.3.51.ssh: Flags [.], ack 22, win 197, options [nop,nop,TS val 2630890253 ecr 3035577013], length 0 9 10 22:45:15.648562 IP 10.85.3.111.49416 > 10.85.3.51.ssh: Flags [F.], seq 1, ack 22, win 197, options [nop,nop,TS val 2630895878 ecr 3035577013], length 0 11 22:45:15.649128 IP 10.85.3.51.ssh > 10.85.3.111.49416: Flags [.], ack 2, win 195, options [nop,nop,TS val 3035582639 ecr 2630895878], length 0 12 22:45:15.651394 IP 10.85.3.51.ssh > 10.85.3.111.49416: Flags [F.], seq 22, ack 2, win 195, options [nop,nop,TS val 3035582641 ecr 2630895878], length 0 13 22:45:15.651411 IP 10.85.3.111.49416 > 10.85.3.51.ssh: Flags [.], ack 23, win 197, options [nop,nop,TS val 2630895881 ecr 3035582641], length 0
在断开连接的TCP_PAWS_MSL时间内启动10.85.1.2通过NAT连接到server,server端抓包可以看到该连接的SYN报文的时间戳1759176699远小于保存的时间戳3035582641,此时server端丢弃接收到的所有SYN报文,客户端连接超时。
1 22:45:33.942378 IP 10.85.3.51.34264 > 10.85.3.111.19090: Flags [S], seq 668096838, win 25200, options [mss 1260,sackOK,TS val 1759176699 ecr 0,nop,wscale 7], length 0 2 22:45:34.942300 IP 10.85.3.51.34264 > 10.85.3.111.19090: Flags [S], seq 668096838, win 25200, options [mss 1260,sackOK,TS val 1759177700 ecr 0,nop,wscale 7], length 0 3 22:45:36.946320 IP 10.85.3.51.34264 > 10.85.3.111.19090: Flags [S], seq 668096838, win 25200, options [mss 1260,sackOK,TS val 1759179704 ecr 0,nop,wscale 7], length 0
TIPS:
参考:
10.85.3.111
linux开启tcp_timestamps和tcp_tw_recycle引发的问题研究
原文:https://www.cnblogs.com/charlieroro/p/11593410.html