k8s修改iptables为ipvs

一条普通的 iptables 规则

1
2
3
4
5
6
[root@k8s01 ~]# iptables -I INPUT -p tcp -m tcp --dport 22 -m comment --comment "allow SSH to this host from anywhere" -j ACCEPT  # -m comment 注释
You have new mail in /var/spool/mail/root
[root@k8s01 ~]# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT tcp -- anywhere anywhere tcp dpt:ssh /* allow SSH to this host from anywhere */

ipvs 采用的 hash 表,iptables 采用一条条的规则列表。集群数量越多 iptables 规则就越多,而iptables 规则是从上到下匹配,所以效率就越是低下。因此当 service 数量达到一定规模时,hash 查表的速度优势就会显现出来,从而提高 service 的服务性能

使用 ipvs

安装软件

1
[root@k8s01 ~]# yum install -y ipset ipvsadm

加载内核模块

1
2
3
4
5
6
7
8
9
10
11
12
13
[root@k8s01 ~]# cat << 'EOF' > /etc/sysconfig/modules/ipvs.modules
#!/bin/bash
ipvs_modules=(ip_vs ip_vs_lc ip_vs_wlc ip_vs_rr ip_vs_wrr ip_vs_lblc ip_vs_lblcr ip_vs_dh
ip_vs_sh ip_vs_fo ip_vs_nq ip_vs_sed ip_vs_ftp nf_conntrack_ipv4)
for kernel_module in ${ipvs_modules[*]}; do
/sbin/modinfo -F filename ${kernel_module} > /dev/null 2>&1
if [ $? -eq 0 ]; then
/sbin/modprobe ${kernel_module}
fi
done
EOF
[root@ k8s01 ~]# chmod +x /etc/sysconfig/modules/ipvs.modules
[root@ k8s01 ~]# /etc/sysconfig/modules/ipvs.modules

修改 iptables 为 ipvs

先备份 iptables;

1
2
iptables-save > my.ipt
# iptables-restore < my.ipt # 恢复

清空 iptables 规则

1
2
# iptables -t filter -F; iptables -t filter -X; iptables -t nat -F;
iptables -t nat -X;

kube-proxy 容器化运行时

修改 cm, 然后重启 kube-proxy 容器

1
2
3
4
5
[root@k8s01 ~]# kubectl -n kube-system edit cm kube-proxy
......
mode: "ipvs"
......
[root@k8s01 ~]# kubectl -n kube-system get pod -l k8s-app=kube-proxy | grep -v 'NAME' | awk '{print $1}' | xargs kubectl -n kube-system delete pod

kube-proxy 二进制方式运行

查看 kube-proxy.service

1
2
3
4
5
6
7
8
9
10
[root@hdp01 ~]# systemctl status kube-proxy.service
● kube-proxy.service - Kubernetes Kube-Proxy Server
Loaded: loaded (/etc/systemd/system/kube-proxy.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2022-09-07 22:41:12 EDT; 4 days ago
Docs: https://github.com/GoogleCloudPlatform/kubernetes
Main PID: 1169 (kube-proxy)
Tasks: 20
Memory: 32.2M
CGroup: /system.slice/kube-proxy.service
└─1169 /opt/kube/bin/kube-proxy --config=/var/lib/kube-proxy/kube-proxy-config.yaml

修改 /var/lib/kube-proxy/kube-proxy-config.yaml

1
mode: "ipvs"

重启 kube-proxy.service

1
systemctl restart kube-proxy.service

IPVS

低版本的 ipvs 的 serviceip 是可以 ping 通的,但高版本不行,因为 自动添加了iptables 规则

1
iptables -D  KUBE-IPVS-FILTER -m conntrack --ctstate NEW -m set --match-set KUBE-IPVS-IPS dst -j REJECT --reject-with icmp-port-unreachable

手动删除后可以 ping 通。但一段时间后会自动将这条规则添加回来

1
iptables -A  KUBE-IPVS-FILTER -m conntrack --ctstate NEW -m set --match-set KUBE-IPVS-IPS dst -j REJECT --reject-with icmp-port-unreachable

iptables -L -n

1
2
3
4
5
6
Chain KUBE-IPVS-FILTER (1 references)
target prot opt source destination
RETURN all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOAD-BALANCER dst,dst
RETURN all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-CLUSTER-IP dst,dst
RETURN all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-EXTERNAL-IP dst,dst
REJECT all -- 0.0.0.0/0 0.0.0.0/0 ctstate NEW match-set KUBE-IPVS-IPS dst reject-with icmp-port-unreachable