简介
etcd 是一款分布式存储中间件,使用 Go 语言编写,并通过 Raft 一致性算法处理和确保分布式一致性,解决了分布式系统中数据一致性的问题
etcd 的核心架构
etcd Server: 对外接收和处理客户端的请求gRPC Server:etcd与其他etcd节点之间的通信和信息同步MVCC: 多版本控制,etcd的存储模块,键值对的每一次操作行为都会被记录存储,这些数据底层存储在BoltDB数据库中Snapshot、WAL:WAL预写式日志,etcd中的数据提交前都会记录到日志。Snapshot快照,以防WAL日志过多,用于存储某一时刻etcd的所有数据。Snapshot和WAL相结合,etcd可以有效地进行数据存储和节点故障恢复等操作。Raft模块 : 实现分布式集群的一致性
版本选择
etcd 目前有 V2.x 和 V3.x 两个大版本
主流版本 V3.x。
安装
三节点部署, etcd 和 etcdctl 文件 在 /usr/bin 目录下
前置条件
节点 ip1
2
3172.20.40.173
172.20.40.196
172.20.40.107
生成 etcd 的server 证书
当前和 k8s 共同部署,k8s 文件均相同, 在 /etc/kubernetes/ssl/ 目录下1
2
3/etc/kubernetes/ssl/ca.pem
/etc/kubernetes/ssl/ca-key.pem
/etc/kubernetes/ssl/ca-config.json
/etc/kubernetes/ssl/ca-config.json 示例1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19[root@test-173 test]# cat /etc/kubernetes/ssl/ca-config.json
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"kubernetes": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry": "87600h"
}
}
}
}
etcd 使用的 etcd-csr.json 需要修改 IP, 在 /etc/etcd/ssl 目录下
1 | [root@test-173 ssl]# pwd |
在每个节点的 的 /etc/etcd/ssl 目录下执行, 生成 etcd.csr , etcd-key.pem , etcd.pem1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20[root@test-107 ssl]# pwd
/etc/etcd/ssl
[root@test-107 ssl]# ls
etcd-csr.json
[root@test-173 ssl]# cfssl gencert -ca=/etc/kubernetes/ssl/ca.pem -ca-key=/etc/kubernetes/ssl/ca-key.pem -config=/etc/kubernetes/ssl/ca-config.json -profile=kubernetes /etc/etcd/ssl/etcd-csr.json | cfssljson -bare etcd
2021/04/06 13:48:22 [INFO] generate received request
2021/04/06 13:48:22 [INFO] received CSR
2021/04/06 13:48:22 [INFO] generating key: rsa-2048
2021/04/06 13:48:23 [INFO] encoded CSR
2021/04/06 13:48:23 [INFO] signed certificate with serial number 518875750718489889129533277175826954994664517958
2021/04/06 13:48:23 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
cfssl gencert -ca=/etc/kubernetes/ssl/ca.pem -ca-key=/etc/kubernetes/ssl/ca-key.pem -config=/etc/kubernetes/ssl/ca-config.json -profile=kubernetes /etc/etcd/ssl/etcd-csr.json | cfssljson -bare etcd
[root@test-107 ssl]# ls
etcd.csr etcd-csr.json etcd-key.pem etcd.pem
创建 etcd.service 启动文件
每个节点创建文件,对应修改
eg: 172.20.40.107 名字为 etcd31
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35[root@test-107 ssl]# cat /etc/systemd/system/etcd.service
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
ExecStart=/usr/bin/etcd \
--name=etcd3 \
--cert-file=/etc/etcd/ssl/etcd.pem \
--key-file=/etc/etcd/ssl/etcd-key.pem \
--peer-cert-file=/etc/etcd/ssl/etcd.pem \
--peer-key-file=/etc/etcd/ssl/etcd-key.pem \
--trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
--peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
--initial-advertise-peer-urls=https://172.20.40.107:2380 \
--listen-peer-urls=https://172.20.40.107:2380 \
--listen-client-urls=https://172.20.40.107:2379,http://127.0.0.1:2379 \
--advertise-client-urls=https://172.20.40.107:2379 \
--initial-cluster-token=etcd-cluster-0 \
--initial-cluster=etcd1=https://172.20.40.173:2380,etcd2=https://172.20.40.196:2380,etcd3=https://172.20.40.107:2380 \
--initial-cluster-state=new \
--data-dir=/mnt/etcd_data
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
[root@test-107 ssl]# systemctl daemon-reload && systemctl enable --now etcd.service
验证, 当前正常
客户端交互端口 : 2379
peer通信端口 : 2380
peer : 同一集群的另一个 member1
2
3
4
5
6
7
8
9
10[root@test-107 ssl]# etcdctl --endpoints=http://localhost:2379 member list # 查看集群成员
d20c6ab8456f5fd, started, etcd1, https://172.20.40.173:2380, https://172.20.40.173:2379, false
71c9147302874bdf, started, etcd3, https://172.20.40.107:2380, https://172.20.40.107:2379, false
ce0a9eb4851e47ac, started, etcd2, https://172.20.40.196:2380, https://172.20.40.196:2379, fasle
[root@test-107 ssl]# ETCDCTL_API=3 /usr/bin/etcdctl --endpoints=https://172.20.40.196:2379,https://172.20.40.107:2379,https://172.20.40.173:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem endpoint health # 查看节点健康
https://172.20.40.107:2379 is healthy: successfully committed proposal: took = 26.66478ms
https://172.20.40.173:2379 is healthy: successfully committed proposal: took = 30.31433ms
https://172.20.40.196:2379 is healthy: successfully committed proposal: took = 37.859764ms
etcd.service 部分参数说明
1 | --name:etcd 集群中的节点名,这里可以随意,方便区分且不重复即可。 |
升级
仅需 替换二进制文件 etcd, etcdctl
备份恢复
在任一正常节点执行如下操作
eg:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33[root@test-173 test]# export ETCDCTL_API=3; /usr/bin/etcdctl --endpoints=172.20.40.107:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem snapshot save /root/backup/etcd/snapshot-20210406-143852.db
{"level":"info","ts":1617691138.681973,"caller":"snapshot/v3_snapshot.go:119","msg":"created temporary db file","path":"/root/backup/etcd/snapshot-20210406-143852.db.part"}
{"level":"info","ts":"2021-04-06T14:38:58.698+0800","caller":"clientv3/maintenance.go:200","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":1617691138.698508,"caller":"snapshot/v3_snapshot.go:127","msg":"fetching snapshot","endpoint":"172.20.40.107:2379"}
{"level":"info","ts":"2021-04-06T14:38:58.990+0800","caller":"clientv3/maintenance.go:208","msg":"completed snapshot read; closing"}
{"level":"info","ts":1617691138.9978282,"caller":"snapshot/v3_snapshot.go:142","msg":"fetched snapshot","endpoint":"172.20.40.107:2379","size":"10 MB","took":0.315710239}
{"level":"info","ts":1617691139.000513,"caller":"snapshot/v3_snapshot.go:152","msg":"saved","path":"/root/backup/etcd/snapshot-20210406-143852.db"}
Snapshot saved at /root/backup/etcd/snapshot-20210406-143852.db
[root@test-173 test]# export ETCDCTL_API=3; /usr/bin/etcdctl --endpoints=172.20.40.107:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem snapshot save /root/backup/etcd/snapshot-20210406-143852.db
{"level":"info","ts":1617691138.681973,"caller":"snapshot/v3_snapshot.go:119","msg":"created temporary db file","path":"/root/backup/etcd/snapshot-20210406-143852.db.part"}
{"level":"info","ts":"2021-04-06T14:38:58.698+0800","caller":"clientv3/maintenance.go:200","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":1617691138.698508,"caller":"snapshot/v3_snapshot.go:127","msg":"fetching snapshot","endpoint":"172.20.40.107:2379"}
{"level":"info","ts":"2021-04-06T14:38:58.990+0800","caller":"clientv3/maintenance.go:208","msg":"completed snapshot read; closing"}
{"level":"info","ts":1617691138.9978282,"caller":"snapshot/v3_snapshot.go:142","msg":"fetched snapshot","endpoint":"172.20.40.107:2379","size":"10 MB","took":0.315710239}
{"level":"info","ts":1617691139.000513,"caller":"snapshot/v3_snapshot.go:152","msg":"saved","path":"/root/backup/etcd/snapshot-20210406-143852.db"}
Snapshot saved at /root/backup/etcd/snapshot-20210406-143852.db
## 恢复
systemctl stop etcd.service
cd /root/backup/etcd/ && \
ETCDCTL_API=3 /use/bin/etcdctl snapshot restore snapshot-20210406-143852.db\
--name etcd-3 \
--initial-cluster etcd1=https://172.20.40.173:2380,etcd2=https://172.20.40.196:2380,etcd3=https://172.20.40.107:2380 \
--initial-cluster-token etcd-cluster-0 \
--initial-advertise-peer-urls https://172.20.40.107:2380"
具体可以参考 之前编写的 ansible 文件
用 kubeasz 安装1
etcdctl --endpoints=https://172.20.19.14:2379,https://172.20.19.9:2379,https://172.20.19.17:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/kubernetes/ssl/etcd.pem --key=/etc/kubernetes/ssl/etcd-key.pem endpoint health
etcd 网关与 gRPC-Gateway
etcd 网关通常用于 etcd 集群的门户,是一个简单的 TCP 代理,将客户端请求转发到 etcd 集群,对外屏蔽了 etcd 集群内部的实际情况,在集群出现故障或者异常时,可以通过 etcd 网关进行切换;
gRPC-Gateway 则是对于 etcd 的 gRPC 通信协议的补充,有些语言的客户端不支持 gRPC 通信协议,此时就可以使用 gRPC-Gateway 对外提供的 HTTP API 接口。通过 HTTP 请求,实现与 gRPC 调用协议同样的功能。