Contents

1 k8s部分host迁移

k8s部分host迁移

说明/背景/目标

之前在两台主机上组了k8s集群，一台有公网ip和公网域名解析，一台有nfs提供存储服务。

k8s集群信息

主机名	ip	操作系统	容器环境
jingmin-kube-master1	192.168.1.1	CentOS Stream release 8 Linux jingmin-kube-master1 4.18.0-448.el8.x86_64 #1 SMP Wed Jan 18 15:02:46 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux	docker-ce
jingmin-kube-archlinux	192.168.1.7	archlinux Linux jingmin-kube-archlinux 6.1.44-1-lts #1 SMP PREEMPT_DYNAMIC Tue, 08 Aug 2023 19:07:19 +0000 x86_64 GNU/Linux	docker-ce

安装的k8s是1.27.4

jingmin-kube-archlinux 这台主机是archlinux不够稳定（操作系统经常会有更新，同时nfs服务也有单点故障隐患）

内网中已经部署了一套ceph分布式存储服务，想要将k8s的存储由nfs切换到ceph rbd服务。

ceph集群信息

主机名	ip	操作系统	容器环境
wangjm-B550M-K-1	192.168.1.8	ubuntu 22	docker.io
wangjm-B550M-K-2	192.168.1.9	ubuntu 22	docker.io
wangjm-B550M-K-3	192.168.1.10	ubuntu 22	podman-docker

目标：

ceph集群的3台主机，也加入到k8s集群中。

k8s的storage provisoner由 nfs 切换为 ceph rbd.

关停 jingmin-kube-archlinux这台主机。

3台主机加入k8s集群

加载内核模块

linux开启内核转发，加载转发和容器需要的模块

参考：https://kubernetes.io/zh-cn/docs/setup/production-environment/container-runtimes/#install-and-configure-prerequisites

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

# 设置所需的 sysctl 参数，参数在重新启动后保持不变
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

# 应用 sysctl 参数而不重新启动
sudo sysctl --system

通过运行以下指令确认 br_netfilter 和 overlay 模块被加载：

lsmod | grep br_netfilter
lsmod | grep overlay

通过运行以下指令确认 net.bridge.bridge-nf-call-iptables、net.bridge.bridge-nf-call-ip6tables 和 net.ipv4.ip_forward 系统变量在你的 sysctl 配置中被设置为 1：

sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward

关swap

查看目前是否有swap

free -m

修改系统加载时不使用swap

echo "vm.swappiness = 0" >> /etc/sysctl.d/k8s.conf
sysctl --system

关闭所有的swap (本次启动期间有效)

swapoff -a

关闭所有的swap (下次启动有效)

vim /etc/fstab

很多人说这里直接注释掉swap分区那一行就可以了。

但是有的系统不适用。systemd会自动扫描gpt分区，加载其中的swap分区。要么fdisk工具删分区，要么这里/etc/fstab中swap分区的options那一列由defaults改为noauto，即不自动挂载。

添加k8s仓库，安装基础服务

参考官方文档： https://kubernetes.io/zh-cn/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#installing-kubeadm-kubelet-and-kubectl

更新 apt 包索引并安装使用 Kubernetes apt 仓库所需要的包：

apt-get update
# apt-transport-https 可能是一个虚拟包（dummy package）；如果是的话，你可以跳过安装这个包
apt-get install -y apt-transport-https ca-certificates curl gpg

下载用于 Kubernetes 软件包仓库的公共签名密钥。所有仓库都使用相同的签名密钥，因此你可以忽略URL中的版本：

由于当前的k8s集群是1.27.4版本，这里使用一致的版本。

curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.27/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

添加 Kubernetes apt 仓库。请注意，此仓库仅包含适用于 Kubernetes 1.27 的软件包；对于其他 Kubernetes 次要版本，则需要更改 URL 中的 Kubernetes 次要版本以匹配你所需的次要版本（你还应该检查正在阅读的安装文档是否为你计划安装的 Kubernetes 版本的文档）。

# 此操作会覆盖 /etc/apt/sources.list.d/kubernetes.list 中现存的所有配置。
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.27/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list

更新 apt 包索引，安装 kubelet、kubeadm 和 kubectl，并锁定其版本：

apt-get update
apt-get install -y kubelet kubeadm kubectl
apt-mark hold kubelet kubeadm kubectl

配置开机启动相关服务

systemctl enable containerd
systemctl enable kubelet
#systemctl enable docker

配置containerd代理（可选）

后面用kubeadm初始化k8s的时候，发现有镜像拉不下来，可以选择这里加下代理配置。

vim /lib/systemd/system/containerd.service

临时配置下containerd.service的代理(加了Evironment两行)，用完可以注释掉

# Copyright The containerd Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target local-fs.target

[Service]
Environment="HTTP_PROXY=http://192.168.1.7:8889"
Environment="HTTPS_PROXY=http://192.168.1.7:8889"
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/bin/containerd

Type=notify
Delegate=yes
KillMode=process
Restart=always
RestartSec=5
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=infinity
# Comment TasksMax if your systemd version does not supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
OOMScoreAdjust=-999

[Install]
WantedBy=multi-user.target

重载服务，重启服务

systemctl daemon-reload
systemctl restart containerd

新节点加入集群 //todo 添加 controlplane

在k8s之前的master上

[root@jingmin-kube-archlinux ~]# kubeadm token create --print-join-command
kubeadm join 192.168.1.1:6443 --token 9a9200.029fdpyenaxeki34 --discovery-token-ca-cert-hash sha256:bdc79fe69b3bb037f29068ce7a7f39a88e3c8d166317131121da42b4f2f6bf1b

在待加入节点上执行，上面提示的命令

kubeadm join 192.168.1.1:6443 --token 9a9200.029fdpyenaxeki34 --discovery-token-ca-cert-hash sha256:bdc79fe69b3bb037f29068ce7a7f39a88e3c8d166317131121da42b4f2f6bf1b

看下

[root@jingmin-kube-archlinux ~]# kubectl get all -n kube-flannel 
NAME                        READY   STATUS    RESTARTS       AGE
pod/kube-flannel-ds-ftj95   1/1     Running   11 (73d ago)   233d
pod/kube-flannel-ds-m9df4   1/1     Running   2 (88s ago)    96s
pod/kube-flannel-ds-zb7xd   1/1     Running   34 (21d ago)   233d

NAME                             DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/kube-flannel-ds   3         3         3       3            3           <none>          258d
[root@jingmin-kube-archlinux ~]# kubectl get nodes
NAME                     STATUS   ROLES           AGE    VERSION
jingmin-kube-archlinux   Ready    <none>          258d   v1.29.0
jingmin-kube-master1     Ready    control-plane   258d   v1.27.4
wangjm-b550m-k-1         Ready    <none>          26m    v1.27.13

//todo 添加controlplane https://blog.slys.dev/adding-worker-and-control-plane-nodes-to-the-kubernetes-cluster/

//todo https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/

//todo https://www.reddit.com/r/kubernetes/comments/18xddaj/how_do_i_add_another_control_plane_node_using/

在已有的control plane上执行，更新certs

#对我而言， admin.conf中 server ip 需要调整为192.168.1.1
[root@jingmin-kube-master1 kubernetes]# vim /etc/kubernetes/admin.conf 

[root@jingmin-kube-master1 kubernetes]# kubeadm init phase upload-certs --upload-certs
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
be9c208662bfe1f0a6ad1cbe160c64a7a87b924c184b70dce0cb356573bd2d67


[root@jingmin-kube-master1 kubernetes]# kubeadm token create --print-join-command
kubeadm join 192.168.1.1:6443 --token u6nmrk.coo18mgamsesghy2 --discovery-token-ca-cert-hash sha256:bdc79fe69b3bb037f29068ce7a7f39a88e3c8d166317131121da42b4f2f6bf1b

在新节点上

#如果前面已经加为 worker节点， 需要重新配置
root@wangjm-B550M-K-1:~# kubeadm reset


root@wangjm-B550M-K-1:~# kubeadm join 192.168.1.1:6443 --token 9a9200.029fdpyenaxeki34 --discovery-token-ca-cert-hash sha256:bdc79fe69b3bb037f29068ce7a7f39a88e3c8d166317131121da42b4f2f6bf1b --control-plane --certificate-key be9c208662bfe1f0a6ad1cbe160c64a7a87b924c184b70dce0cb356573bd2d67
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W0506 12:07:45.984967  107889 checks.go:835] detected that the sandbox image "registry.k8s.io/pause:3.8" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.k8s.io/pause:3.9" as the CRI sandbox image.
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[download-certs] Saving the certificates to the folder: "/etc/kubernetes/pki"
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [jingmin-kube-master1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local ole12138.top wangjm-b550m-k-1] and IPs [172.31.0.1 192.168.1.8 192.168.1.1]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost wangjm-b550m-k-1] and IPs [192.168.1.8 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost wangjm-b550m-k-1] and IPs [192.168.1.8 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
The 'update-status' phase is deprecated and will be removed in a future release. Currently it performs no operation
[mark-control-plane] Marking the node wangjm-b550m-k-1 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node wangjm-b550m-k-1 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]

This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

        mkdir -p $HOME/.kube
        sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
        sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

按照上面的提示，master节点上执行

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

然后kubectl命令就可以使用了，可以查看和操作k8s中的各种资源了。

网络上的版本：

默认情况下主节点是不会调度运行pod的，如果想要master节点也可以运行pod，执行如下指令

kubectl taint nodes --all node-role.kubernetes.io/master-

末尾的减号标识去除相应的taint

官网参考： https://kubernetes.io/zh-cn/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#control-plane-node-isolation

如果你希望能够在控制平面节点上调度 Pod，例如单机 Kubernetes 集群，请运行:

kubectl taint nodes --all node-role.kubernetes.io/control-plane-

附录： kubelet故障恢复（失败，搞炸了）

不小心把control plane的 /etc/kubenetes/kubelet.conf删掉了。。。

然后kubelet服务启动失败了。

然后报错： Kubeadm fails – kubelet fails to find /etc/kubernetes/bootstrap-kubelet.conf

重装kubelet：还是失败。

参考： https://github.com/kubernetes-sigs/kubespray/issues/3769

参考： https://www.jianshu.com/p/4f7d977a2b91

重新生成 certs相关

备份并重新生成证书

# cd /etc/kubernetes/pki/
# mkdir backup
# mv  apiserver.crt apiserver-etcd-client.key apiserver-kubelet-client.crt front-proxy-ca.crt front-proxy-client.crt front-proxy-client.key front-proxy-ca.key apiserver-kubelet-client.key apiserver.key apiserver-etcd-client.crt backup
# kubeadm init phase certs all

对我而言是：

# kubeadm init phase certs all --apiserver-advertise-address=0.0.0.0 --apiserver-cert-extra-sans localhost --apiserver-cert-extra-sans ole12138.top
# kubeadm init phase certs all --apiserver-advertise-address=192.168.1.1 --apiserver-cert-extra-sans localhost --apiserver-cert-extra-sans ole12138.top  --control-plane-endpoint=192.168.1.1

备份并重新生成配置文件

# cd /etc/kubernetes/
# mkdir backup
# mv admin.conf controller-manager.conf kubelet.conf scheduler.conf  backup
# kubeadm init phase kubeconfig all

对我而言，还需要手动调整/etc/kubenetes/admin.conf下的server的ip 为 192.168.1.1

拷贝用户权限文件（一般不用）

一般不用，除非你的kubectl令出错了

# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

01 k8s部分迁移_deprecated

k8s部分host迁移

说明/背景/目标

3台主机加入k8s集群

加载内核模块

关swap

添加k8s仓库，安装基础服务

配置containerd代理（可选）

新节点加入集群 //todo 添加 controlplane

附录： kubelet故障恢复（失败，搞炸了）

评论

发表回复取消回复

01 k8s部分迁移_deprecated

k8s部分host迁移

说明/背景/目标

3台主机加入k8s集群

加载内核模块

关swap

添加k8s仓库，安装基础服务

配置containerd代理（可选）

新节点加入集群 //todo 添加 controlplane

附录： kubelet故障恢复（失败，搞炸了）

评论

发表回复 取消回复

发表回复取消回复