Restoring etcd Consensus

If etcd loses consensus, stop k3s on all master nodes:

systemctl stop k3s.service

To clean up processes that might hang around, restart the nodes with k3s disabled:

systemctl disable --now k3s.service
systemctl reboot

On one node, run:

k3s server --cluster-reset

Follow the directions. There may be configuration changes needed (for example, removing the server key from /etc/rancher/k3s/config.yaml).

On the same node, restart and enable k3s:

systemctl enable --now k3s.service

Check for errors and address if any.

On the remaining master nodes, backup and delete the etcd data directory, adjust /etc/rancher/k3s/config.yaml’s server key if necessary, then enable and start k3s:

rsync -avz /var/lib/rancher/k3s/server/db/ /root/db-backup/
rm -rf /var/lib/rancher/k3s/server/db/
systemctl enable --now k3s.service

Check to make sure the node properly joins the cluster as a master.

Change your shirt and underwear before going out in public.