What's Actually Happening

etcd cluster reports unhealthy status. Kubernetes control plane may be affected as etcd is the backing store.

The Error You'll See

```bash $ etcdctl endpoint health

http://10.0.0.1:2379 is unhealthy: failed to commit proposal: context deadline exceeded ```

Why This Happens

  1. 1.Member down
  2. 2.Quorum loss
  3. 3.Network partition
  4. 4.Disk issues
  5. 5.Clock skew

Step 1: Check Cluster Health

bash
etcdctl endpoint health --cluster
etcdctl endpoint status --cluster -w table

Step 2: Check Member List

bash
etcdctl member list -w table

Step 3: Check Logs

bash
journalctl -u etcd -f

Step 4: Check Network

bash
nc -zv etcd-1 2379
nc -zv etcd-1 2380

Step 5: Check Disk

bash
df -h /var/lib/etcd
iostat -x 1 5

Step 6: Check Quorum

bash
# Must have majority (N/2 + 1) members up
# For 3-node cluster: need 2 members
# For 5-node cluster: need 3 members

Step 7: Restart Member

bash
systemctl restart etcd

Step 8: Remove Failed Member

bash
etcdctl member remove <member-id>

Step 9: Add New Member

bash
etcdctl member add etcd-new --peer-urls=http://10.0.0.4:2380

Step 10: Defragment

bash
etcdctl defrag
etcdctl compact <revision>
  • [Fix etcd Leader Election Failed](/articles/fix-etcd-leader-election-failed)
  • [Fix Kubernetes API Server Not Responding](/articles/fix-kubernetes-api-server-not-responding)