Introduction

When a Kubernetes node becomes NotReady, the control plane stops treating it as a healthy scheduling target. In many incidents, the underlying cause is kubelet failure or degraded node-level health rather than a problem in the Pods themselves. The right response is to inspect the kubelet, the node’s local resources, and its connectivity to the API server before trying to fix workloads on top of it.

Symptoms

  • kubectl get nodes shows the node as NotReady
  • New Pods stop scheduling to the node
  • Existing Pods become Unknown, are evicted, or stop receiving normal updates
  • Events point to KubeletNotReady, disk pressure, or connectivity issues

Common Causes

  • The kubelet service stopped or is crashing repeatedly
  • Disk pressure or other local resource exhaustion degraded node health
  • The node lost connectivity to the API server
  • Kubelet configuration or node certificates became invalid

Step-by-Step Fix

  1. 1.Inspect node conditions and recent events
  2. 2.Start from the cluster view so you know whether the issue is readiness, pressure, or communication.
bash
kubectl describe node my-node
  1. 1.Check kubelet service health on the node
  2. 2.If kubelet is down or crashing, no higher-level Kubernetes debugging will help until the node agent is stable again.
bash
sudo systemctl status kubelet
sudo journalctl -u kubelet --since "1 hour ago"
  1. 1.Verify local disk and basic node health
  2. 2.Full disks and corrupted local state are common reasons kubelet degrades or stops.
bash
df -h
  1. 1.Restart kubelet only after checking why it failed
  2. 2.A restart may restore service temporarily, but you still need to understand whether the root cause is configuration, connectivity, or resource pressure.
bash
sudo systemctl restart kubelet

Prevention

  • Monitor kubelet service health and node conditions directly
  • Alert on disk pressure before nodes become NotReady
  • Keep node bootstrap and certificate rotation procedures documented
  • Treat repeated kubelet restarts as a real incident signal, not just noise