What's Actually Happening

A Node in NotReady state means the kubelet on that node is not properly reporting status to the Kubernetes API server. Pods on that node may still be running, but new pods won't schedule there and the node is considered unhealthy.

The Error You'll See

bash
$ kubectl get nodes
NAME       STATUS     ROLES           AGE   VERSION
node-1     Ready      control-plane   10d   v1.28.0
node-2     NotReady   <none>          10d   v1.28.0
node-3     Ready      <none>          10d   v1.28.0

The STATUS column shows NotReady.

Why This Happens

  1. 1.kubelet stopped - Kubelet service crashed or stopped
  2. 2.Container runtime down - Docker/containerd not running
  3. 3.Network issues - Node cannot reach API server
  4. 4.Certificate expired - Kubelet client certificate expired
  5. 5.Disk pressure - Node disk full, kubelet can't function
  6. 6.Memory pressure - Node running out of memory
  7. 7.Clock skew - Node time significantly different from master

Step 1: Describe the Node

bash
kubectl describe node node-2

Look at Conditions section:

bash
Conditions:
  Type                 Status  LastHeartbeatTime                 Reason                       Message
  ----                 ------  -----------------                 -------                       -------
  MemoryPressure       False   Mon, 03 Apr 2026 10:00:00         KubeletHasSufficientMemory    kubelet has sufficient memory
  DiskPressure         False   Mon, 03 Apr 2026 10:00:00         KubeletHasNoDiskPressure      kubelet has no disk pressure
  PIDPressure          False   Mon, 03 Apr 2026 10:00:00         KubeletHasSufficientPID       kubelet has sufficient PID available
  Ready                False   Mon, 03 Apr 2026 10:00:00         KubeletNotReady               container runtime not running

The Ready condition shows False with reason.

Step 2: SSH to Node and Check Kubelet

```bash ssh node-2

# Check kubelet service systemctl status kubelet ```

Output shows:

bash
kubelet.service - Kubernetes Kubelet
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled)
   Active: inactive (dead) since Mon 2026-04-03 10:00:00

If stopped:

bash
systemctl start kubelet
systemctl enable kubelet

Step 3: Check Container Runtime

```bash ssh node-2

# For Docker systemctl status docker

# For containerd systemctl status containerd ```

If not running:

bash
systemctl start docker
# or
systemctl start containerd

Check logs for errors:

bash
journalctl -u docker -n 50
journalctl -u containerd -n 50

Step 4: Check Kubelet Logs

bash
ssh node-2
journalctl -u kubelet -n 100

Look for errors:

bash
E0310 10:00:00.123456  1234 kubelet.go:1234] "Unable to register node with API server" err="Post https://10.96.0.1:443/api/v1/nodes: dial tcp 10.96.0.1:443: connect: connection refused"

Common errors: - connection refused - API server unreachable - certificate signed by unknown authority - Cert issues - disk pressure - Disk full

Step 5: Check Network Connectivity

```bash ssh node-2

# Test API server connectivity curl -k https://10.96.0.1:443/healthz # Should return "ok"

# Check DNS nslookup kubernetes.default

# Check if node can ping master ping node-1 ```

Step 6: Check Certificate Issues

```bash ssh node-2

# Check kubelet certificates ls -la /etc/kubernetes/pki/

# Check cert expiration openssl x509 -in /etc/kubernetes/pki/kubelet-client-current.pem -text -noout | grep Not ```

If expired:

bash
# Regenerate certs (kubeadm)
kubeadm certs renew
systemctl restart kubelet

Step 7: Check Disk Space

bash
ssh node-2
df -h

Check:

bash
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       50G   48G  2G   96% /    # Problem: almost full

Clean up if needed:

```bash # Clean old Docker images docker system prune -a

# Clean containerd crictl rmi --prune

# Check kubelet reserved space df -h /var/lib/kubelet ```

Step 8: Check Node Resources

bash
ssh node-2
free -h
top

If memory pressure:

```bash # Check for memory hog ps aux --sort=-%mem | head -10

# Evict pods if needed kubectl drain node-2 --ignore-daemonsets --delete-emptydir-data ```

Step 9: Check Clock Sync

bash
ssh node-2
date
# Compare with master
ssh node-1 date

If time differs significantly:

bash
# Enable NTP sync
systemctl start systemd-timesyncd
# Or install chrony
apt-get install chrony
systemctl start chrony

Verify the Fix

After fixing on node:

```bash kubectl get nodes # Should show Ready

NAME STATUS ROLES AGE VERSION node-1 Ready control-plane 10d v1.28.0 node-2 Ready <none> 10d v1.28.0

# Check node conditions kubectl describe node node-2 | grep -A5 "Conditions:" # Ready should be True ```

Prevention Tips

Monitor node health:

```bash # Set up node monitoring kubectl top nodes

# Monitor kubelet status systemctl status kubelet

# Set up alerts for NotReady nodes # Use Prometheus alert: KubeNodeNotReady

# Regular cert renewal kubeadm certs check-expiration ```