What's Actually Happening
When a Kubernetes node runs low on disk space, it reports a DiskPressure condition. The kubelet starts evicting pods to free space and refuses to schedule new pods. This protects the node from running out of disk completely.
The Error You'll See
$ kubectl describe node node-1
Conditions:
Type Status Reason Message
---- ------ ------ -------
MemoryPressure False KubeletHasSufficientMemory kubelet has sufficient memory
DiskPressure True KubeletHasDiskPressure kubelet has disk pressure
PIDPressure False KubeletHasSufficientPID kubelet has sufficient PID available
Ready True KubeletReady kubelet is posting ready statusPods may show eviction:
$ kubectl get pods -o wide
NAME STATUS RESTARTS AGE NODE
my-app-xx Evicted 0 5m node-1Why This Happens
- 1.Container images filling disk - Old images not cleaned
- 2.Container logs too large - Application logging excessively
- 3.EmptyDir volumes - Pods using large emptyDir volumes
- 4.Downloaded layers - Image pull caches filling disk
- 5.Old logs - System and container logs accumulating
- 6.Journald logs - System journal growing unbounded
Step 1: Check Node Disk Usage
kubectl describe node node-1 | grep -A20 "Conditions:"Shows:
DiskPressure True KubeletHasDiskPressure kubelet has disk pressureGet more details:
```bash # SSH to node ssh node-1
df -h ```
Shows disk usage:
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 50G 45G 5G 90% /
/dev/sda2 100G 95G 5G 95% /var/lib/dockerCheck kubelet thresholds:
cat /var/lib/kubelet/config.yaml | grep -A5 evictionShows when eviction triggers:
evictionHard:
memory.available: "100Mi"
nodefs.available: "10%"
nodefs.inodesFree: "5%"
imagefs.available: "15%"Step 2: Clean Up Docker Resources
```bash ssh node-1
# Remove unused images docker image prune -a
# Remove stopped containers docker container prune
# Remove unused volumes docker volume prune
# Full cleanup docker system prune -a --volumes ```
For containerd:
```bash # Remove unused images crictl rmi --prune
# List images crictl images
# Remove specific image crictl rmi <image-id> ```
Step 3: Clean Up Container Logs
```bash ssh node-1
# Check log sizes du -sh /var/lib/docker/containers/*/*-json.log
# Truncate large logs truncate -s 0 /var/lib/docker/containers/*/*-json.log ```
Configure log rotation:
# /etc/docker/daemon.json
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}Restart Docker:
systemctl restart dockerStep 4: Clean Up System Logs
```bash ssh node-1
# Check journal log size journalctl --disk-usage
# Vacuum to 100M journalctl --vacuum-size=100M
# Or keep only last 7 days journalctl --vacuum-time=7d ```
Step 5: Find Large Files and Directories
```bash ssh node-1
# Find large directories du -sh /* 2>/dev/null | sort -h
# Find large files find /var -type f -size +100M 2>/dev/null
# Check /var/lib du -sh /var/lib/* | sort -h ```
Common large directories:
- /var/lib/docker - Container images and volumes
- /var/log - System logs
- /var/lib/kubelet - Kubelet data
- /var/lib/containerd - Containerd data
Step 6: Remove Unused EmptyDir Volumes
```bash # Find pods with emptyDir kubectl get pods -A -o json | jq -r '.items[] | select(.spec.volumes[]?.emptyDir) | .metadata.name'
# Check volume usage kubectl describe pod <pod-name> ```
Step 7. Clean Up Old Pods and Resources
```bash # List all pods on node kubectl get pods -A --field-selector spec.nodeName=node-1
# Delete evicted pods kubectl get pods -A --field-selector=status.phase=Failed -o json | kubectl delete -f -
# Clean up completed jobs kubectl get jobs -A | grep Completed kubectl delete job <completed-job> ```
Step 8: Increase Node Disk Size
If disk is consistently full:
```bash # For cloud VMs, resize disk # AWS: Modify volume, then resize filesystem ssh node-1 lsblk growpart /dev/sda 1 resize2fs /dev/sda1
# Or add additional disk for container runtime # Configure containerd/docker to use new disk ```
Step 9: Adjust Kubelet Thresholds
```yaml # /var/lib/kubelet/config.yaml evictionHard: memory.available: "100Mi" nodefs.available: "5%" # Lower from 10% imagefs.available: "10%" # Lower from 15%
evictionSoft: nodefs.available: "8%" imagefs.available: "12%" evictionSoftGracePeriod: nodefs.available: "1m30s" imagefs.available: "2m" ```
Restart kubelet:
systemctl restart kubeletVerify the Fix
```bash # Check node condition kubectl describe node node-1 | grep -A5 "Conditions:"
# DiskPressure should be False # df -h should show more available space
# Verify pods can schedule kubectl get pods -A --field-selector spec.nodeName=node-1 ```
Prevention Tips
Regular maintenance:
```bash # Set up log rotation # /etc/docker/daemon.json { "log-opts": { "max-size": "10m", "max-file": "3" } }
# Set up periodic cleanup cron # crontab -e 0 2 * * * docker system prune -f 0 3 * * * journalctl --vacuum-time=7d
# Monitor disk usage # Prometheus alert: (1 - (node_filesystem_avail_bytes / node_filesystem_size_bytes)) > 0.85 ```