You checked your deployment and found pods with status Evicted. They're not running, they're not restarting - they're just gone from their node. Pod eviction is Kubernetes' way of maintaining node stability when resources run low, but it can disrupt your applications if you don't understand why it's happening and how to prevent it.
Understanding Pod Eviction
Pods get evicted when a node experiences resource pressure - typically disk pressure, memory pressure, or during node maintenance operations. The kubelet evicts pods to reclaim resources and keep the node healthy. Evicted pods show status Evicted and won't restart automatically (unlike crash loops).
Eviction follows a priority order: BestEffort pods (no resource requests) go first, then Burstable pods under their request, then Guaranteed pods. System-critical pods are protected from eviction.
Diagnosis Commands
Start by identifying evicted pods:
```bash # Find all evicted pods in namespace kubectl get pods -n namespace --field-selector=status.phase=Failed | grep Evicted
# Find evicted pods across all namespaces kubectl get pods -A --field-selector=status.phase=Failed | grep Evicted
# Get detailed information about evicted pod kubectl describe pod pod-name -n namespace ```
Check the eviction reason:
```bash # Check pod status for eviction reason kubectl get pod pod-name -n namespace -o jsonpath='{.status.reason}' kubectl get pod pod-name -n namespace -o jsonpath='{.status.message}'
# Check node conditions that triggered eviction kubectl describe node node-name | grep -A 10 Conditions ```
Examine node resource pressure:
```bash # Check node disk pressure kubectl describe node node-name | grep -i "DiskPressure|MemoryPressure|PIDPressure"
# Check node resource usage kubectl top nodes kubectl describe node node-name | grep -A 5 "Allocated resources"
# Check events on the node kubectl get events -n namespace --field-selector involvedObject.kind=Node ```
Common Solutions
Solution 1: Fix Disk Pressure Evictions
Disk pressure is the most common eviction cause. The kubelet evicts pods when node filesystem runs low:
```bash # Check disk pressure status kubectl describe node node-name | grep DiskPressure
# Check node filesystem usage (SSH to node) df -h /var/lib/kubelet df -h /var/lib/docker df -h /var/log
# Check what's using disk space du -sh /var/lib/kubelet/* du -sh /var/lib/docker/* du -sh /var/log/* ```
Clean up disk space on the node:
```bash # Clean up Docker resources docker system prune -af --volumes
# Clean up old container logs find /var/log -name "*.log" -mtime +7 -delete truncate -s 0 /var/log/syslog
# Remove unused images crictl rmi --prune
# Clean up old pods (if any stuck) crictl pods crictl rmp <pod-id> ```
Prevent disk pressure with proper logging configuration:
# Limit container log size
containers:
- name: app
resources: {}
# Log rotation happens automatically, but limits help
# Kubernetes defaults: 10MB per log file, 3 files per containerConfigure kubelet to evict at higher thresholds:
# kubelet configuration
evictionHard:
memory.available: "500Mi"
nodefs.available: "15%"
nodefs.inodesFree: "10%"
imagefs.available: "15%"
evictionSoft:
memory.available: "750Mi"
nodefs.available: "20%"
imagefs.available: "20%"
evictionSoftGracePeriod:
memory.available: "1m30s"
nodefs.available: "2m"
imagefs.available: "2m"Solution 2: Fix Memory Pressure Evictions
When node memory runs low, the kubelet evicts pods to reclaim memory:
```bash # Check memory pressure kubectl describe node node-name | grep MemoryPressure
# Check node memory usage kubectl top node node-name free -m # On the node
# Check which pods use most memory kubectl top pods -A --sort-by=memory | head -20 ```
Add more memory to nodes or optimize memory-heavy pods:
# Reduce memory footprint
resources:
requests:
memory: "256Mi"
limits:
memory: "512Mi"Solution 3: Handle Node Maintenance Evictions
During node drain or maintenance, pods are evicted deliberately:
```bash # Check if node is being drained kubectl get nodes kubectl describe node node-name | grep -A 5 "Spec:|Taints:"
# Node may have maintenance taint kubectl taint nodes node-name node.kubernetes.io/unschedulable:NoSchedule- ```
If draining a node for maintenance, use proper drain command:
```bash # Proper node drain with pod disruption budgets respected kubectl drain node-name --ignore-daemonsets --delete-emptydir-data --grace-period=30
# After maintenance, uncordon kubectl uncordon node-name ```
Solution 4: Clean Up Evicted Pods
Evicted pods don't automatically clean up and can clutter your namespace:
```bash # Delete all evicted pods in namespace kubectl get pods -n namespace --field-selector=status.phase=Failed -o json | jq '.items[] | select(.status.reason=="Evicted") | .metadata.name' | xargs kubectl delete pod -n namespace
# Delete all evicted pods cluster-wide kubectl get pods -A --field-selector=status.phase=Failed -o json | jq '.items[] | select(.status.reason=="Evicted") | "\(.metadata.namespace) \(.metadata.name)"' | while read ns pod; do kubectl delete pod $pod -n $ns; done
# Using grep instead of jq kubectl get pods -A --field-selector=status.phase=Failed | grep Evicted | awk '{print $1,$$2}' | xargs -n2 kubectl delete pod -n ```
Create a cron job to clean up evicted pods:
apiVersion: batch/v1
kind: CronJob
metadata:
name: cleanup-evicted-pods
spec:
schedule: "*/30 * * * *"
jobTemplate:
spec:
template:
spec:
serviceAccountName: pod-cleaner
containers:
- name: kubectl
image: bitnami/kubectl:latest
command:
- /bin/sh
- -c
- kubectl get pods -A --field-selector=status.phase=Failed -o json | jq -r '.items[] | select(.status.reason=="Evicted") | "\(.metadata.namespace) \(.metadata.name)"' | xargs -n2 -r kubectl delete pod -n
restartPolicy: OnFailureSolution 5: Use Pod Disruption Budgets
Prevent too many evictions during maintenance with PDB:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: myapp-pdb
spec:
minAvailable: 2 # Or use maxUnavailable: 1
selector:
matchLabels:
app: myappCheck PDB status:
kubectl get pdb -n namespace
kubectl describe pdb myapp-pdb -n namespaceSolution 6: Improve Pod Priority
Higher priority pods are less likely to be evicted:
# Create priority class
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: high-priority
value: 1000000
globalDefault: false
description: "High priority for critical pods"
---
# Use in pod spec
spec:
priorityClassName: high-prioritySolution 7: Set Proper Resource Requests
Pods with resource requests are less likely to be evicted than BestEffort pods:
```yaml # Avoid BestEffort pods (no resources) # Bad - will be evicted first containers: - name: app image: myimage
# Good - has resource guarantees containers: - name: app image: myimage resources: requests: memory: "256Mi" cpu: "100m" limits: memory: "512Mi" cpu: "500m" ```
Solution 8: Monitor and Alert on Evictions
Set up alerts for eviction events:
# Prometheus alert rule
- alert: PodEvicted
expr: increase(kube_pod_container_status_terminated_reason{reason="Evicted"}[5m]) > 0
for: 1m
labels:
severity: warning
annotations:
summary: "Pod {{ $labels.pod }} evicted in namespace {{ $labels.namespace }}"
description: "Pod was evicted from node. Check node resource pressure."Verification
After fixing eviction issues:
```bash # Verify node is healthy kubectl describe node node-name | grep -A 5 Conditions
# Check for evicted pods kubectl get pods -n namespace --field-selector=status.phase=Failed
# Monitor for new evictions kubectl get events -A --field-selector reason=Evicted -w
# Verify disk/memory pressure is resolved kubectl describe node node-name | grep -E "DiskPressure|MemoryPressure" ```
Eviction Causes Summary
| Cause | Symptoms | Solution |
|---|---|---|
| Disk Pressure | DiskPressure=True, nodefs low | Clean up disk, configure log rotation |
| Memory Pressure | MemoryPressure=True | Add node memory, optimize pod memory |
| Node Drain | Unschedulable taint, intentional | Normal maintenance, PDB protects |
| PID Pressure | PIDPressure=True | Reduce pod density per node |
| Image FS Full | ImageGCFailed events | Clean up unused images |
Prevention Best Practices
Set appropriate resource requests and limits for all pods. Use Pod Disruption Budgets for critical workloads. Monitor node resources and set up alerts before pressure triggers eviction. Configure proper log rotation and container garbage collection. Use node auto-scaling to add capacity when needed. Regular maintenance to clean up unused resources.
Eviction is a survival mechanism for your cluster - it's Kubernetes choosing to sacrifice some pods to keep the node running. The key is understanding why your pods were chosen for eviction and ensuring critical workloads have the protection they need.