Your StatefulSet update is stuck. Some pods have the new version, others are still on the old version, and the rollout isn progressing. StatefulSets manage ordered, stable pods with persistent identities, but this ordering can cause updates to stall when a single pod fails to update.
Understanding StatefulSet Updates
StatefulSets update pods in order (or reverse order), one at a time, waiting for each pod to become ready before updating the next. This ordered approach ensures stability for stateful applications, but it means a single failed pod can block the entire rollout.
The update can be stuck because: a pod failed to start with the new version, the partition setting is blocking updates, readiness probes are failing, or resources are insufficient.
Diagnosis Commands
Start by checking StatefulSet status:
```bash # Check StatefulSet status kubectl get statefulset statefulset-name -n namespace
# Get detailed status kubectl describe statefulset statefulset-name -n namespace
# Check update revision kubectl get statefulset statefulset-name -n namespace -o jsonpath='{.status.updateRevision}' kubectl get statefulset statefulset-name -n namespace -o jsonpath='{.status.currentRevision}' ```
Check pod status:
```bash # Check all pods and their versions kubectl get pods -n namespace -l app=statefulset-label -o wide
# Check which pods are updated kubectl get pods -n namespace -l app=statefulset-label -o jsonpath='{.items[*].metadata.labels.controller-revision-hash}'
# Describe the stuck pod kubectl describe pod statefulset-name-index -n namespace ```
Check update strategy:
```bash # Check update strategy kubectl get statefulset statefulset-name -n namespace -o yaml | grep -A 15 updateStrategy
# Check partition setting kubectl get statefulset statefulset-name -n namespace -o jsonpath='{.spec.updateStrategy.rollingUpdate.partition}' ```
Common Solutions
Solution 1: Fix Pod Failing to Update
The pod at the current update position might be failing:
```bash # Identify which pod is blocking the update kubectl describe statefulset statefulset-name -n namespace | grep -A 5 "Pods"
# Check the failing pod kubectl describe pod statefulset-name-2 -n namespace # For example, pod index 2
# Check pod logs kubectl logs statefulset-name-2 -n namespace
# Check events kubectl get events -n namespace --field-selector involvedObject.name=statefulset-name-2 ```
Fix the pod failure:
```yaml # Check if image is correct kubectl get statefulset statefulset-name -n namespace -o jsonpath='{.spec.template.spec.containers[*].image}'
# If image is wrong, update it kubectl set image statefulset/statefulset-name container-name=correct-image:tag -n namespace ```
If a pod is stuck in CrashLoopBackOff:
```bash # Check pod restart count kubectl get pod statefulset-name-2 -n namespace -o jsonpath='{.status.containerStatuses[*].restartCount}'
# Check previous logs kubectl logs statefulset-name-2 -n namespace --previous
# Delete the pod to force recreation kubectl delete pod statefulset-name-2 -n namespace ```
Solution 2: Fix Partition Blocking Updates
The partition setting limits which pods get updated:
```bash # Check partition value kubectl get statefulset statefulset-name -n namespace -o jsonpath='{.spec.updateStrategy.rollingUpdate.partition}' # If partition is N, pods with index >= N won be updated
# Example: partition = 2 means pods statefulset-0 and statefulset-1 won update ```
Remove or reduce partition:
```bash # Remove partition to update all pods kubectl patch statefulset statefulset-name -n namespace -p '{"spec":{"updateStrategy":{"rollingUpdate":{"partition":0}}}}'
# Or edit directly kubectl edit statefulset statefulset-name -n namespace # Set spec.updateStrategy.rollingUpdate.partition to 0 ```
Gradual rollout using partition:
# Gradual update: start with partition = replicas
# Then reduce to update pods one by one
spec:
replicas: 5
updateStrategy:
rollingUpdate:
partition: 5 # Start: no pods update
# Then: partition: 4 -> pod-4 updates
# Then: partition: 3 -> pod-3 updates
# Continue until partition: 0 -> all pods updatedSolution 3: Fix Readiness Probe Failures
Pods must be ready before the next pod updates:
```bash # Check pod readiness kubectl get pods -n namespace -l app=statefulset-label
# Check probe failures kubectl describe pod statefulset-name-index -n namespace | grep -A 10 Readiness ```
Fix readiness probe:
spec:
template:
spec:
containers:
- name: app
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30 # Increase if app takes time
periodSeconds: 10
failureThreshold: 3Solution 4: Fix Resource Constraints
Pods might be pending due to resource issues:
```bash # Check for pending pods kubectl get pods -n namespace -l app=statefulset-label | grep Pending
# Describe pending pod kubectl describe pod pending-pod -n namespace
# Check PVC status (StatefulSets use PVCs) kubectl get pvc -n namespace ```
Fix resource requests:
spec:
template:
spec:
containers:
- name: app
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"Solution 5: Fix PVC Issues
StatefulSet pods use persistent volumes:
```bash # Check PVC binding kubectl get pvc -n namespace
# Describe PVC for issues kubectl describe pvc pvc-name -n namespace
# Check storage class kubectl get storageclass ```
Fix storage issues:
```bash # If PVC is stuck pending kubectl describe pvc pvc-name -n namespace | grep -A 10 Events
# Create missing storage class if needed kubectl apply -f storageclass.yaml ```
Solution 6: Fix OnDelete Update Strategy
If update strategy is OnDelete, pods only update when manually deleted:
# Check update strategy
kubectl get statefulset statefulset-name -n namespace -o jsonpath='{.spec.updateStrategy.type}'Switch to RollingUpdate or delete pods manually:
```bash # Switch to rolling update kubectl patch statefulset statefulset-name -n namespace -p '{"spec":{"updateStrategy":{"type":"RollingUpdate"}}}'
# Or manually delete pods for OnDelete strategy kubectl delete pod statefulset-name-0 -n namespace kubectl delete pod statefulset-name-1 -n namespace # Continue for all pods ```
Solution 7: Force Update by Deleting Pods
Sometimes you need to force pod recreation:
```bash # Delete stuck pod - StatefulSet will recreate it with new spec kubectl delete pod statefulset-name-index -n namespace
# Watch pod recreation kubectl get pods -n namespace -l app=statefulset-label -w ```
Solution 8: Scale Down and Up
For major changes, scale to zero and back:
```bash # Scale to zero kubectl scale statefulset statefulset-name -n namespace --replicas=0
# Wait for pods to terminate kubectl get pods -n namespace -l app=statefulset-label -w
# Scale back up kubectl scale statefulset statefulset-name -n namespace --replicas=3 ```
Note: This removes pod identity and may cause issues for some applications.
Solution 9: Check Pod Management Policy
Parallel management allows simultaneous updates:
spec:
podManagementPolicy: Parallel # Pods created/updated in parallel, not ordered
# Default is OrderedReady (sequential)Solution 10: Fix Headless Service Issues
StatefulSets require a headless service:
```bash # Check if headless service exists kubectl get service -n namespace
# Verify service is headless (clusterIP: None) kubectl get service service-name -n namespace -o jsonpath='{.spec.clusterIP}' ```
Create headless service:
apiVersion: v1
kind: Service
metadata:
name: statefulset-service
spec:
clusterIP: None # Headless
selector:
app: statefulset-app
ports:
- port: 8080Verification
After fixing the issue:
```bash # Check StatefulSet update status kubectl get statefulset statefulset-name -n namespace
# Check all pods are on new revision kubectl get pods -n namespace -l app=statefulset-label -o jsonpath='{.items[*].metadata.labels.controller-revision-hash}'
# Monitor rollout kubectl rollout status statefulset/statefulset-name -n namespace
# Verify pods are ready kubectl get pods -n namespace -l app=statefulset-label ```
StatefulSet Update Debugging
```bash # Comprehensive check echo "=== StatefulSet Status ===" kubectl get statefulset statefulset-name -n namespace -o yaml | grep -A 10 status
echo "=== Pod Status ===" kubectl get pods -n namespace -l app=statefulset-label -o wide
echo "=== Update Strategy ===" kubectl get statefulset statefulset-name -n namespace -o yaml | grep -A 15 updateStrategy
echo "=== Current vs Update Revision ===" kubectl get statefulset statefulset-name -n namespace -o jsonpath='{.status.currentRevision}' echo "" kubectl get statefulset statefulset-name -n namespace -o jsonpath='{.status.updateRevision}' ```
StatefulSet Stuck Causes Summary
| Cause | Check | Solution |
|---|---|---|
| Pod failing to start | kubectl describe pod | Fix image, config, or delete pod |
| Partition blocking | kubectl get sts -o yaml | Set partition to 0 |
| Readiness probe failing | Pod not ready | Fix probe or increase delay |
| PVC issues | kubectl get pvc | Fix storage class or PVC |
| OnDelete strategy | kubectl get sts -o yaml | Change to RollingUpdate |
| Resource constraints | Pod pending | Fix resource requests |
| Headless service missing | kubectl get svc | Create headless service |
Prevention Best Practices
Use appropriate update strategy for your application. Set proper readiness probes with realistic delays. Use partition for controlled canary rollouts. Ensure headless service exists before creating StatefulSet. Test updates with small partitions first. Monitor pod readiness during updates. Have rollback plan ready before updates.
StatefulSet updates getting stuck is usually about one pod failing to become ready - the ordered nature means everything stops at that point. Check the pod at the lowest index that hasn't been updated, as that's where the problem lies.