What's Actually Happening
Helm release upgrade times out waiting for resources to become ready. The upgrade process fails but may leave resources in a partially updated state.
The Error You'll See
Upgrade timeout:
```bash $ helm upgrade myrelease mychart
Error: UPGRADE FAILED: timed out waiting for the condition ```
Release in failed state:
```bash $ helm status myrelease
STATUS: pending-upgrade LAST DEPLOYED: Mon Jan 1 00:00:00 2024 NOTES: Error: release myrelease failed ```
Resource not ready:
Error: release myrelease failed, and has been rolled back due to atomic being setWhy This Happens
- 1.Slow startup - Pods take long to become ready
- 2.Image pull - Large images or slow registries
- 3.Resource constraints - Insufficient CPU/memory
- 4.Dependency issues - Services not ready
- 5.Hook timeout - Pre/post hooks taking too long
- 6.Default timeout - 5-minute timeout insufficient
Step 1: Check Release Status
```bash # Get release status: helm status myrelease
# Get release history: helm history myrelease
# Check release in failed state: helm list --all-namespaces --all
# Describe failed revision: helm get manifest myrelease --revision 5
# Get release notes: helm get notes myrelease
# Get release values: helm get values myrelease
# Check release in Kubernetes: kubectl get secret -l owner=helm -A kubectl get secret sh.helm.release.v1.myrelease.v5 -o jsonpath='{.data.release}' | base64 -d | base64 -d | gunzip | jq ```
Step 2: Check Pending Resources
```bash # Check pod status: kubectl get pods -l app.kubernetes.io/instance=myrelease
# Check for pending pods: kubectl get pods -l app.kubernetes.io/instance=myrelease --field-selector=status.phase=Pending
# Check for image pull errors: kubectl describe pods -l app.kubernetes.io/instance=myrelease | grep -A 5 "Events:"
# Check for crashloopback: kubectl get pods -l app.kubernetes.io/instance=myrelease | grep CrashLoopBackOff
# Check deployment status: kubectl get deployments -l app.kubernetes.io/instance=myrelease
# Check events: kubectl get events --field-selector involvedObject.name=myrelease-app
# Check resource quota: kubectl describe resourcequota -n mynamespace ```
Step 3: Increase Helm Timeout
```bash # Default timeout is 5 minutes (300s)
# Increase timeout in upgrade: helm upgrade myrelease mychart --timeout 15m
# Or with atomic (auto-rollback): helm upgrade myrelease mychart --timeout 15m --atomic
# With wait for all resources: helm upgrade myrelease mychart --timeout 15m --wait --wait-for-jobs
# Set default timeout in values: helm upgrade myrelease mychart --set timeout=900
# Or in values.yaml: # (Note: This requires chart to support timeout value) ```
Step 4: Check Deployment Health
```bash # Check deployment progress: kubectl rollout status deployment/myrelease-app
# Check deployment conditions: kubectl describe deployment myrelease-app | grep -A 10 "Conditions:"
# Check replicaset: kubectl get replicasets -l app.kubernetes.io/instance=myrelease
# Check for image issues: kubectl describe pods -l app.kubernetes.io/instance=myrelease | grep -i "image|pull"
# Check for resource limits: kubectl describe pods -l app.kubernetes.io/instance=myrelease | grep -A 5 "Limits:"
# Check startup probes: kubectl describe pods -l app.kubernetes.io/instance=myrelease | grep -i "probe"
# Adjust probes if too aggressive: # In values.yaml: livenessProbe: initialDelaySeconds: 60 # Increase if slow startup periodSeconds: 10 readinessProbe: initialDelaySeconds: 30 periodSeconds: 5 ```
Step 5: Check Helm Hooks
```bash # List hooks in release: helm get hooks myrelease
# Common hooks: # pre-install, post-install # pre-upgrade, post-upgrade # pre-rollback, post-rollback # pre-delete, post-delete
# Check hook jobs/pods: kubectl get jobs -l app.kubernetes.io/instance=myrelease kubectl get pods -l app.kubernetes.io/instance=myrelease,helm.sh/hook
# Check hook status: kubectl describe job myrelease-pre-upgrade-hook
# Hook timeout in annotation: annotations: "helm.sh/hook-delete-policy": hook-succeeded "helm.sh/hook-weight": "1" "helm.sh/hook-timeout": "600" # 10 minutes
# In hook template: metadata: annotations: helm.sh/hook: pre-upgrade helm.sh/hook-timeout: "900"
# Check hook logs: kubectl logs job/myrelease-pre-upgrade-hook ```
Step 6: Fix Stuck Release
```bash # Release stuck in pending-upgrade:
# Option 1: Force rollback: helm rollback myrelease
# Option 2: Force upgrade again: helm upgrade myrelease mychart --force
# Option 3: Delete and reinstall (data loss risk): helm uninstall myrelease helm install myrelease mychart
# Option 4: Manually fix via secret: # Get current revision: REVISION=$(kubectl get secret -l owner=helm,name=myrelease --sort-by=.metadata.creationTimestamp -o json | jq -r '.items[-1].metadata.name')
# Check revision status: kubectl get secret $REVISION -o jsonpath='{.metadata.labels.status}'
# For rollback without Helm: kubectl get secret sh.helm.release.v1.myrelease.v5 -o yaml > release.yaml # Edit the status to "deployed" and reapply
# Using Helm plugin: helm plugin install https://github.com/adamreese/helm-nuke helm nuke myrelease ```
Step 7: Optimize Chart for Upgrades
```yaml # In deployment.yaml: spec: replicas: {{ .Values.replicaCount }} strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0 # Wait for new pods before removing old
# Faster rollout: strategy: type: RollingUpdate rollingUpdate: maxSurge: 25% maxUnavailable: 25%
# Adjust probes: livenessProbe: initialDelaySeconds: {{ .Values.livenessProbe.initialDelaySeconds }} timeoutSeconds: 5 periodSeconds: 10 failureThreshold: 3
readinessProbe: initialDelaySeconds: {{ .Values.readinessProbe.initialDelaySeconds }} timeoutSeconds: 5 periodSeconds: 5 failureThreshold: 3
# Resource requests: resources: requests: memory: "256Mi" cpu: "100m" limits: memory: "512Mi" cpu: "500m" ```
Step 8: Check Image Availability
```bash # Check image exists: docker pull myimage:latest
# Check from inside cluster: kubectl run test-image --image=myimage:latest --rm -it --restart=Never -- /bin/sh
# Check image pull secrets: kubectl get secrets -l app.kubernetes.io/instance=myrelease
# Check image pull secret valid: kubectl get secret myregistry-key -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d | jq
# Create image pull secret: kubectl create secret docker-registry myregistry-key \ --docker-server=myregistry.com \ --docker-username=user \ --docker-password=password
# Reference in values: imagePullSecrets: - name: myregistry-key ```
Step 9: Use Atomic for Auto-Rollback
```bash # Use --atomic for automatic rollback on failure: helm upgrade myrelease mychart --atomic --timeout 15m
# With cleanup on fail: helm upgrade myrelease mychart --atomic --cleanup-on-fail --timeout 15m
# Atomic behavior: # 1. Upgrade starts # 2. If timeout or failure, automatic rollback # 3. Previous release restored
# Check rollback happened: helm history myrelease # Will show upgrade attempt and rollback
# Without atomic, stuck in pending-upgrade: helm upgrade myrelease mychart --timeout 5m # Times out, release stuck
# Must manually rollback: helm rollback myrelease ```
Step 10: Monitor Upgrade Progress
```bash # Create monitoring script: cat << 'EOF' > /usr/local/bin/monitor-helm-upgrade.sh #!/bin/bash
RELEASE=$1 NAMESPACE=${2:-default}
echo "=== Release Status ===" helm status $RELEASE -n $NAMESPACE
echo "" echo "=== Pods Status ===" kubectl get pods -l app.kubernetes.io/instance=$RELEASE -n $NAMESPACE
echo "" echo "=== Events ===" kubectl get events -n $NAMESPACE --sort-by='.lastTimestamp' | tail -20
echo "" echo "=== Deployment Status ===" kubectl get deployments -l app.kubernetes.io/instance=$RELEASE -n $NAMESPACE -o wide
echo "" echo "=== Recent Logs ===" kubectl logs -l app.kubernetes.io/instance=$RELEASE -n $NAMESPACE --tail=20 EOF
chmod +x /usr/local/bin/monitor-helm-upgrade.sh
# Watch upgrade progress: watch -n 5 "helm status myrelease"
# Monitor pods: watch -n 2 "kubectl get pods -l app.kubernetes.io/instance=myrelease"
# Prometheus metrics (if Helm operator): # helm_release_status # helm_upgrade_duration_seconds ```
Helm Upgrade Timeout Checklist
| Check | Command | Expected |
|---|---|---|
| Release status | helm status | Deployed |
| Timeout | --timeout | Adequate |
| Pods ready | kubectl get pods | Running |
| Hooks | helm get hooks | Completed |
| Images | docker pull | Available |
| Resources | describe pods | Sufficient |
Verify the Fix
```bash # After fixing upgrade timeout
# 1. Run upgrade with timeout helm upgrade myrelease mychart --timeout 15m --atomic // Upgrade successful
# 2. Check release status helm status myrelease // STATUS: deployed
# 3. Verify pods running kubectl get pods -l app.kubernetes.io/instance=myrelease // All pods Running/Ready
# 4. Check history helm history myrelease // Latest revision deployed
# 5. Test rollback helm rollback myrelease 1 // Rollback works
# 6. Verify application kubectl port-forward svc/myrelease-app 8080:80 curl http://localhost:8080/health // Application healthy ```
Related Issues
- [Fix Helm Release Failed](/articles/fix-helm-release-failed)
- [Fix Helm Chart Template Render Failed](/articles/fix-helm-chart-template-render-failed)
- [Fix ArgoCD Sync Timeout](/articles/fix-argocd-sync-timeout)