Introduction CrashLoopBackOff on Azure Kubernetes Service means a pod keeps crashing and Kubernetes is backing off before restarting it. The backoff increases from 10 seconds to a maximum of 5 minutes.

Symptoms - `kubectl get pods` shows STATUS = CrashLoopBackOff with high RESTARTS count - Pod events show: "Back-off restarting failed container" - Container terminates within seconds of starting - Liveness probe reports: "Liveness probe failed: connection refused" - Exit code 137 (OOMKilled) or 1 (application error)

Common Causes - Container OOMKilled due to memory limits too low - Liveness probe failing before application finishes startup - Missing ConfigMap or Secret referenced in pod spec - Application crashes due to missing environment variables - Init container failures blocking main container start

Step-by-Step Fix 1. **Check pod status and events**: ```bash kubectl describe pod <pod-name> -n <namespace> ``` Look at the Events section at the bottom for the root cause.

  1. 1.Check container logs:
  2. 2.```bash
  3. 3.kubectl logs <pod-name> -n <namespace> --previous
  4. 4.kubectl logs <pod-name> -n <namespace> --tail=100
  5. 5.`
  6. 6.Check for OOMKilled:
  7. 7.```bash
  8. 8.kubectl get pod <pod-name> -n <namespace> -o jsonpath='{.status.containerStatuses[0].lastState.terminated.reason}'
  9. 9.`
  10. 10.If "OOMKilled", increase memory limits in the deployment spec.
  11. 11.Fix liveness probe timing with startupProbe:
  12. 12.```yaml
  13. 13.livenessProbe:
  14. 14.httpGet: { path: /healthz, port: 8080 }
  15. 15.initialDelaySeconds: 30
  16. 16.periodSeconds: 10
  17. 17.failureThreshold: 3
  18. 18.startupProbe:
  19. 19.httpGet: { path: /healthz, port: 8080 }
  20. 20.failureThreshold: 30
  21. 21.periodSeconds: 10
  22. 22.`
  23. 23.A startupProbe prevents livenessProbe from killing slow-starting containers.
  24. 24.Verify ConfigMaps and Secrets exist:
  25. 25.```bash
  26. 26.kubectl get configmap <config-name> -n <namespace>
  27. 27.kubectl get secret <secret-name> -n <namespace>
  28. 28.`

Prevention - Always configure startupProbe for applications with long initialization - Set memory requests and limits based on actual usage profiling - Use Azure Monitor Container Insights for pod health monitoring - Implement readiness probes to prevent traffic to unready pods - Use PodDisruptionBudgets to control voluntary disruptions