Introduction
An OOM-killed container usually did not fail gracefully. The kernel enforced the container’s memory boundary and terminated the process to protect the host. The fix is not just to raise the limit blindly. You need to understand whether the limit is unrealistically low, the application is misconfigured for the container size, or the workload is leaking or spiking memory in a way the platform cannot sustain.
Symptoms
- The container exits with status
137 docker inspectshowsOOMKilled: true- The service restarts repeatedly under load
- Application logs stop abruptly with no graceful shutdown path
Common Causes
- The hard memory limit is lower than the workload’s real runtime requirements
- The application runtime is not configured to respect container memory constraints
- Temporary build or startup spikes exceed the limit even if steady-state usage is lower
- A memory leak or unbounded cache eventually consumes the whole limit
Step-by-Step Fix
- 1.Confirm the container was actually OOM-killed
- 2.Verify the failure mode first so you do not confuse it with a normal crash or manual stop.
docker inspect my-container --format "{{.State.OOMKilled}}"
docker stats --no-stream- 1.Check the configured memory limit against real workload use
- 2.Many incidents happen because the limit was copied from another service instead of measured for this one.
- 3.Tune the application runtime inside the container
- 4.JVM, Node.js, and other runtimes often need explicit memory settings to behave well within container limits.
- 5.Increase headroom only after understanding the usage pattern
- 6.If the workload has legitimate spikes, raise the limit. If it has a leak, raising the limit only delays the next kill.
Prevention
- Set container memory limits based on measured behavior, not guesswork
- Tune application runtimes to respect container memory boundaries
- Monitor memory growth trends, not just restart counts
- Distinguish startup spikes from true steady-state memory needs