Fix Linux Out of Memory OOM Killer - Complete Troubleshooting Guide

Your server suddenly becomes unresponsive. You check the logs and see messages like "Out of memory: Kill process 1234 (java) score 567 or sacrifice child." The OOM killer has struck again, terminating processes to keep the system running.

Understanding the Problem

The Linux kernel's OOM (Out-Of-Memory) killer activates when the system exhausts available memory. When no more memory can be allocated and no process will voluntarily release memory, the kernel selects a process to terminate based on its OOM score.

Typical Error Messages

bash

Out of memory: Kill process 1842 (mysqld) score 527 or sacrifice child
Killed process 1842 (mysqld) total-vm:8388608kB, anon-rss:4194304kB, file-rss:0kB
Memory cgroup out of memory: Kill process 1842 (mysqld) score 527 or sacrifice child

You might also notice: - Services crashing unexpectedly - System becoming extremely slow before recovering - SSH connections dropping - Applications reporting "Cannot allocate memory" errors

Diagnosing the Issue

First, verify that OOM killer was indeed the culprit:

```bash # Check for OOM killer activity in kernel ring buffer dmesg | grep -i "out of memory" dmesg | grep -i "killed process"

# Search system logs for OOM events journalctl -k --since "1 hour ago" | grep -i oom grep -i "out of memory" /var/log/syslog grep -i "oom-killer" /var/log/messages ```

Examine current memory usage to understand the scope:

```bash # Quick overview of memory free -h

# Detailed memory breakdown cat /proc/meminfo | head -20

# Top memory-consuming processes ps aux --sort=-%mem | head -15

# Alternative: use smem if installed smem -t -k -s rss | tail -20 ```

For a more detailed analysis, check the OOM scores of running processes:

bash

# List processes with their OOM scores (higher = more likely to be killed)
find /proc -maxdepth 2 -name oom_score -exec sh -c 'pid=$(dirname {} | xargs basename); score=$(cat {}); comm=$(cat $(dirname {})/comm 2>/dev/null); echo "$pid $score $comm"' \; | sort -k2 -rn | head -20

Solutions

Immediate Relief: Free Up Memory

If the system is still responsive but under memory pressure:

```bash # Clear page cache (safe, only frees cached data) sync && echo 1 > /proc/sys/vm/drop_caches

# Clear dentries and inodes echo 2 > /proc/sys/vm/drop_caches

# Clear all caches (use cautiously) echo 3 > /proc/sys/vm/drop_caches

# For PostgreSQL, trigger a checkpoint to flush dirty buffers sudo -u postgres psql -c "CHECKPOINT;"

# Restart memory-hungry services during low traffic systemctl restart application-name ```

Protect Critical Processes

Prevent essential services from being killed by adjusting their OOM score:

```bash # Lower OOM score (-1000 to 1000, lower = less likely to be killed) # -1000 completely disables OOM killing for the process echo -1000 > /proc/$(pidof sshd)/oom_score_adj

# For systemd services, add to the service file: # [Service] # OOMScoreAdjust=-500

# Verify the adjustment cat /proc/$(pidof sshd)/oom_score_adj ```

For a systemd-managed service like PostgreSQL:

```bash # Create an override systemctl edit postgresql

# Add: [Service] OOMScoreAdjust=-500

# Reload and restart systemctl daemon-reload systemctl restart postgresql ```

Tune Virtual Memory Parameters

Adjust kernel parameters to handle memory pressure better:

```bash # View current settings sysctl vm.swappiness vm.vfs_cache_pressure vm.overcommit_memory

# Reduce swappiness (default 60, lower = less swap usage) sysctl -w vm.swappiness=10

# Increase cache pressure to reclaim inode/dentry cache more aggressively sysctl -w vm.vfs_cache_pressure=200

# Control overcommit behavior # 0: heuristic overcommit (default) # 1: always overcommit # 2: never overcommit, strict accounting sysctl -w vm.overcommit_memory=2 sysctl -w vm.overcommit_ratio=80 ```

Make these changes persistent by adding to /etc/sysctl.conf or creating a file in /etc/sysctl.d/:

```bash # Create persistent configuration cat > /etc/sysctl.d/99-memory-tuning.conf << 'EOF' vm.swappiness = 10 vm.vfs_cache_pressure = 200 vm.overcommit_memory = 2 vm.overcommit_ratio = 80 EOF

# Apply changes sysctl -p /etc/sysctl.d/99-memory-tuning.conf ```

Add Swap Space

If you lack sufficient swap, create additional swap file:

```bash # Check current swap swapon --show

# Create a 4GB swap file sudo fallocate -l 4G /swapfile sudo chmod 600 /swapfile sudo mkswap /swapfile sudo swapon /swapfile

# Verify swapon --show free -h

# Make permanent echo '/swapfile none swap sw 0 0' >> /etc/fstab ```

Monitor and Alert

Set up monitoring to catch memory issues early:

```bash # Quick monitoring script cat > /usr/local/bin/memory-check.sh << 'EOF' #!/bin/bash THRESHOLD=90 MEM_USAGE=$(free | awk '/Mem:/ {printf "%.0f", $3/$2 * 100}') if [ "$MEM_USAGE" -gt "$THRESHOLD" ]; then echo "WARNING: Memory usage at ${MEM_USAGE}%" ps aux --sort=-%mem | head -10 | mail -s "Memory Alert" admin@example.com fi EOF chmod +x /usr/local/bin/memory-check.sh

# Add to crontab for periodic checks echo '*/5 * * * * /usr/local/bin/memory-check.sh' | crontab - ```

Verification

After implementing fixes, verify the system is stable:

```bash # Monitor memory in real-time watch -n 1 free -h

# Check for recent OOM events dmesg -T | grep -i "out of memory" | tail -5

# Verify OOM score adjustments cat /proc/$(pidof sshd)/oom_score_adj

# Confirm swap is active swapon --show

# Check sysctl settings are applied sysctl vm.swappiness vm.overcommit_memory ```

Prevention Tips

Set appropriate ulimit values for applications to prevent runaway memory usage
Use containerization (Docker, Podman) with memory limits to isolate memory-hungry applications
Monitor memory trends over time with tools like Prometheus, Grafana, or Zabbix
Review application configurations for memory-related settings (JVM heap size, PHP memory_limit, etc.)
Consider upgrading physical RAM if consistently near capacity during normal operations

How to Fix Linux Out of Memory (OOM Killer) Errors

Understanding the Problem

Typical Error Messages

Diagnosing the Issue

Solutions

Immediate Relief: Free Up Memory

Protect Critical Processes

Tune Virtual Memory Parameters

Add Swap Space

Monitor and Alert

Verification

Prevention Tips

Share this guide

More Linux System Troubleshooting Guides

Linux Thermal Zone Throttling

Linux Hwmon Sensor Not Reading

Linux GPIO Pin Not Accessible

Linux SPI Device Communication Failed

Linux I2C Device Not Found

Linux USB Device Not Mounting