Introduction

Redis OOM (Out Of Memory) command failed errors occur when Redis reaches its configured maxmemory limit and cannot allocate memory for new write operations. When this happens, Redis returns OOM command not allowed when used memory > maxmemory errors to clients, causing application failures, cache misses, session loss, and cascading service disruptions. Unlike database systems that can use disk for overflow, Redis is an in-memory data store - when memory is full, writes fail immediately. The fix requires understanding Redis memory architecture, eviction policies, key expiration strategies, memory profiling, and application-level error handling. This guide provides production-proven troubleshooting for Redis OOM scenarios including memory limit configuration, eviction policy selection, big key detection and removal, memory fragmentation resolution, and monitoring strategies.

Symptoms

  • Application logs show ERR command not allowed when used memory > maxmemory
  • Redis CLI returns OOM command not allowed when used memory > maxmemory
  • Write operations (SET, LPUSH, HSET, etc.) fail while reads succeed
  • Cache misses increase as new entries cannot be stored
  • User sessions lost when session storage Redis reaches memory limit
  • Application slows down or returns errors during peak traffic
  • INFO memory shows used_memory at or above maxmemory
  • Redis mem_fragmentation_ratio above 1.5 indicates memory inefficiency

Common Causes

  • maxmemory limit set too low for dataset size and workload
  • No eviction policy configured (noeviction is default in some versions)
  • Eviction policy inappropriate for workload (e.g., allkeys-lru for mixed workloads)
  • Big keys consuming disproportionate memory (Hash/List/Set with millions of elements)
  • Memory fragmentation from frequent allocations/deallocations
  • Key TTL not set, causing memory to fill with stale data
  • Traffic spike causing temporary memory surge above limit
  • Redis dataset growth outpacing memory capacity planning
  • Multiple applications sharing Redis instance without resource quotas
  • Redis running on 32-bit system with 3GB memory ceiling

Step-by-Step Fix

### 1. Confirm OOM diagnosis

Check current memory status:

```bash # Connect to Redis redis-cli

# Check memory info INFO memory

# Key metrics: # used_memory: Actual memory used by Redis data # used_memory_human: Human-readable (e.g., 1.95G) # used_memory_rss: Memory from OS perspective (includes fragmentation) # used_memory_peak: Maximum memory reached # maxmemory: Configured limit (0 = unlimited) # maxmemory_human: Human-readable limit # maxmemory_policy: Current eviction policy

# Quick memory status redis-cli INFO memory | grep -E "used_memory:|maxmemory:|maxmemory_policy"

# Example output: # used_memory:2147483648 # used_memory_human:2.00G # maxmemory:2147483648 # maxmemory_human:2.00G # maxmemory_policy:noeviction ```

Test write operation:

```bash # Try a simple SET command redis-cli SET test_key "test_value"

# If OOM, returns: # (error) OOM command not allowed when used memory > maxmemory

# Check which commands are failing redis-cli LPUSH mylist "item" redis-cli HSET myhash field value redis-cli SADD myset member ```

### 2. Check current eviction policy

Eviction policy determines how Redis handles memory pressure:

```bash # Get current eviction policy redis-cli CONFIG GET maxmemory-policy

# Output examples: # noeviction - Return OOM errors (default for data safety) # allkeys-lru - Evict least recently used keys from all keys # volatile-lru - Evict LRU keys with TTL set # allkeys-lfu - Evict least frequently used keys # volatile-lfu - Evict LFU keys with TTL set # allkeys-random - Evict random keys # volatile-random - Evict random keys with TTL # volatile-ttl - Evict keys with shortest TTL

# Check if policy is appropriate for your workload # - Cache workloads: allkeys-lru or allkeys-lfu # - Session storage: volatile-lru with TTL # - Mixed workloads: volatile-lru or volatile-lfu # - Time-series: volatile-ttl ```

Eviction policy selection guide:

| Use Case | Recommended Policy | Rationale | |----------|-------------------|-----------| | Pure cache | allkeys-lru | Evict coldest cached items | | Cache with popularity | allkeys-lfu | Evict least frequently accessed | | Session storage | volatile-lru | Only evict sessions with TTL | | Time-series data | volatile-ttl | Evict oldest data first | | Mixed (cache + persistent) | volatile-lru | Protect keys without TTL | | Development/testing | allkeys-random | Simple, predictable |

### 3. Increase maxmemory limit

If memory is available on the host, increase the limit:

```bash # Check current limit redis-cli CONFIG GET maxmemory

# Set new limit (example: 4GB) redis-cli CONFIG SET maxmemory 4gb

# Make persistent (add to redis.conf) # Edit /etc/redis/redis.conf or /etc/redis.conf # maxmemory 4gb

# Restart Redis to apply config file changes sudo systemctl restart redis

# Verify new limit redis-cli INFO memory | grep maxmemory

# Calculate appropriate limit based on host memory # Rule: Redis should use 50-75% of host RAM # Leave room for: OS, other processes, Redis fork operations

# Example: 8GB host # - Reserve 2GB for OS (25%) # - Redis maxmemory: 6GB (75%) ```

Memory limit guidelines:

| Host RAM | Recommended maxmemory | Notes | |----------|----------------------|-------| | 2GB | 1-1.5GB | Leave 25-50% for OS | | 4GB | 2-3GB | Suitable for small datasets | | 8GB | 5-6GB | Good for medium workloads | | 16GB | 10-12GB | Handle large datasets | | 32GB+ | 20-24GB | Consider Redis Cluster for larger |

### 4. Change eviction policy

Switch to an appropriate eviction policy:

```bash # Change eviction policy (runtime) redis-cli CONFIG SET maxmemory-policy allkeys-lru

# Make persistent in redis.conf # Edit /etc/redis/redis.conf # maxmemory-policy allkeys-lru

# Verify policy change redis-cli CONFIG GET maxmemory-policy

# Monitor eviction activity redis-cli INFO stats | grep evicted

# Output: # evicted_keys: 1234 # This shows how many keys have been evicted ```

Configure LRU/LFU sampling:

```bash # LRU/LFU algorithms sample a subset of keys for performance # Higher samples = more accurate eviction, more CPU

# Check current sample size redis-cli CONFIG GET maxmemory-samples

# Default is 5, increase to 10 for better accuracy redis-cli CONFIG SET maxmemory-samples 10

# For LFU, configure decay time redis-cli CONFIG GET lfu-decay-time redis-cli CONFIG GET lfu-log-factor

# lfu-decay-time: Minutes to decay counter by 1 (default 1) # lfu-log-factor: Counter saturation factor (default 10) ```

### 5. Detect and remove big keys

Big keys consume disproportionate memory and block Redis during deletion:

```bash # Find big keys using --bigkeys scan redis-cli --bigkeys

# Output example: # # Scanning the entire keyspace to find bigger keys... # # -------- output ------- # # Sampled 100000 keys in the keyspace. # Peak key found: 10485760 bytes (10.00 MB) # # Hash keys sampled: 50000 # Biggest hash key found: 10485760 bytes with 1000000 fields # # List keys sampled: 1000 # Biggest list key found: 5242880 bytes with 500000 elements # # Set keys sampled: 2000 # Biggest set key found: 2097152 bytes with 100000 members # # Sorted Set keys sampled: 500 # Biggest sorted set key found: 3145728 bytes with 150000 members

# More detailed analysis with redis-rdb-tools pip install rdbtools rdb-cli --command biggest /path/to/dump.rdb

# Or use Redis 4.0+ MEMORY command redis-cli MEMORY USAGE mykey

# Scan for keys over 1MB redis-cli --scan --pattern '*' | while read key; do size=$(redis-cli MEMORY USAGE "$key") if [ "$size" -gt 1048576 ]; then echo "$key: $((size/1024/1024))MB" fi done ```

Remove big keys safely:

```bash # WRONG: DEL on big key blocks Redis DEL big_hash_with_million_fields # Blocks for seconds!

# CORRECT: Use UNLINK (non-blocking async delete) UNLINK big_hash_with_million_fields

# For Hash: delete field by field in batches redis-cli HLEN big_hash # Returns: 1000000

# Batch delete 1000 fields at a time python << 'EOF' import redis r = redis.Redis() key = "big_hash" batch_size = 1000

while True: fields = r.hkeys(key)[:batch_size] if not fields: break r.hdel(key, *fields) print(f"Deleted {batch_size} fields") EOF

# For List: trim in batches redis-cli LLEN mylist # Returns: 500000

# Trim 10000 elements at a time while redis-cli LLEN mylist > 0; do redis-cli LTRIM mylist 10000 -1 done

# For Set/Sorted Set: use SPOP/ZPOPMIN in loop redis-cli SCARD big_set # Returns: 100000

# Pop 1000 members at a time while redis-cli SCARD big_set > 0; do redis-cli SPOP big_set 1000 done ```

### 6. Set key TTL for automatic expiration

Keys with TTL are automatically evicted when memory is full:

```bash # Set key with expiration (seconds) redis-cli SET session:user123 "{\"user_id\": 123}" EX 3600

# Set key with expiration (milliseconds) redis-cli SET cache:item "value" PX 60000

# Add TTL to existing key redis-cli EXPIRE cache:item 3600

# Set TTL on Hash redis-cli HSET user:123 name "John" email "john@example.com" redis-cli EXPIRE user:123 7200

# Check TTL redis-cli TTL session:user123 # Returns: seconds until expiration, -1 (no TTL), -2 (expired)

# Set default TTL in application code # Python example import redis r = redis.Redis() r.setex("cache:key", 3600, "value") # Set with 1 hour TTL

# Java Spring example @Autowired private RedisTemplate<String, String> redisTemplate;

public void cacheWithTtl(String key, String value) { redisTemplate.opsForValue().set(key, value, 1, TimeUnit.HOURS); } ```

TTL guidelines by data type:

| Data Type | Recommended TTL | Rationale | |-----------|----------------|-----------| | User sessions | 1-24 hours | Match business requirements | | API cache | 5-60 minutes | Balance freshness vs. hits | | Computed results | 15-60 minutes | Avoid recomputation | | Rate limit counters | 1-60 seconds | Rolling window | | Lock keys | 5-30 seconds | Prevent deadlocks | | Temporary data | 5-60 minutes | Auto-cleanup |

### 7. Enable memory-efficient data structures

Optimize data structure encoding:

```bash # Check encoding of keys redis-cli OBJECT ENCODING mykey

# Output: # embstr - Embedded string (small strings, efficient) # raw - Raw string (larger strings) # int - Integer encoded (most efficient) # hashtable - Standard hash table # ziplist - Compressed list (memory efficient) # quicklist - Linked list of ziplists (default for lists) # intset - Integer set (memory efficient) # skiplist - Sorted set internal structure

# Optimize Hash encoding # Ziplist encoding is more memory efficient for small hashes redis-cli CONFIG SET hash-max-ziplist-entries 512 redis-cli CONFIG SET hash-max-ziplist-value 64

# Optimize List encoding redis-cli CONFIG SET list-max-ziplist-size -2 # -2: 8KB max per ziplist node # -1: 16KB, -3: 4KB, -4: 32KB, -5: 64KB

# Optimize Set encoding redis-cli CONFIG SET set-max-intset-entries 512 # Sets with only integers use intset (more efficient)

# Optimize Sorted Set encoding redis-cli CONFIG SET zset-max-ziplist-entries 128 redis-cli CONFIG SET zset-max-ziplist-value 64 ```

Use memory-efficient patterns:

```bash # WRONG: Individual keys for related data SET user:1:name "John" SET user:1:email "john@example.com" SET user:1:phone "123-456-7890" # Overhead: 3 separate keys, 3x memory overhead

# CORRECT: Use Hash for related data HSET user:1 name "John" email "john@example.com" phone "123-456-7890" # Single key, shared overhead, better compression

# WRONG: List for unique items LPUSH tags "tag1" LPUSH tags "tag1" # Duplicate! LPUSH tags "tag2"

# CORRECT: Use Set for unique items SADD tags "tag1" "tag2"

# WRONG: Store full objects SET user:123 "{\"id\":123,\"name\":\"John\",\"email\":\"john@example.com\",\"active\":true,...}"

# CORRECT: Use Hash with selective fields HSET user:123 name "John" email "john@example.com" active 1 ```

### 8. Clear memory fragmentation

Memory fragmentation wastes RAM without storing data:

```bash # Check fragmentation ratio redis-cli INFO memory | grep fragmentation

# Output: # mem_fragmentation_ratio:1.45 # mem_fragmentation_bytes:524288000

# Fragmentation ratio: # - 1.0 = Perfect (no fragmentation) # - 1.0-1.5 = Acceptable # - 1.5-2.0 = Moderate fragmentation # - >2.0 = Severe fragmentation

# Enable automatic defragmentation (Redis 4.0+) redis-cli CONFIG SET activedefrag yes redis-cli CONFIG SET active-defrag-threshold-lower 10 redis-cli CONFIG SET active-defrag-threshold-upper 100 redis-cli CONFIG SET active-defrag-cycle-min 1 redis-cli CONFIG SET active-defrag-cycle-max 25

# Parameters: # threshold-lower: Start defrag at 10% fragmentation # threshold-upper: Maximum effort at 100% fragmentation # cycle-min: Minimum CPU % for defrag # cycle-max: Maximum CPU % for defrag

# Manual defrag (restart required) # Stop Redis, restart with --activedefrag yes sudo systemctl restart redis

# Check fragmentation after defrag redis-cli INFO memory | grep fragmentation ```

### 9. Implement client-side error handling

Applications should handle Redis OOM errors gracefully:

```python # Python with error handling and fallback import redis from redis.exceptions import ResponseError

class CacheWithFallback: def __init__(self): self.redis = redis.Redis(host='localhost', port=6379)

def set_with_handling(self, key, value, ttl=3600): try: return self.redis.setex(key, ttl, value) except ResponseError as e: if 'OOM' in str(e): # Log OOM, optionally clear some cache logger.warning(f"Redis OOM: {key}")

# Option 1: Delete some cache to make room self._evict_some_keys()

# Option 2: Use local cache as fallback self._local_cache[key] = value

# Option 3: Silently drop (acceptable for cache) return False raise

def _evict_some_keys(self): # Evict 10% of keys with shortest TTL pass

def get_with_fallback(self, key): try: return self.redis.get(key) except ResponseError as e: if 'OOM' in str(e): # Try local cache return self._local_cache.get(key) raise ```

Java with Spring:

```java @Service public class CacheService {

@Autowired private RedisTemplate<String, String> redisTemplate;

private static final Logger log = LoggerFactory.getLogger(CacheService.class);

public void cacheWithOomHandling(String key, String value, long ttl) { try { redisTemplate.opsForValue().set(key, value, ttl, TimeUnit.SECONDS); } catch (RedisException e) { if (e.getRootCause() != null && e.getRootCause().getMessage().contains("OOM")) {

log.warn("Redis OOM detected for key: {}", key);

// Option 1: Log and skip (cache miss acceptable) // Option 2: Use Caffeine local cache as L2 // Option 3: Trigger manual eviction

} else { throw e; } } }

@Cacheable(value = "fallback", unless = "#result == null") public String getWithFallback(String key) { try { return redisTemplate.opsForValue().get(key); } catch (RedisException e) { if (e.getRootCause() != null && e.getRootCause().getMessage().contains("OOM")) { log.warn("Redis OOM on get, using fallback: {}", key); return null; // Will trigger cache miss handling } throw e; } } } ```

### 10. Monitor memory usage proactively

Set up memory monitoring and alerting:

```bash # Monitor memory in real-time watch -n 1 'redis-cli INFO memory | grep -E "used_memory_human|maxmemory_human|fragmentation"'

# Create monitoring script cat > /usr/local/bin/redis-memory-monitor.sh << 'EOF' #!/bin/bash

MEMORY_INFO=$(redis-cli INFO memory) USED=$(echo "$MEMORY_INFO" | grep "used_memory:" | cut -d: -f2 | tr -d '\r') MAX=$(echo "$MEMORY_INFO" | grep "maxmemory:" | cut -d: -f2 | tr -d '\r') FRAG=$(echo "$MEMORY_INFO" | grep "mem_fragmentation_ratio:" | cut -d: -f2 | tr -d '\r')

if [ "$MAX" -gt 0 ]; then USAGE_PCT=$((USED * 100 / MAX)) else USAGE_PCT=0 fi

echo "Redis Memory Usage: ${USAGE_PCT}%" echo "Fragmentation Ratio: ${FRAG}"

if [ "$USAGE_PCT" -gt 90 ]; then echo "CRITICAL: Memory usage above 90%" exit 2 elif [ "$USAGE_PCT" -gt 80 ]; then echo "WARNING: Memory usage above 80%" exit 1 else echo "OK: Memory usage normal" exit 0 fi EOF

chmod +x /usr/local/bin/redis-memory-monitor.sh ```

Prometheus metrics with redis_exporter:

```yaml # docker-compose.yml version: '3' services: redis_exporter: image: oliver006/redis_exporter:latest ports: - "9121:9121" command: - --redis.addr=redis:6379 - --check-keys=cache:*,session:*

prometheus: image: prom/prometheus:latest volumes: - ./prometheus.yml:/etc/prometheus/prometheus.yml ports: - "9090:9090" ```

Prometheus alerting rules:

```yaml # alerting_rules.yml groups: - name: redis_memory rules: - alert: RedisMemoryHigh expr: redis_memory_used_bytes / redis_max_memory_bytes > 0.85 for: 5m labels: severity: warning annotations: summary: "Redis memory usage above 85%" description: "Redis {{ $labels.instance }} memory usage is {{ $value | humanizePercentage }}"

  • alert: RedisMemoryCritical
  • expr: redis_memory_used_bytes / redis_max_memory_bytes > 0.95
  • for: 2m
  • labels:
  • severity: critical
  • annotations:
  • summary: "Redis memory usage above 95%"
  • description: "Redis {{ $labels.instance }} memory usage is {{ $value | humanizePercentage }} - OOM imminent"
  • alert: RedisFragmentationHigh
  • expr: redis_memory_fragmentation_ratio > 1.5
  • for: 10m
  • labels:
  • severity: warning
  • annotations:
  • summary: "Redis memory fragmentation high"
  • description: "Redis {{ $labels.instance }} fragmentation ratio is {{ $value }}"
  • `

Grafana dashboard panels:

json { "panels": [ { "title": "Redis Memory Usage", "targets": [ { "expr": "redis_memory_used_bytes", "legendFormat": "Used Memory" }, { "expr": "redis_max_memory_bytes", "legendFormat": "Max Memory" } ] }, { "title": "Memory Fragmentation Ratio", "targets": [ { "expr": "redis_memory_fragmentation_ratio" } ] }, { "title": "Evicted Keys Rate", "targets": [ { "expr": "rate(redis_evicted_keys_total[5m])" } ] } ] }

### 11. Handle Redis Cluster memory

For Redis Cluster, manage memory per shard:

```bash # Check memory for each node redis-cli -c -p 7000 CLUSTER NODES | awk '{print $1, $2}' | while read id addr; do echo "=== Node $id ($addr) ===" redis-cli -h ${addr%:*} -p ${addr#*:} INFO memory | grep -E "used_memory_human|maxmemory_human" done

# Check memory distribution across slots redis-cli -c -p 7000 CLUSTER SLOTS

# Rebalance if memory uneven redis-cli --cluster rebalance 127.0.0.1:7000 --cluster-use-empty-masters

# Migrate keys from high-memory node redis-cli --cluster reshard 127.0.0.1:7000 ```

### 12. Configure Redis for production

Production-ready redis.conf settings:

```conf # Memory management maxmemory 4gb maxmemory-policy allkeys-lru maxmemory-samples 10

# Defragmentation activedefrag yes active-defrag-threshold-lower 10 active-defrag-threshold-upper 100 active-defrag-cycle-min 1 active-defrag-cycle-max 25

# Persistence (affects memory during BGSAVE) save 900 1 save 300 10 save 60 10000

# Disable THP for better latency (set at OS level) # never use transparent_hugepage

# Client limits maxclients 10000 timeout 300 tcp-keepalive 60

# Slow log for debugging slowlog-log-slower-than 10000 slowlog-max-len 128

# Memory notifications (Redis 4.0+) notify-keyspace-events Ex ```

Prevention

  • Set maxmemory to 75% of available RAM
  • Always use allkeys-lru or volatile-lru eviction policy
  • Set TTL on all cache keys (sessions, computed results, API responses)
  • Monitor memory usage with alerts at 80% and 95%
  • Scan for big keys weekly and remove or restructure
  • Enable active defragmentation for long-running instances
  • Use Hash instead of individual keys for related data
  • Implement client-side OOM error handling with fallbacks
  • Plan capacity based on dataset growth projections
  • Use Redis Cluster for datasets exceeding single-node memory
  • **ERR maxmemory limit reached**: Memory limit exceeded
  • **OOM command not allowed**: Write operations blocked
  • **MISCONF Redis is configured to save RDB**: Persistence error during memory pressure
  • **Can't save in background, fork failed**: Not enough memory for BGSAVE fork
  • **Background save error**: RDB/AOF persistence failing