What's Actually Happening

ClickHouse query exceeds configured memory limit and is terminated. Large aggregations, joins, or queries processing many rows can exceed memory constraints.

The Error You'll See

Memory limit exceeded:

```bash $ clickhouse-client --query "SELECT count() FROM large_table GROUP BY column"

Received exception from server (version 24.1): Code: 241. DB::Exception: Memory limit exceeded (used: 2.5GiB, limit: 2GiB) Would use 2.5 GiB at peak for aggregation. ```

Memory limit for query:

bash
Code: 241. DB::Exception: Memory limit (for query) exceeded: would use 3GiB > 2GiB (max_memory_usage)

Overflow during aggregation:

bash
Code: 241. DB::Exception: Memory limit exceeded during aggregation.
Would overflow with current memory limit.

Why This Happens

  1. 1.Large GROUP BY - Many unique grouping keys
  2. 2.Big JOINs - Joining large tables without optimization
  3. 3.Memory limit too low - Server limit insufficient
  4. 4.No spill to disk - Aggregation not spilling
  5. 5.Query not optimized - Inefficient query structure
  6. 6.High cardinality columns - Many unique values

Step 1: Check Memory Settings

```bash # Check current memory limits: clickhouse-client --query "SELECT name, value FROM system.settings WHERE name LIKE '%memory%'"

# Key settings: # max_memory_usage: 2GiB (default) # max_memory_usage_for_all_queries: 4GiB # max_bytes_before_external_group_by: 0 (disabled) # max_bytes_before_external_sort: 0 (disabled)

# Check server memory: clickhouse-client --query "SELECT * FROM system.asynchronous_metrics WHERE metric LIKE '%Memory%'"

# Check current query memory: clickhouse-client --query "SELECT query, memory_usage FROM system.processes"

# Check total memory usage: free -h ps aux | grep clickhouse ```

Step 2: Increase Memory Limits

```bash # Increase per-query memory limit: clickhouse-client --query "SET max_memory_usage = 4GiB"

# Or in query: SELECT count() FROM large_table GROUP BY column SETTINGS max_memory_usage = 4000000000

# Increase global memory limit: # In config.xml: <max_memory_usage>4000000000</max_memory_usage> <max_memory_usage_for_all_queries>8000000000</max_memory_usage_for_all_queries>

# Or use profile: clickhouse-client --query "SET profile = 'large_memory'"

# In users.xml: <profiles> <large_memory> <max_memory_usage>10000000000</max_memory_usage> </large_memory> </profiles> ```

Step 3: Enable External Aggregation

```bash # Enable spill to disk for GROUP BY: clickhouse-client --query "SET max_bytes_before_external_group_by = 1000000000"

# Query will spill to disk when exceeding 1GB: SELECT count() FROM large_table GROUP BY column SETTINGS max_bytes_before_external_group_by = 1000000000

# Enable external sort for ORDER BY: clickhouse-client --query "SET max_bytes_before_external_sort = 1000000000"

# Enable for JOIN: clickhouse-client --query "SET max_bytes_before_join = 1000000000"

# In config.xml (permanent): <max_bytes_before_external_group_by>1000000000</max_bytes_before_external_group_by> <max_bytes_before_external_sort>1000000000</max_bytes_before_external_sort>

# External aggregation is slower but doesn't fail ```

Step 4: Optimize GROUP BY

```bash # Use -If aggregate functions for conditional aggregation: SELECT sumIf(value, condition) as conditional_sum, countIf(condition) as conditional_count FROM large_table GROUP BY simple_key SETTINGS max_memory_usage = 2GiB

# Instead of: SELECT sum(value) as total, sum(case_value) as conditional FROM (SELECT * FROM large_table WHERE condition) GROUP BY simple_key

# Use two-step aggregation: SELECT sum(total) FROM ( SELECT count() as total FROM large_table GROUP BY column SETTINGS max_bytes_before_external_group_by = 1GiB )

# Reduce grouping keys: SELECT count() FROM large_table GROUP BY column1 # Instead of GROUP BY column1, column2, column3

# Use approximate aggregation: SELECT uniq(column) FROM large_table # Instead of SELECT count(DISTINCT column)

# uniq() uses less memory than count(DISTINCT) ```

Step 5: Optimize JOINs

```bash # Use appropriate JOIN type: # LEFT ANY JOIN is more memory efficient: SELECT a.*, b.value FROM table_a LEFT ANY JOIN table_b ON a.key = b.key

# Use JOIN with limits: SELECT a.*, b.value FROM table_a JOIN (SELECT key, value FROM table_b LIMIT 1000000) b ON a.key = b.key

# Use hash join with spill: SELECT * FROM large_table_a JOIN large_table_b ON a.key = b.key SETTINGS join_algorithm = 'hash', max_bytes_before_join = 1000000000

# Use grace hash join for very large: SETTINGS join_algorithm = 'grace_hash'

# Filter before join: SELECT a.*, b.value FROM (SELECT * FROM table_a WHERE condition) a JOIN table_b ON a.key = b.key # Reduces join input size ```

Step 6: Use Projection

```bash # Create projection for common queries: ALTER TABLE large_table ADD PROJECTION proj_by_column ( SELECT count(), sum(value) GROUP BY column )

# Materialize projection: ALTER TABLE large_table MATERIALIZE PROJECTION proj_by_column

# Query uses projection automatically: SELECT count(), sum(value) FROM large_table GROUP BY column # Uses pre-computed projection instead of raw data

# Check projections: SELECT * FROM system.projections WHERE table = 'large_table'

# Drop unused projections: ALTER TABLE large_table DROP PROJECTION proj_by_column ```

Step 7: Partition Query

```bash # Query by partition: SELECT count() FROM large_table WHERE partition_key = 'partition_1' GROUP BY column

# Then process other partitions: SELECT count() FROM large_table WHERE partition_key = 'partition_2' GROUP BY column

# Union results: SELECT sum(cnt) FROM ( SELECT count() as cnt FROM large_table WHERE partition_key = 'p1' GROUP BY column UNION ALL SELECT count() as cnt FROM large_table WHERE partition_key = 'p2' GROUP BY column )

# Or use window of data: SELECT count() FROM large_table WHERE date BETWEEN '2024-01-01' AND '2024-01-31' GROUP BY column # Smaller data window = less memory ```

Step 8: Check Query Profile

```bash # Enable query logging with memory: clickhouse-client --query "SET log_queries = 1, log_query_memory_usage = 1"

# Run query: SELECT count() FROM large_table GROUP BY column

# Check query log: SELECT query, memory_usage, read_rows FROM system.query_log WHERE query LIKE '%large_table%' AND type = 'QueryFinish' ORDER BY event_time DESC LIMIT 5

# Check peak memory: SELECT query, peak_memory_usage, memory_usage FROM system.query_log WHERE peak_memory_usage > 1000000000 ORDER BY peak_memory_usage DESC

# Analyze memory breakdown: SELECT query, memory_usage_aggregation, memory_usage_join, memory_usage_sort FROM system.query_log WHERE memory_usage > 0 ```

Step 9: Optimize Table Structure

```bash # Check table structure: SHOW CREATE TABLE large_table

# Optimize ORDER BY for queries: # ORDER BY should match common GROUP BY keys: CREATE TABLE large_table ( date Date, key String, value UInt64 ) ORDER BY (date, key)

# Then GROUP BY key uses ORDER BY efficiently: SELECT count() FROM large_table GROUP BY key

# Use PRIMARY KEY wisely: # PRIMARY KEY is sparse and stored in memory CREATE TABLE large_table ( date Date, key String, value UInt64 ) PRIMARY KEY (date, key) ORDER BY (date, key, value)

# Use appropriate compression: CREATE TABLE large_table ( ... ) ORDER BY (date, key) SETTINGS compression = 'ZSTD(3)' # Better compression = less memory for reads

# Optimize table: OPTIMIZE TABLE large_table FINAL # Merges parts for better query performance ```

Step 10: Monitor Memory Usage

```bash # Create monitoring script: cat << 'EOF' > /usr/local/bin/monitor-clickhouse.sh #!/bin/bash

echo "=== Current Memory Usage ===" clickhouse-client --query " SELECT metric, value / 1024 / 1024 as value_mb FROM system.asynchronous_metrics WHERE metric LIKE '%Memory%' "

echo "" echo "=== Active Queries Memory ===" clickhouse-client --query " SELECT query_id, substring(query, 1, 50) as query, memory_usage / 1024 / 1024 as memory_mb FROM system.processes WHERE memory_usage > 100000000 "

echo "" echo "=== Memory Settings ===" clickhouse-client --query " SELECT name, value FROM system.settings WHERE name LIKE '%memory%' OR name LIKE '%bytes_before_external%' " EOF

chmod +x /usr/local/bin/monitor-clickhouse.sh

# Prometheus metrics: curl http://localhost:9363/metrics | grep -i memory

# Key metrics: # ClickHouseMetrics_MemoryTracking - tracked memory # ClickHouseMetrics_MemoryResident - resident memory

# Alert for memory limit: - alert: ClickHouseMemoryLimit expr: ClickHouseMetrics_MemoryTracking > 0.9 * ClickHouseMetrics_MaxMemoryUsage for: 5m labels: severity: warning annotations: summary: "ClickHouse approaching memory limit" ```

ClickHouse Memory Limit Checklist

CheckCommandExpected
Memory limitsystem.settingsAdequate
External aggmax_bytes_beforeEnabled
Query memorysystem.processesWithin limit
GROUP BY keysquery analysisMinimized
JOIN sizequery analysisFiltered
Projectionsystem.projectionsAvailable

Verify the Fix

```bash # After adjusting memory settings

# 1. Run problematic query clickhouse-client --query "SELECT count() FROM large_table GROUP BY column SETTINGS max_memory_usage = 4GiB" // Query completes successfully

# 2. Check memory used SELECT query, memory_usage FROM system.query_log WHERE query LIKE '%large_table%' ORDER BY event_time DESC LIMIT 1 // Memory within limit

# 3. Test external aggregation SELECT count() FROM large_table GROUP BY column SETTINGS max_bytes_before_external_group_by = 1GiB // Uses disk spill if needed

# 4. Verify projections working SELECT count(), sum(value) FROM large_table GROUP BY column // Uses projection, faster

# 5. Monitor ongoing queries clickhouse-client --query "SELECT query, memory_usage FROM system.processes" // No queries exceeding limit

# 6. Check server stability free -h // Sufficient memory available ```

  • [Fix Elasticsearch Query Taking Too Long](/articles/fix-elasticsearch-query-taking-too-long)
  • [Fix MongoDB Aggregation Memory Limit](/articles/fix-mongodb-aggregation-memory-limit)
  • [Fix Redis Memory OOM](/articles/fix-redis-memory-oom)