What's Actually Happening
ClickHouse queries fail due to exceeding memory limits. Queries are terminated with memory allocation errors.
The Error You'll See
```sql SELECT * FROM large_table GROUP BY user_id;
Error: Memory limit (total) exceeded: would use 12.0 GiB (attempt to allocate chunk of 42.00 MiB), maximum: 10.0 GiB ```
Memory tracking error:
Error: Memory limit (for query) exceeded: would use 2.0 GiB, maximum: 1.0 GiBOOM during aggregation:
Error: Allocator: Cannot mmap 128.00 MiB., errno: 12Query killed:
Error: Query was cancelledWhy This Happens
- 1.Large GROUP BY - Many unique keys require memory
- 2.JOIN without optimization - Large right table in memory
- 3.Sorting - ORDER BY on large result set
- 4.Window functions - Memory-intensive operations
- 5.Subqueries - Intermediate results in memory
- 6.Memory limit too low - Insufficient allocation
Step 1: Check Current Memory Limits
```sql -- Check server memory settings: SELECT * FROM system.settings WHERE name LIKE '%memory%' OR name LIKE '%max_bytes%';
-- Key settings: -- max_memory_usage - Per query limit (default 10GB) -- max_memory_usage_for_all_queries - Total limit -- max_bytes_before_external_group_by -- max_bytes_before_external_sort
-- Check current memory usage: SELECT metric, value FROM system.metrics WHERE metric LIKE '%Memory%';
-- Check query memory consumption: SELECT query, memory_usage, read_rows, read_bytes FROM system.query_log WHERE type = 'QueryFinish' ORDER BY memory_usage DESC LIMIT 10;
-- Check current queries: SELECT query_id, query, memory_usage, elapsed FROM system.processes ORDER BY memory_usage DESC; ```
Step 2: Adjust Memory Limits
```sql -- Increase per-query memory limit: SET max_memory_usage = 20000000000; -- 20GB
-- For current session: SET max_memory_usage = 30000000000; -- 30GB
-- In config file (/etc/clickhouse-server/config.xml): <profiles> <default> <max_memory_usage>20000000000</max_memory_usage> <max_memory_usage_for_all_queries>50000000000</max_memory_usage_for_all_queries> </default> </profiles>
-- Set in query: SELECT * FROM large_table GROUP BY user_id SETTINGS max_memory_usage = 20000000000;
-- Unlimited (dangerous): SET max_memory_usage = 0;
-- Check setting applied: SELECT name, value FROM system.settings WHERE name = 'max_memory_usage'; ```
Step 3: Enable External Group By
```sql -- Use external GROUP BY to spill to disk: SET max_bytes_before_external_group_by = 10000000000; -- 10GB SET max_memory_usage = 15000000000; -- 15GB for spill buffer
-- Run query with external GROUP BY: SELECT user_id, COUNT(*) as cnt FROM large_table GROUP BY user_id SETTINGS max_bytes_before_external_group_by = 10000000000;
-- External GROUP BY settings: SET distributed_aggregation_memory_efficient = 1; SET aggregation_memory_efficient_merge_threads = 4;
-- Check temp directory for spill: SELECT * FROM system.settings WHERE name = 'tmp_path';
-- Default: /var/lib/clickhouse/tmp/ -- Ensure enough disk space for spills ```
Step 4: Optimize GROUP BY
```sql -- Reduce GROUP BY keys: -- BAD: SELECT user_id, session_id, event_id, COUNT(*) FROM events GROUP BY user_id, session_id, event_id;
-- GOOD: Pre-aggregate SELECT user_id, COUNT(*) as events FROM ( SELECT user_id, session_id, COUNT(*) as session_events FROM events GROUP BY user_id, session_id SETTINGS max_bytes_before_external_group_by = 10000000000 ) GROUP BY user_id;
-- Use -State aggregate functions for incremental: SELECT user_id, uniqState(event_id) as events_state FROM events GROUP BY user_id;
-- Then merge: SELECT user_id, uniqMerge(events_state) FROM state_table GROUP BY user_id;
-- Use approximate aggregation: SELECT user_id, uniqExact(event_id) as exact_count, -- Exact, memory intensive uniq(event_id) as approx_count -- Approximate, less memory FROM events GROUP BY user_id; ```
Step 5: Optimize JOINs
```sql -- ClickHouse loads right table into memory: -- Keep right table small:
-- BAD: Large table on right SELECT l.* FROM large_table l JOIN large_table r ON l.id = r.id;
-- GOOD: Small table on right SELECT l.* FROM large_table l JOIN small_table r ON l.id = r.id;
-- Use ANY LEFT JOIN if duplicates OK: SELECT l.* FROM large_table l ANY LEFT JOIN small_table r ON l.id = r.id;
-- Join algorithm settings: SET join_algorithm = 'grace_hash'; SET grace_hash_join_initial_buckets = 8;
-- Increase join memory: SET max_bytes_in_join = 5000000000; -- 5GB
-- Use IN instead of JOIN when possible: SELECT * FROM large_table WHERE user_id IN (SELECT user_id FROM active_users); ```
Step 6: Optimize ORDER BY
```sql -- Enable external sort: SET max_bytes_before_external_sort = 10000000000; -- 10GB
-- Or use ORDER BY with LIMIT: SELECT * FROM large_table ORDER BY created_at LIMIT 1000; -- Much less memory
-- Avoid sorting large results: -- BAD: SELECT * FROM large_table ORDER BY created_at;
-- GOOD: Use index or limit SELECT * FROM large_table ORDER BY created_at LIMIT 10000;
-- Pre-sort in subquery: SELECT * FROM ( SELECT * FROM large_table ORDER BY user_id LIMIT 1000000 ) ORDER BY created_at LIMIT 100; ```
Step 7: Use Projection and Materialized Views
```sql -- Create projection for common queries: ALTER TABLE events ADD PROJECTION events_by_user ( SELECT * ORDER BY user_id, event_time );
ALTER TABLE events MATERIALIZE PROJECTION events_by_user;
-- Use materialized view for pre-aggregation: CREATE MATERIALIZED VIEW events_daily_mv ENGINE = SummingMergeTree() ORDER BY (event_date, user_id) AS SELECT toDate(event_time) as event_date, user_id, COUNT(*) as event_count FROM events GROUP BY event_date, user_id;
-- Query pre-aggregated data: SELECT event_date, SUM(event_count) as total FROM events_daily_mv GROUP BY event_date;
-- Much less memory than: SELECT toDate(event_time) as event_date, COUNT(*) as total FROM events GROUP BY event_date; ```
Step 8: Optimize Table Structure
```sql -- Check table engine and sorting key: SHOW CREATE TABLE large_table;
-- Use appropriate PRIMARY KEY: -- Data sorted by PRIMARY KEY -- Queries filtering on KEY are efficient
-- Partition large tables: CREATE TABLE events ( event_time DateTime, user_id UInt64, event_type String ) ENGINE = MergeTree() PARTITION BY toYYYYMM(event_time) ORDER BY (event_time, user_id);
-- Query specific partition: SELECT * FROM events WHERE event_time BETWEEN '2024-01-01' AND '2024-01-31'; -- Only scans January partition
-- Use appropriate data types: -- Smaller types = less memory -- UInt32 vs UInt64, String vs LowCardinality(String) ```
Step 9: Monitor and Debug Memory
```sql -- Query memory usage: SELECT query_id, substring(query, 1, 100) as query_short, memory_usage, peak_memory_usage, read_rows, read_bytes FROM system.query_log WHERE type = 'QueryFinish' AND event_date = today() ORDER BY memory_usage DESC LIMIT 10;
-- Memory by user: SELECT user, sum(memory_usage) as total_memory, count() as query_count FROM system.query_log WHERE type = 'QueryFinish' GROUP BY user ORDER BY total_memory DESC;
-- Check memory metrics: SELECT metric, value FROM system.asynchronous_metrics WHERE metric LIKE '%Memory%';
-- Current processes: SELECT query_id, substring(query, 1, 50), memory_usage, elapsed FROM system.processes ORDER BY memory_usage DESC;
-- Kill memory-heavy query: KILL QUERY WHERE query_id = 'xxx'; ```
Step 10: ClickHouse Memory Verification Script
```bash # Create verification script: cat << 'EOF' > /usr/local/bin/check-clickhouse-memory.sh #!/bin/bash
HOST=${1:-"localhost"} PORT=${2:-"9000"}
echo "=== Memory Settings ===" clickhouse-client -h $HOST --port $PORT -q " SELECT name, value FROM system.settings WHERE name LIKE '%memory%' OR name LIKE '%max_bytes%' ORDER BY name"
echo "" echo "=== Memory Metrics ===" clickhouse-client -h $HOST --port $PORT -q " SELECT metric, value FROM system.metrics WHERE metric LIKE '%Memory%'"
echo "" echo "=== Top Memory Queries ===" clickhouse-client -h $HOST --port $PORT -q " SELECT query_id, substring(query, 1, 60) as query, memory_usage, read_rows FROM system.query_log WHERE type = 'QueryFinish' AND event_date = today() ORDER BY memory_usage DESC LIMIT 5"
echo "" echo "=== Active Queries ===" clickhouse-client -h $HOST --port $PORT -q " SELECT query_id, substring(query, 1, 50) as query, memory_usage, elapsed FROM system.processes ORDER BY memory_usage DESC"
echo "" echo "=== Disk Space ===" df -h /var/lib/clickhouse
echo "" echo "=== Recommendations ===" echo "1. Increase max_memory_usage if needed" echo "2. Enable external GROUP BY with max_bytes_before_external_group_by" echo "3. Enable external sort with max_bytes_before_external_sort" echo "4. Use appropriate JOIN order (small table right)" echo "5. Add LIMIT to ORDER BY queries" echo "6. Use materialized views for pre-aggregation" echo "7. Check partition pruning is working" EOF
chmod +x /usr/local/bin/check-clickhouse-memory.sh
# Usage: /usr/local/bin/check-clickhouse-memory.sh localhost 9000 ```
ClickHouse Memory Checklist
| Check | Expected |
|---|---|
| Memory limit | Adequate for queries |
| External GROUP BY | Enabled for large ops |
| JOIN order | Small table right |
| ORDER BY | Has LIMIT |
| Partitions | Query hits specific partition |
| Aggregates | Using uniq() instead of uniqExact() |
| Projections | Created for common queries |
Verify the Fix
```bash # After fixing ClickHouse memory issues
# 1. Check settings applied SELECT name, value FROM system.settings WHERE name = 'max_memory_usage'; // Shows increased value
# 2. Run previously failing query SELECT user_id, COUNT(*) FROM large_table GROUP BY user_id; // Completes successfully
# 3. Check memory usage SELECT query_id, memory_usage FROM system.query_log WHERE type = 'QueryFinish' ORDER BY event_time DESC LIMIT 1; // memory_usage within limits
# 4. Verify external sort used # Query logs show spill to disk if needed
# 5. Monitor ongoing SELECT sum(memory_usage) FROM system.processes; // Within acceptable range
# 6. Check query performance SELECT query, elapsed FROM system.query_log WHERE type = 'QueryFinish' ORDER BY event_time DESC LIMIT 5; // Queries complete faster ```
Related Issues
- [Fix PostgreSQL Connection Pool Exhausted](/articles/fix-postgresql-connection-pool-exhausted)
- [Fix MySQL Slow Query](/articles/fix-mysql-slow-query)
- [Fix Elasticsearch Query Timeout](/articles/fix-elasticsearch-query-timeout)