# How to Fix Elasticsearch Heap Size Issues
Your Elasticsearch nodes are either crashing with OutOfMemoryError or performing poorly due to heap misconfiguration. Getting the JVM heap size right is crucial for stability and performance.
Recognizing Heap Size Problems
Signs of Undersized Heap
You'll see these symptoms when heap is too small:
OutOfMemoryError in logs:
java.lang.OutOfMemoryError: Java heap space
[ERROR][o.e.b.HierarchyCircuitBreakerService] circuit breaker triggeredCircuit breaker trips frequently:
curl -X GET "localhost:9200/_nodes/stats/breaker?pretty"{
"breakers" : {
"parent" : {
"tripped" : 150
}
}
}High heap usage constantly:
curl -X GET "localhost:9200/_nodes/stats/jvm?pretty"{
"jvm" : {
"mem" : {
"heap_used_percent" : 95
}
}
}Signs of Oversized Heap
Counterintuitively, too much heap also causes problems:
Long garbage collection pauses:
[WARN ][o.e.m.j.JvmMonitorService] [node-1] detected slow gc [G1 Young Generation] duration [1.5s]
[WARN ][o.e.m.j.JvmMonitorService] [node-1] detected slow gc [G1 Old Generation] duration [15s]Node becomes unresponsive during GC:
curl -X GET "localhost:9200/_nodes/stats/jvm?pretty" | grep gc{
"gc" : {
"collectors" : {
"old" : {
"collection_time_in_millis" : 450000
}
}
}
}Queries timeout intermittently.
Understanding JVM Heap
The JVM heap stores:
- Lucene index segments (in-memory structures)
- Field data cache for aggregations
- Query cache for repeated searches
- Request memory for active operations
- Cluster state metadata
Lucene also uses off-heap memory for index segments stored on disk. This is important: the heap is not the only memory Elasticsearch uses.
Checking Current Heap Configuration
curl -X GET "localhost:9200/_nodes/stats/jvm?pretty"{
"nodes" : {
"node-1" : {
"jvm" : {
"mem" : {
"heap_used_in_bytes" : 8000000000,
"heap_max_in_bytes" : 16000000000,
"heap_used_percent" : 50
}
}
}
}
}Check the JVM options file:
cat /etc/elasticsearch/jvm.options | grep "Xm"-Xms8g
-Xmx8gOr in newer versions:
cat /etc/elasticsearch/jvm.options.d/heap.optionsThe Heap Size Rule
Elasticsearch recommends:
- 1.Set minimum and maximum to the same value (
-Xms=-Xmx) - 2.Don't exceed 50% of physical RAM
- 3.Don't exceed 31GB (compressed OOPs threshold)
The 50% rule leaves memory for:
- Lucene off-heap segment memory
- Operating system file cache
- Network buffers
- Other processes
The 31GB limit preserves compressed ordinary object pointers (OOPs). Below 31GB, JVM uses 32-bit pointers, saving memory. Above 31GB, it switches to 64-bit pointers, actually reducing effective heap.
Solution 1: Increase Heap Size
For undersized heap, increase it:
```bash # Edit /etc/elasticsearch/jvm.options # Change: -Xms4g -Xmx4g
# To: -Xms8g -Xmx8g ```
For Elasticsearch 8.x+, use environment variable or options file:
# Create a custom options file
echo '-Xms8g' > /etc/elasticsearch/jvm.options.d/heap.options
echo '-Xmx8g' >> /etc/elasticsearch/jvm.options.d/heap.optionsRestart Elasticsearch:
systemctl restart elasticsearchVerify the change:
curl -X GET "localhost:9200/_nodes/stats/jvm?filter_path=nodes.*.jvm.mem.heap_max_in_bytes&pretty"Solution 2: Decrease Heap Size
For oversized heap (causing GC pauses), reduce it:
# From 32GB down to 24GB
-Xms24g
-Xmx24gThis keeps you under the 31GB compressed OOPs threshold and leaves more RAM for the OS file cache.
Monitor GC after the change:
curl -X GET "localhost:9200/_nodes/stats/jvm?filter_path=nodes.*.jvm.gc&pretty"Solution 3: Change Garbage Collector
For large heaps, G1GC is the default and usually best. But for specific cases, you might adjust:
# In jvm.options
-XX:+UseG1GC
-XX:G1HeapRegionSize=32m
-XX:InitiatingHeapOccupancyPercent=30The InitiatingHeapOccupancyPercent controls when G1 starts concurrent marking. Lower values trigger earlier, reducing old GC pauses.
Do NOT use the old CMS collector for modern Elasticsearch:
# Don't use this
# -XX:+UseConcMarkSweepGCSolution 4: Configure Heap Dump on OOM
To diagnose OOM errors, enable heap dumps:
# Add to jvm.options
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/var/log/elasticsearch/heapdump.hprofAnalyze the dump with tools like Eclipse Memory Analyzer or VisualVM.
Solution 5: Reduce Memory Usage
If you can't increase heap, reduce memory consumption:
Clear caches:
curl -X POST "localhost:9200/_cache/clear"Reduce field data cache:
curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
"transient": {
"indices.fielddata.cache.size": "20%"
}
}
'Reduce query cache:
curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
"transient": {
"indices.queries.cache.size": "5%"
}
}
'Use doc_values instead of fielddata:
Doc values are stored on disk, not in heap:
curl -X PUT "localhost:9200/your-index/_mapping" -H 'Content-Type: application/json' -d'
{
"properties": {
"category": {
"type": "keyword",
"doc_values": true
}
}
}
'Solution 6: Scale Horizontally
If you're hitting heap limits constantly, add nodes:
curl -X GET "localhost:9200/_cat/nodes?v&h=name,heap.percent"name heap.percent
node-1 85
node-2 90
node-3 82New nodes will receive shards, distributing the load.
Heap Size Calculator
Use this formula:
``` Recommended Heap = min(50% of RAM, 31GB)
Examples: - 16GB RAM server: Heap = 8GB - 32GB RAM server: Heap = 16GB (not 32GB - leave for OS) - 64GB RAM server: Heap = 31GB (compressed OOPs limit) - 128GB RAM server: Heap = 31GB (still 31GB max) ```
For memory-heavy workloads (many aggregations, large fielddata):
Heap = 40-50% of RAM (up to 31GB)For search-heavy workloads (Lucene dominant):
Heap = 20-30% of RAM (more for OS file cache)Verification Steps
After changing heap size:
- 1.Check JVM settings applied:
curl -X GET "localhost:9200/_nodes/stats/jvm?filter_path=nodes.*.jvm.mem&pretty"- 1.Monitor heap usage over time:
watch -n 10 'curl -s localhost:9200/_nodes/stats/jvm?filter_path=nodes.*.jvm.mem.heap_used_percent | jq'- 1.Check GC behavior:
curl -X GET "localhost:9200/_nodes/stats/jvm?filter_path=nodes.*.jvm.gc&pretty"- 1.Run a workload test:
curl -X GET "localhost:9200/_bench?pretty" # Or your actual workloadCommon Heap Size Mistakes
| Mistake | Problem | Fix |
|---|---|---|
| Setting max > min | Heap resize overhead during runtime | Set -Xms = -Xmx |
| Exceeding 31GB | Lost compressed OOPs, slower GC | Cap at 30GB |
| Using 100% of RAM | No room for OS/Lucene | Use 50% max |
| Different sizes per node | Inconsistent behavior | Standardize config |
| Ignoring physical RAM | Server-specific issues | Configure per server |
Heap Monitoring Dashboard
Create a monitoring script:
#!/bin/bash
echo "=== Elasticsearch Heap Monitor ==="
while true; do
echo "$(date)"
curl -s "localhost:9200/_nodes/stats/jvm" | jq '
.nodes | to_entries[] | {
node: .value.name,
heap_percent: .value.jvm.mem.heap_used_percent,
heap_max_gb: (.value.jvm.mem.heap_max_in_bytes / 1073741824 | floor),
gc_old_time_s: (.value.jvm.gc.collectors.old.collection_time_in_millis / 1000 | floor)
}
'
echo ""
sleep 30
doneSummary
Configure Elasticsearch heap correctly by:
- 1.Setting -Xms and -Xmx to the same value
- 2.Using 50% of physical RAM (maximum)
- 3.Not exceeding 31GB to preserve compressed OOPs
- 4.Monitoring heap usage and GC behavior
- 5.Adjusting based on workload type
- 6.Scaling horizontally when heap limits reached
Proper heap configuration prevents both OOM crashes and GC-induced latency. Monitor continuously and adjust as your workload evolves.