Fix Java GC Overhead Limit Exceeded Error

Introduction

Java GC Overhead Limit Exceeded (java.lang.OutOfMemoryError: GC overhead limit exceeded) occurs when the JVM spends more than 98% of its time performing garbage collection but reclaims less than 2% of the heap. This error is a protective mechanism - the JVM detects that continued execution would be futile since all CPU time is consumed by GC with no meaningful work being done. Unlike simple heap exhaustion, this error indicates a fundamental mismatch between memory allocation patterns and available heap, often caused by memory leaks, aggressive object creation, or severely undersized heap.

Symptoms

Application throws OutOfMemoryError: GC overhead limit exceeded
Application becomes extremely slow before crashing (GC thrashing)
CPU usage near 100% but throughput drops to near zero
GC logs show Full GC running every few seconds
Heap usage returns to near-maximum immediately after GC completes
Response times increase 10-100x before crash
Issue appears gradually as workload increases over time

Common Causes

Memory leak causing heap to fill faster than GC can reclaim
Heap size too small for working set
Creating大量 short-lived objects (allocation storm)
Large object graph preventing efficient GC
Finalizer queue backlog (objects with finalizers)
Weak/Soft references not being cleared fast enough
JNI references preventing object collection

Step-by-Step Fix

### 1. Confirm GC overhead diagnosis

Distinguish from other OOM errors:

```bash # Check error message in logs grep -E "GC overhead|OutOfMemoryError" /var/log/app/*.log

# Expected output: # java.lang.OutOfMemoryError: GC overhead limit exceeded

# This is different from: # - Java heap space (simple heap exhaustion) # - Metaspace (class metadata exhausted) # - Unable to create new native thread (OS thread limit) # - Requested array size exceeds VM limit (corrupt heap)

# Verify with jstat before crash jstat -gcutil <pid> 1000

# Watch for GC overhead pattern: # YGC YGCT FGC FGCT GCT # 1000 12.5 500 87.5 100.0 (FGCT/GCT > 98%)

# Calculate GC overhead percentage # FGCT / GCT * 100 = percentage of time in Full GC # If > 98%, GC overhead limit will trigger ```

### 2. Analyze GC logs for root cause

Extract GC patterns from logs:

```bash # Java 8 GC log example # Parse GC frequency and efficiency

grep "Full GC" gc.log | awk '{ # Extract heap before and after GC match($0, /([0-9]+)K->([0-9]+)K/, arr); before = arr[1]; after = arr[2]; freed = before - after; percent = (freed / before) * 100;

print $1, $2, "Before:", before, "After:", after, "Freed:", freed, "Efficiency:", percent "%"; }'

# Healthy GC pattern: # 10:00:00 Full GC 2048000K->512000K Freed: 1536000K Efficiency: 75% # 10:05:00 Full GC 2100000K->520000K Freed: 1580000K Efficiency: 75%

# GC overhead pattern: # 10:00:00 Full GC 3900000K->3800000K Freed: 100000K Efficiency: 2.5% # 10:00:05 Full GC 3950000K->3850000K Freed: 100000K Efficiency: 2.5% # 10:00:10 Full GC 3980000K->3880000K Freed: 100000K Efficiency: 2.5% # Notice: Very little memory freed, GC running constantly

# Check GC pause times grep "Full GC" gc.log | awk '{ match($0, /([0-9.]+)ms/, arr); print "Pause:", arr[1], "ms"; }' | sort -n | tail -20

# Long GC pauses (>10 seconds) indicate heap issues ```

Use GC analysis tools:

```bash # gceasy.io - Upload gc.log for detailed analysis # https://gceasy.io/

# Key metrics to check: # - GC Frequency: How often GC runs # - GC Efficiency: Memory freed per GC # - GC Pause Time: Application downtime # - Stop The World time: Total application pause

# Or use local tools # Install gclog parser pip install gclog-parser

# Analyze with Python python3 -c " import gclogparser with open('gc.log') as f: events = gclogparser.parse(f) for event in events: if event.type == 'Full GC': print(f'{event.timestamp}: {event.heap_before} -> {event.heap_after}') " ```

### 3. Identify memory leak with heap dump

Capture heap before crash:

```bash # Auto-generate heap dump on OOM -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/app/heapdump.hprof -XX:OnOutOfMemoryError="jcmd %p GC.heap_dump /var/log/app/forced.hprof"

# Or manually trigger when you see GC thrashing # Warning: This may cause application to hang temporarily jcmd <pid> GC.heap_dump /tmp/heap.hprof

# For large heaps (>8GB), use compressed format jcmd <pid> GC.heap_dump -gz /tmp/heap.hprof.gz

# Kubernetes pod kubectl exec <pod-name> -- jcmd 1 GC.heap_dump /tmp/heap.hprof kubectl cp <pod-name>:/tmp/heap.hprof ./heap.hprof ```

Analyze heap dump for leak patterns:

```bash # Open in Eclipse MAT and run queries

# 1. Find memory leak suspects # Right-click heap dump > Leak Suspects Report

# 2. Find largest objects SELECT * FROM java.lang.Object[] ORDER BY used_heap DESC LIMIT 50

# 3. Find objects with many instances SELECT toString(class), COUNT(*) as count FROM java.lang.Object GROUP BY toString(class) ORDER BY count DESC LIMIT 50

# 4. Find GC roots retaining large objects # Right-click object > Path to GC Roots > Exclude all phantom/weak references

# 5. Find duplicate strings (common leak) SELECT toString(s) AS value, COUNT(*) AS count FROM java.lang.String s GROUP BY toString(s) HAVING COUNT(*) > 100 ORDER BY count DESC ```

Common leak patterns in heap dump:

``` Leak Pattern 1: Growing ArrayList/HashMap java.util.ArrayList (2.1 GB / 45% of heap) - internal array: byte[] (2.0 GB) - Retained by: com.example.CacheManager.cache - 50,000,000 items, never cleared

Leak Pattern 2: Unclosed resources java.io.FileInputStream (500,000 instances) - Each holding file descriptor - Retained by: ThreadLocal in RequestProcessor

Leak Pattern 3: Event listeners java.util.ArrayList (1.5 GB) - Contains: com.example.EventListener[] - Listeners registered but never unregistered - Each listener holds reference to entire UI component tree ```

### 4. Tune GC algorithm for workload

Select appropriate GC algorithm:

```bash # G1 GC (Recommended for most applications) # Best for: Large heaps (>4GB), predictable pause times

-XX:+UseG1GC -XX:MaxGCPauseMillis=200 # Target max pause time -XX:G1HeapRegionSize=16m # Region size (1-32MB) -XX:G1ReservePercent=10 # Reserve 10% for evacuation -XX:G1NewSizePercent=30 # Young gen as % of heap -XX:G1MaxNewSizePercent=60 # Max young gen size -XX:ParallelGCThreads=8 # Parallel GC threads -XX:ConcGCThreads=2 # Concurrent GC threads -XX:InitiatingHeapOccupancyPercent=45 # Start mixed GC at 45%

# For GC overhead specifically: -XX:G1HeapWastePercent=10 # Allow 10% wasted space before mixed GC -XX:G1MixedGCCountTarget=8 # Number of mixed GCs in cycle -XX:G1MixedGCLIveThresholdPercent=85 # Include regions with <85% live data

# Parallel GC (Throughput-focused, older JVMs) # Best for: Batch processing, scientific computing

-XX:+UseParallelGC -XX:ParallelGCThreads=8 -XX:MaxGCPauseMillis=100 -XX:GCTimeRatio=99 # Target 99% throughput (1% GC time) -XX:AdaptiveSizePolicyWeight=90

# If GCTimeRatio too high, may cause GC overhead # Reduce to 90 if experiencing GC overhead

# ZGC (Java 15+, low latency) # Best for: Low-latency applications, large heaps

-XX:+UseZGC -XX:ZCollectionInterval=5 # Minimum time between GCs (seconds) -XX:ZAllocationSpikeTolerance=2.0 # Handle allocation spikes -XX:ConcGCThreads=4 # Concurrent threads -XX:MaxGCPauseMillis=10 # Target pause <10ms

# For GC overhead with ZGC: # ZGC rarely hits GC overhead due to concurrent operation # If it does, heap is severely undersized ```

### 5. Adjust heap configuration

Size heap appropriately:

```bash # Rule of thumb: Heap should be 2-4x working set size

# Calculate working set from GC logs # Average heap usage after Full GC = working set

# If working set is 2GB: # Minimum heap: 4GB (2x working set) # Recommended heap: 6-8GB (3-4x working set)

# Production configuration -Xms6g # Initial heap = Max heap (avoid resizing) -Xmx6g # Max heap

# For containerized deployments # Kubernetes with 8GB limit: -XX:InitialRAMPercentage=75.0 # 6GB initial -XX:MaxRAMPercentage=75.0 # 6GB max -XX:+UseContainerSupport # Respect container limits (Java 8u191+)

# Don't set heap too close to container limit # Leave room for: Metaspace, Code Cache, Thread stacks, Direct buffers

# Container memory = Heap + Non-heap # Non-heap typically 500MB-1GB

# For 8GB container: # Heap: 6GB (75%) # Non-heap: ~1GB # Headroom: ~1GB ```

Disable GC overhead limit (not recommended):

```bash # ONLY for debugging, NOT for production! # This allows JVM to continue running despite GC overhead # Application will be extremely slow but won't crash

-XX:-UseGCOverheadLimit

# Use this to: # - Capture better diagnostics before eventual crash # - Allow application to drain requests gracefully

# NEVER use this as a "fix" - it masks the underlying problem ```

### 6. Fix memory allocation patterns

Reduce object allocation rate:

```java // WRONG: Creating millions of temporary objects public List<String> processData(List<String> input) { List<String> result = new ArrayList<>();

for (String item : input) { // Creates new StringBuilder for each iteration String processed = new StringBuilder(item) .append("-processed") .toString(); result.add(processed); }

return result; // All intermediate objects go to GC }

// CORRECT: Reuse StringBuilder public List<String> processData(List<String> input) { List<String> result = new ArrayList<>(input.size()); StringBuilder sb = new StringBuilder(64); // Reuse same builder

for (String item : input) { sb.setLength(0); // Clear builder sb.append(item).append("-processed"); result.add(sb.toString()); }

return result; }

// WRONG: Boxing/unboxing in loops public long sumList(List<Integer> numbers) { long sum = 0; for (Integer n : numbers) { // Auto-unboxing creates objects sum += n; // Each iteration boxes/unboxes } return sum; }

// CORRECT: Use primitive arrays public long sumArray(long[] numbers) { long sum = 0; for (long n : numbers) { sum += n; // No boxing } return sum; }

// WRONG: String concatenation in loop public String buildMessage(List<String> parts) { String message = ""; for (String part : parts) { message += part + ", "; // Creates new String each iteration } return message; }

// CORRECT: Use StringBuilder public String buildMessage(List<String> parts) { StringBuilder sb = new StringBuilder(parts.size() * 20); for (String part : parts) { sb.append(part).append(", "); } return sb.toString(); } ```

Use object pooling for high-allocation scenarios:

```java // For objects created/destroyed frequently // Use Apache Commons Pool

public class ConnectionPool { private final GenericObjectPool<Connection> pool;

public ConnectionPool() { GenericObjectPoolConfig<Connection> config = new GenericObjectPoolConfig<>(); config.setMaxTotal(50); config.setMaxIdle(25); config.setMinIdle(5); config.setBlockWhenExhausted(true);

pool = new GenericObjectPool<>(new ConnectionFactory(), config); }

public Connection borrow() throws Exception { return pool.borrowObject(); }

public void returnConnection(Connection conn) { pool.returnObject(conn); } }

// For byte arrays, use Netty's Recycler public class BufferPool { private static final Recycler<byte[]> RECYCLER = new Recycler<byte[]>() { @Override protected byte[] newObject(Handle<byte[]> handle) { return new byte[8192]; } };

public static byte[] acquire() { return RECYCLER.get(); }

public static void release(byte[] buffer, Handle<?> handle) { handle.recycle(buffer); } } ```

### 7. Clear finalizer backlog

Objects with finalizers can block GC:

```bash # Check finalizer queue size jcmd <pid> GC.finalizer_info

# Or with JMX jconsole <pid> > Memory > Finalizer Queue Size

# If queue size is growing, finalizers can't keep up ```

Fix finalizer issues:

```java // WRONG: Relying on finalizers for cleanup public class LeakyResource { private final FileInputStream stream;

public LeakyResource(String path) throws FileNotFoundException { this.stream = new FileInputStream(path); // No explicit close - relies on finalizer }

@Override protected void finalize() { // Finalizer may not run for hours // Objects accumulate in finalizer queue stream.close(); } }

// CORRECT: Use try-with-resources public class SafeResource implements AutoCloseable { private final FileInputStream stream;

public SafeResource(String path) throws FileNotFoundException { this.stream = new FileInputStream(path); }

@Override public void close() throws IOException { stream.close(); } }

// Usage try (SafeResource resource = new SafeResource("file.txt")) { // Use resource } // Automatically closed

// Avoid PhantomReference for cleanup // Use Cleaner instead (Java 9+)

public class SafeCleanup implements Runnable { private final Cleaner.Cleanable cleanable;

public SafeCleanup() { Cleaner cleaner = Cleaner.create(); cleanable = cleaner.register(this, () -> { // Cleanup code - runs when object is GC'd releaseNativeResource(); }); }

public void close() { cleanable.clean(); // Explicit cleanup }

private void releaseNativeResource() { // Release native resources } } ```

### 8. Implement circuit breaker for memory pressure

Prevent memory exhaustion from cascading:

```java @Component public class MemoryCircuitBreaker {

private final MemoryMXBean memoryMXBean; private volatile boolean circuitOpen = false; private volatile long lastCheckTime = 0;

public MemoryCircuitBreaker() { this.memoryMXBean = ManagementFactory.getMemoryMXBean(); }

public boolean canAcceptRequest() { // Check every 5 seconds long now = System.currentTimeMillis(); if (now - lastCheckTime < 5000) { return !circuitOpen; }

lastCheckTime = now; MemoryUsage usage = memoryMXBean.getHeapMemoryUsage(); double usageRatio = (double) usage.getUsed() / usage.getMax();

if (usageRatio > 0.90) { circuitOpen = true; log.warn("Memory circuit OPEN - heap at {:.1f}%", usageRatio * 100); return false; } else if (usageRatio < 0.70) { circuitOpen = false; log.info("Memory circuit CLOSED - heap at {:.1f}%", usageRatio * 100); return true; }

return !circuitOpen; }

@EventListener public void handleRequest(RequestEvent event) { if (!canAcceptRequest()) { event.reject("Service temporarily unavailable - memory pressure"); return; }

// Process request } }

// Spring Cloud Circuit Breaker integration @Configuration public class MemoryCircuitBreakerConfig {

@Bean public Customizer<Resilience4JCircuitBreakerFactory> memoryCircuitBreaker() { return factory -> factory.configureDefault(id -> id .with(() -> CircuitBreakerConfig.custom() .failureRateThreshold(50) .slidingWindowSize(10) .minimumNumberOfCalls(5) .recordExceptions(OutOfMemoryError.class, GCOverheadLimitExceeded.class) .waitDurationInOpenState(Duration.ofMinutes(5)) .build())); } } ```

### 9. Monitor for early detection

Set up proactive monitoring:

```java // Scheduled memory health check @Component public class MemoryHealthMonitor {

private final MemoryMXBean memoryMXBean; private final ApplicationEventPublisher publisher;

@Scheduled(fixedRate = 30000) // Every 30 seconds public void checkMemoryHealth() { MemoryUsage usage = memoryMXBean.getHeapMemoryUsage(); double usedPercent = (double) usage.getUsed() / usage.getMax() * 100;

if (usedPercent > 85) { log.warn("Heap usage above 85%: {}MB / {}MB", usage.getUsed() / 1024 / 1024, usage.getMax() / 1024 / 1024); publisher.publishEvent(new MemoryWarningEvent(this, usedPercent)); }

if (usedPercent > 95) { log.error("Heap usage CRITICAL: {}MB / {}MB", usage.getUsed() / 1024 / 1024, usage.getMax() / 1024 / 1024); publisher.publishEvent(new MemoryCriticalEvent(this, usedPercent)); } } } ```

Prometheus alerting rules:

```yaml groups: - name: java_gc rules: - alert: JavaGCHighFrequency expr: rate(jvm_gc_collection_seconds_count[5m]) > 1 for: 5m labels: severity: warning annotations: summary: "Java GC running more than once per minute" description: "GC frequency is {{ $value | humanize }} per second"

alert: JavaGCHighOverhead
expr: rate(jvm_gc_collection_seconds_sum[5m]) / 300 > 0.5
for: 5m
labels:
severity: critical
annotations:
summary: "Java GC consuming > 50% of CPU time"
description: "GC overhead is {{ $value | humanizePercentage }}"

alert: JavaHeapHigh
expr: jvm_memory_used_bytes{area="heap"} / jvm_memory_max_bytes{area="heap"} > 0.85
for: 10m
labels:
severity: warning
annotations:
summary: "Java heap usage above 85%"

alert: JavaHeapNotReclaiming
expr: |
delta(jvm_memory_used_bytes{area="heap"}[10m]) < 0
and
rate(jvm_gc_collection_seconds_sum[10m]) > 60
for: 10m
labels:
severity: critical
annotations:
summary: "Java heap not being reclaimed despite GC"
description: "Possible memory leak - GC running but heap not decreasing"
`

### 10. Implement graceful degradation

Handle memory pressure without crashing:

```java // Graceful shutdown on memory pressure @Component public class GracefulMemoryShutdown {

private final MemoryMXBean memoryMXBean; private final ApplicationEventPublisher publisher; private volatile boolean shutdownInitiated = false;

public GracefulMemoryShutdown() { this.memoryMXBean = ManagementFactory.getMemoryMXBean(); }

@Scheduled(fixedRate = 10000) public void monitorMemory() { if (shutdownInitiated) return;

MemoryUsage usage = memoryMXBean.getHeapMemoryUsage(); double usageRatio = (double) usage.getUsed() / usage.getMax();

if (usageRatio > 0.95) { shutdownInitiated = true; log.error("Critical memory pressure ({}%) - initiating graceful shutdown", usageRatio * 100);

// Publish event for graceful shutdown publisher.publishEvent(new ContextClosedEvent(this));

// Or trigger shutdown directly new Thread(() -> { try { // Give time for in-flight requests to complete Thread.sleep(30000); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } System.exit(1); }, "memory-shutdown").start(); } } }

// Drop low-priority work under memory pressure @Component public class AdaptiveWorkloadManager {

private volatile WorkloadPriority currentPriority = WorkloadPriority.ALL;

@EventListener public void handleMemoryWarning(MemoryWarningEvent event) { if (event.getUsagePercent() > 90) { currentPriority = WorkloadPriority.HIGH_ONLY; log.info("Switching to HIGH_ONLY workload mode"); } }

public void processWorkItem(WorkItem item) { if (currentPriority == WorkloadPriority.HIGH_ONLY && item.getPriority() != WorkItemPriority.HIGH) { // Queue low-priority work for later lowPriorityQueue.add(item); return; }

// Process high-priority work item.process(); } } ```

Prevention

Set heap to 2-4x working set size based on load testing
Monitor GC overhead metric continuously
Set up alerts for GC frequency and pause times
Conduct regular load tests with memory profiling
Use object pooling for high-allocation scenarios
Avoid finalizers - use AutoCloseable instead
Implement memory-aware circuit breakers
Document heap requirements for each service

**OutOfMemoryError: Java heap space**: Simple heap exhaustion
**OutOfMemoryError: Metaspace**: Class metadata exhausted
**OutOfMemoryError: Unable to create new native thread**: Thread creation failed
**OutOfMemoryError: Direct buffer memory**: NIO buffer pool exhausted

How to Fix Java GC Overhead Limit Exceeded

Introduction

Symptoms

Common Causes

Step-by-Step Fix

Prevention

Related Errors

Share this guide