Introduction
Go application memory leaks and performance issues manifest as gradually increasing memory usage, high CPU utilization, slow response times, or goroutine count explosion. Unlike languages with explicit memory management, Go's garbage collector automatically reclaims unused memory, but memory leaks still occur through goroutine leaks, unbounded caches, growing global variables, channel deadlocks, and unintended references preventing GC. Common causes include goroutines blocked on channels, background workers not respecting context cancellation, timer leaks from unchecked time.After, sync.Mutex contention causing lock convoy, slice/map retaining large underlying arrays, and finalizer misconfiguration. The fix requires understanding Go runtime behavior, profiling tools (pprof), GC tuning parameters, and concurrency patterns. This guide provides production-proven troubleshooting for Go performance issues across microservices, CLI tools, and long-running server applications.
Symptoms
- RSS memory continuously increasing over hours/days
- OOM killer terminates Go application
- Goroutine count growing unbounded (thousands)
- GC pause times exceeding 100ms+
- CPU spikes to 100% on single core
- Request latency increasing over time
- Channel send/receive blocking indefinitely
fatal error: all goroutines are asleep - deadlock- Application slow to respond after deployment
- Memory usage doesn't drop after traffic spike
- High mutex wait time in profiling
- Context cancellation not propagating to child goroutines
Common Causes
- Goroutine blocked forever on channel send/receive
- Worker goroutine not checking context.Done()
- time.After() called in loop without cleanup
- sync.WaitGroup not decremented (missing wg.Done())
- Map/slice growing without bounds
- Global cache without eviction policy
- Circular references preventing GC
- Finalizer preventing object collection
- sync.Mutex contention causing lock convoy
- Buffered channel never drained
- HTTP response body not closed
- Database connection not returned to pool
- Event listener/handler registered multiple times
Step-by-Step Fix
### 1. Profile memory and goroutines
Enable pprof in application:
```go // Add to main.go import ( _ "net/http/pprof" "net/http" )
func main() { // Start pprof server go func() { http.ListenAndServe("localhost:6060", nil) }()
// Your application code // ... }
// Access profiles at: // http://localhost:6060/debug/pprof/ ```
Collect heap profile:
```bash # View current heap allocation go tool pprof http://localhost:6060/debug/pprof/heap
# Interactive commands: # top - Show top memory consumers # top10 - Show top 10 # list FuncName - Show allocations in function # web - Generate SVG call graph
# Save profile for analysis curl -o heap.pprof http://localhost:6060/debug/pprof/heap
# Analyze saved profile go tool pprof heap.pprof go tool pprof -top heap.pprof go tool pprof -svg heap.pprof > heap.svg
# Get baseline, then compare after traffic curl -o heap1.pprof http://localhost:6060/debug/pprof/heap?seconds=30 # ... wait for traffic ... curl -o heap2.pprof http://localhost:6060/debug/pprof/heap?seconds=30
# Compare profiles (find new allocations) go tool pprof -base heap1.pprof heap2.pprof ```
Collect goroutine profile:
```bash # View all goroutines curl http://localhost:6060/debug/pprof/goroutine?debug=2
# Analyze with pprof go tool pprof http://localhost:6060/debug/pprof/goroutine
# Count goroutines over time watch -n 1 'curl -s http://localhost:6060/debug/pprof/goroutine?debug=1 | grep -c "goroutine "'
# Or from application metrics # runtime.NumGoroutine() exported via /metrics ```
Collect CPU profile:
```bash # 30-second CPU profile curl -o cpu.pprof http://localhost:6060/debug/pprof/profile?seconds=30
# Analyze go tool pprof cpu.pprof go tool pprof -top cpu.pprof
# Find hot spots go tool pprof -text cpu.pprof | head -20 ```
### 2. Fix goroutine leaks
Detect goroutine leaks:
```go // Use go.uber.org/goleak in tests import ( "testing" "go.uber.org/goleak" )
func TestMain(m *testing.M) { goleak.VerifyTestMain(m) }
// Or in specific tests func TestMyFunction(t *testing.T) { defer goleak.VerifyNone(t)
// Your test code that might leak go someGoroutine() } ```
Common goroutine leak patterns:
```go // LEAK 1: Goroutine blocked on channel send func process(data []int) { results := make(chan int) // Unbuffered channel
for _, d := range data { go func(val int) { result := heavyComputation(val) results <- result // BLOCKS if no one receives }(d) }
// Only receives first result, rest leak first := <-results return first }
// FIX: Buffer channel or receive all results func process(data []int) (int, error) { results := make(chan int, len(data)) // Buffered var wg sync.WaitGroup
for _, d := range data { wg.Add(1) go func(val int) { defer wg.Done() results <- heavyComputation(val) }(d) }
// Wait for all and collect results go func() { wg.Wait() close(results) }()
// Collect all results var first int for result := range results { if first == 0 { first = result } } return first, nil }
// LEAK 2: Goroutine not respecting context cancellation func startWorker(ctx context.Context) { go func() { for { // Never checks ctx.Done() doWork() time.Sleep(100 * time.Millisecond) } }() }
// FIX: Check context in loop func startWorker(ctx context.Context) { go func() { for { select { case <-ctx.Done(): return // Exit on cancellation default: doWork() time.Sleep(100 * time.Millisecond) } } }() }
// LEAK 3: time.After in loop func pollServer() { for { select { case <-time.After(1 * time.Second): // New timer every iteration! checkServer() } } }
// FIX: Use time.NewTimer for reuse func pollServer(ctx context.Context) { ticker := time.NewTicker(1 * time.Second) defer ticker.Stop()
for { select { case <-ctx.Done(): return case <-ticker.C: checkServer() } } }
// LEAK 4: HTTP response body not closed func fetchData(url string) ([]byte, error) { resp, err := http.Get(url) if err != nil { return nil, err } // Missing: defer resp.Body.Close() return io.ReadAll(resp.Body) }
// FIX: Always close response body func fetchData(url string) ([]byte, error) { resp, err := http.Get(url) if err != nil { return nil, err } defer resp.Body.Close() return io.ReadAll(resp.Body) }
// LEAK 5: WaitGroup not decremented func launchWorkers(n int) { var wg sync.WaitGroup for i := 0; i < n; i++ { wg.Add(1) go func() { // Missing: defer wg.Done() doWork() }() } wg.Wait() // Waits forever }
// FIX: Always call Done func launchWorkers(n int) { var wg sync.WaitGroup for i := 0; i < n; i++ { wg.Add(1) go func() { defer wg.Done() // CRITICAL doWork() }() } wg.Wait() } ```
### 3. Fix memory leaks
Detect growing memory:
```go // Export runtime metrics import ( "runtime" "encoding/json" "net/http" )
func memoryHandler(w http.ResponseWriter, r *http.Request) { var m runtime.MemStats runtime.ReadMemStats(&m)
stats := map[string]interface{}{ "Alloc": m.Alloc, // Current heap allocation "TotalAlloc": m.TotalAlloc, // Cumulative allocations "Sys": m.Sys, // Total memory from OS "NumGC": m.NumGC, // GC cycles completed "PauseTotal": m.PauseTotalNs, }
json.NewEncoder(w).Encode(stats) }
func main() { http.HandleFunc("/debug/memory", memoryHandler) // ... } ```
Common memory leak patterns:
```go // LEAK 1: Unbounded cache growth var cache = make(map[string][]byte)
func cacheData(key string, data []byte) { cache[key] = data // Grows forever }
// FIX: Use LRU cache with size limit import "github.com/hashicorp/golang-lru/v2"
var cache *lru.Cache[string, []byte]
func init() { cache, _ = lru.New[string, []byte](1000) // Max 1000 items }
func cacheData(key string, data []byte) { cache.Add(key, data) // Evicts oldest when full }
// LEAK 2: Slice retaining large underlying array func loadData() ([]byte, error) { data := make([]byte, 100<<20) // 100MB // Read file into data return data[:100], // Only need 100 bytes but holds 100MB }
// FIX: Copy small slice to new allocation func loadData() ([]byte, error) { data := make([]byte, 100<<20) // Read file into data
// Copy to new small slice result := make([]byte, 100) copy(result, data[:100]) return result, nil // Original 100MB can be GC'd }
// LEAK 3: Growing global slice var events []Event
func recordEvent(e Event) { events = append(events, e) // Grows forever }
// FIX: Use ring buffer or limit size var events []Event var mu sync.Mutex
func recordEvent(e Event) { mu.Lock() defer mu.Unlock()
events = append(events, e)
// Keep only last 10000 events if len(events) > 10000 { events = events[len(events)-10000:] } }
// LEAK 4: Map with no deletion var connections = make(map[string]*Connection)
func addConnection(id string, conn *Connection) { connections[id] = conn // Never deleted when connection closes }
// FIX: Delete on close func addConnection(id string, conn *Connection) { connections[id] = conn
go func() { <-conn.closed // Wait for close mu.Lock() delete(connections, id) mu.Unlock() }() }
// LEAK 5: Finalizer preventing collection type Resource struct { data []byte }
func NewResource() *Resource { r := &Resource{data: make([]byte, 1<<20)} runtime.SetFinalizer(r, func(r *Resource) { // Finalizer keeps r alive longer log.Println("Finalizing") }) return r }
// FIX: Avoid finalizers or use explicit Close type Resource struct { data []byte closed bool }
func (r *Resource) Close() error { if r.closed { return errors.New("already closed") } r.closed = true r.data = nil // Allow GC return nil }
// Usage with defer func process() error { r := NewResource() defer r.Close() // Use r return nil } ```
### 4. Fix mutex contention
Detect lock contention:
```bash # Collect mutex profile curl -o mutex.pprof http://localhost:6060/debug/pprof/mutex?seconds=30
# Analyze go tool pprof mutex.pprof go tool pprof -text mutex.pprof
# Enable mutex profiling (default: 0 = disabled) # Set in code or via environment export GODEBUG=mutexprofiler=1
# Or programmatically import _ "net/http/pprof" import "runtime"
func init() { runtime.SetMutexProfileFraction(1) // Profile all mutex waits } ```
Fix mutex contention:
```go // PROBLEM: Coarse-grained locking type Cache struct { mu sync.Mutex data map[string]string }
func (c *Cache) Get(key string) string { c.mu.Lock() defer c.mu.Unlock() return c.data[key] // Read operations block each other }
func (c *Cache) Set(key, value string) { c.mu.Lock() defer c.mu.Unlock() c.data[key] = value }
// FIX 1: Use RWMutex for read-heavy workloads type Cache struct { mu sync.RWMutex data map[string]string }
func (c *Cache) Get(key string) string { c.mu.RLock() // Read lock - multiple readers allowed defer c.mu.RUnlock() return c.data[key] }
func (c *Cache) Set(key, value string) { c.mu.Lock() // Write lock - exclusive defer c.mu.Unlock() c.data[key] = value }
// FIX 2: Use sync.Map for concurrent map access type Cache struct { data sync.Map // Thread-safe map }
func (c *Cache) Get(key string) (string, bool) { v, ok := c.data.Load(key) if !ok { return "", false } return v.(string), true }
func (c *Cache) Set(key, value string) { c.data.Store(key, value) }
// FIX 3: Sharded locking for fine-grained access type ShardedCache struct { shards [256]struct { mu sync.RWMutex data map[string]string } }
func NewShardedCache() *ShardedCache { c := &ShardedCache{} for i := range c.shards { c.shards[i].data = make(map[string]string) } return c }
func (c *ShardedCache) getShard(key string) int { return int(fnv.HashString(key)) % 256 }
func (c *ShardedCache) Get(key string) string { shard := &c.shards[c.getShard(key)] shard.mu.RLock() defer shard.mu.RUnlock() return shard.data[key] }
func (c *ShardedCache) Set(key, value string) { shard := &c.shards[c.getShard(key)] shard.mu.Lock() defer shard.mu.Unlock() shard.data[key] = value } ```
Avoid lock convoy:
```go // PROBLEM: Holding lock during slow operation func (c *Cache) Update(key string) string { c.mu.Lock() defer c.mu.Unlock()
// Slow operation while holding lock result := slowExternalCall() // Network call, disk I/O c.data[key] = result return result }
// FIX: Minimize critical section func (c *Cache) Update(key string) string { // Do slow work outside lock result := slowExternalCall()
// Only lock for actual data modification c.mu.Lock() defer c.mu.Unlock() c.data[key] = result return result }
// PROBLEM: Lock ordering causing deadlock var mu1, mu2 sync.Mutex
func transaction1() { mu1.Lock() time.Sleep(10 * time.Millisecond) // mu2 might be locked by transaction2 mu2.Lock() // ... mu2.Unlock() mu1.Unlock() }
func transaction2() { mu2.Lock() time.Sleep(10 * time.Millisecond) // mu1 might be locked by transaction1 mu1.Lock() // ... mu1.Unlock() mu2.Unlock() }
// FIX: Always acquire locks in consistent order func transaction1() { mu1.Lock() defer mu1.Unlock() mu2.Lock() defer mu2.Unlock() // ... }
func transaction2() { mu1.Lock() // Same order as transaction1 defer mu1.Unlock() mu2.Lock() defer mu2.Unlock() // ... } ```
### 5. Tune garbage collector
Understand GC behavior:
```go // Monitor GC stats import ( "fmt" "runtime" "time" )
func monitorGC() { var m runtime.MemStats ticker := time.NewTicker(1 * time.Second)
for range ticker.C { runtime.ReadMemStats(&m) fmt.Printf("Alloc: %d MB, NumGC: %d, PauseTotal: %d ms\n", m.Alloc/1024/1024, m.NumGC, m.PauseTotalNs/1000/1000) } } ```
GC tuning parameters:
```bash # Set GOGC (default: 100) # Lower = more frequent GC, less memory # Higher = less frequent GC, more memory export GODEBUG=gctrace=1 # Log GC events
# In code import "runtime"
func init() { // Use 50% of memory before triggering GC (more aggressive) runtime.GC() }
# Set via environment export GOGC=50 # Collect when heap grows 50% (default is 100%) export GOGC=200 # Collect when heap grows 200% (less aggressive) ```
GC tuning guidelines:
``` # When to tune GOGC:
# Lower GOGC (25-50): # - Memory-constrained environments (containers with tight limits) # - Latency-sensitive applications (want shorter pauses) # - Many short-lived objects
# Higher GOGC (200-500): # - Memory is abundant # - Throughput more important than latency # - Many long-lived objects
# Monitor GC pause times: # - < 1ms: Excellent # - 1-10ms: Good for most applications # - 10-100ms: May need tuning # - > 100ms: Problem, investigate
# gctrace output interpretation: # gc 123 @1234.5s 2.3%: 0.23+12.3+0.45 ms clock, 4.5/0.23/12/5.6+0.89 ms cpu, 234->245->123 MB # Fields: # - gc 123: GC number # - @1234.5s: Time since start # - 2.3%: CPU time spent in GC # - 0.23+12.3+0.45: STW (stop-the-world) phases # - 234->245->123: Heap before GC, during GC, after GC ```
### 6. Fix channel deadlocks
Detect channel issues:
```go // Runtime deadlock detection // Go runtime detects some deadlocks automatically: // fatal error: all goroutines are asleep - deadlock
// For more complex cases, use chanlint or similar tools // go install github.com/sasha-s/go-deadlock@latest
import "github.com/sasha-s/go-deadlock"
// Replace mutex types var mu deadlock.Mutex var rwmu deadlock.RWMutex
// These will timeout instead of deadlocking forever ```
Fix channel deadlock patterns:
```go // DEADLOCK 1: Sending to unbuffered channel with no receiver func leak1() { ch := make(chan int) ch <- 42 // Blocks forever - no goroutine receiving }
// FIX: Use buffered channel or start receiver first func fix1() { ch := make(chan int, 1) // Buffered ch <- 42 // Won't block
// Or ch := make(chan int) go func() { <-ch // Receiver ready }() ch <- 42 }
// DEADLOCK 2: Circular channel dependency func deadlock2() { ch1 := make(chan int) ch2 := make(chan int)
go func() { <-ch1 ch2 <- 1 }()
go func() { <-ch2 // Waiting for ch1 send ch1 <- 1 // But this send waits for above receive }() }
// FIX: Use select with default or timeouts func fix2() { ch1 := make(chan int) ch2 := make(chan int)
go func() { select { case <-ch1: ch2 <- 1 case <-time.After(1 * time.Second): // Timeout to prevent deadlock } }() }
// DEADLOCK 3: Filling buffered channel func deadlock3() { ch := make(chan int, 10)
// Fill channel for i := 0; i < 10; i++ { ch <- i }
// This blocks - channel full ch <- 11 }
// FIX: Use select with default func fix3() { ch := make(chan int, 10)
for i := 0; i < 20; i++ { select { case ch <- i: // Sent default: // Channel full, drop or handle log.Printf("Dropped: %d", i) } } }
// DEADLOCK 4: Not draining channel func deadlock4() { done := make(chan bool)
go func() { doWork() done <- true // Send when done }()
// Forget to receive // <-done // This line missing // Now goroutine blocks forever on send }
// FIX: Always drain channels func fix4() { done := make(chan bool)
go func() { doWork() select { case done <- true: default: // Receiver not waiting, that's OK } }()
<-done // Wait for completion } ```
### 7. Production monitoring
Export metrics for Prometheus:
```go import ( "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promhttp" "runtime" )
var ( goroutines = prometheus.NewGauge(prometheus.GaugeOpts{ Name: "go_goroutines", Help: "Number of goroutines", })
heapAlloc = prometheus.NewGauge(prometheus.GaugeOpts{ Name: "go_heap_alloc_bytes", Help: "Current heap allocation", })
gcPauses = prometheus.NewHistogram(prometheus.HistogramOpts{ Name: "go_gc_pause_seconds", Help: "GC pause duration", Buckets: prometheus.ExponentialBuckets(0.0001, 2, 20), }) )
func init() { prometheus.MustRegister(goroutines, heapAlloc, gcPauses) }
func collectMetrics() { ticker := time.NewTicker(1 * time.Second)
for range ticker.C { var m runtime.MemStats runtime.ReadMemStats(&m)
goroutines.Set(float64(runtime.NumGoroutine())) heapAlloc.Set(float64(m.Alloc))
if m.NumGC > 0 { pause := float64(m.PauseNs[(m.NumGC+255)%256]) / 1e9 gcPauses.Observe(pause) } } }
func main() { go collectMetrics()
http.Handle("/metrics", promhttp.Handler()) http.ListenAndServe(":8080", nil) } ```
Grafana dashboard panels:
```yaml # Goroutine count over time expr: go_goroutines thresholds: - value: 100 color: yellow - value: 1000 color: red
# Heap allocation expr: go_heap_alloc_bytes unit: bytes thresholds: - value: 1073741824 # 1GB color: yellow - value: 4294967296 # 4GB color: red
# GC pause histogram expr: histogram_quantile(0.99, rate(go_gc_pause_seconds_bucket[5m])) unit: seconds thresholds: - value: 0.01 # 10ms color: yellow - value: 0.1 # 100ms color: red
# GC CPU percentage expr: rate(go_gc_duration_seconds_sum[5m]) / rate(go_gc_duration_seconds_count[5m]) unit: percent ```
Alert rules:
```yaml groups: - name: go_health rules: - alert: GoGoroutineLeak expr: increase(go_goroutines[1h]) > 100 for: 10m labels: severity: warning annotations: summary: "Possible goroutine leak" description: "Goroutine count increased by {{ $value }} in 1 hour"
- alert: GoMemoryHigh
- expr: go_heap_alloc_bytes > 4 * 1024 * 1024 * 1024
- for: 5m
- labels:
- severity: warning
- annotations:
- summary: "Go heap memory high"
- description: "Heap allocation at {{ $value | humanize1024 }}B"
- alert: GoGCPauseHigh
- expr: histogram_quantile(0.99, rate(go_gc_pause_seconds_bucket[5m])) > 0.1
- for: 10m
- labels:
- severity: warning
- annotations:
- summary: "Go GC pause time high"
- description: "99th percentile GC pause is {{ $value | humanizeDuration }}"
`
Prevention
- Use goroutine leak detection in tests (goleak)
- Always close response bodies, files, and connections with defer
- Use context.WithTimeout for all long-running operations
- Set reasonable buffer sizes for channels
- Use LRU caches with size limits
- Monitor goroutine count in production
- Profile memory before major releases
- Set container memory limits with headroom
- Use sync.Pool for frequently allocated objects
- Document ownership for channels (who sends, who receives, who closes)
Related Errors
- **panic: send on closed channel**: Sending to already-closed channel
- **panic: close of closed channel**: Closing channel twice
- **panic: sync.WaitGroup misuse**: Add after Wait or negative counter
- **context deadline exceeded**: Operation timed out
- **broken pipe**: Writing to closed network connection