Introduction

Go channel deadlock occurs when goroutines are blocked waiting for channel operations that can never complete, causing the entire program to hang or crash with "all goroutines are asleep - deadlock!" error. Deadlocks happen when circular dependencies form between goroutines waiting for channels, when sends block forever with no receiver, when receives block forever with no sender, when channels are not closed signaling completion, or when select statements have no ready cases and no default. Common causes include send on unbuffered channel with no ready receiver, receive from channel with no sender, unbuffered channel used where buffered needed, channel never closed causing range to block, circular goroutine communication patterns, missing default in select causing indefinite block, context without cancellation leaving goroutines stranded, WaitGroup not decremented causing Await to block forever, and mutex held while waiting for channel. The fix requires understanding Go channel semantics, proper select patterns, buffer sizing, close conventions, and using Go's deadlock detector. This guide provides production-proven techniques for preventing and fixing Go channel deadlocks.

Symptoms

  • Program hangs with no output
  • "fatal error: all goroutines are asleep - deadlock!" panic
  • Goroutine count grows then stops
  • Application stops processing but no error logged
  • Specific operation hangs indefinitely
  • go routine dump shows all goroutines in chan send/receive
  • select statement with no cases ready
  • Range over channel never completes

Common Causes

  • Send on unbuffered channel without receiver ready
  • Receive from channel with no sender
  • Channel not closed, range blocks forever
  • Select with no ready cases and no default
  • Circular goroutine dependencies
  • Mutex held while waiting on channel
  • WaitGroup.Add without matching Done
  • Context not cancelled, goroutines wait forever
  • Channel buffer too small for workload pattern

Step-by-Step Fix

### 1. Diagnose deadlock

Enable deadlock detector:

```bash # Go has built-in deadlock detector for small programs # Runs automatically when few goroutines exist

# For larger programs, use race detector which catches some deadlocks go run -race main.go go test -race ./...

# Race detector output shows suspicious patterns # WARNING: DATA RACE # May indicate potential deadlock scenarios

# Get goroutine stack trace on hang # Send SIGQUIT to running process kill -SIGQUIT <pid>

# Output shows all goroutine stacks # Look for goroutines blocked on chan send/receive ```

Analyze goroutine dump:

```bash # Capture goroutine profile go tool pprof http://localhost:6060/debug/pprof/goroutine

# Or save to file curl http://localhost:6060/debug/pprof/goroutine > goroutine.prof go tool pprof goroutine.prof

# Look for blocked goroutines: # - goroutine blocked on chan send (no receiver) # - goroutine blocked on chan receive (no sender) # - goroutine blocked on select (no cases ready)

# Debug build with goroutine traces GODEBUG=gctrace=1 ./app

# Add deadlock detection to code import "runtime"

func detectDeadlock() { // Count goroutines n := runtime.NumGoroutine() fmt.Printf("Goroutines: %d\n", n)

// If n is constant and no progress, likely deadlock } ```

Identify blocked goroutines:

```go // Add deadlock detection helper package main

import ( "runtime" "time" )

func watchGoroutines() { prev := runtime.NumGoroutine()

for range time.Tick(5 * time.Second) { curr := runtime.NumGoroutine()

if curr == prev { fmt.Printf("WARNING: Goroutine count stable at %d - possible deadlock\n", curr) // Print stack traces buf := make([]byte, 1<<20) runtime.Stack(buf, true) fmt.Printf("%s\n", buf) }

prev = curr } }

func main() { go watchGoroutines() // ... rest of application } ```

### 2. Fix unbuffered channel deadlocks

Basic send/receive deadlock:

```go // BAD: Send with no receiver - deadlock! func deadlock1() { ch := make(chan int) // Unbuffered ch <- 42 // Blocks forever - no receiver // Program crashes: all goroutines are asleep - deadlock! }

// BAD: Receive with no sender - deadlock! func deadlock2() { ch := make(chan int) <-ch // Blocks forever - no sender }

// CORRECT: Use goroutine for concurrent send/receive func fixed1() { ch := make(chan int) go func() { ch <- 42 // Send in goroutine }() val := <-ch // Receive in main fmt.Println(val) }

// CORRECT: Use buffered channel if async needed func fixed2() { ch := make(chan int, 1) // Buffered ch <- 42 // Non-blocking (buffer available) val := <-ch fmt.Println(val) } ```

Worker pool deadlock:

```go // BAD: Worker pool with insufficient capacity func workerPoolDeadlock(jobs []int) { jobsCh := make(chan int) // Unbuffered! resultsCh := make(chan int) // Unbuffered!

// Start workers for i := 0; i < 3; i++ { go func() { for job := range jobsCh { resultsCh <- job * 2 // Blocks if no receiver! } }() }

// Send jobs for _, job := range jobs { jobsCh <- job } close(jobsCh)

// BUG: Results channel fills up, workers block // Main goroutine waits here but workers blocked on resultsCh // DEADLOCK!

// Collect results - but too late, workers already blocked for i := 0; i < len(jobs); i++ { <-resultsCh } }

// CORRECT: Buffer results or collect while sending func workerPoolFixed(jobs []int) []int { jobsCh := make(chan int, len(jobs)) // Buffer all jobs resultsCh := make(chan int, len(jobs)) // Buffer all results

// Start workers for i := 0; i < 3; i++ { go func() { for job := range jobsCh { resultsCh <- job * 2 } }() }

// Send jobs for _, job := range jobs { jobsCh <- job } close(jobsCh)

// Collect results results := make([]int, len(jobs)) for i := 0; i < len(jobs); i++ { results[i] = <-resultsCh }

return results }

// CORRECT: Use errgroup for cleaner worker management func workerPoolErrGroup(jobs []int) ([]int, error) { g, ctx := errgroup.WithContext(context.Background()) results := make([]int, len(jobs))

for i, job := range jobs { i, job := i, job g.Go(func() error { select { case <-ctx.Done(): return ctx.Err() default: results[i] = job * 2 return nil } }) }

return results, g.Wait() } ```

### 3. Fix select deadlocks

Select with no ready cases:

```go // BAD: Select with no ready cases and no default - blocks forever func selectDeadlock() { ch1 := make(chan int) ch2 := make(chan int)

// Both channels have no sender/receiver // This select blocks forever select { case <-ch1: fmt.Println("ch1") case <-ch2: fmt.Println("ch2") } }

// CORRECT: Add default case for non-blocking func selectNonBlocking() { ch1 := make(chan int) ch2 := make(chan int)

select { case <-ch1: fmt.Println("ch1") case <-ch2: fmt.Println("ch2") default: // No channel ready, continue without blocking fmt.Println("no channel ready") } }

// CORRECT: Add timeout to prevent indefinite block func selectWithTimeout() { ch := make(chan int)

go func() { time.Sleep(100 * time.Millisecond) ch <- 42 }()

select { case val := <-ch: fmt.Println(val) case <-time.After(500 * time.Millisecond): // Timeout after 500ms fmt.Println("timeout") } } ```

Select in loop:

```go // BAD: Select in loop can deadlock if channel closes func selectLoopDeadlock(ch <-chan int) { for { select { case val := <-ch: process(val) // Missing case for closed channel! } } // If ch closes, val is zero value, loop continues forever }

// CORRECT: Handle closed channel func selectLoopFixed(ch <-chan int) { for { select { case val, ok := <-ch: if !ok { // Channel closed return } process(val) } } }

// Or use range (automatically handles close) func selectLoopWithRange(ch <-chan int) { for val := range ch { process(val) } // Loop exits when channel closes } ```

### 4. Fix channel close issues

Range on unclosed channel:

```go // BAD: Range on channel that's never closed func rangeDeadlock(ch chan int) { // This blocks forever if ch is never closed for val := range ch { process(val) } }

func producer() { ch := make(chan int) go rangeDeadlock(ch)

ch <- 1 ch <- 2 ch <- 3 // Missing: close(ch) // rangeDeadlock blocks forever waiting for more values }

// CORRECT: Close channel when done func producerFixed() { ch := make(chan int) go rangeDeadlock(ch)

ch <- 1 ch <- 2 ch <- 3 close(ch) // Signal completion }

// CORRECT: Use defer to ensure close func producerWithDefer() { ch := make(chan int) defer close(ch) // Always close, even on panic

go rangeDeadlock(ch)

// ... produce values ... } ```

Close direction matters:

```go // Only close from sender side // BAD: Receiver closing channel func badClose() { ch := make(chan int)

go func() { for val := range ch { process(val) } }()

go func() { <-ch close(ch) // WRONG - receiver shouldn't close! }()

ch <- 1 ch <- 2 // May panic: close of closed channel }

// CORRECT: Sender closes func correctClose() { ch := make(chan int)

go func() { defer close(ch) // Sender closes when done ch <- 1 ch <- 2 }()

for val := range ch { process(val) } }

// Send-only channel parameter (can't close from receiver) func producer(ch chan<- int) { ch <- 1 ch <- 2 close(ch) // OK - we're the sender }

func consumer(ch <-chan int) { // ch <- 1 // ERROR - can't send on receive-only channel // close(ch) // ERROR - can't close receive-only channel for val := range ch { process(val) } } ```

### 5. Fix context cancellation deadlocks

Context without cancellation:

```go // BAD: Goroutine waits forever if work never completes func contextDeadlock() { ctx := context.Background() // No cancellation!

go func() { // This goroutine blocks forever if doWork never returns doWork(ctx) }()

// Main continues, may exit, goroutine still running }

// CORRECT: Use context with cancellation func contextFixed() error { ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) defer cancel() // Ensure cleanup

done := make(chan error)

go func() { done <- doWork(ctx) }()

select { case err := <-done: return err case <-ctx.Done(): return ctx.Err() // Timeout or cancelled } } ```

Multiple goroutines with context:

```go // CORRECT: Fan-out with context cancellation func fanOut(ctx context.Context, inputs []int) ([]int, error) { // Create worker context tied to main context workerCtx, cancel := context.WithCancel(ctx) defer cancel() // Cancel all workers on exit

results := make([]int, len(inputs)) errors := make(chan error, len(inputs)) var wg sync.WaitGroup

for i, input := range inputs { wg.Add(1) go func(i int, input int) { defer wg.Done()

select { case <-workerCtx.Done(): errors <- workerCtx.Err() return default: results[i] = input * 2 errors <- nil } }(i, input) }

// Wait for all or first error go func() { wg.Wait() close(errors) }()

// Collect results or handle error for err := range errors { if err != nil { return nil, err } }

return results, nil } ```

### 6. Fix mutex and channel deadlocks

Mutex held while waiting on channel:

```go // BAD: Holding mutex while waiting on channel - deadlock! type BadStruct struct { mu sync.Mutex data map[string]int }

func (b *BadStruct) updateAndNotify(ch chan struct{}) { b.mu.Lock() b.data["key"] = 42 b.mu.Unlock()

// BUG: If channel has no buffer and no receiver, this blocks // But we already unlocked... actually this specific case is OK // The real issue is below: }

func (b *BadStruct) reallyBad(ch chan struct{}) { b.mu.Lock() defer b.mu.Unlock()

// DEADLOCK: Holding lock while waiting for channel <-ch // Blocks here, other goroutines can't acquire lock

b.data["key"] = 42 }

// CORRECT: Don't hold mutex while waiting on channel func (b *GoodStruct) updateAndNotify(ch chan struct{}) { // Prepare data with lock b.mu.Lock() b.data["key"] = 42 b.mu.Unlock()

// Notify without holding lock select { case ch <- struct{}{}: default: // No receiver, skip notification } }

// CORRECT: Use channel for synchronization instead of mutex func useChannels() { updateCh := make(chan func(map[string]int))

// Single goroutine owns the data go func() { data := make(map[string]int) for update := range updateCh { update(data) // Apply update } }()

// Send updates through channel (no mutex needed) updateCh <- func(d map[string]int) { d["key"] = 42 } } ```

### 7. Fix WaitGroup deadlocks

WaitGroup without Done:

```go // BAD: WaitGroup.Add without matching Done - deadlock! func wgDeadlock() { var wg sync.WaitGroup

wg.Add(1) go func() { // Forgot wg.Done()! doWork() }()

wg.Wait() // Blocks forever - Done never called }

// CORRECT: Always call Done with defer func wgFixed() { var wg sync.WaitGroup

wg.Add(1) go func() { defer wg.Done() // Ensures Done is called even on panic doWork() }()

wg.Wait() }

// CORRECT: Use pattern that prevents forgetting Done func wgPattern() { var wg sync.WaitGroup

// Add matches number of goroutines exactly for i := 0; i < 3; i++ { wg.Add(1) go func() { defer wg.Done() doWork() }() }

wg.Wait() } ```

### 8. Debug with tools

Use Go's deadlock detector:

```bash # Build with deadlock detection go build -race main.go

# Run with race detector ./main

# Race detector also catches potential deadlocks # Look for "potential deadlock" warnings ```

Use pprof for goroutine analysis:

```go import ( _ "net/http/pprof" "runtime/pprof" )

func main() { // Start pprof server go http.ListenAndServe("localhost:6060", nil)

// Or write goroutine profile to file f, _ := os.Create("goroutine.prof") pprof.Lookup("goroutine").WriteTo(f, 0)

// ... application code ... }

// Analyze: // go tool pprof goroutine.prof // (pprof) web # Visualize goroutine stack ```

Prevention

  • Use buffered channels when send shouldn't block
  • Always close channels from sender side when done
  • Use select with default or timeout for non-blocking operations
  • Cancel contexts to unblock waiting goroutines
  • Call WaitGroup.Done with defer to ensure it's called
  • Don't hold mutex while waiting on channels
  • Use errgroup for cleaner goroutine coordination
  • Test concurrent code under load to expose deadlocks
  • Use race detector in CI/CD pipeline
  • Add goroutine monitoring to production applications
  • **panic: send on closed channel**: Sending after channel closed
  • **panic: close of closed channel**: Closing already-closed channel
  • **context deadline exceeded**: Operation timed out
  • **sync: negative WaitGroup counter**: Mismatched Add/Done calls