Recognizing Deadlocks

A goroutine deadlock produces this distinctive error message:

``` fatal error: all goroutines are asleep - deadlock!

goroutine 1 [chan receive]: main.main() /app/main.go:15 +0x85

goroutine 6 [chan send]: main.processData.func1() /app/main.go:25 +0x4b ```

This error means all goroutines are blocked waiting on each other - none can proceed.

Common Deadlock Patterns

Pattern 1: Unbuffered Channel Without Receiver

Problem code: ``go func main() { ch := make(chan int) ch <- 42 // Blocks forever - no receiver! fmt.Println("Sent") }

Solution 1 - Add a receiver: ``go func main() { ch := make(chan int) go func() { fmt.Println("Received:", <-ch) }() ch <- 42 time.Sleep(100 * time.Millisecond) }

Solution 2 - Use buffered channel: ``go func main() { ch := make(chan int, 1) // Buffer size 1 ch <- 42 // Doesn't block - buffer has space fmt.Println("Sent") }

Pattern 2: Circular Wait

Problem code: ```go func main() { ch1 := make(chan int) ch2 := make(chan int)

go func() { <-ch1 // Waiting for ch1 ch2 <- 1 // Then send to ch2 }()

ch1 <- 1 // Send to ch1 <-ch2 // Wait for ch2

// Actually this works, but this pattern is problematic: go func() { ch1 <- 1 }() go func() { <-ch2 ch1 <- 2 }() ch2 <- 1 // Deadlock! Circular dependency } ```

Solution - Restructure communication: ```go func main() { ch1 := make(chan int, 1) ch2 := make(chan int, 1)

go func() { ch1 <- 1 }()

go func() { ch2 <- 2 }()

// Use select to handle multiple channels for i := 0; i < 2; i++ { select { case v := <-ch1: fmt.Println("ch1:", v) case v := <-ch2: fmt.Println("ch2:", v) } } } ```

Pattern 3: Mutex Lock Ordering

Problem code: ```go var mu1, mu2 sync.Mutex

func goroutineA() { mu1.Lock() time.Sleep(100 * time.Millisecond) mu2.Lock() // Deadlock potential! mu2.Unlock() mu1.Unlock() }

func goroutineB() { mu2.Lock() time.Sleep(100 * time.Millisecond) mu1.Lock() // Deadlock! mu1.Unlock() mu2.Unlock() } ```

Solution - Always acquire locks in same order: ```go var mu1, mu2 sync.Mutex

func goroutineA() { mu1.Lock() defer mu1.Unlock() time.Sleep(100 * time.Millisecond) mu2.Lock() defer mu2.Unlock() // work... }

func goroutineB() { mu1.Lock() // Same order as A! defer mu1.Unlock() time.Sleep(100 * time.Millisecond) mu2.Lock() defer mu2.Unlock() // work... } ```

Pattern 4: WaitGroup Misuse

Problem code: ```go func main() { var wg sync.WaitGroup

for i := 0; i < 3; i++ { go func() { wg.Add(1) // Wrong! Add called inside goroutine defer wg.Done() fmt.Println(i) }() }

wg.Wait() // May return before goroutines start! } ```

Solution - Add before starting goroutine: ```go func main() { var wg sync.WaitGroup

for i := 0; i < 3; i++ { wg.Add(1) // Correct! Add before goroutine starts go func(n int) { defer wg.Done() fmt.Println(n) }(i) }

wg.Wait() } ```

Detection Tools

1. Use the Race Detector

bash
go run -race main.go
go test -race ./...

2. Debug with GODEBUG

```bash # Enable detailed scheduling traces GODEBUG=schedtrace=1000 go run main.go

# Enable deadlock detection (default in most cases) GODEBUG=invalidptr=1 go run main.go ```

3. Use Pprof for Goroutine Analysis

```go import ( "net/http" _ "net/http/pprof" )

func main() { go func() { http.ListenAndServe(":6060", nil) }() // Your program... } ```

Then inspect goroutines: ```bash # View goroutine count curl http://localhost:6060/debug/pprof/goroutine?debug=1

# Save goroutine profile curl http://localhost:6060/debug/pprof/goroutine -o goroutine.out

# Analyze with pprof go tool pprof goroutine.out ```

4. Add Timeout to Channel Operations

```go func main() { ch := make(chan int)

select { case v := <-ch: fmt.Println("Received:", v) case <-time.After(5 * time.Second): fmt.Println("Timeout waiting for channel") } }

// Or with context func main() { ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) defer cancel()

ch := make(chan int)

select { case v := <-ch: fmt.Println("Received:", v) case <-ctx.Done(): fmt.Println("Context cancelled:", ctx.Err()) } } ```

Prevention Strategies

1. Use Context for Cancellation

go
func worker(ctx context.Context, jobs <-chan int, results chan<- int) {
    for {
        select {
        case job, ok := <-jobs:
            if !ok {
                return
            }
            results <- process(job)
        case <-ctx.Done():
            return
        }
    }
}

2. Use Buffered Channels Judiciously

go
// For producer-consumer with known batch size
jobs := make(chan int, 100) // Buffer to handle burst

3. Use sync.Map for Concurrent Access

```go var m sync.Map

// Safe concurrent access m.Store("key", "value") v, ok := m.Load("key") ```

4. Implement Graceful Shutdown

```go func main() { ctx, cancel := context.WithCancel(context.Background()) defer cancel()

// Handle signals sigCh := make(chan os.Signal, 1) signal.Notify(sigCh, os.Interrupt, syscall.SIGTERM)

go func() { <-sigCh cancel() }()

// Workers respect context var wg sync.WaitGroup for i := 0; i < 5; i++ { wg.Add(1) go worker(ctx, &wg) }

wg.Wait() } ```

Testing for Deadlocks

```go func TestNoDeadlock(t *testing.T) { done := make(chan bool)

go func() { // Code that might deadlock processThatMightDeadlock() done <- true }()

select { case <-done: // Success case <-time.After(5 * time.Second): t.Fatal("Potential deadlock detected") } } ```