Introduction

Go race detector positive findings indicate data races in concurrent code where multiple goroutines access the same memory location simultaneously and at least one access is a write. Data races cause undefined behavior including corrupted data, panics, incorrect results, and intermittent failures that are difficult to reproduce. The Go race detector (-race flag) uses ThreadSanitizer to detect races at runtime by tracking memory accesses and synchronization operations. Common causes include unprotected shared variable access, mutex not held during read or write, channel used incorrectly for synchronization, slice or map accessed concurrently without protection, singleton initialized without sync.Once, closure capturing loop variable modified concurrently, global variable accessed from multiple goroutines, and pointer aliasing causing unintended shared state. The fix requires understanding Go memory model, proper use of synchronization primitives (mutex, channel, atomic), and systematic identification of shared state. This guide provides production-proven techniques for fixing race detector findings across goroutine patterns, concurrent data structures, and distributed systems.

Symptoms

  • WARNING: DATA RACE in test output with -race flag
  • Previous write and Read at locations shown in race report
  • Intermittent test failures that pass when run individually
  • Production panics with concurrent map read and map write
  • Corrupted data structures under concurrent load
  • Incorrect calculation results that vary between runs
  • fatal error: concurrent map writes panic
  • Race detector significantly slows program execution (5-50x)
  • Race only occurs under specific timing conditions

Common Causes

  • Shared variable accessed without mutex protection
  • Map read/write from multiple goroutines concurrently
  • Slice append while another goroutine reads
  • Global configuration modified during runtime access
  • Singleton pattern without sync.Once or proper locking
  • Cache accessed without synchronization
  • Counter incremented without atomic operations
  • Channel closed while another goroutine sends
  • WaitGroup used incorrectly (Add called after Wait)
  • Context cancelled while goroutine still accessing shared state

Step-by-Step Fix

### 1. Understand race detector output

Read race report:

``` WARNING: DATA RACE

Read at 0x00c000123456 by goroutine 10: main.counter.Inc() /app/counter.go:15 +0x45 main.worker() /app/worker.go:25 +0x67

Previous write at 0x00c000123456 by goroutine 8: main.counter.Inc() /app/counter.go:15 +0x5a main.worker() /app/worker.go:25 +0x67

Goroutine 10 (running) created at: main.main() /app/main.go:10 +0x45

Goroutine 8 (running) created at: main.main() /app/main.go:10 +0x45 ```

Key information in race report: - Memory address involved (0x00c000123456) - Operation type (Read vs Write) - Goroutine IDs (8 and 10) - File and line number of each access - Goroutine creation stack

Enable race detector:

```bash # Run tests with race detector go test -race ./...

# Run application with race detector go run -race main.go go build -race -o app main.go ./app

# Race detector flags export GORACE="log_path=/tmp/race log_file=race.%p" # Log to file export GORACE="halt_on_error=1" # Stop on first race export GORACE="second_deadlock_stack=1" # More deadlock info ```

### 2. Fix unprotected shared variables

Add mutex protection:

```go // WRONG: Unprotected shared counter type Counter struct { value int }

func (c *Counter) Inc() { c.value++ // DATA RACE: read-modify-write not atomic }

func (c *Counter) Value() int { return c.value // DATA RACE: may read during write }

// CORRECT: With mutex type Counter struct { mu sync.Mutex value int }

func (c *Counter) Inc() { c.mu.Lock() defer c.mu.Unlock() c.value++ }

func (c *Counter) Value() int { c.mu.Lock() defer c.mu.Unlock() return c.value }

// BETTER: Use atomic for simple counters import "sync/atomic"

type Counter struct { value int64 // Must be int64 for 64-bit atomic on 32-bit systems }

func (c *Counter) Inc() { atomic.AddInt64(&c.value, 1) }

func (c *Counter) Value() int64 { return atomic.LoadInt64(&c.value) } ```

Protect map access:

```go // WRONG: Concurrent map access type Cache struct { data map[string]string }

func (c *Cache) Get(key string) string { return c.data[key] // DATA RACE: concurrent read }

func (c *Cache) Set(key, value string) { c.data[key] = value // DATA RACE: concurrent write }

// CORRECT: With mutex type Cache struct { mu sync.RWMutex data map[string]string }

func (c *Cache) Get(key string) string { c.mu.RLock() defer c.mu.RUnlock() return c.data[key] }

func (c *Cache) Set(key, value string) { c.mu.Lock() defer c.mu.Unlock() c.data[key] = value }

// BETTER: Use sync.Map for read-heavy workloads type Cache struct { data sync.Map // Built-in concurrent map }

func (c *Cache) Get(key string) (string, bool) { val, ok := c.data.Load(key) if !ok { return "", false } return val.(string), true }

func (c *Cache) Set(key, value string) { c.data.Store(key, value) } ```

### 3. Fix slice concurrent access

```go // WRONG: Concurrent slice access var results []string

go func() { results = append(results, "item1") // DATA RACE }() go func() { results = append(results, "item2") // DATA RACE }()

// CORRECT: Use mutex var ( mu sync.Mutex results []string )

go func() { mu.Lock() results = append(results, "item1") mu.Unlock() }()

// BETTER: Use channel to collect results resultsCh := make(chan string, 2)

go func() { resultsCh <- "item1" }() go func() { resultsCh <- "item2" }()

// Collect in main goroutine (no race) var results []string for i := 0; i < 2; i++ { results = append(results, <-resultsCh) } close(resultsCh)

// BEST: Pre-allocate and use index-based access results := make([]string, 2)

var wg sync.WaitGroup for i := 0; i < 2; i++ { wg.Add(1) go func(idx int) { defer wg.Done() results[idx] = fmt.Sprintf("item%d", idx) // No race: different indices }(i) } wg.Wait() ```

### 4. Fix singleton initialization

```go // WRONG: Race in singleton initialization var instance *Singleton

func GetInstance() *Singleton { if instance == nil { instance = &Singleton{} // DATA RACE: multiple goroutines may create } return instance }

// CORRECT: Use sync.Once var ( instance *Singleton once sync.Once )

func GetInstance() *Singleton { once.Do(func() { instance = &Singleton{} }) return instance }

// CORRECT: Use init function for compile-time initialization var instance = &Singleton{} // Initialized before main()

func GetInstance() *Singleton { return instance } ```

### 5. Fix channel race conditions

Close channel safely:

```go // WRONG: Channel may be closed multiple times ch := make(chan int)

go func() { // ... produce values close(ch) // DATA RACE: may close while other goroutine also closes }()

go func() { // ... also may close close(ch) // PANIC: close of closed channel }()

// CORRECT: Single goroutine responsible for close ch := make(chan int)

go func() { defer close(ch) // Only this goroutine closes // ... produce values }()

// Receivers never close for val := range ch { process(val) }

// CORRECT: Use sync.Once for close var closeOnce sync.Once ch := make(chan int)

closeCh := func() { closeOnce.Do(func() { close(ch) }) } ```

Select with channels:

```go // WRONG: Blocking send with no receiver guarantee ch := make(chan int) // Unbuffered

go func() { ch <- 42 // May block forever if main exits }()

// CORRECT: Use select with default for non-blocking ch := make(chan int, 1) // Buffered

select { case ch <- 42: // Send succeeded default: // Channel full, handle gracefully }

// CORRECT: Use context for cancellation ctx, cancel := context.WithTimeout(context.Background(), 100*time.Millisecond) defer cancel()

select { case ch <- 42: case <-ctx.Done(): // Timeout, don't block forever } ```

### 6. Fix closure loop variable capture

```go // WRONG: Closure captures loop variable for i := 0; i < 10; i++ { go func() { fmt.Println(i) // DATA RACE: i shared across all goroutines }() } // All goroutines may print same value (likely 10)

// CORRECT: Pass loop variable as parameter for i := 0; i < 10; i++ { go func(val int) { fmt.Println(val) // Each goroutine has its own copy }(i) }

// CORRECT: Create new variable in loop scope for i := 0; i < 10; i++ { i := i // Shadow loop variable go func() { fmt.Println(i) }() } ```

### 7. Fix WaitGroup races

```go // WRONG: Add called after Wait started var wg sync.WaitGroup

go func() { for i := 0; i < 10; i++ { wg.Add(1) // DATA RACE: may call Add while Wait is waiting go func() { defer wg.Done() doWork() }() } }()

wg.Wait()

// CORRECT: All Add calls before any Wait var wg sync.WaitGroup

for i := 0; i < 10; i++ { wg.Add(1) go func() { defer wg.Done() doWork() }() }

wg.Wait()

// CORRECT: Use semaphore pattern for dynamic work sem := make(chan struct{}, 10) // Limit to 10 concurrent

for i := 0; i < 100; i++ { sem <- struct{}{} // Acquire go func() { defer func() { <-sem }() // Release doWork() }() }

// Wait for all to complete for i := 0; i < 10; i++ { sem <- struct{}{} // Fill semaphore } ```

### 8. Production race detection strategies

Enable race detection in CI:

```yaml # GitHub Actions name: Test on: [push, pull_request] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4

  • name: Set up Go
  • uses: actions/setup-go@v4
  • name: Run tests with race detector
  • run: go test -race -v ./...
  • name: Run benchmarks with race detector
  • run: go test -race -bench=. -run=^$ ./...
  • `

Periodic production sampling:

```go // Enable race detector only during specific test windows // Build with -race flag, run periodically in staging

// Canary test with race detection func TestProductionWithRace(t *testing.T) { if testing.Short() { t.Skip("skipping race detection test") }

// Run production-like workload runWorkload()

// Race detector will report any races found }

// Run periodically: // go test -race -run TestProductionWithRace -count=10 ./... ```

Stress testing for races:

```bash # Run tests multiple times to catch intermittent races go test -race -count=100 ./... 2>&1 | grep -A 10 "DATA RACE"

# Use stress package go install golang.org/x/tools/cmd/stress@latest stress -race -p=8 ./package.test -test.run=TestConcurrent

# Parallel stress test stress -p=$(nproc) ./package.test -test.run=TestConcurrent ```

Prevention

  • Run go test -race in CI/CD pipeline for every commit
  • Use go test -race -count=N for flaky test investigation
  • Document shared state and its synchronization in code comments
  • Prefer channels for goroutine communication over shared memory
  • Use sync.Once for singleton initialization
  • Use sync/atomic for simple counters and flags
  • Use sync.Map for concurrent map access patterns
  • Pre-allocate slices for concurrent index-based access
  • Avoid global variables for mutable state
  • Use vet with shadow check: go vet -shadow ./...
  • **Go channel deadlock**: Goroutines blocked waiting for channel operations
  • **Go channel send on closed channel**: Sending after channel closed
  • **Go goroutine leak**: Goroutines not terminating properly
  • **Go context deadline exceeded**: Operation timeout
  • **Go nil pointer dereference**: Accessing nil pointer