Introduction

Go test race conditions cause flaky tests that pass most of the time but fail unpredictably, especially under CI load or on machines with different CPU counts. The Go race detector (go test -race) identifies data races at runtime, but many teams do not run it in CI or ignore its warnings. Race conditions in tests are often caused by shared mutable state between test goroutines, improper use of time.Sleep for synchronization, or goroutines outliving their test function. Fixing these requires understanding Go's memory model and using proper synchronization primitives.

Symptoms

The race detector reports:

``` WARNING: DATA RACE Write at 0x00c0000160a0 by goroutine 9: myapp/pkg/cache.(*Cache).Set() /app/cache/cache.go:25 +0x45 myapp/pkg/cache.TestConcurrentSet.func1() /app/cache/cache_test.go:42 +0x89

Previous read at 0x00c0000160a0 by goroutine 8: myapp/pkg/cache.(*Cache).Get() /app/cache/cache.go:35 +0x3e myapp/pkg/cache.TestConcurrentSet.func2() /app/cache/cache_test.go:52 +0x89 ```

Or the test fails intermittently:

bash
=== RUN   TestConcurrentUpdates
    cache_test.go:67: expected count 100, got 97
--- FAIL: TestConcurrentUpdates (0.01s)

Running the same test again passes:

bash
=== RUN   TestConcurrentUpdates
--- PASS: TestConcurrentUpdates (0.01s)

Common Causes

  • Shared mutable map without mutex: Concurrent map reads and writes cause races
  • Counter updated by multiple goroutines: count++ is not atomic (it is read-modify-write)
  • time.Sleep used for synchronization: Sleep does not guarantee the goroutine has finished
  • Test globals modified by parallel tests: t.Parallel() tests sharing package-level state
  • Goroutine outliving the test: Background goroutine still running when test function returns
  • Benchmark with shared state: Benchmarks run in parallel and race on shared resources

Step-by-Step Fix

Step 1: Use sync.Mutex for shared map access

```go // WRONG - concurrent map access func TestConcurrentSet(t *testing.T) { cache := make(map[string]string)

var wg sync.WaitGroup for i := 0; i < 100; i++ { wg.Add(1) go func(i int) { defer wg.Done() cache[fmt.Sprintf("key-%d", i)] = fmt.Sprintf("value-%d", i) // RACE }(i) } wg.Wait()

if len(cache) != 100 { t.Errorf("expected 100 entries, got %d", len(cache)) } }

// CORRECT - mutex-protected map func TestConcurrentSet(t *testing.T) { var mu sync.Mutex cache := make(map[string]string)

var wg sync.WaitGroup for i := 0; i < 100; i++ { wg.Add(1) go func(i int) { defer wg.Done() mu.Lock() cache[fmt.Sprintf("key-%d", i)] = fmt.Sprintf("value-%d", i) mu.Unlock() }(i) } wg.Wait()

mu.Lock() defer mu.Unlock() if len(cache) != 100 { t.Errorf("expected 100 entries, got %d", len(cache)) } } ```

Step 2: Use sync/atomic for counters

```go import "sync/atomic"

func TestConcurrentIncrement(t *testing.T) { var counter atomic.Int64

var wg sync.WaitGroup for i := 0; i < 1000; i++ { wg.Add(1) go func() { defer wg.Done() counter.Add(1) // Atomic - no race }() } wg.Wait()

if got := counter.Load(); got != 1000 { t.Errorf("expected 1000, got %d", got) } } ```

Step 3: Never use time.Sleep for synchronization

```go // WRONG - sleep does not guarantee completion func TestAsyncOperation(t *testing.T) { var result string go func() { result = fetchData() }() time.Sleep(100 * time.Millisecond) // Fragile! if result == "" { t.Error("expected result") } }

// CORRECT - use channels or WaitGroup func TestAsyncOperation(t *testing.T) { done := make(chan string, 1) go func() { done <- fetchData() }()

select { case result := <-done: if result == "" { t.Error("expected result") } case <-time.After(5 * time.Second): t.Fatal("timed out waiting for result") } } ```

Step 4: Run race detector in CI

```bash # In CI pipeline go test -race -count=1 ./...

# Run tests multiple times to catch flakiness for i in $(seq 1 10); do go test -race -count=1 ./... || exit 1 done ```

Prevention

  • Always run go test -race in CI -- it has ~2x overhead but catches real races
  • Use sync.Map for read-heavy concurrent maps (write-heavy still needs sync.Mutex)
  • Prefer channels over shared memory for communication between goroutines
  • Use t.Cleanup() to ensure background goroutines are stopped when tests end
  • Never use time.Sleep to wait for goroutines -- use sync.WaitGroup or channels
  • Add go test -race -shuffle=on to randomize test execution order and expose hidden races