Introduction
GitHub Actions matrix strategy runs a job across multiple combinations of variables (OS, Node version, etc.). By default, fail-fast is set to true, meaning if any matrix combination fails, GitHub cancels all in-progress and queued matrix jobs. While this saves resources, it can hide issues in other combinations and prevent useful test results from being collected.
Symptoms
- Matrix job shows
Cancelledstatus for most combinations after one fails - Only one failure is visible while other combinations never run
- Intermittent failures in one combination block results from all other combinations
- Test results are incomplete because most matrix jobs were cancelled
- Status check shows
failureeven though 9 of 12 matrix combinations would have passed
Common Causes
fail-fast: true(default) cancelling all other matrix jobs on first failure- Flaky test in one specific OS/version combination
- Environment-specific issue (e.g., macOS-specific build failure)
- Resource contention on the runner affecting one combination disproportionately
- Dependency incompatibility with a specific version in the matrix
Step-by-Step Fix
- 1.Disable fail-fast to let all matrix combinations complete: See all results.
- 2.```yaml
- 3.jobs:
- 4.test:
- 5.runs-on: ${{ matrix.os }}
- 6.strategy:
- 7.fail-fast: false # Let all combinations run even if one fails
- 8.matrix:
- 9.os: [ubuntu-latest, windows-latest, macos-latest]
- 10.node: [16, 18, 20]
- 11.
` - 12.Identify the specific failing combination: Check which matrix variables caused the failure.
- 13.
` - 14.# In the workflow run, click on each matrix combination
- 15.# Check the matrix variables in the job name:
- 16.# "test (ubuntu-latest, 18)" failed
- 17.
` - 18.Fix the issue in the failing combination: Address the specific failure.
- 19.```yaml
- 20.# If the issue is OS-specific, add conditional steps
- 21.- name: Install dependencies
- 22.run: |
- 23.if [ "${{ runner.os }}" == "macOS" ]; then
- 24.brew install specific-package
- 25.else
- 26.npm install
- 27.fi
- 28.
` - 29.Use continue-on-error for known flaky combinations: Allow the workflow to proceed.
- 30.```yaml
- 31.jobs:
- 32.test:
- 33.strategy:
- 34.fail-fast: false
- 35.matrix:
- 36.os: [ubuntu-latest, macos-latest]
- 37.continue-on-error: ${{ matrix.os == 'macos-latest' }}
- 38.
` - 39.Set up required status checks for only critical combinations: Don't block PRs on non-critical matrix jobs.
- 40.
` - 41.# In branch protection rules, only require:
- 42.# - test (ubuntu-latest, 18)
- 43.# - test (ubuntu-latest, 20)
- 44.# Not every single matrix combination
- 45.
`
Prevention
- Set
fail-fast: falsefor test matrices to collect complete results - Use
continue-on-errorfor experimental or known-flaky matrix combinations - Separate critical matrix combinations (required for PR merge) from exploratory ones
- Monitor matrix failure patterns to identify environment-specific issues
- Implement retry logic for flaky tests in specific matrix combinations
- Document which matrix combinations are required vs. informational for PR checks