# GitHub Actions Workflow Failed: Complete Troubleshooting Guide
You pushed code to GitHub, the Actions workflow started, and then it failed. Now you're looking at a sea of red X's and need to figure out what went wrong.
GitHub Actions failures can happen at any stage—from checkout to deployment. Let me walk through the most common failure patterns and how to fix them.
Reading the Failure Logs
Before anything else, understand how to read GitHub Actions logs:
- 1.Go to your repository's Actions tab
- 2.Click on the failed workflow run
- 3.Click on the failed job (marked with a red X)
- 4.Click on the failed step to expand its logs
Pro tip: Use Ctrl+F to search for "Error", "FAILED", "exception", or "fatal". The actual error is often buried in verbose output.
Fix 1: Dependency Installation Failures
The most common failure happens during dependency installation:
npm ERR! 404 Not Found - GET https://registry.npmjs.org/@scope/package/-/package-1.0.0.tgz
npm ERR! 404 '@scope/package@1.0.0' is not in the npm registry.Or for Python:
ERROR: Could not find a version that satisfies the requirement package==1.0.0Diagnosis:
Check if the dependency exists and your package.json or requirements.txt is correct:
# Add a debug step before your install step
- name: Debug dependencies
run: |
cat package.json
npm ls --depth=0 || trueSolution for private packages:
If you're using private npm packages:
```yaml - name: Setup Node.js uses: actions/setup-node@v4 with: node-version: '20' registry-url: 'https://npm.pkg.github.com' scope: '@your-org'
- name: Install dependencies
- run: npm ci
- env:
- NODE_AUTH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
`
For Python private packages:
- name: Install dependencies
run: pip install -r requirements.txt
env:
PIP_EXTRA_INDEX_URL: https://${{ secrets.PYPI_TOKEN }}@pypi.example.com/simple/Solution for lockfile issues:
npm ERR! `npm ci` requires a clean package-lock.jsonRegenerate your lockfile:
# Local
rm package-lock.json
npm install
git add package-lock.json
git commit -m "Regenerate package-lock.json"
git pushFix 2: Permission Denied Errors
You might see errors like:
Error: EACCES: permission denied, open '/home/runner/.npm/_logs/...'Or:
fatal: could not create work tree dir 'repo-name': Permission deniedSolution:
GitHub Actions runs as a non-root user. If you need elevated permissions:
```yaml - name: Fix permissions run: sudo chown -R $(whoami) /path/to/directory
# Or run the step as root - name: Run as root run: | sudo apt-get update sudo apt-get install -y package-name ```
For Git operations, ensure your token has correct permissions:
- name: Checkout
uses: actions/checkout@v4
with:
token: ${{ secrets.GITHUB_TOKEN }} # Has limited permissionsIf you need to push changes:
- name: Push changes
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git push
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # Or use PAT for more permissionsFix 3: Step Timeout Failures
Some steps take longer than expected:
Error: The action 'Build' has timed out after 360 minutesDiagnosis:
Your step or job might be hanging. Add timeout and debug steps:
jobs:
build:
timeout-minutes: 30 # Job-level timeout
steps:
- name: Long running step
timeout-minutes: 10 # Step-level timeout
run: ./build.shSolution:
For genuinely long builds, increase the timeout:
jobs:
build:
timeout-minutes: 90 # Increase from default 360For stuck processes, add process monitoring:
- name: Debug running processes
if: always() # Runs even if previous steps fail
run: |
ps aux
docker ps -a
df -hFix 4: Environment Variable Issues
Missing or incorrect environment variables:
Error: API_KEY is not definedDiagnosis:
Add a debug step to check your environment:
- name: Debug environment
run: |
echo "Node version: $(node --version)"
echo "NPM version: $(npm --version)"
echo "Working directory: $(pwd)"
echo "Available secrets: ${?GITHUB_TOKEN}" # Don't echo actual secret values!Solution:
Set environment variables correctly:
jobs:
build:
env:
NODE_ENV: test
API_URL: https://api.example.com
steps:
- name: Use environment
run: echo "API URL is $API_URL"
env:
PER_STEP_VAR: valueFor secrets:
- name: Deploy
run: ./deploy.sh
env:
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}Important: Never echo secret values. They're masked in logs, but still don't print them.
Fix 5: Docker Build Failures
Docker-related errors are common:
Error: denied: permission_denied: write access to repositoryOr:
Error: no space left on deviceSolution for registry permission:
```yaml - name: Login to Docker Hub uses: docker/login-action@v3 with: username: ${{ secrets.DOCKERHUB_USERNAME }} password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Build and push
- uses: docker/build-push-action@v5
- with:
- context: .
- push: true
- tags: user/image:latest
`
For GitHub Container Registry:
- name: Login to GHCR
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}Solution for disk space:
Add disk cleanup:
- name: Free disk space
run: |
sudo rm -rf /usr/share/dotnet
sudo rm -rf /usr/local/lib/android
sudo rm -rf /opt/ghc
df -hOr use a specialized action:
- name: Free disk space
uses: jlumbroso/free-disk-space@main
with:
tool-cache: falseFix 6: Matrix Build Failures
Matrix builds fail for one configuration but pass others:
strategy:
matrix:
node: [16, 18, 20]Diagnosis:
Check which matrix combination failed:
- name: Debug matrix
run: |
echo "Testing with Node ${{ matrix.node }}"
echo "Running on ${{ runner.os }}"Solution:
Allow specific failures to pass:
strategy:
fail-fast: false # Continue other matrix jobs even if one fails
matrix:
node: [16, 18, 20]Or exclude problematic combinations:
strategy:
matrix:
os: [ubuntu-latest, windows-latest]
node: [16, 18, 20]
exclude:
- os: windows-latest
node: 16 # Skip this combinationFix 7: Cache Issues
Sometimes cache causes problems:
Error: Unable to restore cacheSolution:
Add cache invalidation:
- name: Cache node modules
uses: actions/cache@v4
with:
path: ~/.npm
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
restore-keys: |
${{ runner.os }}-node-To force cache refresh, update the key:
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}-v2Debugging Workflow Files
When the workflow file itself has issues:
# Add this at the workflow level for maximum debug output
env:
ACTIONS_STEP_DEBUG: true
ACTIONS_RUNNER_DEBUG: trueOr use the runner debug shell:
- name: Debug with tmate
if: failure()
uses: mxschmitt/action-tmate@v3
timeout-minutes: 15This gives you SSH access to the runner for interactive debugging.
Quick Reference: Common Failure Patterns
| Error Pattern | Cause | Solution |
|---|---|---|
npm ERR! 404 | Missing package | Check package name, registry |
EACCES | Permission denied | Fix file/directory permissions |
ETIMEDOUT | Network timeout | Check external services |
command not found | Tool not installed | Add setup step |
ENOMEM | Out of memory | Optimize build, increase timeout |
SIGTERM | Process killed | Check for timeout, resource limits |
Preventing Failures
Add a linting step to catch issues early:
```yaml name: CI
on: push
jobs: lint-workflow: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Validate workflow run: | # Install actionlint go install github.com/rhysd/actionlint/cmd/actionlint@latest actionlint .github/workflows/*.yml ```
This catches syntax errors and common misconfigurations before they cause failures in your actual workflows.