# GitHub Actions Workflow Failed: Complete Troubleshooting Guide

You pushed code to GitHub, the Actions workflow started, and then it failed. Now you're looking at a sea of red X's and need to figure out what went wrong.

GitHub Actions failures can happen at any stage—from checkout to deployment. Let me walk through the most common failure patterns and how to fix them.

Reading the Failure Logs

Before anything else, understand how to read GitHub Actions logs:

  1. 1.Go to your repository's Actions tab
  2. 2.Click on the failed workflow run
  3. 3.Click on the failed job (marked with a red X)
  4. 4.Click on the failed step to expand its logs

Pro tip: Use Ctrl+F to search for "Error", "FAILED", "exception", or "fatal". The actual error is often buried in verbose output.

Fix 1: Dependency Installation Failures

The most common failure happens during dependency installation:

bash
npm ERR! 404 Not Found - GET https://registry.npmjs.org/@scope/package/-/package-1.0.0.tgz
npm ERR! 404  '@scope/package@1.0.0' is not in the npm registry.

Or for Python:

bash
ERROR: Could not find a version that satisfies the requirement package==1.0.0

Diagnosis:

Check if the dependency exists and your package.json or requirements.txt is correct:

yaml
# Add a debug step before your install step
- name: Debug dependencies
  run: |
    cat package.json
    npm ls --depth=0 || true

Solution for private packages:

If you're using private npm packages:

```yaml - name: Setup Node.js uses: actions/setup-node@v4 with: node-version: '20' registry-url: 'https://npm.pkg.github.com' scope: '@your-org'

  • name: Install dependencies
  • run: npm ci
  • env:
  • NODE_AUTH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
  • `

For Python private packages:

yaml
- name: Install dependencies
  run: pip install -r requirements.txt
  env:
    PIP_EXTRA_INDEX_URL: https://${{ secrets.PYPI_TOKEN }}@pypi.example.com/simple/

Solution for lockfile issues:

bash
npm ERR! `npm ci` requires a clean package-lock.json

Regenerate your lockfile:

bash
# Local
rm package-lock.json
npm install
git add package-lock.json
git commit -m "Regenerate package-lock.json"
git push

Fix 2: Permission Denied Errors

You might see errors like:

bash
Error: EACCES: permission denied, open '/home/runner/.npm/_logs/...'

Or:

bash
fatal: could not create work tree dir 'repo-name': Permission denied

Solution:

GitHub Actions runs as a non-root user. If you need elevated permissions:

```yaml - name: Fix permissions run: sudo chown -R $(whoami) /path/to/directory

# Or run the step as root - name: Run as root run: | sudo apt-get update sudo apt-get install -y package-name ```

For Git operations, ensure your token has correct permissions:

yaml
- name: Checkout
  uses: actions/checkout@v4
  with:
    token: ${{ secrets.GITHUB_TOKEN }}  # Has limited permissions

If you need to push changes:

yaml
- name: Push changes
  run: |
    git config user.name "github-actions[bot]"
    git config user.email "github-actions[bot]@users.noreply.github.com"
    git push
  env:
    GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}  # Or use PAT for more permissions

Fix 3: Step Timeout Failures

Some steps take longer than expected:

bash
Error: The action 'Build' has timed out after 360 minutes

Diagnosis:

Your step or job might be hanging. Add timeout and debug steps:

yaml
jobs:
  build:
    timeout-minutes: 30  # Job-level timeout
    steps:
      - name: Long running step
        timeout-minutes: 10  # Step-level timeout
        run: ./build.sh

Solution:

For genuinely long builds, increase the timeout:

yaml
jobs:
  build:
    timeout-minutes: 90  # Increase from default 360

For stuck processes, add process monitoring:

yaml
- name: Debug running processes
  if: always()  # Runs even if previous steps fail
  run: |
    ps aux
    docker ps -a
    df -h

Fix 4: Environment Variable Issues

Missing or incorrect environment variables:

bash
Error: API_KEY is not defined

Diagnosis:

Add a debug step to check your environment:

yaml
- name: Debug environment
  run: |
    echo "Node version: $(node --version)"
    echo "NPM version: $(npm --version)"
    echo "Working directory: $(pwd)"
    echo "Available secrets: ${?GITHUB_TOKEN}"  # Don't echo actual secret values!

Solution:

Set environment variables correctly:

yaml
jobs:
  build:
    env:
      NODE_ENV: test
      API_URL: https://api.example.com
    steps:
      - name: Use environment
        run: echo "API URL is $API_URL"
        env:
          PER_STEP_VAR: value

For secrets:

yaml
- name: Deploy
  run: ./deploy.sh
  env:
    AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
    AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

Important: Never echo secret values. They're masked in logs, but still don't print them.

Fix 5: Docker Build Failures

Docker-related errors are common:

bash
Error: denied: permission_denied: write access to repository

Or:

bash
Error: no space left on device

Solution for registry permission:

```yaml - name: Login to Docker Hub uses: docker/login-action@v3 with: username: ${{ secrets.DOCKERHUB_USERNAME }} password: ${{ secrets.DOCKERHUB_TOKEN }}

  • name: Build and push
  • uses: docker/build-push-action@v5
  • with:
  • context: .
  • push: true
  • tags: user/image:latest
  • `

For GitHub Container Registry:

yaml
- name: Login to GHCR
  uses: docker/login-action@v3
  with:
    registry: ghcr.io
    username: ${{ github.actor }}
    password: ${{ secrets.GITHUB_TOKEN }}

Solution for disk space:

Add disk cleanup:

yaml
- name: Free disk space
  run: |
    sudo rm -rf /usr/share/dotnet
    sudo rm -rf /usr/local/lib/android
    sudo rm -rf /opt/ghc
    df -h

Or use a specialized action:

yaml
- name: Free disk space
  uses: jlumbroso/free-disk-space@main
  with:
    tool-cache: false

Fix 6: Matrix Build Failures

Matrix builds fail for one configuration but pass others:

yaml
strategy:
  matrix:
    node: [16, 18, 20]

Diagnosis:

Check which matrix combination failed:

yaml
- name: Debug matrix
  run: |
    echo "Testing with Node ${{ matrix.node }}"
    echo "Running on ${{ runner.os }}"

Solution:

Allow specific failures to pass:

yaml
strategy:
  fail-fast: false  # Continue other matrix jobs even if one fails
  matrix:
    node: [16, 18, 20]

Or exclude problematic combinations:

yaml
strategy:
  matrix:
    os: [ubuntu-latest, windows-latest]
    node: [16, 18, 20]
    exclude:
      - os: windows-latest
        node: 16  # Skip this combination

Fix 7: Cache Issues

Sometimes cache causes problems:

bash
Error: Unable to restore cache

Solution:

Add cache invalidation:

yaml
- name: Cache node modules
  uses: actions/cache@v4
  with:
    path: ~/.npm
    key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
    restore-keys: |
      ${{ runner.os }}-node-

To force cache refresh, update the key:

yaml
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}-v2

Debugging Workflow Files

When the workflow file itself has issues:

yaml
# Add this at the workflow level for maximum debug output
env:
  ACTIONS_STEP_DEBUG: true
  ACTIONS_RUNNER_DEBUG: true

Or use the runner debug shell:

yaml
- name: Debug with tmate
  if: failure()
  uses: mxschmitt/action-tmate@v3
  timeout-minutes: 15

This gives you SSH access to the runner for interactive debugging.

Quick Reference: Common Failure Patterns

Error PatternCauseSolution
npm ERR! 404Missing packageCheck package name, registry
EACCESPermission deniedFix file/directory permissions
ETIMEDOUTNetwork timeoutCheck external services
command not foundTool not installedAdd setup step
ENOMEMOut of memoryOptimize build, increase timeout
SIGTERMProcess killedCheck for timeout, resource limits

Preventing Failures

Add a linting step to catch issues early:

```yaml name: CI

on: push

jobs: lint-workflow: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Validate workflow run: | # Install actionlint go install github.com/rhysd/actionlint/cmd/actionlint@latest actionlint .github/workflows/*.yml ```

This catches syntax errors and common misconfigurations before they cause failures in your actual workflows.