# Concourse Pipeline Error: Complete Troubleshooting Guide
Concourse CI is a pipeline-based CI/CD system that emphasizes containers and resource management. When Concourse pipelines fail, it's typically due to task configuration, resource issues, or container execution problems.
Concourse's unique architecture means troubleshooting requires understanding resources, tasks, and the worker-container relationship.
Understanding Concourse Architecture
Concourse has three main components:
- 1.Web - The UI and API server
- 2.Worker - Executes tasks in containers
- 3.Resource - External systems like Git, S3, Docker registry
When builds fail, check which component is causing the issue.
Fix 1: Pipeline YAML Syntax Errors
Concourse YAML has strict requirements.
Symptoms: - "pipeline config is invalid" - Pipeline doesn't load - Task definition errors
Diagnosis:
```bash # Validate pipeline locally fly validate-pipeline -c pipeline.yml
# Or with strict validation fly validate-pipeline -c pipeline.yml --strict ```
Solution A: Fix common YAML issues:
# WRONG - missing resource type
jobs:
- name: build
plan:
- get: source
- task: build
config:
platform: linux
image_resource:
type: docker-image
source:
repository: node # Missing tag can cause issues```yaml # CORRECT - complete resource definition resources: - name: source type: git source: uri: https://github.com/org/repo branch: main
jobs: - name: build plan: - get: source - task: build config: platform: linux image_resource: type: docker-image source: repository: node tag: "20" ```
Solution B: Validate task definitions:
# Each task must have platform and image_resource
- task: test
config:
platform: linux
image_resource:
type: docker-image
source:
repository: node
tag: "20-alpine"
run:
path: npm
args: ["test"]
inputs:
- name: sourceSolution C: Use fly to set pipeline:
```bash # Set pipeline with validation fly -t main set-pipeline -p my-pipeline -c pipeline.yml -l vars.yml
# Check for errors in output # Concourse shows validation errors before setting ```
Fix 2: Task Execution Failures
Tasks fail inside containers.
Symptoms: - Task shows "errored" or "failed" - Exit code non-zero - Container exits prematurely
Diagnosis:
```bash # Check task execution details fly -t main watch -j my-pipeline/build
# Get task container info fly -t main containers
# Hijack into container for debugging fly -t main hijack -j my-pipeline/build -s build ```
Solution A: Fix task run path:
```yaml # WRONG - path might not exist - task: build config: run: path: ./build.sh # Relative path can fail
# CORRECT - absolute path or use sh - task: build config: run: path: sh args: - -c - | cd source npm install npm run build ```
Solution B: Fix task inputs:
# Task needs inputs from previous steps
- task: test
config:
inputs:
- name: source # Must match get name
run:
path: npm
args: ["test"]
dir: source # Working directorySolution C: Debug inside container:
```bash # Hijack into failed task container fly -t main hijack -j my-pipeline/build -s build-step
# Inside container: ls -la pwd cat /etc/environment npm --version ```
Fix 3: Resource Check Failures
Resources fail to check or get.
Symptoms:
- "check errored"
- Resource version not updating
- get step fails
Solution A: Fix Git resource:
resources:
- name: source
type: git
source:
uri: https://github.com/org/repo
branch: main
# For private repos, add credentials
username: ((git_username))
password: ((git_password))```bash # Check resource status fly -t main check-resource -r my-pipeline/source
# Force resource check fly -t main check-resource -r my-pipeline/source -f ```
Solution B: Fix S3 resource:
resources:
- name: artifacts
type: s3
source:
bucket: my-bucket
access_key_id: ((aws_access_key))
secret_access_key: ((aws_secret_key))
region_name: us-east-1
regexp: artifacts/build-(.*).tar.gzSolution C: Fix Docker registry resource:
resources:
- name: docker-image
type: docker-image
source:
repository: org/image
username: ((docker_username))
password: ((docker_password))
tag: latestSolution D: Use resource types correctly:
```yaml # Custom resource types resource_types: - name: slack-notification type: docker-image source: repository: cfcommunity/slack-notification-resource
resources: - name: notify type: slack-notification source: url: ((slack_webhook)) ```
Fix 4: Container Resource Issues
Containers fail to start or run.
Symptoms: - "failed to create container" - "container create timeout" - Worker resource exhaustion
Diagnosis:
```bash # Check worker status fly -t main workers
# Check worker containers fly -t main containers
# Check worker volumes fly -t main volumes ```
Solution A: Prune old containers:
```bash # Prune stalled volumes fly -t main prune
# Or on worker machine concourse-worker prune ```
Solution B: Check worker health:
```bash # On worker machine systemctl status concourse-worker
# Check worker logs journalctl -u concourse-worker -n 100
# Verify worker registration fly -t main workers ```
Solution C: Increase worker resources:
# Configure worker with more resources
# In worker configuration
CONCOURSE_CONTAINER_PLACEMENT_STRATEGY=volume-locality
CONCOURSE_MAX_CONTAINERS=100
CONCOURSE_BAGGAGECLAIM_DRIVER=btrfsFix 5: Volume and Cache Issues
Cache volumes cause failures.
Symptoms: - Cache not persisting - Volume mount errors - Disk space issues
Solution A: Configure task caches:
- task: build
config:
caches:
- path: node_modules # Cache directory
run:
path: npm
args: ["install"]Cache path is relative to task working directory.
Solution B: Use volume caching:
# Task with persistent cache across builds
- task: test
config:
caches:
- path: .npm-cache
run:
path: sh
args:
- -c
- |
npm config set cache .npm-cache
npm ci
npm testSolution C: Clean volumes:
```bash # List volumes fly -t main volumes
# Prune orphaned volumes fly -t main prune
# Delete specific volume fly -t main destroy-volume -v volume-id ```
Fix 6: Credential Management Errors
Credentials not resolved correctly.
Symptoms:
- ((credential)) not substituted
- Credential provider errors
- Secrets not accessible
Solution A: Use credential interpolation:
resources:
- name: docker-image
type: docker-image
source:
username: ((docker.username)) # Nested syntax
password: ((docker.password))Solution B: Configure credential manager:
For Vault:
# In Concourse web configuration
CONCOURSE_VAULT_URL=https://vault.company.com
CONCOURSE_VAULT_PATH_PREFIX=/concourse
CONCOURSE_VAULT_AUTH_CLIENT_TOKEN=your-tokenFor Credhub:
CONCOURSE_CREDHUB_URL=https://credhub.company.com
CONCOURSE_CREDHUB_CLIENT_ID=client
CONCOURSE_CREDHUB_CLIENT_SECRET=secretSolution C: Pass credentials via fly:
```yaml # Use -v for single values fly -t main set-pipeline -p my-pipeline -c pipeline.yml \ -v docker.username=myuser \ -v docker.password=mypass
# Or use -l for variable files fly -t main set-pipeline -p my-pipeline -c pipeline.yml \ -l credentials.yml ```
Fix 7: Timeout Errors
Tasks exceed timeout limits.
Symptoms: - "task timed out" - Task terminated early
Solution A: Increase task timeout:
- task: long-build
timeout: 2h # Default is 1h
config:
run:
path: sh
args:
- -c
- |
npm run build:productionSolution B: Add task progress output:
- task: build
config:
run:
path: sh
args:
- -c
- |
echo "Starting build..."
npm install
echo "Dependencies installed"
npm run build
echo "Build complete"Fix 8: Input Output Issues
Task outputs not passed correctly.
Symptoms:
- Next step can't find input
- put step fails
- Outputs missing
Solution A: Configure task outputs:
```yaml - task: build config: outputs: - name: build-output run: path: sh args: - -c - | mkdir -p build-output npm run build cp -r dist/* build-output/
- put: artifacts
- params:
- file: build-output/*.tar.gz # Reference output name
`
Solution B: Map inputs to outputs:
- task: test
config:
inputs:
- name: build-output # From previous task
run:
path: sh
args:
- -c
- |
cd build-output
npm testSolution C: Use file path patterns:
- put: docker-image
params:
build: source # Use input name
dockerfile: source/DockerfileFix 9: Serial Pipeline Execution
Parallel builds interfere with each other.
Symptoms: - Builds queued unnecessarily - Resource contention - Concurrent execution issues
Solution A: Configure serial execution:
jobs:
- name: deploy
serial: true # Only one build at a time
plan:
- get: source
- task: deploySolution B: Use serial groups:
```yaml jobs: - name: deploy-prod serial_groups: [production]
- name: deploy-staging
- serial_groups: [production] # Same group - serial execution
- name: test
- serial_groups: [test] # Different group - parallel with deploy
`
Solution C: Configure max in flight:
jobs:
- name: test
max_in_flight: 3 # Allow up to 3 concurrent buildsFix 10: Worker Network Issues
Workers can't reach web or resources.
Symptoms: - Worker shows "stalled" - Tasks fail to start - Network errors in logs
Solution A: Check worker-web connectivity:
```bash # From worker machine curl -I http://concourse-web:8080
# Check worker registration fly -t main workers
# Restart worker systemctl restart concourse-worker ```
Solution B: Configure worker networking:
# Worker configuration
CONCOURSE_WORKER_GARDEN_NETWORK=10.254.0.0/16
CONCOURSE_WORKER_GARDEN_NETWORK_POOL=10.254.100.0/24
CONCOURSE_TSA_HOST=concourse-web:2222Solution C: Debug worker registration:
```bash # On worker journalctl -u concourse-worker -f
# Look for: # - "failed to register worker" # - "connection refused" # - "TSA handshake failed" ```
Quick Reference: Concourse Errors
| Error | Cause | Solution |
|---|---|---|
| Pipeline invalid | YAML syntax | Use fly validate-pipeline |
| Task errored | Script failure | Check run path, hijack container |
| Check errored | Resource auth | Fix credentials, verify source |
| Container timeout | Worker issues | Prune, check worker resources |
| Credential empty | Provider issue | Configure Vault/Credhub |
| Task timeout | Build too long | Increase timeout, add progress |
| Output missing | Wrong output name | Configure outputs, map inputs |
| Worker stalled | Network issue | Check connectivity, restart worker |
Debugging Commands
```bash # Validate pipeline fly -t main validate-pipeline -c pipeline.yml --strict
# Watch job execution fly -t main watch -j my-pipeline/job-name
# Hijack into container fly -t main hijack -j my-pipeline/job-name -s step-name
# Check resources fly -t main check-resource -r my-pipeline/resource-name
# List workers fly -t main workers
# List containers fly -t main containers
# Prune stalled resources fly -t main prune
# View job history fly -t main builds -p my-pipeline
# Get build logs fly -t main build -b build-id ```