What's Actually Happening
Spinnaker pipeline execution fails at one or more stages, preventing application deployment or infrastructure changes.
The Error You'll See
Stage failed:
{
"status": "TERMINAL",
"stage": "Deploy",
"message": "Failed to create server group",
"exceptions": ["No providers found for account 'my-k8s'"]
}Pipeline stopped:
{
"status": "CANCELED",
"reason": "Stage 'Bake' failed with exit code 1"
}Trigger error:
{
"status": "NOT_RUNNING",
"message": "No matching artifact found for trigger"
}Why This Happens
- 1.Account misconfigured - Kubernetes or cloud account not set up
- 2.Missing permissions - Spinnaker lacks access to cluster/account
- 3.Stage configuration - Incorrect stage parameters
- 4.Trigger issues - Webhook or artifact trigger not working
- 5.Docker image - Image not found or pull errors
- 6.Infrastructure issues - Cluster or cloud provider problems
Step 1: Check Pipeline Status
```bash # Check Spinnaker UI: # Applications > myapp > Pipelines > pipeline-name
# Get pipeline via API: curl -s http://spinnaker:8080/pipelines/myapp/my-pipeline | jq
# Check execution history: curl -s http://spinnaker:8080/pipelines/myapp/my-pipeline/executions | jq '.[] | {id, status, startTime}'
# Get specific execution: curl -s http://spinnaker:8080/executions/<execution-id> | jq '.stages[] | {name, status, outputs}'
# Check execution details: curl -s http://spinnaker:8080/executions/<execution-id>/details | jq
# Check Spinnaker logs: kubectl logs -n spinnaker spin-orca-xxx | grep -i error kubectl logs -n spinnaker spin-clouddriver-xxx | grep -i error ```
Step 2: Check Stage Configuration
```bash # In Spinnaker Deck UI: # Pipelines > Configure > Edit as JSON
# Check stage JSON: { "name": "Deploy", "type": "deployManifest", "account": "my-k8s-account", "cloudProvider": "kubernetes", "manifests": [ {"kind": "Deployment", ...} ] }
# Common issues: # 1. Wrong account name: "account": "wrong-account" # Doesn't exist
# 2. Missing required fields: "namespace": "default" # Required for Kubernetes
# 3. Invalid manifest: "manifests": [{"kind": "Deployment", "name": null}] # Missing metadata.name
# Validate stage config: # Check account exists: curl -s http://spinnaker:8080/credentials | jq '.[] | .name' ```
Step 3: Check Cloud Accounts
```bash # List configured accounts: curl -s http://spinnaker:8080/credentials | jq '.[] | {name, type, environment}'
# Check Kubernetes accounts: curl -s http://spinnaker:8080/credentials?accountType=kubernetes | jq
# Check specific account: curl -s http://spinnaker:8080/credentials/my-k8s-account | jq
# In Spinnaker config: cat ~/.hal/default/service-settings/clouddriver.yml
# Check clouddriver config: kubectl exec -n spinnaker spin-clouddriver-xxx -- cat /opt/spinnaker/config/clouddriver.yml | grep -A 20 "accounts:"
# Verify account has access: kubectl exec -n spinnaker spin-clouddriver-xxx -- kubectl get pods --kubeconfig=/path/to/kubeconfig ```
Step 4: Check Permissions
```bash # Check Fiat (authz) logs: kubectl logs -n spinnaker spin-fiat-xxx | grep -i denied
# Check user permissions: curl -s http://spinnaker:8080/auth/user | jq
# Check service account: kubectl get serviceaccounts -n spinnaker spinnaker-service-account -o yaml
# Check RBAC: kubectl get clusterrolebinding spinnaker-role-binding -o yaml
# Required permissions for Kubernetes: apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: spinnaker-role rules: - apiGroups: [""] resources: ["*"] verbs: ["*"] - apiGroups: ["apps"] resources: ["*"] verbs: ["*"] - apiGroups: ["extensions"] resources: ["*"] verbs: ["*"]
# Apply missing permissions: kubectl apply -f spinnaker-rbac.yaml ```
Step 5: Check Triggers
```bash # Check trigger configuration in pipeline JSON: "triggers": [ { "type": "jenkins", "master": "my-jenkins", "job": "build-job", "propertyFile": "build.properties" } ]
# For Git triggers: "triggers": [ { "type": "git", "source": "github", "project": "org/repo", "branch": "main" } ]
# Check Jenkins connection: curl -s http://spinnaker:8080/jenkins/masters | jq
# Check Igor (CI integrations): kubectl logs -n spinnaker spin-igor-xxx | grep -i error
# Test webhook trigger: curl -X POST http://spinnaker:8080/webhooks/webhook/my-webhook \ -H "Content-Type: application/json" \ -d '{"repository": "org/repo", "branch": "main"}' ```
Step 6: Check Docker Images
```bash # Check Docker registry accounts: curl -s http://spinnaker:8080/credentials?accountType=dockerRegistry | jq
# Verify image exists: docker pull myregistry.com/myimage:v1.0.0
# Check image in registry: curl -s -u user:password https://myregistry.com/v2/myimage/tags/list | jq
# Check image tag in pipeline: "image": "myregistry.com/myimage:${imageTag}"
# Check image pull secrets: kubectl get secrets -n default docker-registry-secret -o yaml
# Add Docker registry to Spinnaker: hal config provider docker-registry account add my-registry \ --address myregistry.com \ --repositories myimage \ --username user \ --password password
hal config provider docker-registry enable hal deploy apply ```
Step 7: Check Bake Stage
```bash # For Packer-based bake stages:
# Check Rosco (baker) logs: kubectl logs -n spinnaker spin-rosco-xxx | grep -i error
# Check Packer template: curl -s http://spinnaker:8080/rosco/v1/pipelineTemplates | jq
# Check bake configuration: { "type": "bake", "cloudProviderType": "aws", "amiName": "myapp-{{timestamp}}", "baseOs": "ubuntu", "baseLabel": "release", "package": "myapp", "storeType": "ebs" }
# Verify Packer base image exists: # In AWS: Check AMI # In GCP: Check source image
# Check S3/image storage: aws s3 ls s3://my-spinnaker-bucket/ ```
Step 8: Check Deploy Stage
```bash # For Kubernetes deploy:
# Check namespace exists: kubectl get namespace myapp
# Check deployment created: kubectl get deployments -n myapp -l app=myapp
# Check events: kubectl get events -n myapp --sort-by='.lastTimestamp'
# For container errors: kubectl describe pods -n myapp -l app=myapp
# Check pod logs: kubectl logs -n myapp deployment/myapp
# Common deploy issues: # 1. Namespace missing: kubectl create namespace myapp
# 2. Resource quota exceeded: kubectl describe resourcequota -n myapp
# 3. Image pull error: kubectl describe pods -n myapp | grep -A 5 "Events:"
# Check Clouddriver cache: curl -s http://spinnaker:8080/cache/kubernetes/my-k8s-account/namespaces | jq ```
Step 9: Check Orca Execution
```bash # Orca is the orchestration engine:
# Check Orca logs: kubectl logs -n spinnaker spin-orca-xxx | grep -i "execution|failed|error"
# Check execution queue: curl -s http://spinnaker:8080/orca/admin/queue | jq
# Check running executions: curl -s http://spinnaker:8080/orca/executions/running | jq '. | length'
# Cancel stuck execution: curl -X DELETE http://spinnaker:8080/orca/executions/<execution-id>/cancel
# Restart Orca: kubectl rollout restart deployment/spin-orca -n spinnaker
# Check Orca database: kubectl exec -n spinnaker spin-orca-xxx -- mysql -e "SELECT COUNT(*) FROM execution;" ```
Step 10: Monitor Pipelines
```bash # Create monitoring script: cat << 'EOF' > /usr/local/bin/monitor-spinnaker.sh #!/bin/bash
echo "=== Running Pipelines ===" curl -s http://spinnaker:8080/orca/executions/running | jq '.[] | {id, application, pipelineConfigId}'
echo "" echo "=== Recent Failures ===" curl -s http://spinnaker:8080/orca/pipelines?limit=20 | jq '.[] | select(.status == "TERMINAL") | {id, application, name}'
echo "" echo "=== Service Health ===" curl -s http://spinnaker:8080/health | jq '.status'
echo "" echo "=== Clouddriver Cache ===" curl -s http://spinnaker:8080/cache/kubernetes | jq '.[] | {name, lastRefresh}'
echo "" echo "=== Queue Depth ===" curl -s http://spinnaker:8080/orca/admin/queue | jq '.size' EOF
chmod +x /usr/local/bin/monitor-spinnaker.sh
# Prometheus metrics: curl http://spinnaker:8080/metrics | grep -i "orca|pipeline|execution"
# Key metrics: # orca_pipeline_executions_total # orca_pipeline_execution_duration_seconds # orca_stage_failures_total
# Alerts: - alert: SpinnakerPipelineFailures expr: rate(orca_stage_failures_total[5m]) > 0 for: 5m labels: severity: warning annotations: summary: "Spinnaker pipeline stage failures detected" ```
Spinnaker Pipeline Failure Checklist
| Check | Command | Expected |
|---|---|---|
| Pipeline status | API/UI | Shows failure stage |
| Account config | credentials | Valid |
| Permissions | Fiat logs | Authorized |
| Trigger | Igor logs | Triggering |
| Docker image | registry | Exists |
| Cluster | kubectl | Healthy |
Verify the Fix
```bash # After fixing pipeline issue
# 1. Re-run pipeline # In Spinnaker UI: Pipelines > Start Manual Execution // Pipeline starts
# 2. Monitor execution # Check each stage passes // All stages green
# 3. Check deployment kubectl get pods -n myapp -l app=myapp // Pods running
# 4. Verify application kubectl port-forward svc/myapp -n myapp 8080:80 curl http://localhost:8080/health // Application healthy
# 5. Check pipeline history # Pipelines > History // Successful executions
# 6. Monitor ongoing /usr/local/bin/monitor-spinnaker.sh // No failures ```
Related Issues
- [Fix Spinnaker Pipeline Stuck](/articles/fix-spinnaker-pipeline-stuck)
- [Fix Jenkins Pipeline SCM Checkout Failed](/articles/fix-jenkins-pipeline-scm-checkout-failed)
- [Fix ArgoCD Sync Timeout](/articles/fix-argocd-sync-timeout)