# Spinnaker Deployment Error: Complete Troubleshooting Guide
Spinnaker is a multi-cloud continuous delivery platform that orchestrates deployments across Kubernetes, AWS, GCP, Azure, and other platforms. When deployments fail, debugging requires understanding Spinnaker's pipeline structure, provider configurations, and execution model.
Let me walk through the most common Spinnaker deployment errors and how to fix each one.
Understanding Spinnaker Components
Spinnaker consists of several microservices:
- Deck - Web UI
- Gate - API gateway
- Orca - Pipeline orchestration
- Clouddriver - Cloud provider interactions
- Echo - Notifications
- Rosco - Baking (image building)
- Kayenta - Canary analysis
- Front50 - Persistence
Pipeline failures often occur in Orca or Clouddriver.
Fix 1: Pipeline Execution Failures
Pipelines fail to start or complete.
Symptoms: - Pipeline shows "FAILED" status - Stages don't execute - "Pipeline execution error"
Diagnosis:
```bash # Check Orca logs for pipeline execution kubectl logs -n spinnaker deployment/orca | grep -i "pipeline"
# Check pipeline execution details # In Spinnaker UI: Pipelines → [Pipeline] → Execution Details
# Or via API curl -H "Authorization: Bearer $TOKEN" \ "https://spinnaker.example.com/pipelines/executions/PIPELINE_ID" ```
Solution A: Check stage configuration:
In Spinnaker Deck UI:
- 1.Go to Pipelines → Configure → [Your Pipeline]
- 2.Click on failing stage
- 3.Check stage configuration:
- 4.- Account name correct?
- 5.- Cluster/app name matches?
- 6.- Required parameters provided?
Solution B: Fix pipeline triggers:
If pipeline doesn't start automatically:
{
"triggers": [
{
"type": "jenkins",
"master": "Jenkins",
"job": "build-job",
"propertyFile": "build.properties"
},
{
"type": "docker",
"account": "docker-hub",
"registry": "https://index.docker.io",
"repository": "org/image"
}
]
}Solution C: Verify pipeline JSON:
```bash # Export pipeline JSON to check for errors curl -H "Authorization: Bearer $TOKEN" \ "https://spinnaker.example.com/pipelines/appName/pipelineName" \ > pipeline.json
# Validate JSON structure jq . pipeline.json ```
Fix 2: Kubernetes Deployment Failures
Deployments to Kubernetes clusters fail.
Symptoms: - "Failed to deploy to Kubernetes" - "Server group not found" - "Namespace doesn't exist"
Diagnosis:
```bash # Check Clouddriver logs kubectl logs -n spinnaker deployment/clouddriver | grep -i kubernetes
# Check Kubernetes provider is registered curl -H "Authorization: Bearer $TOKEN" \ "https://spinnaker.example.com/credentials" | jq '.[] | select(.type=="kubernetes")'
# Check cluster connectivity kubectl get clusters ```
Solution A: Fix Kubernetes account configuration:
In clouddriver.yml or via Halyard:
```bash # Configure Kubernetes account hal config provider kubernetes account add my-k8s \ --context-name my-cluster-context \ --service-account true \ --provider-version v2
# Apply configuration hal deploy apply ```
Verify account:
```bash # List accounts hal config provider kubernetes account list
# Verify connection kubectl --context my-cluster-context get namespaces ```
Solution B: Check namespace exists:
```bash # Create namespace if missing kubectl create namespace my-app-namespace
# Or configure Spinnaker to create namespace # In Deploy stage: "Create namespace if it doesn't exist" ```
Solution C: Fix RBAC permissions:
Spinnaker needs Kubernetes permissions:
```yaml apiVersion: v1 kind: ServiceAccount metadata: name: spinnaker-service-account namespace: spinnaker
--- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: spinnaker-role rules: - apiGroups: [""] resources: [namespaces, pods, services, replicationcontrollers] verbs: [create, delete, get, list, update] - apiGroups: [apps] resources: [deployments, statefulsets, daemonsets, replicasets] verbs: [create, delete, get, list, update] - apiGroups: [extensions, networking.k8s.io] resources: [ingresses] verbs: [create, delete, get, list, update]
--- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: spinnaker-binding subjects: - kind: ServiceAccount name: spinnaker-service-account namespace: spinnaker roleRef: kind: ClusterRole name: spinnaker-role apiGroup: rbac.authorization.k8s.io ```
Fix 3: AWS Deployment Issues
Deployments to AWS fail.
Symptoms: - "Failed to create ASG" - "EC2 instance launch failed" - "S3 bucket not found"
Diagnosis:
```bash # Check Clouddriver AWS logs kubectl logs -n spinnaker deployment/clouddriver | grep -i aws
# Verify AWS account hal config provider aws account list
# Check AWS credentials aws sts get-caller-identity ```
Solution A: Configure AWS account:
```bash # Add AWS account via Halyard hal config provider aws account add my-aws \ --default-keypair my-keypair \ --edda-endpoint http://edda:8080 \ --regions us-east-1 \ --access-key-id $AWS_ACCESS_KEY_ID \ --secret-access-key $AWS_SECRET_ACCESS_KEY
# Or use IAM role with instance profile hal config provider aws account edit my-aws \ --assume-role role/spinnaker-managed ```
Solution B: Fix IAM permissions:
Spinnaker needs AWS permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"autoscaling:*",
"ec2:*",
"elasticloadbalancing:*",
"iam:PassRole",
"iam:GetRole",
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket"
],
"Resource": "*"
}
]
}Solution C: Check security groups:
```bash # Verify security group exists aws ec2 describe-security-groups --group-names my-security-group
# Create if missing aws ec2 create-security-group \ --group-name my-security-group \ --description "Spinnaker deployment group" ```
Fix 4: Image Bake Failures
Baking (building) images fails.
Symptoms: - "Failed to bake image" - "Packaging failed" - "Packer execution error"
Diagnosis:
```bash # Check Rosco logs kubectl logs -n spinnaker deployment/rosco | grep -i bake
# Check Packer execution # Rosco runs Packer for baking ```
Solution A: Fix packer template:
In bake stage configuration:
{
"bake": {
"baseLabel": "base",
"baseOs": "ubuntu",
"cloudProviderType": "aws",
"amiName": "my-app-${{buildInfo.number}}",
"package": "my-app.deb",
"user": "appuser"
}
}Verify package exists:
# Check package is available
curl -I https://repo.company.com/my-app.debSolution B: Configure bake options:
```bash # Configure default bake options hal config provider aws bake edit \ --base-amis ubuntu=ami-12345678 \ --default-ami-name my-app
# Apply hal deploy apply ```
Solution C: Check Rosco configuration:
# In rosco.yml
packer:
template:
aws:
templateFile: aws-ebs.json
baseAmi: ami-12345678
amiName: spinnaker-${{timestamp}}Fix 5: Canary Analysis Failures
Canary deployments fail analysis.
Symptoms: - "Canary score below threshold" - "Kayenta analysis failed" - Metric source unavailable
Diagnosis:
```bash # Check Kayenta logs kubectl logs -n spinnaker deployment/kayenta | grep -i canary
# Check metric source connectivity curl -H "Authorization: Bearer $TOKEN" \ "https://spinnaker.example.com/canary/configs" ```
Solution A: Configure canary judge:
{
"canary": {
"judge": " NetflixACAJudge-v1.0",
"metricsAccount": "prometheus-account",
"storageAccount": "s3-account",
"thresholds": {
"pass": 95,
"marginal": 75,
"fail": 0
}
}
}Solution B: Fix metric source:
For Prometheus:
```bash # Add Prometheus account hal config canary prometheus account add my-prometheus \ --address http://prometheus.company.com:9090
# Enable canary hal config canary prometheus enable ```
For Datadog:
```bash hal config canary datadog account add my-datadog \ --application-key $DATADOG_APP_KEY \ --api-key $DATADOG_API_KEY
hal config canary datadog enable ```
Solution C: Adjust thresholds:
// Lower threshold temporarily for testing
{
"thresholds": {
"pass": 80, // Lower from 95
"marginal": 60,
"fail": 0
}
}Fix 6: Load Balancer Configuration Issues
Load balancers don't configure correctly.
Symptoms: - "Load balancer not found" - "ELB configuration failed" - "Ingress creation failed"
Solution A: For AWS ELB:
```bash # Verify ELB exists aws elb describe-load-balancers --load-balancer-name my-elb
# Check Spinnaker load balancer config # In pipeline: Create Load Balancer stage ```
Solution B: For Kubernetes Ingress:
# In Deploy stage, add load balancers
loadBalancers:
- name: my-service
namespace: default
type: service
port: 80
targetPort: 8080Solution C: Configure health checks:
{
"loadBalancer": {
"name": "my-lb",
"healthCheck": {
"protocol": "HTTP",
"port": 8080,
"path": "/health",
"intervalSeconds": 30,
"timeoutSeconds": 5,
"healthyThreshold": 2,
"unhealthyThreshold": 5
}
}
}Fix 7: Approval Gate Failures
Manual approvals don't work correctly.
Symptoms: - "Waiting for approval" - Approval notification not sent - Can't approve deployment
Solution A: Configure approval stage:
{
"type": "manualJudgement",
"name": "Approve Deployment",
"comments": "Please review before deploying to production",
"judgementInputs": [
{
"value": "approve",
"description": "Approve deployment"
},
{
"value": "reject",
"description": "Reject deployment"
}
]
}Solution B: Set notification:
```bash # Configure Slack notifications hal config notification slack enable \ --token xoxb-your-slack-token
# Add to pipeline { "notifications": [ { "type": "slack", "address": "#deployments" } ] } ```
Solution C: Check permissions:
# Users need permission to approve
# In Fiat (auth service) configuration
hal config security auth fiat enableFix 8: Rollback Failures
Rollback operations don't complete.
Symptoms: - "Rollback failed" - "Previous version not found" - "Unable to undo deployment"
Solution A: Verify previous version exists:
```bash # Check previous server groups in Spinnaker UI # Infrastructure → Clusters → [App] → [Cluster]
# Or via API curl -H "Authorization: Bearer $TOKEN" \ "https://spinnaker.example.com/serverGroups/appName/accountName" ```
Solution B: Configure rollback stage:
{
"type": "undoRollback",
"cluster": "my-app-cluster",
"credentials": "my-k8s-account",
"target": "ancestor" // Previous version
}Solution C: Manual rollback:
```bash # If automatic rollback fails, use Kubernetes kubectl rollout undo deployment/my-app -n my-namespace
# Or deploy previous image kubectl set image deployment/my-app my-app=org/image:v1.0.0 ```
Fix 9: Webhook Integration Failures
External webhook triggers fail.
Symptoms: - "Webhook trigger failed" - "Invalid webhook payload" - Pipeline doesn't trigger
Solution A: Configure webhook trigger:
{
"triggers": [
{
"type": "webhook",
"source": "github-webhook",
"payloadConstraints": {
"ref": {
"type": "string",
"required": true
},
"repository.name": {
"type": "string",
"required": true
}
}
}
]
}Solution B: Test webhook:
# Send test webhook
curl -X POST \
-H "Content-Type: application/json" \
-d '{"ref":"refs/heads/main","repository":{"name":"my-repo"}}' \
"https://spinnaker.example.com/webhooks/webhook/github-webhook"Solution C: Check Echo configuration:
# Check Echo (webhook handler) logs
kubectl logs -n spinnaker deployment/echo | grep webhookFix 10: Pipeline Expression Errors
Spinnaker expressions evaluate incorrectly.
Symptoms:
- #{...} expression returns null
- Expression evaluation error
- Wrong value substituted
Solution A: Use correct expression syntax:
// Spinnaker uses Spring Expression Language (SpEL)
{
"name": "#{execution.application}",
"ami": "myapp-#{buildInfo.number}",
"env": "#{trigger.payload.environment ?: 'dev'}"
}Solution B: Debug expressions:
Add a webhook stage to echo expression values:
{
"type": "webhook",
"url": "https://echo.company.com/debug",
"payload": {
"app": "#{execution.application}",
"build": "#{buildInfo.number}",
"trigger": "#{trigger}"
}
}Solution C: Use safe expressions:
// Use elvis operator for null safety
{
"value": "#{trigger.payload?.value ?: 'default'}"
}Quick Reference: Spinnaker Errors
| Error | Component | Solution |
|---|---|---|
| Pipeline failed | Orca | Check stage config, JSON |
| Kubernetes deploy failed | Clouddriver | Fix account, RBAC |
| AWS deploy failed | Clouddriver | Configure AWS, IAM |
| Bake failed | Rosco | Fix Packer template |
| Canary failed | Kayenta | Configure metrics, thresholds |
| Load balancer failed | Clouddriver | Create LB, health checks |
| Approval stuck | Echo, Fiat | Configure notifications, auth |
| Rollback failed | Orca | Verify previous version |
| Webhook failed | Echo | Fix trigger config |
Debugging Commands
```bash # Check Spinnaker services kubectl get pods -n spinnaker
# Check Orca logs (pipelines) kubectl logs -n spinnaker deployment/orca -f
# Check Clouddriver logs (cloud operations) kubectl logs -n spinnaker deployment/clouddriver -f
# Check Echo logs (notifications, webhooks) kubectl logs -n spinnaker deployment/echo -f
# Check Kayenta logs (canary) kubectl logs -n spinnaker deployment/kayenta -f
# Check Front50 logs (persistence) kubectl logs -n spinnaker deployment/front50 -f
# Halyard commands hal status hal config list hal deploy apply
# API calls curl -H "Authorization: Bearer $TOKEN" \ "https://spinnaker.example.com/pipelines/executions" ```