# Kubernetes Deployment Failed in CI
Common Error Patterns
Kubernetes deployment failures typically show:
Error from server (Forbidden): deployments.apps is forbiddenDeployment "myapp" failed: ReplicaSet "myapp-xxx" has timed outFailed to pull image "myapp:latest": rpc error: code = UnknownFailed to create pod: pod "myapp-xxx" is forbidden: exceeded quota0/3 nodes are available: 3 Insufficient cpuRoot Causes and Solutions
1. RBAC Permission Denied
CI service account lacks Kubernetes permissions.
Solution:
Create proper RBAC configuration:
```yaml # Service account for CI apiVersion: v1 kind: ServiceAccount metadata: name: ci-deployer namespace: production
--- # Role with deployment permissions apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: deployer-role namespace: production rules: - apiGroups: ["apps", "extensions"] resources: ["deployments", "replicasets", "pods"] verbs: ["get", "list", "watch", "create", "update", "patch", "delete"] - apiGroups: [""] resources: ["secrets", "configmaps", "services"] verbs: ["get", "list", "create", "update", "patch"]
--- # Bind role to service account apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: deployer-binding namespace: production subjects: - kind: ServiceAccount name: ci-deployer namespace: production roleRef: kind: Role name: deployer-role apiGroup: rbac.authorization.k8s.io ```
Apply RBAC:
kubectl apply -f rbac.yamlGet token for CI:
```bash # Create long-lived token (Kubernetes 1.24+) kubectl create token ci-deployer --duration=24h -n production
# Or create secret kubectl apply -f - <<EOF apiVersion: v1 kind: Secret metadata: name: ci-deployer-token annotations: kubernetes.io/service-account.name: ci-deployer type: kubernetes.io/service-account-token EOF ```
2. Manifest Validation Errors
Invalid Kubernetes manifest syntax or configuration.
Solution:
Validate manifests before applying:
```bash # Validate with kubectl kubectl apply --dry-run=client -f deployment.yaml
# Validate server-side kubectl apply --dry-run=server -f deployment.yaml
# Use kubeval kubeval deployment.yaml
# Use kubeconform kubeconform -schema-location default deployment.yaml ```
In CI pipeline:
```yaml # GitHub Actions - name: Validate manifests run: | kubectl apply --dry-run=client -f k8s/
- name: Validate with kubeconform
- uses: instrumenta/kubeconform-action@v0.1.0
- with:
- manifests: 'k8s/*.yaml'
`
3. Image Pull Failures
Cannot pull container image from registry.
Solution:
Create image pull secret:
kubectl create secret docker-registry regcred \
--docker-server=<registry-server> \
--docker-username=<username> \
--docker-password=<password> \
--docker-email=<email> \
-n productionReference in deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
template:
spec:
imagePullSecrets:
- name: regcred
containers:
- name: myapp
image: registry/myapp:v1For AWS ECR:
# Create secret with AWS credentials
kubectl create secret docker-registry ecr-cred \
--docker-server=123456789012.dkr.ecr.us-east-1.amazonaws.com \
--docker-username=AWS \
--docker-password=$(aws ecr get-login-password) \
-n production4. Resource Quota Exceeded
Namespace quota doesn't allow deployment resources.
Solution:
Check quota:
kubectl get quota -n production
kubectl describe quota production-quota -n productionRequest appropriate resources:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
template:
spec:
containers:
- name: myapp
image: myapp:v1
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"Or increase quota:
apiVersion: v1
kind: ResourceQuota
metadata:
name: production-quota
spec:
hard:
requests.cpu: "10"
requests.memory: 10Gi
limits.cpu: "20"
limits.memory: 20Gi
pods: "50"5. Insufficient Node Resources
Nodes don't have enough resources for pods.
Solution:
Check node resources:
kubectl describe nodes
kubectl top nodesView available resources:
kubectl describe node node-1 | grep -A 5 "Allocated resources"Options: - Add nodes to cluster - Reduce resource requests - Use node autoscaling:
# Cluster Autoscaler configuration
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
spec:
template:
spec:
containers:
- name: cluster-autoscaler
image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.25.0
command:
- ./cluster-autoscaler
- --scale-down-delay-after-add=10m
- --scale-down-unneeded-time=10m
- --min-size=2
- --max-size=106. Deployment Rollout Timeout
Deployment takes too long to become ready.
Solution:
Increase deployment timeout:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
progressDeadlineSeconds: 600 # Default 600 seconds
template:
spec:
containers:
- name: myapp
image: myapp:v1
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3Wait for rollout in CI:
```bash # Wait for deployment kubectl rollout status deployment/myapp -n production --timeout=300s
# Check rollout history kubectl rollout history deployment/myapp -n production
# View deployment status kubectl describe deployment myapp -n production ```
7. Health Check Failures
Pod fails readiness/liveness probe checks.
Solution:
Configure proper probes:
containers:
- name: myapp
image: myapp:v1
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
successThreshold: 1
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3Check pod status:
kubectl describe pod myapp-xxx -n production
kubectl logs myapp-xxx -n production8. ConfigMap/Secret Missing
Pod references missing configuration resources.
Solution:
Verify references exist:
kubectl get configmap myapp-config -n production
kubectl get secret myapp-secret -n productionCreate missing resources:
```yaml apiVersion: v1 kind: ConfigMap metadata: name: myapp-config data: config.yaml: | key: value
--- apiVersion: v1 kind: Secret metadata: name: myapp-secret type: Opaque stringData: password: secretvalue ```
CI Pipeline Configuration
GitHub Actions Deploy
```yaml jobs: deploy: runs-on: ubuntu-latest steps: - name: Checkout uses: actions/checkout@v4
- name: Set up kubectl
- uses: azure/setup-kubectl@v3
- name: Configure kubeconfig
- run: |
- echo "${{ secrets.KUBECONFIG }}" | base64 -d > kubeconfig
- export KUBECONFIG=kubeconfig
- name: Validate manifests
- run: kubectl apply --dry-run=client -f k8s/
- name: Deploy
- run: |
- kubectl apply -f k8s/ -n production
- kubectl rollout status deployment/myapp -n production --timeout=300s
`
GitLab CI Deploy
deploy:
stage: deploy
image: bitnami/kubectl:latest
script:
- kubectl config set-cluster k8s --server="$KUBE_URL"
- kubectl config set-credentials admin --token="$KUBE_TOKEN"
- kubectl config set-context default --cluster=k8s --user=admin
- kubectl config use-context default
- kubectl apply --dry-run=client -f k8s/
- kubectl apply -f k8s/ -n production
- kubectl rollout status deployment/myapp -n production --timeout=300sDebugging Commands
```bash # Check deployment status kubectl get deployments -n production kubectl describe deployment myapp -n production
# Check pods kubectl get pods -n production -l app=myapp kubectl describe pod myapp-xxx -n production
# Check events kubectl get events -n production --sort-by=.metadata.creationTimestamp
# Check logs kubectl logs deployment/myapp -n production --all-containers
# Debug pod kubectl debug pod/myapp-xxx -n production -it --image=busybox
# Check rollout kubectl rollout status deployment/myapp -n production kubectl rollout history deployment/myapp -n production kubectl rollout undo deployment/myapp -n production ```
Quick Reference
| Error | Command/Solution |
|---|---|
| RBAC denied | Create Role and RoleBinding |
| Image pull fail | Create imagePullSecret |
| Quota exceeded | Check and adjust ResourceQuota |
| No nodes available | Add nodes or reduce requests |
| Rollout timeout | Increase progressDeadlineSeconds |
| Probe failure | Configure proper health checks |
Prevention Tips
- 1.Validate manifests before applying
- 2.Use
--dry-run=serverto catch server-side errors - 3.Set proper resource requests and limits
- 4.Configure readiness probes
- 5.Use deployment strategies (rolling update)
- 6.Set up proper RBAC for CI service account
Related Articles
- [Docker Build Failed in CI](#)
- [AWS ECS Task Stopped](#)
- [Terraform Plan Failed in CI](#)