Introduction
Azure AKS cluster autoscaler not adding nodes when pool config or quota prevents scaling. This guide provides step-by-step diagnosis and resolution.
Symptoms
Typical error output:
Warning: FailedScaleOut
Cluster autoscaler could not add node: quota exceeded for 'standardDSv5Family' in region 'eastus'
Nodepool 'agentpool' cannot scale beyond 10 nodesCommon Causes
- 1.Subscription quota limit exceeded for node VM family
- 2.Autoscaler max node count reached
- 3.Subnet IP address exhaustion
- 4.Nodepool configuration prevents scaling
Step-by-Step Fix
Step 1: Check Current State
az aks show --resource-group MyRG --name MyAKS --query agentPoolProfiles
az aks command invoke --resource-group MyRG --name MyAKS --command "kubectl get nodes"
kubectl describe nodes | grep -A5 "Capacity"Step 2: Identify Root Cause
az monitor activity-log list --resource-group MyRG --status FailedStep 3: Apply Primary Fix
```bash # Update nodepool autoscaler limits az aks nodepool update --resource-group MyRG --cluster-name MyAKS --name agentpool --min-count 3 --max-count 20
# Request quota increase for VM family az quota request create --scope /subscriptions/<sub-id> --resource-name standardDSv5Family --limit 50 ```
Step 4: Apply Alternative Fix
```bash # Alternative fix: Check configuration az resource show --resource-group MyRG --name MyResource -o yaml
# Update specific properties az resource update --resource-group MyRG --name MyResource --set properties.<key>=<value>
# Verify the fix az resource show --resource-group MyRG --name MyResource --query properties.<key> ```
Step 5: Verify the Fix
az aks show --resource-group MyRG --name MyAKS --query agentPoolProfiles[0].count
kubectl get nodes | wc -lCommon Pitfalls
- Forgetting to check quota limits before resize operations
- Not waiting for async operations to complete before next step
- Missing RBAC permissions for Azure resource operations
- Confusing subscription-level and resource-level quotas
Best Practices
- Always check quota before provisioning new resources
- Use Azure Resource Health for monitoring
- Implement proper error handling in Azure CLI scripts
- Enable diagnostic settings for all critical resources
Related Issues
- Azure Subscription Quota Exceeded
- Azure Resource Deployment Failed
- Azure Network Connectivity Issues
- Azure RBAC Permission Denied