Your Horizontal Pod Autoscaler (HPA) is configured, but pods aren scaling. The CPU or memory is high, you set the thresholds, yet the replica count stays static. HPA is designed to automatically scale your workload based on demand, but several issues can prevent it from working correctly.
Understanding HPA Scaling
HPA monitors metrics (CPU, memory, or custom metrics) and adjusts replica count to meet targets. It reads metrics from the Metrics Server or custom metrics APIs, calculates desired replicas based on current metrics versus target, and instructs the deployment/replicaset controller to scale.
HPA won scale if: metrics server isn working, resource requests aren set (CPU metrics need requests), scaling is blocked by min/max limits, or cooldown periods are preventing rapid scaling.
Diagnosis Commands
Start by checking HPA status:
```bash # Check HPA status kubectl get hpa hpa-name -n namespace
# Get detailed HPA information kubectl describe hpa hpa-name -n namespace
# Check HPA metrics status kubectl get hpa hpa-name -n namespace -o yaml | grep -A 20 metrics ```
Check metrics availability:
```bash # Check if metrics server is running kubectl get pods -n kube-system -l k8s-app=metrics-server
# Check metrics server deployment kubectl get deployment metrics-server -n kube-system
# Verify metrics are available kubectl top pods -n namespace kubectl top nodes ```
Check deployment configuration:
```bash # Check deployment has resource requests kubectl get deployment deployment-name -n namespace -o yaml | grep -A 10 resources
# Check current replicas kubectl get deployment deployment-name -n namespace ```
Common Solutions
Solution 1: Fix Missing Metrics Server
HPA needs the metrics server to get CPU/memory metrics:
```bash # Check if metrics server exists kubectl get pods -n kube-system -l k8s-app=metrics-server
# If missing, install metrics server kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
# For Kubernetes < 1.22, may need: kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.5.0/components.yaml ```
Verify metrics server is working:
```bash # Check metrics server logs kubectl logs -n kube-system deployment/metrics-server
# Test metrics API kubectl top pods -n namespace
# If you see "Metrics API not available", check API registration kubectl get apiservices v1beta1.metrics.k8s.io kubectl describe apiservices v1beta1.metrics.k8s.io ```
Solution 2: Fix Missing Resource Requests
HPA CPU scaling requires resource requests to be set:
# Check pod resource requests
kubectl get deployment deployment-name -n namespace -o jsonpath='{.spec.template.spec.containers[*].resources.requests}'Add resource requests:
spec:
template:
spec:
containers:
- name: app
resources:
requests:
cpu: "100m" # Required for HPA CPU scaling
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"Apply update:
```bash kubectl apply -f deployment.yaml
# Verify HPA can now see metrics kubectl describe hpa hpa-name -n namespace ```
Solution 3: Fix Min/Max Replica Limits
HPA won scale beyond configured bounds:
```bash # Check HPA limits kubectl get hpa hpa-name -n namespace -o yaml | grep -A 5 "minReplicas|maxReplicas"
# Check current replicas vs limits kubectl describe hpa hpa-name -n namespace ```
Adjust replica limits:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-deployment
minReplicas: 2
maxReplicas: 20 # Increase max if needed
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70Solution 4: Fix Metrics Not Available Error
If HPA shows "metrics not available":
```bash # Check HPA events kubectl describe hpa hpa-name -n namespace | grep -A 10 Events
# Common causes: # - Metrics server not installed # - Metrics server not healthy # - Pod has no resource requests ```
Check metrics server health:
```bash # Check metrics server pod kubectl describe pod metrics-server-xxx -n kube-system
# Check metrics server service kubectl get svc metrics-server -n kube-system
# Verify API service kubectl get apiservices | grep metrics kubectl describe apiservice v1beta1.metrics.k8s.io ```
Fix metrics server issues:
# Metrics server might need kubelet-insecure TLS for some clusters
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
spec:
template:
spec:
containers:
- name: metrics-server
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls # Add if kubelet certs aren trustedSolution 5: Fix Custom Metrics Issues
For custom metrics HPA, verify the custom metrics API:
```bash # Check custom metrics APIs kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .
# Check specific metric kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/metrics/http_requests" | jq . ```
Install Prometheus adapter for custom metrics:
```bash # Install prometheus-adapter kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/prometheus-adapter/main/deploy/manifests.yaml
# Or use Helm helm install prometheus-adapter stable/prometheus-adapter -n monitoring ```
Configure custom metrics HPA:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: custom-metrics-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-deployment
minReplicas: 2
maxReplicas: 10
metrics:
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "100"Solution 6: Fix Scaling Behavior Issues
HPA has default stabilization windows that delay scaling:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-deployment
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
behavior:
scaleDown:
stabilizationWindowSeconds: 300 # Wait 5 min before scaling down
policies:
- type: Percent
value: 10
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0 # No delay for scaling up
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max # Use the policy that scales mostSolution 7: Check for Deployment Issues
The target deployment might have issues:
```bash # Check deployment status kubectl get deployment deployment-name -n namespace
# Check if deployment can scale kubectl scale deployment deployment-name -n namespace --replicas=5
# If manual scaling works, HPA should also work kubectl describe hpa hpa-name -n namespace ```
Solution 8: Fix Multiple HPA Conflict
Multiple HPAs targeting same resource can conflict:
```bash # Check for multiple HPAs kubectl get hpa -n namespace -o wide | grep deployment-name
# Delete conflicting HPA kubectl delete hpa conflicting-hpa -n namespace ```
Solution 9: Verify Metrics Are Being Collected
Check if pods are actually using CPU/memory:
```bash # Check current pod metrics kubectl top pods -n namespace
# Check pod resource usage vs requests kubectl describe hpa hpa-name -n namespace
# Calculate utilization # Example: CPU 50m used / 100m requested = 50% utilization # If target is 70%, HPA won scale up ```
Solution 10: Test HPA Manually
Force scaling to verify HPA can control deployment:
```bash # Check HPA can control deployment kubectl describe hpa hpa-name -n namespace
# Trigger scaling by generating load kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://deployment-service:8080; done"
# Watch HPA response kubectl get hpa hpa-name -n namespace -w ```
Verification
After fixing HPA issues:
```bash # Check HPA is receiving metrics kubectl describe hpa hpa-name -n namespace
# Watch HPA scaling behavior kubectl get hpa hpa-name -n namespace -w
# Check deployment replicas kubectl get deployment deployment-name -n namespace
# Verify metrics are available kubectl top pods -n namespace ```
HPA Status Messages
| Status | Meaning | Solution |
|---|---|---|
| "metrics not available" | Can't get metrics | Install/fix metrics server |
| "missing request for cpu" | Pod has no CPU request | Add resource requests |
| "the HPA was unable to compute the replica count" | Metrics calculation failed | Check metrics API |
| "Desired replica count is less than minReplicas" | Below minimum | Increase load or lower min |
| "Desired replica count is greater than maxReplicas" | Above maximum | Increase maxReplicas |
| "New replica count matched current" | No scaling needed | Normal if metrics within target |
HPA Not Scaling Causes Summary
| Cause | Check Command | Solution |
|---|---|---|
| Metrics server missing | kubectl top pods | Install metrics server |
| No resource requests | kubectl get deploy -o yaml | Add CPU/memory requests |
| minReplicas too high | kubectl describe hpa | Lower minReplicas |
| maxReplicas too low | kubectl describe hpa | Increase maxReplicas |
| Metrics API unhealthy | kubectl get apiservices | Fix metrics server |
| Custom metrics missing | kubectl get --raw | Install prometheus-adapter |
| Stabilization window | kubectl describe hpa | Adjust behavior config |
| Multiple HPA conflict | kubectl get hpa | Delete duplicate HPA |
Prevention Best Practices
Always set resource requests for workloads using HPA. Install metrics server before creating HPA. Set appropriate min/max replica bounds. Use custom metrics for application-specific scaling. Configure scaling behavior for smooth scale-up/down. Monitor HPA status regularly. Test scaling behavior under realistic load.
HPA not scaling is usually about missing metrics or missing resource requests. The kubectl describe hpa command tells you exactly why HPA isn scaling - read the events and conditions carefully to find the root cause.