Fix Kubernetes HPA Not Scaling - Autoscaler Troubleshooting Guide

Your Horizontal Pod Autoscaler (HPA) is configured, but pods aren scaling. The CPU or memory is high, you set the thresholds, yet the replica count stays static. HPA is designed to automatically scale your workload based on demand, but several issues can prevent it from working correctly.

Understanding HPA Scaling

HPA monitors metrics (CPU, memory, or custom metrics) and adjusts replica count to meet targets. It reads metrics from the Metrics Server or custom metrics APIs, calculates desired replicas based on current metrics versus target, and instructs the deployment/replicaset controller to scale.

HPA won scale if: metrics server isn working, resource requests aren set (CPU metrics need requests), scaling is blocked by min/max limits, or cooldown periods are preventing rapid scaling.

Diagnosis Commands

Start by checking HPA status:

```bash # Check HPA status kubectl get hpa hpa-name -n namespace

# Get detailed HPA information kubectl describe hpa hpa-name -n namespace

# Check HPA metrics status kubectl get hpa hpa-name -n namespace -o yaml | grep -A 20 metrics ```

Check metrics availability:

```bash # Check if metrics server is running kubectl get pods -n kube-system -l k8s-app=metrics-server

# Check metrics server deployment kubectl get deployment metrics-server -n kube-system

# Verify metrics are available kubectl top pods -n namespace kubectl top nodes ```

Check deployment configuration:

```bash # Check deployment has resource requests kubectl get deployment deployment-name -n namespace -o yaml | grep -A 10 resources

# Check current replicas kubectl get deployment deployment-name -n namespace ```

Common Solutions

Solution 1: Fix Missing Metrics Server

HPA needs the metrics server to get CPU/memory metrics:

```bash # Check if metrics server exists kubectl get pods -n kube-system -l k8s-app=metrics-server

# If missing, install metrics server kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

# For Kubernetes < 1.22, may need: kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.5.0/components.yaml ```

Verify metrics server is working:

```bash # Check metrics server logs kubectl logs -n kube-system deployment/metrics-server

# Test metrics API kubectl top pods -n namespace

# If you see "Metrics API not available", check API registration kubectl get apiservices v1beta1.metrics.k8s.io kubectl describe apiservices v1beta1.metrics.k8s.io ```

Solution 2: Fix Missing Resource Requests

HPA CPU scaling requires resource requests to be set:

bash

# Check pod resource requests
kubectl get deployment deployment-name -n namespace -o jsonpath='{.spec.template.spec.containers[*].resources.requests}'

Add resource requests:

yaml

spec:
  template:
    spec:
      containers:
      - name: app
        resources:
          requests:
            cpu: "100m"    # Required for HPA CPU scaling
            memory: "128Mi"
          limits:
            cpu: "500m"
            memory: "512Mi"

Apply update:

```bash kubectl apply -f deployment.yaml

# Verify HPA can now see metrics kubectl describe hpa hpa-name -n namespace ```

Solution 3: Fix Min/Max Replica Limits

HPA won scale beyond configured bounds:

```bash # Check HPA limits kubectl get hpa hpa-name -n namespace -o yaml | grep -A 5 "minReplicas|maxReplicas"

# Check current replicas vs limits kubectl describe hpa hpa-name -n namespace ```

Adjust replica limits:

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-deployment
  minReplicas: 2
  maxReplicas: 20  # Increase max if needed
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Solution 4: Fix Metrics Not Available Error

If HPA shows "metrics not available":

```bash # Check HPA events kubectl describe hpa hpa-name -n namespace | grep -A 10 Events

# Common causes: # - Metrics server not installed # - Metrics server not healthy # - Pod has no resource requests ```

Check metrics server health:

```bash # Check metrics server pod kubectl describe pod metrics-server-xxx -n kube-system

# Check metrics server service kubectl get svc metrics-server -n kube-system

# Verify API service kubectl get apiservices | grep metrics kubectl describe apiservice v1beta1.metrics.k8s.io ```

Fix metrics server issues:

yaml

# Metrics server might need kubelet-insecure TLS for some clusters
apiVersion: apps/v1
kind: Deployment
metadata:
  name: metrics-server
  namespace: kube-system
spec:
  template:
    spec:
      containers:
      - name: metrics-server
        args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls  # Add if kubelet certs aren trusted

Solution 5: Fix Custom Metrics Issues

For custom metrics HPA, verify the custom metrics API:

```bash # Check custom metrics APIs kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .

# Check specific metric kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/metrics/http_requests" | jq . ```

Install Prometheus adapter for custom metrics:

```bash # Install prometheus-adapter kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/prometheus-adapter/main/deploy/manifests.yaml

# Or use Helm helm install prometheus-adapter stable/prometheus-adapter -n monitoring ```

Configure custom metrics HPA:

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: custom-metrics-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-deployment
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: "100"

Solution 6: Fix Scaling Behavior Issues

HPA has default stabilization windows that delay scaling:

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-deployment
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300  # Wait 5 min before scaling down
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0  # No delay for scaling up
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
      - type: Pods
        value: 4
        periodSeconds: 15
      selectPolicy: Max  # Use the policy that scales most

Solution 7: Check for Deployment Issues

The target deployment might have issues:

```bash # Check deployment status kubectl get deployment deployment-name -n namespace

# Check if deployment can scale kubectl scale deployment deployment-name -n namespace --replicas=5

# If manual scaling works, HPA should also work kubectl describe hpa hpa-name -n namespace ```

Solution 8: Fix Multiple HPA Conflict

Multiple HPAs targeting same resource can conflict:

```bash # Check for multiple HPAs kubectl get hpa -n namespace -o wide | grep deployment-name

# Delete conflicting HPA kubectl delete hpa conflicting-hpa -n namespace ```

Solution 9: Verify Metrics Are Being Collected

Check if pods are actually using CPU/memory:

```bash # Check current pod metrics kubectl top pods -n namespace

# Check pod resource usage vs requests kubectl describe hpa hpa-name -n namespace

# Calculate utilization # Example: CPU 50m used / 100m requested = 50% utilization # If target is 70%, HPA won scale up ```

Solution 10: Test HPA Manually

Force scaling to verify HPA can control deployment:

```bash # Check HPA can control deployment kubectl describe hpa hpa-name -n namespace

# Trigger scaling by generating load kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://deployment-service:8080; done"

# Watch HPA response kubectl get hpa hpa-name -n namespace -w ```

Verification

After fixing HPA issues:

```bash # Check HPA is receiving metrics kubectl describe hpa hpa-name -n namespace

# Watch HPA scaling behavior kubectl get hpa hpa-name -n namespace -w

# Check deployment replicas kubectl get deployment deployment-name -n namespace

# Verify metrics are available kubectl top pods -n namespace ```

HPA Status Messages

Status	Meaning	Solution
"metrics not available"	Can't get metrics	Install/fix metrics server
"missing request for cpu"	Pod has no CPU request	Add resource requests
"the HPA was unable to compute the replica count"	Metrics calculation failed	Check metrics API
"Desired replica count is less than minReplicas"	Below minimum	Increase load or lower min
"Desired replica count is greater than maxReplicas"	Above maximum	Increase maxReplicas
"New replica count matched current"	No scaling needed	Normal if metrics within target

HPA Not Scaling Causes Summary

Cause	Check Command	Solution
Metrics server missing	`kubectl top pods`	Install metrics server
No resource requests	`kubectl get deploy -o yaml`	Add CPU/memory requests
minReplicas too high	`kubectl describe hpa`	Lower minReplicas
maxReplicas too low	`kubectl describe hpa`	Increase maxReplicas
Metrics API unhealthy	`kubectl get apiservices`	Fix metrics server
Custom metrics missing	`kubectl get --raw`	Install prometheus-adapter
Stabilization window	`kubectl describe hpa`	Adjust behavior config
Multiple HPA conflict	`kubectl get hpa`	Delete duplicate HPA

Prevention Best Practices

Always set resource requests for workloads using HPA. Install metrics server before creating HPA. Set appropriate min/max replica bounds. Use custom metrics for application-specific scaling. Configure scaling behavior for smooth scale-up/down. Monitor HPA status regularly. Test scaling behavior under realistic load.

HPA not scaling is usually about missing metrics or missing resource requests. The kubectl describe hpa command tells you exactly why HPA isn scaling - read the events and conditions carefully to find the root cause.

Fix Kubernetes HPA Not Scaling Pods

Understanding HPA Scaling

Diagnosis Commands

Common Solutions

Solution 1: Fix Missing Metrics Server

Solution 2: Fix Missing Resource Requests

Solution 3: Fix Min/Max Replica Limits

Solution 4: Fix Metrics Not Available Error

Solution 5: Fix Custom Metrics Issues

Solution 6: Fix Scaling Behavior Issues

Solution 7: Check for Deployment Issues

Solution 8: Fix Multiple HPA Conflict

Solution 9: Verify Metrics Are Being Collected

Solution 10: Test HPA Manually

Verification

HPA Status Messages

HPA Not Scaling Causes Summary

Prevention Best Practices

Share this guide

More Kubernetes Troubleshooting Guides

Kubernetes Service Mesh Proxy Error

Kubernetes CronJob Not Scheduling

Kubernetes Admission Webhook Denied

Kubernetes Pod Security Denied

Kubernetes Network Policy Blocking

Kubernetes Weave Net Cross-Host Connectivity