Your Horizontal Pod Autoscaler (HPA) is configured, but pods aren scaling. The CPU or memory is high, you set the thresholds, yet the replica count stays static. HPA is designed to automatically scale your workload based on demand, but several issues can prevent it from working correctly.

Understanding HPA Scaling

HPA monitors metrics (CPU, memory, or custom metrics) and adjusts replica count to meet targets. It reads metrics from the Metrics Server or custom metrics APIs, calculates desired replicas based on current metrics versus target, and instructs the deployment/replicaset controller to scale.

HPA won scale if: metrics server isn working, resource requests aren set (CPU metrics need requests), scaling is blocked by min/max limits, or cooldown periods are preventing rapid scaling.

Diagnosis Commands

Start by checking HPA status:

```bash # Check HPA status kubectl get hpa hpa-name -n namespace

# Get detailed HPA information kubectl describe hpa hpa-name -n namespace

# Check HPA metrics status kubectl get hpa hpa-name -n namespace -o yaml | grep -A 20 metrics ```

Check metrics availability:

```bash # Check if metrics server is running kubectl get pods -n kube-system -l k8s-app=metrics-server

# Check metrics server deployment kubectl get deployment metrics-server -n kube-system

# Verify metrics are available kubectl top pods -n namespace kubectl top nodes ```

Check deployment configuration:

```bash # Check deployment has resource requests kubectl get deployment deployment-name -n namespace -o yaml | grep -A 10 resources

# Check current replicas kubectl get deployment deployment-name -n namespace ```

Common Solutions

Solution 1: Fix Missing Metrics Server

HPA needs the metrics server to get CPU/memory metrics:

```bash # Check if metrics server exists kubectl get pods -n kube-system -l k8s-app=metrics-server

# If missing, install metrics server kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

# For Kubernetes < 1.22, may need: kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.5.0/components.yaml ```

Verify metrics server is working:

```bash # Check metrics server logs kubectl logs -n kube-system deployment/metrics-server

# Test metrics API kubectl top pods -n namespace

# If you see "Metrics API not available", check API registration kubectl get apiservices v1beta1.metrics.k8s.io kubectl describe apiservices v1beta1.metrics.k8s.io ```

Solution 2: Fix Missing Resource Requests

HPA CPU scaling requires resource requests to be set:

bash
# Check pod resource requests
kubectl get deployment deployment-name -n namespace -o jsonpath='{.spec.template.spec.containers[*].resources.requests}'

Add resource requests:

yaml
spec:
  template:
    spec:
      containers:
      - name: app
        resources:
          requests:
            cpu: "100m"    # Required for HPA CPU scaling
            memory: "128Mi"
          limits:
            cpu: "500m"
            memory: "512Mi"

Apply update:

```bash kubectl apply -f deployment.yaml

# Verify HPA can now see metrics kubectl describe hpa hpa-name -n namespace ```

Solution 3: Fix Min/Max Replica Limits

HPA won scale beyond configured bounds:

```bash # Check HPA limits kubectl get hpa hpa-name -n namespace -o yaml | grep -A 5 "minReplicas|maxReplicas"

# Check current replicas vs limits kubectl describe hpa hpa-name -n namespace ```

Adjust replica limits:

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-deployment
  minReplicas: 2
  maxReplicas: 20  # Increase max if needed
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Solution 4: Fix Metrics Not Available Error

If HPA shows "metrics not available":

```bash # Check HPA events kubectl describe hpa hpa-name -n namespace | grep -A 10 Events

# Common causes: # - Metrics server not installed # - Metrics server not healthy # - Pod has no resource requests ```

Check metrics server health:

```bash # Check metrics server pod kubectl describe pod metrics-server-xxx -n kube-system

# Check metrics server service kubectl get svc metrics-server -n kube-system

# Verify API service kubectl get apiservices | grep metrics kubectl describe apiservice v1beta1.metrics.k8s.io ```

Fix metrics server issues:

yaml
# Metrics server might need kubelet-insecure TLS for some clusters
apiVersion: apps/v1
kind: Deployment
metadata:
  name: metrics-server
  namespace: kube-system
spec:
  template:
    spec:
      containers:
      - name: metrics-server
        args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls  # Add if kubelet certs aren trusted

Solution 5: Fix Custom Metrics Issues

For custom metrics HPA, verify the custom metrics API:

```bash # Check custom metrics APIs kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .

# Check specific metric kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/metrics/http_requests" | jq . ```

Install Prometheus adapter for custom metrics:

```bash # Install prometheus-adapter kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/prometheus-adapter/main/deploy/manifests.yaml

# Or use Helm helm install prometheus-adapter stable/prometheus-adapter -n monitoring ```

Configure custom metrics HPA:

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: custom-metrics-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-deployment
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: "100"

Solution 6: Fix Scaling Behavior Issues

HPA has default stabilization windows that delay scaling:

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-deployment
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300  # Wait 5 min before scaling down
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0  # No delay for scaling up
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
      - type: Pods
        value: 4
        periodSeconds: 15
      selectPolicy: Max  # Use the policy that scales most

Solution 7: Check for Deployment Issues

The target deployment might have issues:

```bash # Check deployment status kubectl get deployment deployment-name -n namespace

# Check if deployment can scale kubectl scale deployment deployment-name -n namespace --replicas=5

# If manual scaling works, HPA should also work kubectl describe hpa hpa-name -n namespace ```

Solution 8: Fix Multiple HPA Conflict

Multiple HPAs targeting same resource can conflict:

```bash # Check for multiple HPAs kubectl get hpa -n namespace -o wide | grep deployment-name

# Delete conflicting HPA kubectl delete hpa conflicting-hpa -n namespace ```

Solution 9: Verify Metrics Are Being Collected

Check if pods are actually using CPU/memory:

```bash # Check current pod metrics kubectl top pods -n namespace

# Check pod resource usage vs requests kubectl describe hpa hpa-name -n namespace

# Calculate utilization # Example: CPU 50m used / 100m requested = 50% utilization # If target is 70%, HPA won scale up ```

Solution 10: Test HPA Manually

Force scaling to verify HPA can control deployment:

```bash # Check HPA can control deployment kubectl describe hpa hpa-name -n namespace

# Trigger scaling by generating load kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://deployment-service:8080; done"

# Watch HPA response kubectl get hpa hpa-name -n namespace -w ```

Verification

After fixing HPA issues:

```bash # Check HPA is receiving metrics kubectl describe hpa hpa-name -n namespace

# Watch HPA scaling behavior kubectl get hpa hpa-name -n namespace -w

# Check deployment replicas kubectl get deployment deployment-name -n namespace

# Verify metrics are available kubectl top pods -n namespace ```

HPA Status Messages

StatusMeaningSolution
"metrics not available"Can't get metricsInstall/fix metrics server
"missing request for cpu"Pod has no CPU requestAdd resource requests
"the HPA was unable to compute the replica count"Metrics calculation failedCheck metrics API
"Desired replica count is less than minReplicas"Below minimumIncrease load or lower min
"Desired replica count is greater than maxReplicas"Above maximumIncrease maxReplicas
"New replica count matched current"No scaling neededNormal if metrics within target

HPA Not Scaling Causes Summary

CauseCheck CommandSolution
Metrics server missingkubectl top podsInstall metrics server
No resource requestskubectl get deploy -o yamlAdd CPU/memory requests
minReplicas too highkubectl describe hpaLower minReplicas
maxReplicas too lowkubectl describe hpaIncrease maxReplicas
Metrics API unhealthykubectl get apiservicesFix metrics server
Custom metrics missingkubectl get --rawInstall prometheus-adapter
Stabilization windowkubectl describe hpaAdjust behavior config
Multiple HPA conflictkubectl get hpaDelete duplicate HPA

Prevention Best Practices

Always set resource requests for workloads using HPA. Install metrics server before creating HPA. Set appropriate min/max replica bounds. Use custom metrics for application-specific scaling. Configure scaling behavior for smooth scale-up/down. Monitor HPA status regularly. Test scaling behavior under realistic load.

HPA not scaling is usually about missing metrics or missing resource requests. The kubectl describe hpa command tells you exactly why HPA isn scaling - read the events and conditions carefully to find the root cause.