Fix Prometheus Sample Limit Exceeded

The Problem

Prometheus logs show sample limit exceeded errors, and metrics are being dropped:

bash

level=warn ts=2026-04-04T06:45:22.123Z caller=scrape.go:1456 component="scrape manager" scrape_pool=kubernetes-pods target=http://10.0.0.5:8080/metrics msg="Scrape failed" err="sample_limit exceeded (10000 > 5000)"

You might also see in the UI:

bash

Error: sample_limit (5000) exceeded

This occurs when a single scrape returns more samples than the configured limit, protecting Prometheus from cardinality explosions.

Diagnosis

Check Current Sample Counts

```promql # Samples scraped per target scrape_samples_scraped

# Targets exceeding sample limits scrape_samples_scraped > 5000

# Top targets by sample count topk(10, scrape_samples_scraped)

# Samples per job sum by (job) (scrape_samples_scraped) ```

Identify High Cardinality Metrics

```promql # Count series per metric name count by (__name__)({__name__=~".+"})

# Top 20 highest cardinality metrics topk(20, count by (__name__)({__name__=~".+"}))

# Cardinality growth rate delta(count by (__name__)({__name__=~".+"})[1h]) ```

Find Problem Labels

```promql # Find labels with many values count_values by (label_name) ("cardinality", {__name__=~".+"})

# Top label cardinality topk(10, count by (pod, container) ({__name__=~".+"})) ```

Solutions

1. Increase Sample Limit

Quick fix for legitimate high-cardinality targets:

yaml

# prometheus.yml
scrape_configs:
  - job_name: 'high-cardinality-app'
    sample_limit: 50000  # increase from default (0 = unlimited, not recommended)
    static_configs:
      - targets: ['app:9090']

Apply and reload:

```bash # Reload config curl -X POST http://localhost:9090/-/reload

# Or restart systemctl restart prometheus ```

2. Reduce Metric Cardinality

Drop unnecessary labels at scrape time:

```yaml scrape_configs: - job_name: 'kubernetes-pods' kubernetes_sd_configs: - role: pod metric_relabel_configs: # Drop high-cardinality labels - action: labeldrop regex: '(pod_template_hash|deployment_kubernetes_io|pod_template_generation)'

# Keep only needed labels - action: keep source_labels: [__name__] regex: '(http_requests_total|http_request_duration_seconds|process_.+)'

# Replace high-cardinality values - source_labels: [__meta_kubernetes_pod_label_version] target_label: version action: replace ```

3. Drop Unwanted Metrics

Exclude metrics you don't need:

```yaml scrape_configs: - job_name: 'myapp' metric_relabel_configs: # Drop all metrics starting with unwanted prefix - action: drop source_labels: [__name__] regex: 'unwanted_metric_.+'

# Drop specific high-cardinality metrics - action: drop source_labels: [__name__] regex: '(http_request_duration_seconds_bucket|grpc_server_handled_total)' ```

4. Aggregate High Cardinality Metrics

Use recording rules to pre-aggregate:

```yaml # recording_rules.yml groups: - name: cardinality_reduction interval: 30s rules: # Aggregate away high-cardinality label - record: http_requests_total:by_method expr: sum without (pod, container, endpoint) (http_requests_total)

# Bucket aggregation - record: http_request_duration_seconds:bucket:by_service expr: sum without (pod, instance) (http_request_duration_seconds_bucket) ```

5. Configure Label Limits

Limit labels per sample:

yaml

scrape_configs:
  - job_name: 'myapp'
    label_limit: 20
    label_name_length_limit: 64
    label_value_length_limit: 128
    static_configs:
      - targets: ['app:9090']

6. Use Histogram Buckets Wisely

Reduce histogram cardinality:

yaml

# In your application's metrics configuration
histogramBuckets: [0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10]
# Instead of:
# histogramBuckets: [0.001, 0.002, 0.003, ... many buckets]

Verification

Monitor sample counts after changes:

```promql # Should be below limit scrape_samples_scraped{job="myapp"} < 50000

# Check dropped samples rate(prometheus_target_scrapes_exceeded_sample_limit_total[5m])

# Verify no limit errors in logs # journalctl -u prometheus --since "1 hour ago" | grep "sample_limit" ```

Prevention

Add alerts for cardinality issues:

```yaml groups: - name: cardinality_alerts rules: - alert: HighSampleCount expr: scrape_samples_scraped > 20000 for: 5m labels: severity: warning annotations: summary: "Target {{ $labels.instance }} exposing many samples" description: "Target is exposing {{ $value }} samples, consider reducing cardinality"

alert: SampleLimitApproaching
expr: scrape_samples_scraped / 50000 > 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "Sample limit approaching for {{ $labels.instance }}"

alert: HighCardinalityMetric
expr: count by (__name__)({__name__=~".+"}) > 10000
for: 15m
labels:
severity: warning
annotations:
summary: "Metric {{ $labels.__name__ }} has high cardinality"
`

Monitor cardinality growth:

```promql # Cardinality growth rate delta(count by (__name__)({__name__=~".+"})[1h]) > 1000

# Total series count sum(scrape_samples_scraped) ```

The Problem

Diagnosis

Check Current Sample Counts

Identify High Cardinality Metrics

Find Problem Labels

Solutions

1. Increase Sample Limit

2. Reduce Metric Cardinality

3. Drop Unwanted Metrics

4. Aggregate High Cardinality Metrics

5. Configure Label Limits

6. Use Histogram Buckets Wisely

Verification

Prevention

Share this guide

More Monitoring Troubleshooting Guides

Metric Retention Expired

Timeseries Storage Full

Collector Agent Crashed

Webhook Notification Timeout

SMS Notification Failed

Email Notification Bounced