The Problem

Prometheus logs show sample limit exceeded errors, and metrics are being dropped:

bash
level=warn ts=2026-04-04T06:45:22.123Z caller=scrape.go:1456 component="scrape manager" scrape_pool=kubernetes-pods target=http://10.0.0.5:8080/metrics msg="Scrape failed" err="sample_limit exceeded (10000 > 5000)"

You might also see in the UI:

bash
Error: sample_limit (5000) exceeded

This occurs when a single scrape returns more samples than the configured limit, protecting Prometheus from cardinality explosions.

Diagnosis

Check Current Sample Counts

```promql # Samples scraped per target scrape_samples_scraped

# Targets exceeding sample limits scrape_samples_scraped > 5000

# Top targets by sample count topk(10, scrape_samples_scraped)

# Samples per job sum by (job) (scrape_samples_scraped) ```

Identify High Cardinality Metrics

```promql # Count series per metric name count by (__name__)({__name__=~".+"})

# Top 20 highest cardinality metrics topk(20, count by (__name__)({__name__=~".+"}))

# Cardinality growth rate delta(count by (__name__)({__name__=~".+"})[1h]) ```

Find Problem Labels

```promql # Find labels with many values count_values by (label_name) ("cardinality", {__name__=~".+"})

# Top label cardinality topk(10, count by (pod, container) ({__name__=~".+"})) ```

Solutions

1. Increase Sample Limit

Quick fix for legitimate high-cardinality targets:

yaml
# prometheus.yml
scrape_configs:
  - job_name: 'high-cardinality-app'
    sample_limit: 50000  # increase from default (0 = unlimited, not recommended)
    static_configs:
      - targets: ['app:9090']

Apply and reload:

```bash # Reload config curl -X POST http://localhost:9090/-/reload

# Or restart systemctl restart prometheus ```

2. Reduce Metric Cardinality

Drop unnecessary labels at scrape time:

```yaml scrape_configs: - job_name: 'kubernetes-pods' kubernetes_sd_configs: - role: pod metric_relabel_configs: # Drop high-cardinality labels - action: labeldrop regex: '(pod_template_hash|deployment_kubernetes_io|pod_template_generation)'

# Keep only needed labels - action: keep source_labels: [__name__] regex: '(http_requests_total|http_request_duration_seconds|process_.+)'

# Replace high-cardinality values - source_labels: [__meta_kubernetes_pod_label_version] target_label: version action: replace ```

3. Drop Unwanted Metrics

Exclude metrics you don't need:

```yaml scrape_configs: - job_name: 'myapp' metric_relabel_configs: # Drop all metrics starting with unwanted prefix - action: drop source_labels: [__name__] regex: 'unwanted_metric_.+'

# Drop specific high-cardinality metrics - action: drop source_labels: [__name__] regex: '(http_request_duration_seconds_bucket|grpc_server_handled_total)' ```

4. Aggregate High Cardinality Metrics

Use recording rules to pre-aggregate:

```yaml # recording_rules.yml groups: - name: cardinality_reduction interval: 30s rules: # Aggregate away high-cardinality label - record: http_requests_total:by_method expr: sum without (pod, container, endpoint) (http_requests_total)

# Bucket aggregation - record: http_request_duration_seconds:bucket:by_service expr: sum without (pod, instance) (http_request_duration_seconds_bucket) ```

5. Configure Label Limits

Limit labels per sample:

yaml
scrape_configs:
  - job_name: 'myapp'
    label_limit: 20
    label_name_length_limit: 64
    label_value_length_limit: 128
    static_configs:
      - targets: ['app:9090']

6. Use Histogram Buckets Wisely

Reduce histogram cardinality:

yaml
# In your application's metrics configuration
histogramBuckets: [0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10]
# Instead of:
# histogramBuckets: [0.001, 0.002, 0.003, ... many buckets]

Verification

Monitor sample counts after changes:

```promql # Should be below limit scrape_samples_scraped{job="myapp"} < 50000

# Check dropped samples rate(prometheus_target_scrapes_exceeded_sample_limit_total[5m])

# Verify no limit errors in logs # journalctl -u prometheus --since "1 hour ago" | grep "sample_limit" ```

Prevention

Add alerts for cardinality issues:

```yaml groups: - name: cardinality_alerts rules: - alert: HighSampleCount expr: scrape_samples_scraped > 20000 for: 5m labels: severity: warning annotations: summary: "Target {{ $labels.instance }} exposing many samples" description: "Target is exposing {{ $value }} samples, consider reducing cardinality"

  • alert: SampleLimitApproaching
  • expr: scrape_samples_scraped / 50000 > 0.8
  • for: 5m
  • labels:
  • severity: warning
  • annotations:
  • summary: "Sample limit approaching for {{ $labels.instance }}"
  • alert: HighCardinalityMetric
  • expr: count by (__name__)({__name__=~".+"}) > 10000
  • for: 15m
  • labels:
  • severity: warning
  • annotations:
  • summary: "Metric {{ $labels.__name__ }} has high cardinality"
  • `

Monitor cardinality growth:

```promql # Cardinality growth rate delta(count by (__name__)({__name__=~".+"})[1h]) > 1000

# Total series count sum(scrape_samples_scraped) ```