Fix Prometheus High Cardinality Metrics Error

Introduction

Prometheus high cardinality errors occur when the number of unique time series explodes beyond manageable limits, causing memory exhaustion, slow queries, compaction failures, and potential Prometheus server crashes. Cardinality in Prometheus refers to the number of unique label combinations (time series) for a metric. High cardinality typically manifests as "out of bounds" errors, "sample limit exceeded" warnings, compaction taking too long, TSDB head chunks persist failures, remote write backlog growth, and OOM kills. Common causes include using high-cardinality labels (user IDs, email addresses, IP addresses, request IDs, pod names with unique identifiers), unbounded label values (timestamps, random strings, UUIDs), joining metrics with many-to-one relationships that explode series, service discovery creating ephemeral labels, container ID or instance ID as labels, query parameters as labels, and client libraries creating dynamic labels from request attributes. The fix requires identifying high-cardinality metrics, implementing cardinality limits, relabeling to drop expensive labels, aggregating metrics at source, and configuring appropriate resource limits. This guide provides production-proven strategies for managing Prometheus cardinality across single-server and federated deployments.

Symptoms

Prometheus logs show "cardinality limit exceeded" or "series limit exceeded"
Memory usage grows continuously until OOM kill
TSDB compaction takes excessively long (>30 minutes)
Remote write backlog grows unbounded
Queries timeout or return "query execution cancelled"
prometheus_tsdb_head_series metric shows rapid growth
prometheus_tsdb_storage_blocks_bytes grows faster than expected
Scrape targets show "context deadline exceeded"
prometheus_target_scrapes_exceeded_sample_limit increases
Prometheus becomes unresponsive during compaction
rate() and histogram_quantile() queries extremely slow

Common Causes

High-cardinality labels (user_id, email, ip_address, session_id)
Unbounded label values (UUIDs, timestamps, random strings)
Container ID or pod UID as label without aggregation
Request ID or trace ID as metric label
Query parameters exported as labels
Joining metrics causing series explosion (many-to-many)
Service discovery adding dynamic labels per scrape
Client library auto-instrumentation capturing request details
Histogram buckets too granular for label combinations
Summary quantiles with many label combinations
Metric relabeling creating new high-cardinality labels
Federation aggregating without label deduplication

Step-by-Step Fix

### 1. Diagnose high cardinality metrics

Check current cardinality:

```bash # Connect to Prometheus API PROMETHEUS_URL="http://localhost:9090"

# Get top 10 highest cardinality metrics curl -s "$PROMETHEUS_URL/api/v1/api/v1/status/tsdb" | jq ' .data.topLabelValuesCombinations | map({labelValues: .labelValues, cardinality: .cardinality}) | sort_by(-.cardinality) | .[0:10]'

# Or use promtool for local analysis promtool tsdb analyze /path/to/prometheus/data

# Query series count directly (expensive on large datasets) # Use in Prometheus UI or with thanos query count by (__name__) ({__name__=~".+"}) ```

Check memory usage:

```bash # Prometheus memory metrics curl -s "$PROMETHEUS_URL/api/v1/query?query=prometheus_tsdb_head_series" | jq curl -s "$PROMETHEUS_URL/api/v1/query?query=prometheus_tsdb_head_chunks" | jq curl -s "$PROMETHEUS_URL/api/v1/query?query=prometheus_tsdb_head_chunks_storage_size_bytes" | jq curl -s "$PROMETHEUS_URL/api/v1/query?query=go_memstats_alloc_bytes{job=\"prometheus\"}" | jq

# Memory per metric (approximate) # Formula: series_count * chunk_size * chunks_per_series # Typical: 100MB per 1M series with default settings

# Check Prometheus process memory ps aux | grep prometheus | awk '{print $6/1024, $11}'

# Or with systemd systemctl status prometheus | grep Memory ```

Identify problematic labels:

```bash # Find labels with many unique values # Run in Prometheus UI or via API

# Top labels by cardinality label_values(label_name)

# Check specific metric labels label_values(http_requests_total,method) label_values(http_requests_total,status) label_values(http_requests_total,endpoint)

# Find high-cardinality label values # This query shows label value counts for http_requests_total count by (label_name) ( label_join( {__name__="http_requests_total"}, "label_name", "", # List your labels here ) )

# Or use promql to find series with many labels count_values("label_count", label_join(http_requests_total, "dummy", "")) ```

Use promtool for TSDB analysis:

```bash # Analyze TSDB blocks promtool tsdb analyze /var/lib/prometheus/data --block-id <block_id>

# Output shows: # - Series count per block # - Samples count per series # - Label value cardinality # - Chunk statistics

# Analyze specific time range promtool tsdb analyze /var/lib/prometheus/data \ --min-time $(date -d "1 hour ago" -u +%Y-%m-%dT%H:%M:%SZ) \ --max-time $(date -u +%Y-%m-%dT%H:%M:%SZ)

# Check for label explosion patterns promtool tsdb analyze /var/lib/prometheus/data | grep -A 20 "Label Value Cardinality" ```

### 2. Configure cardinality limits

Set series limits:

```yaml # prometheus.yml - Global limits global: # Maximum samples per scrape scrape_sample_limit: 10000

# Maximum series per target (Prometheus 2.45+) target_limit: 5000

# Per-job limits scrape_configs: - job_name: 'api-service' # Limit samples per scrape for this job sample_limit: 5000

# Limit label value length label_limit: 128

# Limit number of labels per series label_name_length_limit: 64

# Limit label value length label_value_length_limit: 256 ```

Configure TSDB limits:

```yaml # Command-line flags for prometheus server --storage.tsdb.max-block-duration=2h # Smaller blocks = faster compaction --storage.tsdb.min-block-duration=2h --storage.tsdb.retention.time=15d # Reduce retention if needed --storage.tsdb.retention.size=50GB # Size-based retention --storage.tsdb.head-chunks-write-queue-size=10000 --storage.tsdb.out-of-order-time-window=5m # Reduce if causing issues

# Memory limits --storage.tsdb.head-chunks-max-chunk-segment-size=512MB --max-block-chunk-segment-size=256MB

# Series limit (Prometheus 2.37+) --storage.tsdb.max-block-chunk-segment-size=256MB

# For Thanos/Sidecar deployments # thanos-sidecar --tsdb.path /var/lib/prometheus/data ```

Configure scrape timeout and interval:

```yaml scrape_configs: - job_name: 'kubernetes-pods' # Reduce scrape frequency for high-cardinality metrics scrape_interval: 60s

# Timeout should be less than interval scrape_timeout: 30s

# Limit concurrent scrapes scrape_config_files: - /etc/prometheus/jobs/*.yml ```

### 3. Implement metric relabeling

Drop high-cardinality labels:

```yaml scrape_configs: - job_name: 'api-service' static_configs: - targets: ['api:8080']

# Drop specific labels that cause cardinality explosion metric_relabel_configs: # Drop user_id label (common culprit) - source_labels: [__name__, user_id] regex: (.+) action: labeldrop # This drops the user_id label from all metrics

# Or drop labels matching pattern - source_labels: [__name__, request_id] regex: (.+) action: labeldrop

# Drop labels for specific metrics only - source_labels: [__name__, endpoint] regex: (http_requests_total|http_response_time).* action: drop # Drops entire metric if it has endpoint label

# Keep only specific labels - source_labels: [__name__, method, status, le] regex: (.+) action: keep # Drops all labels except method, status, le

# Replace high-cardinality label with aggregated value - source_labels: [instance] regex: '(.+)-[a-f0-9-]+(.*)' # Match pod-uuid pattern target_label: instance replacement: '${1}${2}' # Removes UUID from instance name ```

Aggregate labels at scrape time:

```yaml scrape_configs: - job_name: 'kubernetes-pods' kubernetes_sd_configs: - role: pod

metric_relabel_configs: # Aggregate pod names (remove random suffix) - source_labels: [pod] regex: '(.+)-[a-z0-9]+(-[a-z0-9]+)?' target_label: pod replacement: '${1}'

# Aggregate container IDs - source_labels: [container_id] regex: '(.{12}).*' target_label: container_id replacement: '${1}' # Keep only first 12 characters

# Drop ephemeral labels - regex: 'revision|pod_template_hash|controller_revision_hash' action: labeldrop

# Normalize endpoint paths - source_labels: [endpoint] regex: '/api/v1/users/[^/]+' target_label: endpoint replacement: '/api/v1/users/{id}' # Replace /api/v1/users/123 with /api/v1/users/{id} ```

Drop entire metrics:

```yaml scrape_configs: - job_name: 'api-service'

metric_relabel_configs: # Drop verbose metrics - source_labels: [__name__] regex: 'go_.*' action: drop # Drop all Go runtime metrics if not needed

# Drop histogram buckets you don't use - source_labels: [__name__, le] regex: 'http_request_duration_seconds_bucket;(0\\.001|0\\.005)' action: drop # Drop sub-10ms buckets if not analyzed

# Drop summary quantiles - source_labels: [__name__, quantile] regex: '(.*)_summary;(0\\.99|0\\.999)' action: drop ```

### 4. Fix application-level cardinality

Instrument code correctly:

```go // BAD: High cardinality - user_id as label var httpRequests = prometheus.NewCounterVec( prometheus.CounterOpts{ Name: "http_requests_total", Help: "Total HTTP requests", }, []string{"method", "status", "user_id"}, // user_id is unbounded! )

// GOOD: Fixed cardinality - only bounded labels var httpRequests = prometheus.NewCounterVec( prometheus.CounterOpts{ Name: "http_requests_total", Help: "Total HTTP requests", }, []string{"method", "status", "endpoint"}, // endpoint is bounded )

// For user-specific tracking, use separate approach // - Log to structured logging // - Use external analytics system // - Sample and export aggregate metrics

// BAD: Request ID as label labels["request_id"] = request.ID // UUID - millions of values

// GOOD: Use request ID in logs, not metrics log.WithField("request_id", request.ID).Info("request completed")

// For tracing, use distributed tracing (Jaeger, Zipkin) // not Prometheus metrics ```

Python application fixes:

```python from prometheus_client import Counter, Histogram

# BAD: Dynamic labels from request request_counter = Counter( 'http_requests_total', 'HTTP Requests', ['method', 'status', 'user_agent', 'ip_address'] # High cardinality! )

def handle_request(request): # This creates a new series per user agent and IP request_counter.labels( method=request.method, status=response.status, user_agent=request.headers.get('User-Agent', 'unknown'), ip_address=request.remote_addr ).inc()

# GOOD: Bounded labels only request_counter = Counter( 'http_requests_total', 'HTTP Requests', ['method', 'status', 'service'] )

def handle_request(request): request_counter.labels( method=request.method, status=response.status, service='api' ).inc()

# Log high-cardinality data separately logging.info(f"Request from {request.remote_addr} with {request.headers.get('User-Agent')}") ```

Fix histogram bucket explosion:

```go // BAD: Too many buckets combined with many labels var requestDuration = prometheus.NewHistogramVec( prometheus.HistogramOpts{ Name: "http_request_duration_seconds", Help: "HTTP request duration", Buckets: prometheus.DefBuckets, // 14 buckets // Default: [0.0005, 0.001, 0.0025, 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10] }, []string{"method", "endpoint", "status", "service", "team"}, // 5 labels ) // Cardinality: 14 buckets * 10 methods * 100 endpoints * 10 status * 5 services * 5 teams // = 14 * 10 * 100 * 10 * 5 * 5 = 3.5 MILLION series

// GOOD: Fewer buckets, fewer labels var requestDuration = prometheus.NewHistogramVec( prometheus.HistogramOpts{ Name: "http_request_duration_seconds", Help: "HTTP request duration", Buckets: []float64{0.1, 0.5, 1, 2.5, 5, 10}, // 6 buckets }, []string{"method", "status"}, // Only essential labels ) // Cardinality: 6 buckets * 10 methods * 10 status = 600 series ```

### 5. Configure remote write and federation

Remote write cardinality control:

```yaml remote_write: - url: "https://cortex.example.com/api/v1/push"

# Queue configuration queue_config: capacity: 10000 max_shards: 50 max_samples_per_send: 5000 batch_send_deadline: 5s max_retries: 3 min_backoff: 100ms max_backoff: 5s

# Write relabeling to reduce cardinality before sending write_relabel_configs: # Drop high-cardinality metrics - source_labels: [__name__] regex: 'expensive_metric_.*' action: drop

# Keep only specific metrics - source_labels: [__name__] regex: '(http_requests_total|node_.*|container_*)' action: keep

# Drop specific labels - regex: 'pod_template_hash|controller_revision_hash' action: labeldrop

# Aggregate instance labels - source_labels: [instance] regex: '(.+)-[a-f0-9-]+' target_label: instance replacement: '${1}' ```

Federation configuration:

```yaml # Federating Prometheus - scrape from multiple sources scrape_configs: - job_name: 'federate' honor_labels: true # Preserve labels from source metrics_path: '/federate' params: 'match[]': - '{job="api-service"}' - '{__name__=~"node_.*"}' static_configs: - targets: - 'prometheus-1:9090' - 'prometheus-2:9090'

# Relabel to prevent cardinality explosion metric_relabel_configs: # Aggregate source Prometheus instances - source_labels: [instance, prometheus_instance] regex: '(.+);(.+)' target_label: instance replacement: '${2}' ```

### 6. Monitor cardinality

Set up cardinality monitoring:

```yaml # Alert rules for cardinality monitoring # /etc/prometheus/rules/cardinality-alerts.yml

groups: - name: prometheus-cardinality rules: - alert: PrometheusHighCardinality expr: prometheus_tsdb_head_series > 1000000 for: 5m labels: severity: warning annotations: summary: "Prometheus series count high" description: "{{ $value }} series in TSDB head"

alert: PrometheusVeryHighCardinality
expr: prometheus_tsdb_head_series > 5000000
for: 5m
labels:
severity: critical
annotations:
summary: "Prometheus series count critical"
description: "{{ $value }} series - investigate immediately"

alert: PrometheusMetricHighCardinality
expr: |
count by (__name__) ({__name__=~".+"}) > 100000
for: 10m
labels:
severity: warning
annotations:
summary: "High cardinality metric {{ $labels.__name__ }}"
description: "Metric has {{ $value }} series"

alert: PrometheusScrapeSampleLimitExceeded
expr: |
rate(prometheus_target_scrapes_exceeded_sample_limit_total[5m]) > 0
for: 5m
labels:
severity: warning
annotations:
summary: "Scrape sample limit exceeded"
description: "Target {{ $labels.job }} exceeding sample limit"

alert: PrometheusRemoteWriteBacklog
expr: |
prometheus_remote_storage_highest_timestamp_in_seconds -
prometheus_remote_storage_queue_highest_sent_timestamp_seconds > 300
for: 5m
labels:
severity: critical
annotations:
summary: "Remote write backlog growing"
description: "Backlog of {{ $value }} seconds"

alert: PrometheusMemoryHigh
expr: |
process_resident_memory_bytes{job="prometheus"} > 8 * 1024 * 1024 * 1024
for: 5m
labels:
severity: warning
annotations:
summary: "Prometheus memory usage high"
description: "Using {{ $value | humanize1024 }}B of memory"
`

Cardinality dashboard queries:

```promql # Series count over time prometheus_tsdb_head_series

# Samples ingested per second rate(prometheus_tsdb_head_samples_appended_total[5m])

# Memory usage process_resident_memory_bytes{job="prometheus"}

# Compaction duration prometheus_tsdb_compaction_duration_seconds

# Top 10 metrics by series count topk(10, count by (__name__) ({__name__=~".+"}))

# Label value cardinality for specific metric count by (label_name) (label_join({__name__="http_requests_total"}, "label_name", ""))

# Series growth rate deriv(prometheus_tsdb_head_series[1h]) ```

Prevention

Design metrics with bounded labels only (no user IDs, emails, IPs)
Implement cardinality limits in CI/CD before deploying new metrics
Use metric relabeling to drop unnecessary labels at scrape time
Aggregate high-cardinality data in application before export
Monitor series count and set up alerting at 50%, 75%, 90% of limit
Document cardinality budget per service (e.g., 1000 series per service)
Use exemplars for tracing instead of labels for request IDs
Implement metric naming and labeling standards
Review new metrics in code review with cardinality checklist
Use Prometheus cardinality profiler in staging environment

**Prometheus TSDB compaction failed**: Block compaction taking too long
**Prometheus remote write backlog**: Remote storage cannot keep up
**Prometheus OOM killed**: Memory exhausted from high cardinality
**Prometheus scrape timeout**: Too many samples per scrape
**Prometheus query timeout**: Queries too slow due to series count

How to Fix Prometheus Metrics Cardinality Too High

Introduction

Symptoms

Common Causes

Step-by-Step Fix

Prevention

Related Errors

Share this guide