What's Actually Happening

Prometheus cannot send metrics to remote storage endpoints. Data is not being shipped to your long-term storage like Cortex, Thanos, or VictoriaMetrics.

The Error You'll See

bash
# Prometheus logs:
level=error ts=2026-04-08T10:00:00.000Z caller=queue_manager.go:XXX component="remote queue" remote_name=default url=http://remote:9090/api/v1/write msg="Failed to send batch, retrying" err="Post: context deadline exceeded"
bash
# Metrics show failures:
prometheus_remote_storage_failed_samples_total{remote_name="default"} 1000
prometheus_remote_storage_pending_samples{remote_name="default"} 50000

Why This Happens

  1. 1.Endpoint unreachable - Network connectivity issues
  2. 2.Authentication failed - Wrong credentials or tokens
  3. 3.Rate limited - Too many writes per second
  4. 4.Data size too large - Batch exceeds limit
  5. 5.TLS certificate - Invalid or expired certificate
  6. 6.Sharding issues - Wrong number of shards
  7. 7.Memory pressure - OOM during send
  8. 8.Endpoint overloaded - Remote storage cannot keep up

Step 1: Check Remote Write Configuration

yaml
# prometheus.yml:
remote_write:
  - url: "http://remote-storage:9090/api/v1/write"
    queue_config:
      max_samples_per_send: 1000
      max_shards: 200
      capacity: 2500
    send_exemplars: true

```bash # Verify configuration: promtool check config prometheus.yml

# Check logs: docker logs prometheus 2>&1 | grep remote

# Check metrics: curl http://localhost:9090/api/v1/query?query=prometheus_remote_storage_failed_samples_total ```

Step 2: Test Endpoint Connectivity

```bash # Test network connectivity: curl -v http://remote-storage:9090/api/v1/write

# Test with sample data: curl -X POST http://remote-storage:9090/api/v1/write \ -H "Content-Type: application/x-protobuf" \ -H "Content-Encoding: snappy" \ -H "X-Prometheus-Remote-Write-Version: 0.1.0" \ --data-binary @sample.wal

# Check DNS: nslookup remote-storage

# Check port: nc -zv remote-storage 9090 ```

Step 3: Fix Authentication

```yaml # Basic auth: remote_write: - url: "http://remote-storage:9090/api/v1/write" basic_auth: username: admin password: secret

# Bearer token: remote_write: - url: "http://remote-storage:9090/api/v1/write" bearer_token: "your-token"

# TLS configuration: remote_write: - url: "https://remote-storage:9090/api/v1/write" tls_config: cert_file: /etc/prometheus/cert.pem key_file: /etc/prometheus/key.pem insecure_skip_verify: false ```

Step 4: Tune Queue Configuration

yaml
remote_write:
  - url: "http://remote-storage:9090/api/v1/write"
    queue_config:
      # Number of samples per batch
      max_samples_per_send: 500
      # Maximum concurrent shards
      max_shards: 100
      # Minimum shards
      min_shards: 1
      # Maximum pending samples
      capacity: 10000
      # Batch send deadline
      batch_send_deadline: 5s
      # Min backoff on failure
      min_backoff: 30ms
      # Max backoff on failure
      max_backoff: 100ms

Step 5: Handle Rate Limiting

yaml
remote_write:
  - url: "http://remote-storage:9090/api/v1/write"
    queue_config:
      max_samples_per_send: 500
      max_shards: 50
    # Metadata configuration
    metadata_config:
      send: true
      send_interval: 1m
      max_samples_per_send: 500

Step 6: Monitor Remote Write Metrics

```bash # Check failed samples: sum(rate(prometheus_remote_storage_failed_samples_total[5m]))

# Check pending samples: prometheus_remote_storage_pending_samples

# Check queue capacity: prometheus_remote_storage_queue_capacity

# Check shards: prometheus_remote_storage_shards_max prometheus_remote_storage_shards_min prometheus_remote_storage_shards_desired

# Check latency: histogram_quantile(0.99, rate(prometheus_remote_storage_sent_batch_duration_seconds_bucket[5m])) ```

Prometheus Remote Write Checklist

CheckCommandExpected
Endpoint reachablecurlHTTP 200
Auth configuredconfigCredentials set
Queue healthymetricsLow pending
No failuresmetricsfailed_samples=0
Shards adequatemetricsdesired < max

Verify the Fix

```bash # After fixing:

# 1. Check no failures curl http://localhost:9090/api/v1/query?query=prometheus_remote_storage_failed_samples_total # Output: value: 0

# 2. Check pending decreasing curl http://localhost:9090/api/v1/query?query=prometheus_remote_storage_pending_samples # Output: Low value

# 3. Verify logs clean docker logs prometheus 2>&1 | grep -i "failed to send" # Output: Empty ```

  • [Fix Prometheus Target Down](/articles/fix-prometheus-target-down)
  • [Fix Prometheus Scraping Failed](/articles/fix-prometheus-scraping-failed)