The Problem

Prometheus fails to discover targets via Consul, or discovered targets are showing errors:

bash
level=error ts=2026-04-04T21:35:22.123Z caller=consul.go:234 msg="Consul service discovery failed" err="Get \"http://consul:8500/v1/catalog/services\": dial tcp: connection refused"
level=error ts=2026-04-04T21:35:23.234Z caller=consul.go:235 msg="Consul refresh failed" err="Permission denied: ACL token invalid"
level=warn ts=2026-04-04T21:35:24.345Z caller=consul.go:236 msg="No services found in Consul" services=""

Consul service discovery errors prevent dynamic target discovery, requiring manual configuration updates.

Diagnosis

Check Consul Connectivity

```bash # Test Consul API directly curl -s http://consul:8500/v1/catalog/services | jq .

# Check Consul health curl -s http://consul:8500/v1/agent/service/health | jq .

# Check specific service curl -s http://consul:8500/v1/catalog/service/myapp | jq . ```

Check Prometheus Consul SD Status

```bash # Check discovered targets curl -s http://localhost:9090/api/v1/targets | jq '.data.activeTargets[] | select(.labels.job | contains("consul"))'

# Check Prometheus logs for Consul errors journalctl -u prometheus --since "1 hour ago" | grep -i consul ```

Check Consul ACL Token

```bash # Verify ACL token curl -s -H "X-Consul-Token: your-token" http://consul:8500/v1/acl/token/self | jq .

# Check token permissions consul acl token read -id your-token-id ```

Check Service Tags

```bash # List services with tags curl -s http://consul:8500/v1/catalog/services | jq .

# Get detailed service info curl -s "http://consul:8500/v1/catalog/service/myapp?tag=prometheus" | jq . ```

Solutions

1. Fix Basic Consul Configuration

Incorrect Consul SD configuration:

```yaml # prometheus.yml scrape_configs: # WRONG: Missing required configuration # - job_name: 'consul-services' # consul_sd_configs: []

# CORRECT: Proper Consul SD configuration - job_name: 'consul-services' consul_sd_configs: - server: 'consul:8500' services: [] # All services # Or specific services # services: ['myapp', 'node-exporter'] tags: ['prometheus'] # Filter by tag node_meta: environment: 'production' refresh_interval: 30s ```

2. Fix ACL Token Configuration

Consul requires ACL token for access:

yaml
scrape_configs:
  - job_name: 'consul-services'
    consul_sd_configs:
      - server: 'consul:8500'
        services: ['myapp']
        token: 'your-consul-acl-token'
        # Or from file
        # token_file: /etc/prometheus/consul-token

Create proper ACL token:

```bash # Create ACL policy for Prometheus consul acl policy create -name "prometheus-read" \ -rules 'service_prefix "" { policy = "read" } node_prefix "" { policy = "read" }'

# Create token consul acl token create -name "prometheus-token" \ -policy-name "prometheus-read" \ -secret "your-secret-token" ```

Token file approach:

bash
# Store token securely
echo "your-consul-acl-token" > /etc/prometheus/consul-token
chmod 600 /etc/prometheus/consul-token

3. Fix Service Tag Filtering

Services not discovered due to tag mismatch:

yaml
scrape_configs:
  - job_name: 'consul-services'
    consul_sd_configs:
      - server: 'consul:8500'
        tags: ['prometheus']  # Only discover services with this tag

Register services with correct tags:

json
# Service registration with prometheus tag
{
  "Service": {
    "Name": "myapp",
    "Tags": ["prometheus", "production"],
    "Address": "10.0.0.5",
    "Port": 9090,
    "Meta": {
      "prometheus_scrape_path": "/metrics",
      "prometheus_scrape_port": "9090"
    }
  }
}

Register service via Consul API:

bash
curl -X PUT http://consul:8500/v1/agent/service/register -d '{
  "Name": "myapp",
  "Tags": ["prometheus"],
  "Address": "10.0.0.5",
  "Port": 9090,
  "Meta": {"prometheus_scrape_path": "/metrics"}
}'

4. Configure Relabeling

Use service metadata for configuration:

```yaml scrape_configs: - job_name: 'consul-services' consul_sd_configs: - server: 'consul:8500' services: [] tags: ['prometheus'] relabel_configs: # Use address from Consul - source_labels: [__address__] target_label: __address__ regex: '([^:]+):\d+' replacement: '${1}:9090'

# Use custom scrape path from metadata - source_labels: [__meta_consul_service_metadata_prometheus_scrape_path] target_label: __metrics_path__ regex: '(.+)' replacement: '${1}' action: replace

# Use custom port from metadata - source_labels: [__meta_consul_service_metadata_prometheus_scrape_port] target_label: __address__ regex: '(.+)' replacement: '${1}'

# Keep only healthy services - source_labels: [__meta_consul_health_status] regex: 'passing' action: keep

# Set job from service name - source_labels: [__meta_consul_service] target_label: job ```

5. Fix Network Connectivity

Consul unreachable from Prometheus:

```bash # Check network connectivity ping consul nc -zv consul 8500

# From Prometheus container kubectl exec -it prometheus-pod -- curl -s http://consul:8500/v1/catalog/services

# Use proper service discovery ```

```yaml # Kubernetes DNS scrape_configs: - job_name: 'consul-services' consul_sd_configs: - server: 'consul.default.svc.cluster.local:8500'

# External Consul scrape_configs: - job_name: 'consul-services' consul_sd_configs: - server: 'consul.example.com:8500' # Add authentication basic_auth: username: prometheus password: secret ```

6. Handle TLS Configuration

Secure Consul connections:

yaml
scrape_configs:
  - job_name: 'consul-services'
    consul_sd_configs:
      - server: 'https://consul:8500'
        token: 'your-acl-token'
        tls_config:
          ca_file: /etc/prometheus/consul/ca.crt
          cert_file: /etc/prometheus/consul/client.crt
          key_file: /etc/prometheus/consul/client.key
          insecure_skip_verify: false

7. Fix Watch Issues

Consul watch not updating:

yaml
scrape_configs:
  - job_name: 'consul-services'
    consul_sd_configs:
      - server: 'consul:8500'
        refresh_interval: 15s  # Check every 15 seconds

Force refresh via API:

bash
# Trigger discovery refresh
curl -X POST http://localhost:9090/-/reload

Verification

Check Discovered Targets

bash
# List Consul-discovered targets
curl -s http://localhost:9090/api/v1/targets | jq '.data.activeTargets[] | select(.discoveredLabels.job == "consul-services") | {job: .labels.job, instance: .labels.instance}'

Verify Services

```promql # Count discovered Consul services count by (job) (up{job="consul-services"})

# Services not discovered count(up{job="consul-services"}) != count(up{job="consul-services"} and up == 1) ```

Test Consul SD

```bash # Manual query of Consul curl -s "http://consul:8500/v1/catalog/services" | jq 'keys'

# Check specific service instances curl -s "http://consul:8500/v1/catalog/service/myapp" | jq '.[] | {Address: .ServiceAddress, Port: .ServicePort, Tags: .ServiceTags}' ```

Prevention

Add monitoring for Consul SD:

```yaml groups: - name: consul_sd_alerts rules: - alert: ConsulSDDown expr: up{job="consul"} == 0 for: 5m labels: severity: critical annotations: summary: "Consul service discovery is down" description: "Cannot connect to Consul server at {{ $labels.instance }}"

  • alert: ConsulSDNoServices
  • expr: count(up{job="consul-services"}) == 0
  • for: 10m
  • labels:
  • severity: warning
  • annotations:
  • summary: "No services discovered from Consul"
  • description: "No targets found via Consul service discovery"
  • alert: ConsulSDRefreshFailed
  • expr: increase(prometheus_sd_consul_refresh_failures_total[5m]) > 0
  • for: 5m
  • labels:
  • severity: warning
  • annotations:
  • summary: "Consul SD refresh failed"
  • description: "{{ $value }} Consul refresh failures in last 5 minutes"
  • alert: ConsulServiceUnhealthy
  • expr: up{job="consul-services"} == 0
  • for: 5m
  • labels:
  • severity: warning
  • annotations:
  • summary: "Consul-discovered service is down"
  • description: "Service {{ $labels.instance }} discovered via Consul is not responding"
  • `

Consul Service Registration Template

Register services properly for Prometheus:

json
{
  "Service": {
    "Name": "myapp",
    "ID": "myapp-10.0.0.5-9090",
    "Tags": ["prometheus", "app", "production"],
    "Address": "10.0.0.5",
    "Port": 9090,
    "Meta": {
      "prometheus_scrape_path": "/metrics",
      "prometheus_scrape_interval": "30s",
      "environment": "production"
    },
    "Check": {
      "HTTP": "http://10.0.0.5:9090/health",
      "Interval": "10s",
      "Timeout": "5s"
    }
  }
}