Grafana Tempo returns "Trace not found" when searching for traces, or your distributed tracing dashboard shows gaps. Without traces, you cannot debug request flows across services. Let's diagnose why traces are missing and fix the issue.

Understanding Trace Not Found

Tempo stores distributed traces from OpenTelemetry, Jaeger, Zipkin, and other tracing systems. "Trace not found" errors occur when:

  • Trace never reached Tempo (ingestion failure)
  • Trace is in wrong tenant/database
  • Query time range is incorrect
  • Trace ID is invalid or truncated
  • Storage backend issues

Error patterns:

bash
Trace ID not found: 1234567890abcdef
bash
No spans found for trace query
bash
tempo: trace search returned empty

Initial Diagnosis

Check Tempo status and ingestion:

```bash # Check Tempo service status kubectl get pods -l app=tempo -n monitoring

# Check Tempo logs kubectl logs -l app=tempo -n monitoring | grep -i "error|fail|trace"

# Check Tempo metrics curl -s http://localhost:3100/metrics | grep -E "tempo_ingester|tempo_query"

# Check Tempo health curl -s http://localhost:3200/ready

# Check ingester metrics curl -s http://localhost:3200/metrics | grep tempo_ingester_spans_received_total

# Check querier metrics curl -s http://localhost:3200/metrics | grep tempo_querier_requests_total

# Test trace search API curl -s "http://localhost:3200/api/traces/1234567890abcdef" | jq '.' ```

Common Cause 1: Trace Not Being Ingested

Traces are being generated but not reaching Tempo.

Error pattern: `` Trace ID not found

Diagnosis:

```bash # Check ingester is receiving traces curl -s http://localhost:3200/metrics | grep tempo_ingester_traces_received_total

# Check trace push errors curl -s http://localhost:3200/metrics | grep tempo_ingester_push_errors_total

# Check distributor metrics curl -s http://localhost:3200/metrics | grep tempo_distributor

# Check from OTel collector kubectl logs -l app=otel-collector -n monitoring | grep -i "tempo|trace|export"

# Test direct trace push curl -X POST http://localhost:3200/api/traces \ -H "Content-Type: application/json" \ -d '{"traceId":"test123","spans":[]}'

# Check network connectivity from collectors nc -zv tempo-ingester 3100 ```

Solution:

Fix trace ingestion pipeline:

```yaml # OpenTelemetry Collector configuration exporters: otlp: endpoint: tempo:4317 tls: insecure: true

# For Tempo backend directly tempo: endpoint: http://tempo:3100/api/traces default: service_name: my-service span_kind: SERVER

# Check collector is exporting service: pipelines: traces: receivers: [otlp] exporters: [otlp/tempo]

# Increase collector batch size exporters: otlp/tempo: endpoint: tempo:4317 sending_queue: enabled: true num_consumers: 10 queue_size: 1000 retry_on_failure: enabled: true initial_interval: 5s max_interval: 30s max_elapsed_time: 300s ```

Verify Jaeger/Zipkin ingestion:

yaml
# Tempo configuration for multiple formats
distributor:
  receivers:
    otlp:
      protocols:
        grpc:
          endpoint: 0.0.0.0:4317
        http:
          endpoint: 0.0.0.0:4318
    jaeger:
      protocols:
        grpc:
          endpoint: 0.0.0.0:14250
        thrift_binary:
          endpoint: 0.0.0.0:6832
        thrift_http:
          endpoint: 0.0.0.0:14268
    zipkin:
      endpoint: 0.0.0.0:9411

Common Cause 2: Trace In Wrong Tenant

Multi-tenant Tempo with traces in different tenant.

Error pattern: `` Trace not found in tenant X

Diagnosis:

```bash # Check tenant configuration curl -s http://localhost:3200/config | jq '.multitenancy_enabled'

# Check which tenant traces are in # Use X-Scope-Orgid header curl -s "http://localhost:3200/api/traces/traceid" \ -H "X-Scope-Orgid: tenant-1"

# Try different tenants curl -s "http://localhost:3200/api/traces/traceid" \ -H "X-Scope-Orgid: tenant-2"

# Check tenant limits curl -s http://localhost:3200/api/tenant_limits | jq '.' ```

Solution:

Configure correct tenant:

```yaml # tempo-config.yaml multitenancy_enabled: true

# Configure tenant-specific settings overrides: per_tenant: tenant-1: ingestion: max_bytes_per_trace: 50000 block_retention: 14d tenant-2: ingestion: max_bytes_per_trace: 100000

# Use correct tenant header in queries # In Grafana, configure Tempo datasource with tenant header ```

Common Cause 3: Trace ID Format Issues

Trace ID is malformed or truncated.

Error pattern: `` Invalid trace ID format

Diagnosis:

```bash # Check trace ID format # Valid trace IDs are 16-32 character hex strings

# Verify trace ID length echo "Trace ID: $TRACE_ID" echo "Length: ${#TRACE_ID}"

# Check for encoding issues curl -s "http://localhost:3200/api/traces/$TRACE_ID"

# Try with URL encoding curl -s "http://localhost:3200/api/traces/$(printf '%s' "$TRACE_ID" | jq -sRr @uri)"

# Check trace ID in application logs grep -i "trace.*id|traceid" /var/log/app.log | tail -10

# Compare trace ID formats across services # Some systems use 16-char, others 32-char IDs ```

Solution:

Fix trace ID handling:

```bash # Ensure consistent trace ID format # OpenTelemetry uses 32-character hex (16 bytes) # Jaeger can use 16-character (8 bytes)

# Normalize trace ID in queries # If you have 16-char ID, pad to 32-char TRACE_ID_32=$(printf '%032s' "$TRACE_ID" | tr ' ' '0')

# Or convert 16-char to proper format # Example: 1234567890abcdef -> 000000001234567890abcdef

# Configure ID generation in applications # OpenTelemetry SDK configuration tracer_provider = TracerProvider( id_generator=RandomIdGenerator(), # Proper 32-char IDs )

# For Jaeger, ensure proper propagation JAEGER_TRACEID_128BIT=true ```

Common Cause 4: Time Range Query Issues

Trace exists but query time range doesn't include it.

Error pattern: `` Trace not found in queried time range

Diagnosis:

```bash # Check trace ingestion timestamps curl -s http://localhost:3200/metrics | grep tempo_ingester_timestamp

# Search with broader time range curl -s "http://localhost:3200/api/search?tags=service=my-service&start=1640000000&end=1640003600&limit=100"

# Check trace block retention curl -s http://localhost:3200/config | jq '.compactor.block_retention'

# Check available time ranges in blocks curl -s http://localhost:3200/metrics | grep tempo_block_start_time

# Use trace search instead of direct ID curl -s "http://localhost:3200/api/search?q={service=my-service}" | jq '.traces' ```

Solution:

Query with correct time range:

```bash # Use broader time range when searching # Tempo search API curl -s "http://localhost:3200/api/search" \ -H "Content-Type: application/json" \ -d '{ "tags": {"service": "my-service"}, "start": 1640000000, "end": 1640086400, "limit": 100 }'

# In Grafana, ensure correct time range # Use Explore > Tempo > adjust time picker

# Check trace retention # Tempo configuration compactor: block_retention: 14d # Traces older than this are deleted

# Increase retention if needed compactor: block_retention: 30d ```

Common Cause 5: Ingester Not Flushing

Ingester holds traces but doesn't flush to storage.

Error pattern: `` Trace in ingester but not queriable

Diagnosis:

```bash # Check ingester flush metrics curl -s http://localhost:3200/metrics | grep tempo_ingester_flush_total

# Check ingester block status curl -s http://localhost:3200/metrics | grep tempo_ingester_blocks_created

# Check flush queue length curl -s http://localhost:3200/metrics | grep tempo_ingester_flush_queue_length

# Check ingester ring status curl -s http://localhost:3200/ring | jq '.'

# Monitor ingester lag curl -s http://localhost:3200/metrics | \ grep -E "tempo_ingester_flush_latency|tempo_ingester_flush_age" ```

Solution:

Configure proper flushing:

```yaml # tempo-config.yaml ingester: lifecycler: ring: kvstore: store: memberlist heartbeat_period: 5s trace_idle_period: 10s # How long before flushing complete trace max_block_duration: 30m # Max time before forcing flush max_block_bytes: 100MB flush_check_period: 10s

compactor: compaction: block_retention: 14d chunk_size_bytes: 10485760 # 10MB chunks flush_size_bytes: 1048576 # 1MB flush size

# Restart ingester if stuck kubectl rollout restart deployment/tempo-ingester -n monitoring ```

Common Cause 6: Storage Backend Issues

Backend storage (S3, GCS, local) problems.

Error pattern: `` Trace backend error: object storage unavailable

Diagnosis:

```bash # Check storage backend configuration curl -s http://localhost:3200/config | jq '.storage'

# Test storage connectivity # For S3 aws s3 ls s3://tempo-traces/ aws s3api head-object --bucket tempo-traces --key some-block-meta.json

# For GCS gsutil ls gs://tempo-traces/

# Check storage metrics curl -s http://localhost:3200/metrics | grep tempo_backend

# Check for storage errors in logs kubectl logs -l app=tempo -n monitoring | grep -i "storage|s3|bucket" ```

Solution:

Fix storage configuration:

```yaml # tempo-config.yaml for S3 storage: trace: backend: s3 s3: bucket: tempo-traces endpoint: s3.amazonaws.com region: us-east-1 access_key: AKIAIOSFODNN7EXAMPLE secret_key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

# For GCS storage: trace: backend: gcs gcs: bucket_name: tempo-traces chunk_size: 10485760

# For local storage storage: trace: backend: local local: path: /var/tempo/traces

# Check bucket permissions # Required: s3:GetObject, s3:PutObject, s3:DeleteObject, s3:ListBucket ```

Common Cause 7: Querier Configuration

Querier cannot access ingesters or storage.

Error pattern: `` Querier failed to search: connection refused

Diagnosis:

```bash # Check querier connectivity curl -s http://localhost:3200/metrics | grep tempo_querier_errors_total

# Check querier to ingester connections curl -s http://localhost:3200/ring | jq '.ingesters'

# Test querier directly curl -s "http://localhost:3200/api/traces/test-id"

# Check querier replicas kubectl get pods -l app=tempo-querier -n monitoring

# Check querier configuration kubectl describe pod -l app=tempo-querier -n monitoring | grep -A 20 "Args" ```

Solution:

Configure querier properly:

```yaml # tempo-config.yaml querier: max_concurrent_queries: 20 query_timeout: 30s search: result_cache: backend: none frontend_address: tempo-query-frontend:3100

query_frontend: max_outstanding_per_tenant: 100 query_ingesters_until: 30m

# Ensure querier can reach ingesters # Check ring configuration memberlist: join_members: - tempo-ingester-0:7946 - tempo-ingester-1:7946 - tempo-ingester-2:7946 ```

Common Cause 8: Service Name Missing

Traces exist but service name filtering doesn't match.

Error pattern: `` No traces found for service X

Diagnosis:

```bash # List available services curl -s "http://localhost:3200/api/search/tags/service" | jq '.'

# Check trace tags curl -s "http://localhost:3200/api/search/tags" | jq '.'

# Search without service filter curl -s "http://localhost:3200/api/search?limit=100&start=-1h&end=now"

# Check service name in application traces grep -r "service.name|service_name" application-config/

# Check OpenTelemetry service name configuration kubectl exec -it app-pod -- env | grep OTEL_SERVICE_NAME ```

Solution:

Configure proper service names:

```yaml # OpenTelemetry SDK configuration service: name: my-service # Required for proper trace grouping

# In application code from opentelemetry.sdk.resources import Resource resource = Resource.create({"service.name": "my-service"}) tracer_provider = TracerProvider(resource=resource)

# Environment variables OTEL_SERVICE_NAME=my-service

# For Jaeger JAEGER_SERVICE_NAME=my-service

# Search with correct service name curl -s "http://localhost:3200/api/search?tags=service=my-service&limit=100" ```

Verification

After fixing, verify traces are searchable:

```bash # Check Tempo is healthy curl -s http://localhost:3200/ready

# Generate a test trace and verify it appears curl -X POST http://localhost:3200/api/traces \ -H "Content-Type: application/json" \ -d '{"traceId":"testtrace123","spans":[{"spanId":"span1","operation":"test"}]}'

# Wait for flush sleep 60

# Search for test trace curl -s "http://localhost:3200/api/traces/testtrace123" | jq '.'

# Check ingestion metrics curl -s http://localhost:3200/metrics | grep tempo_ingester_traces_received_total

# Check for recent traces curl -s "http://localhost:3200/api/search?start=-1h&end=now&limit=10" | jq '.traces'

# Verify in Grafana # Navigate to Explore > Tempo datasource # Search for traces in correct time range ```

Prevention

Monitor Tempo health:

```yaml groups: - name: tempo_health rules: - alert: TempoTraceIngestionStopped expr: rate(tempo_ingester_traces_received_total[5m]) == 0 for: 5m labels: severity: critical annotations: summary: "Tempo not receiving traces"

  • alert: TempoIngestionErrors
  • expr: rate(tempo_ingester_push_errors_total[5m]) > 0
  • for: 5m
  • labels:
  • severity: warning
  • annotations:
  • summary: "Tempo trace ingestion errors"
  • alert: TempoFlushLagging
  • expr: tempo_ingester_flush_queue_length > 10
  • for: 10m
  • labels:
  • severity: warning
  • annotations:
  • summary: "Tempo ingester flush queue lagging"
  • alert: TempoQuerierErrors
  • expr: rate(tempo_querier_errors_total[5m]) > 0
  • for: 5m
  • labels:
  • severity: warning
  • annotations:
  • summary: "Tempo querier experiencing errors"
  • `

Tempo trace not found issues typically stem from ingestion problems, tenant configuration, or query parameters. Verify traces are being ingested, check tenant settings, and ensure query time ranges and trace ID formats are correct.