Introduction

Elasticsearch cluster read-only blocks occur when the cluster or individual indices become read-only, rejecting all write operations with ClusterBlockException: index [name] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark] or similar errors. This protective mechanism prevents data loss when cluster resources are constrained. Common causes include disk usage exceeding flood-stage watermark (95% by default), disk usage exceeding high watermark (90%) with shard allocation disabled, index explicitly set to read-only mode, license expiration for paid features, cluster state corruption requiring recovery, node disconnection causing split-brain protection, insufficient disk space for shard relocation, JVM heap pressure triggering circuit breakers, and repository snapshot operations locking indices. The fix requires freeing disk space, adjusting watermark thresholds, clearing index blocks, and restoring normal shard allocation. This guide provides production-proven troubleshooting for read-only blocks across Elasticsearch versions 6.x through 8.x and managed services (Elastic Cloud, AWS OpenSearch).

Symptoms

  • ClusterBlockException: index [name] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark]
  • Write operations fail with 429 Too Many Requests
  • Bulk indexing returns write_rejected errors
  • Index settings show index.blocks.read_only_allow_delete: true
  • Cluster health yellow or red with unassigned shards
  • Disk usage shown as >90% in _cat/allocation
  • Shard allocation fails with disk_threshold_decider
  • Cluster state updates rejected
  • Snapshot operations hang or fail
  • Kibana shows "Cluster is read-only" banner

Common Causes

  • Disk usage exceeds flood-stage watermark (default 95%)
  • Disk usage exceeds high watermark (default 90%) for extended period
  • Shard allocation disabled manually and not re-enabled
  • Index created with read_only_allow_delete flag
  • License expired (for X-Pack features)
  • Cluster split-brain detection triggered
  • Multiple master-eligible nodes with network partition
  • JVM heap pressure >95% triggering circuit breakers
  • File descriptor exhaustion on data nodes
  • Repository snapshot locking indices during backup

Step-by-Step Fix

### 1. Diagnose cluster block state

Check cluster health and blocks:

```bash # Check cluster health curl -X GET "localhost:9200/_cluster/health?pretty"

# Output shows: # { # "cluster_name": "my-cluster", # "status": "yellow", # or red # "timed_out": false, # "number_of_nodes": 3, # "number_of_data_nodes": 2, # "active_primary_shards": 10, # "active_shards": 20, # "relocating_shards": 0, # "initializing_shards": 0, # "unassigned_shards": 5, # "blocked": true # Cluster is blocked # }

# Check specific index blocks curl -X GET "localhost:9200/_all/_settings/index.blocks?pretty"

# Look for: # "index.blocks.read_only_allow_delete": "true" # "index.blocks.read_only": "true"

# Check cluster settings for blocks curl -X GET "localhost:9200/_cluster/settings?include_defaults=true&pretty" \ | grep -A5 "cluster.blocks"

# Check disk usage and watermarks curl -X GET "localhost:9200/_cat/allocation?v"

# Output: # shards disk.indices disk.used disk.avail disk.total disk.percent host ip node # 121 45gb 89gb 6gb 95gb 94 192.168.1.1 10.0.1.1 node-1 # 98 42gb 88gb 7gb 95gb 93 192.168.1.2 10.0.1.2 node-2

# Check which indices are blocked curl -X GET "localhost:9200/_cat/indices?v&health=red" curl -X GET "localhost:9200/_cat/indices?v&health=yellow" ```

Check shard allocation status:

```bash # Check unassigned shards and reasons curl -X GET "localhost:9200/_cat/shards?v&h=index,shard,prirep,state,node,unassigned.reason"

# Look for UNASSIGNED shards with reasons: # CLUSTER_RECOVERED # INDEX_CREATED # NODE_LEFT # REPLICA_ADDED # ALLOCATION_FAILED # CLUSTER_REQUIRES_UPGRADE

# Check allocation explanation for specific index curl -X GET "localhost:9200/_cluster/allocation/explain?pretty" -H 'Content-Type: application/json' -d ' { "index": "my-index", "shard": 0, "primary": true }'

# Output shows why shard cannot be allocated: # { # "index": "my-index", # "shard": 0, # "primary": true, # "current_state": "unassigned", # "unassigned_info": { # "reason": "ALLOCATION_FAILED", # "allocation_status": "no_attempt", # "allocation_explanation": "this shard cannot be allocated to any node" # }, # "can_allocate": "no", # "allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes", # "node_allocation_decisions": [ # { # "node_id": "abc123", # "node_name": "node-1", # "transport_address": "10.0.1.1:9300", # "node_decision": "no", # "deciders": [ # { # "decider": "disk_threshold", # "decision": "NO", # "explanation": "the node is above the high watermark cluster setting [cluster.routing.allocation.disk.watermark.high=90%], having less than the minimum required [9.5gb] free space, actual free: [6gb]" # } # ] # } # ] # } ```

### 2. Clear read-only blocks

Remove index read-only blocks:

```bash # Method 1: Clear read_only_allow_delete for specific index curl -X PUT "localhost:9200/my-index/_settings?pretty" -H 'Content-Type: application/json' -d ' { "index.blocks.read_only_allow_delete": null }'

# Method 2: Clear for all indices curl -X PUT "localhost:9200/_all/_settings?pretty" -H 'Content-Type: application/json' -d ' { "index.blocks.read_only_allow_delete": null }'

# Method 3: Clear all block settings curl -X PUT "localhost:9200/_all/_settings?pretty" -H 'Content-Type: application/json' -d ' { "index.blocks.read_only": null, "index.blocks.read_only_allow_delete": null, "index.blocks.metadata": null, "index.blocks.write": null }'

# Verify blocks cleared curl -X GET "localhost:9200/_all/_settings/index.blocks?pretty" ```

Clear cluster-level blocks:

```bash # Check for cluster-wide blocks curl -X GET "localhost:9200/_cluster/settings?pretty" | grep -A10 "blocks"

# Clear cluster blocks (if any) curl -X PUT "localhost:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d ' { "persistent": { "cluster.blocks.read_only": null, "cluster.blocks.read_only_allow_delete": null } }' ```

### 3. Fix disk watermark issues

Understand Elasticsearch disk watermarks:

```bash # Default watermarks: # - low_watermark: 85% - ES stops allocating new shards to node # - high_watermark: 90% - ES relocates shards away from node # - flood_stage_watermark: 95% - ES sets indices to read-only

# Check current watermark settings curl -X GET "localhost:9200/_cluster/settings?include_defaults=true&pretty" \ | grep -A2 "watermark"

# Output: # "cluster.routing.allocation.disk.watermark.low": "85%", # "cluster.routing.allocation.disk.watermark.high": "90%", # "cluster.routing.allocation.disk.watermark.flood_stage": "95%", # "cluster.routing.allocation.disk.watermark.low.freespace": "200gb", # "cluster.routing.allocation.disk.watermark.high.freespace": "100gb", # "cluster.routing.allocation.disk.watermark.flood_stage.freespace": "50gb" ```

Adjust watermark thresholds:

```bash # Option 1: Temporarily lower flood stage to allow writes # Use with caution - only if you need immediate write access curl -X PUT "localhost:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d ' { "persistent": { "cluster.routing.allocation.disk.watermark.flood_stage": "97%" } }'

# Option 2: Set watermarks based on free space (not percentage) # More predictable behavior across different disk sizes curl -X PUT "localhost:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d ' { "persistent": { "cluster.routing.allocation.disk.watermark.low.freespace": "50gb", "cluster.routing.allocation.disk.watermark.high.freespace": "30gb", "cluster.routing.allocation.disk.watermark.flood_stage.freespace": "10gb" } }'

# Option 3: Disable disk-based allocation decisions (NOT recommended for production) curl -X PUT "localhost:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d ' { "persistent": { "cluster.routing.allocation.disk.threshold_enabled": false } }' ```

Free disk space:

```bash # Check disk usage per index curl -X GET "localhost:9200/_cat/indices?v&s=store.size:desc"

# Check shard size distribution curl -X GET "localhost:9200/_cat/shards?v&s=store:desc"

# Delete old indices (if safe) curl -X DELETE "localhost:9200/logs-2024.01.?pretty"

# Shrink large indices curl -X POST "localhost:9200/large-index/_shrink/shrunk-index?pretty"

# Force merge to free space (reduces segment count) curl -X POST "localhost:9200/_all/_forcemerge?max_num_segments=1&only_expunge_deletes=true"

# Delete old documents by query curl -X POST "localhost:9200/logs/_delete_by_query?pretty" -H 'Content-Type: application/json' -d ' { "query": { "range": { "@timestamp": { "lte": "now-30d" } } } }'

# After freeing space, refresh node stats curl -X POST "localhost:9200/_cat/nodes/_local/_stats?refresh" ```

### 4. Re-enable shard allocation

Restore allocation after maintenance:

```bash # Check current allocation settings curl -X GET "localhost:9200/_cluster/settings?pretty" | grep allocation

# Common allocation disable settings: # cluster.routing.allocation.enable: none (all allocation disabled) # cluster.routing.allocation.enable: primaries (only primary shards) # cluster.routing.allocation.enable: new_primaries (only new primary shards) # cluster.routing.allocation.enable: all (default - full allocation)

# Re-enable all shard allocation curl -X PUT "localhost:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d ' { "persistent": { "cluster.routing.allocation.enable": "all" } }'

# Retry failed shard allocations curl -X POST "localhost:9200/_cluster/reroute?retry_failed=true&pretty"

# Force specific shard allocation (advanced) curl -X POST "localhost:9200/_cluster/reroute?pretty" -H 'Content-Type: application/json' -d ' { "commands": [{ "allocate_replica": { "index": "my-index", "shard": 0, "node": "node-1" } }] }' ```

### 5. Fix cluster state issues

Recover cluster state:

```bash # Check cluster state size (large state can cause issues) curl -X GET "localhost:9200/_cluster/state/_meta?pretty" | grep -c "metadata"

# If cluster state is very large (>100MB), consider cleanup

# Restart master-eligible nodes one at a time # This forces cluster state rebuild

# Stop master node (gracefully) # systemctl stop elasticsearch

# Wait for new master election # curl -X GET "localhost:9200/_cat/master?v"

# Start node again # systemctl start elasticsearch

# For severe corruption, use cluster state restoration # WARNING: This is a last resort - backup first!

# Stop all nodes # Remove gateway state rm -rf /path/to/elasticsearch/data/nodes/0/_state

# Start nodes one at a time # Cluster will rebuild state from index metadata ```

Handle split-brain scenarios:

```bash # Check for multiple masters (split-brain) curl -X GET "http://node1:9200/_cat/master?v" curl -X GET "http://node2:9200/_cat/master?v"

# If different masters, you have split-brain

# Fix: Stop all nodes # Determine which node has correct data (most recent writes) # Start that node first # Update discovery settings to prevent future split-brain

# Configure proper minimum_master_nodes (ES 6.x and earlier) # For ES 7+, voting configuration handles this

# ES 6.x: # discovery.zen.minimum_master_nodes: 2 # (master_eligible_nodes / 2) + 1

# ES 7+: # POST /_cluster/voting_config_exclusions # POST /_nodes/<node_id>/_remove_from_voting_config ```

### 6. Fix circuit breaker issues

Check circuit breaker status:

```bash # Check circuit breaker stats curl -X GET "localhost:9200/_nodes/stats/breaker?pretty"

# Output shows: # "breakers": { # "request": { # "limit_size_in_bytes": 6442450944, # "limit_size": "6gb", # "estimated_size_in_bytes": 0, # "estimated_size": "0b", # "overhead": 1.0, # "tripped": 5 # Times circuit breaker tripped # }, # "fielddata": { # "limit_size_in_bytes": 4294967296, # "limit_size": "4gb", # "estimated_size_in_bytes": 0, # "estimated_size": "0b", # "overhead": 1.03, # "tripped": 2 # }, # "in_flight_requests": { # "limit_size_in_bytes": 10737418240, # "limit_size": "10gb", # "estimated_size_in_bytes": 0, # "estimated_size": "0b", # "overhead": 2.0, # "tripped": 0 # }, # "parent": { # "limit_size_in_bytes": 7516192768, # "limit_size": "7gb", # "estimated_size_in_bytes": 0, # "estimated_size": "0b", # "overhead": 1.0, # "tripped": 10 # Parent breaker tripped = heap pressure # } # }

# If tripped count is high, circuit breakers are blocking operations ```

Adjust circuit breaker limits:

```bash # Temporarily increase parent circuit breaker curl -X PUT "localhost:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d ' { "persistent": { "indices.breaker.total.limit": "80%" } }'

# Adjust request circuit breaker curl -X PUT "localhost:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d ' { "persistent": { "indices.breaker.request.limit": "70%" } }'

# Note: Increasing breaker limits risks OOM errors # Better solution: Add more heap or optimize queries

# Check JVM heap usage curl -X GET "localhost:9200/_nodes/stats/jvm?pretty" | grep -A10 "mem"

# If heap consistently >75%, add more heap or more nodes ```

### 7. Prevent future read-only blocks

Implement ILM (Index Lifecycle Management):

```bash # Create ILM policy to manage index lifecycle curl -X PUT "localhost:9200/_ilm/policy/logs-policy?pretty" -H 'Content-Type: application/json' -d ' { "policy": { "phases": { "hot": { "min_age": "0ms", "actions": { "rollover": { "max_size": "50gb", "max_age": "7d" }, "set_priority": { "priority": 100 } } }, "warm": { "min_age": "7d", "actions": { "set_priority": { "priority": 50 }, "shrink": { "number_of_shards": 1 }, "forcemerge": { "max_num_segments": 1 } } }, "cold": { "min_age": "30d", "actions": { "set_priority": { "priority": 0 }, "freeze": {} } }, "delete": { "min_age": "90d", "actions": { "delete": {} } } } } }'

# Apply policy to index template curl -X PUT "localhost:9200/_index_template/logs-template?pretty" -H 'Content-Type: application/json' -d ' { "index_patterns": ["logs-*"], "template": { "settings": { "index.lifecycle.name": "logs-policy", "index.lifecycle.rollover_alias": "logs" } } }' ```

Set up monitoring and alerting:

```bash # Monitor disk usage via _cat API curl -X GET "localhost:9200/_cat/allocation?format=json" | jq '.[] | select(."disk.percent" | tonumber > 85)'

# Create alert for disk usage >85% # Using Elastic Watcher: PUT _watcher/watch/disk-usage-alert { "trigger": { "schedule": { "interval": "5m" } }, "input": { "chain": { "inputs": [ { "nodes": { "request": { "method": "GET", "endpoint": "/_cat/allocation?format=json" } } } ] } }, "condition": { "script": { "source": "ctx.payload.nodes.nodes.any { node -> Integer.parseInt(node[\"disk.percent\"]) > 85 }" } }, "actions": { "email": { "email": { "to": "ops@example.com", "subject": "Elasticsearch Disk Usage Alert", "body": "Disk usage exceeded 85% threshold" } } } }

# Or use Prometheus + elasticsearch_exporter # Alert rule: # - alert: ElasticsearchDiskUsageHigh # expr: elasticsearch_fs_data_size_used / elasticsearch_fs_data_size_total > 0.85 # for: 5m # labels: # severity: warning # annotations: # summary: "Elasticsearch disk usage high" ```

Prevention

  • Monitor disk usage with alerts at 70%, 80%, 85% thresholds
  • Implement ILM policies for automatic index rollover and deletion
  • Size cluster with 30%+ headroom for growth and shard relocation
  • Use hot-warm-cold architecture for cost-effective scaling
  • Configure appropriate shard sizes (10-50GB typical)
  • Set up circuit breaker monitoring and alerting
  • Regular cluster state optimization (delete unused indices)
  • Document runbook for read-only block recovery
  • Test recovery procedures in staging environment
  • Consider managed Elasticsearch (Elastic Cloud, AWS OpenSearch) for automatic management
  • **Yellow cluster status**: Replica shards unassigned
  • **Red cluster status**: Primary shards unassigned
  • **Circuit breaker tripped**: Memory limit exceeded
  • **Shard allocation failed**: No eligible nodes
  • **Cluster state too large**: Too many indices/mappings