Introduction A red cluster status means at least one primary shard is unallocated, making some data completely unavailable. This is more severe than yellow (missing replicas) and requires immediate action. Common causes include permanent node loss, corrupted shard data, or allocation rules preventing shard assignment.
Symptoms - `GET /_cluster/health` returns `"status": "red"` - `GET /_cat/shards?v | grep UNASSIGNED` shows primary shards unassigned - Search queries on affected indices return partial or no results - `GET /_cluster/allocation/explain` shows primary cannot be allocated - Some indices show `UNASSIGNED` with `store` size of 0 bytes
Common Causes - Node with primary shard data permanently lost (disk failure, instance terminated) - All replicas were also on the same failed node (single-point-of-failure) - Index corruption preventing shard loading - Allocation rules (`require`, `exclude`) preventing shard placement - Snapshot restore interrupted, leaving index in incomplete state
Step-by-Step Fix 1. **Identify the unassigned primary shards": ```bash curl -s localhost:9200/_cat/shards?v | grep UNASSIGNED | grep p # Format: index shard prirep state docs store ip node ```
- 1.**Get detailed allocation explanation":
- 2.```bash
- 3.curl -s localhost:9200/_cluster/allocation/explain?pretty
- 4.# Look for the specific reason the primary cannot be allocated
- 5.
` - 6.**If data is permanently lost, allocate empty primaries":
- 7.```bash
- 8.curl -X POST localhost:9200/_cluster/reroute -H 'Content-Type: application/json' -d '{
- 9."commands": [{
- 10."allocate_empty_primary": {
- 11."index": "logs-2026.04.01",
- 12."shard": 0,
- 13."node": "data-node-1",
- 14."accept_data_loss": true
- 15.}
- 16.}]
- 17.}'
- 18.
` - 19.**For all affected indices at once, use the Reroute API":
- 20.```bash
- 21.# Get all unassigned primaries
- 22.curl -s localhost:9200/_cat/shards | grep UNASSIGNED | grep p | awk '{print $1, $2}' | \
- 23.while read index shard; do
- 24.node=$(curl -s localhost:9200/_cat/nodes?h=name | head -1)
- 25.curl -X POST localhost:9200/_cluster/reroute -H 'Content-Type: application/json' -d "{
- 26.\"commands\": [{
- 27.\"allocate_empty_primary\": {
- 28.\"index\": \"$index\",
- 29.\"shard\": $shard,
- 30.\"node\": \"$node\",
- 31.\"accept_data_loss\": true
- 32.}
- 33.}]
- 34.}"
- 35.done
- 36.
` - 37.**Recover from snapshot if data is critical":
- 38.```bash
- 39.# List available snapshots
- 40.curl -s localhost:9200/_snapshot/my_repo/_all?pretty
# Close the broken index curl -X POST localhost:9200/logs-2026.04.01/_close
# Restore from snapshot curl -X POST localhost:9200/_snapshot/my_repo/snapshot_20260401/_restore -H 'Content-Type: application/json' -d '{ "indices": "logs-2026.04.01", "ignore_unavailable": true }' ```