# How to Fix Elasticsearch Snapshot Restore Failures

Restoring from a snapshot should be straightforward, but various issues can cause failures. Whether you're dealing with repository problems, version incompatibilities, or shard allocation errors, here's how to diagnose and fix snapshot restore issues.

Recognizing Restore Failures

When a restore fails, you'll see errors like:

json
{
  "error": {
    "root_cause": [
      {
        "type": "snapshot_restore_exception",
        "reason": "[backup-repo:snapshot-2024-01-15] cannot restore index [logs-2024-01] because an open index with same name already exists in the cluster"
      }
    ],
    "type": "snapshot_restore_exception",
    "reason": "[backup-repo:snapshot-2024-01-15] cannot restore index [logs-2024-01]"
  },
  "status": 400
}

Or during partial failure:

json
{
  "snapshot" : "snapshot-2024-01-15",
  "indices" : [ "logs-2024-01", "logs-2024-02" ],
  "shards" : {
    "total" : 10,
    "failed" : 3,
    "successful" : 7
  }
}

Pre-Restore Checklist

Before attempting a restore, verify your snapshot repository:

```bash # List registered repositories curl -X GET "localhost:9200/_snapshot?pretty"

# Verify repository is accessible curl -X POST "localhost:9200/_snapshot/backup-repo/_verify?pretty"

# List available snapshots curl -X GET "localhost:9200/_snapshot/backup-repo/_all?pretty" ```

Check snapshot details:

bash
curl -X GET "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15?pretty"
json
{
  "snapshots" : [
    {
      "snapshot" : "snapshot-2024-01-15",
      "uuid" : "abc123...",
      "version_id" : 8010199,
      "version" : "8.1.1",
      "indices" : [ "logs-2024-01", "products", "users" ],
      "data_streams" : [ ],
      "include_global_state" : true,
      "state" : "SUCCESS",
      "start_time" : "2024-01-15T00:00:00.000Z",
      "end_time" : "2024-01-15T00:30:00.000Z"
    }
  ]
}

Problem 1: Index Already Exists

The most common error - trying to restore an index that already exists:

json
{
  "error": {
    "type": "snapshot_restore_exception",
    "reason": "cannot restore index [logs-2024-01] because an open index with same name already exists"
  }
}

Solution A: Close the Existing Index

```bash curl -X POST "localhost:9200/logs-2024-01/_close"

curl -X POST "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_restore" -H 'Content-Type: application/json' -d' { "indices": "logs-2024-01" } ' ```

Solution B: Delete the Existing Index

```bash # Warning: This deletes all data in the index curl -X DELETE "localhost:9200/logs-2024-01"

curl -X POST "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_restore" -H 'Content-Type: application/json' -d' { "indices": "logs-2024-01" } ' ```

Solution C: Restore to a Different Index Name

bash
curl -X POST "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_restore" -H 'Content-Type: application/json' -d'
{
  "indices": "logs-2024-01",
  "rename_pattern": "(.+)",
  "rename_replacement": "$1-restored"
}
'

This restores logs-2024-01 as logs-2024-01-restored.

Problem 2: Version Incompatibility

Snapshot version mismatch errors:

json
{
  "error": {
    "type": "snapshot_restore_exception",
    "reason": "the snapshot was created with Elasticsearch version [7.10.0] which is not compatible with the current version [8.5.0]"
  }
}

Solution: Upgrade Path

You cannot restore snapshots across major versions directly. Options:

  1. 1.Restore to the same major version, then reindex to a new index on the target version
  2. 2.Use the upgrade assistant for version migration
  3. 3.Export/Import data via JSON for cross-version migration

```bash # Check current version curl -X GET "localhost:9200"

# For minor version differences, upgrade the cluster first # Then snapshot and restore ```

Problem 3: Repository Not Accessible

Repository verification fails:

bash
curl -X POST "localhost:9200/_snapshot/backup-repo/_verify?pretty"
json
{
  "error": {
    "root_cause": [
      {
        "type": "repository_verification_exception",
        "reason": "[backup-repo] path is not accessible"
      }
    ],
    "type": "repository_verification_exception",
    "reason": "[backup-repo] path is not accessible"
  },
  "status": 500
}

Solution: Fix Repository Configuration

Check the repository configuration:

bash
curl -X GET "localhost:9200/_snapshot/backup-repo?pretty"

For file system repositories, verify the path is accessible:

```bash # Check elasticsearch.yml for path.repo setting grep path.repo /etc/elasticsearch/elasticsearch.yml

# Verify the directory exists and is writable ls -la /mnt/backups/elasticsearch

# Check permissions sudo chown -R elasticsearch:elasticsearch /mnt/backups/elasticsearch ```

Re-register the repository:

bash
curl -X PUT "localhost:9200/_snapshot/backup-repo" -H 'Content-Type: application/json' -d'
{
  "type": "fs",
  "settings": {
    "location": "/mnt/backups/elasticsearch",
    "compress": true
  }
}
'

For S3 repositories, check credentials:

bash
curl -X PUT "localhost:9200/_snapshot/s3-backups" -H 'Content-Type: application/json' -d'
{
  "type": "s3",
  "settings": {
    "bucket": "my-elasticsearch-backups",
    "region": "us-east-1",
    "base_path": "snapshots"
  }
}
'

Problem 4: Corrupted Snapshot

Snapshot integrity issues:

json
{
  "error": {
    "type": "snapshot_restore_exception",
    "reason": "[backup-repo:snapshot-2024-01-15] failed to restore snapshot"
  }
}

Check snapshot status:

bash
curl -X GET "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_status?pretty"

Solution: Restore Specific Indices

If only some indices are corrupted, restore only the good ones:

```bash # List indices in the snapshot curl -X GET "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15?pretty"

# Restore specific indices curl -X POST "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_restore" -H 'Content-Type: application/json' -d' { "indices": "products,users", "ignore_unavailable": true } ' ```

Problem 5: Insufficient Disk Space

Restore fails due to disk space:

json
{
  "error": {
    "type": "illegal_state_exception",
    "reason": "not enough disk space to restore snapshot"
  }
}

Solution: Free Space or Add Nodes

```bash # Check current disk usage curl -X GET "localhost:9200/_cat/allocation?v"

# Delete unnecessary indices curl -X DELETE "localhost:9200/old-logs-*"

# Or add nodes for more capacity ```

Temporarily lower the watermark:

bash
curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
  "transient": {
    "cluster.routing.allocation.disk.watermark.low": "90%",
    "cluster.routing.allocation.disk.watermark.high": "95%",
    "cluster.routing.allocation.disk.watermark.flood_stage": "98%"
  }
}
'

Problem 6: Shard Allocation Failures

Shards fail to allocate during restore:

bash
curl -X GET "localhost:9200/_cat/shards?v&s=state" | grep -v STARTED

Solution: Check Allocation Settings

```bash # Check allocation settings curl -X GET "localhost:9200/_cluster/settings?include_defaults=true&flat_settings=true&pretty" | grep allocation

# Ensure allocation is enabled curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d' { "transient": { "cluster.routing.allocation.enable": "all" } } ' ```

Check for shard allocation filtering:

bash
curl -X GET "localhost:9200/_cluster/settings?include_defaults=true&flat_settings=true&pretty" | grep -A5 "allocation"

Problem 7: Global State Conflicts

Restoring global state (templates, ILM policies) can conflict:

json
{
  "error": {
    "type": "snapshot_restore_exception",
    "reason": "cannot restore global state because it already exists"
  }
}

Solution: Exclude or Overwrite Global State

Exclude global state:

bash
curl -X POST "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_restore" -H 'Content-Type: application/json' -d'
{
  "indices": "*",
  "include_global_state": false
}
'

Or delete existing conflicting objects first:

```bash # List templates curl -X GET "localhost:9200/_cat/templates?v"

# Delete specific template curl -X DELETE "localhost:9200/_template/my-template" ```

Monitoring Restore Progress

Track restore progress:

bash
curl -X GET "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_status?pretty"
json
{
  "snapshots" : [
    {
      "snapshot" : "snapshot-2024-01-15",
      "repository" : "backup-repo",
      "state" : "IN_PROGRESS",
      "shards_stats" : {
        "initializing" : 0,
        "started" : 5,
        "finalizing" : 2,
        "done" : 3,
        "failed" : 0,
        "total" : 10
      }
    }
  ]
}

Wait for completion:

bash
curl -X GET "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_status?wait_for_completion=true&timeout=300s"

Best Practices for Successful Restores

1. Test Your Backups

Regularly test restore procedures:

```bash # Create a test restore curl -X POST "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_restore" -H 'Content-Type: application/json' -d' { "indices": "logs-2024-01", "rename_pattern": "(.+)", "rename_replacement": "restored_test_$1" } '

# Verify data curl -X GET "localhost:9200/restored_test_logs-2024-01/_count"

# Clean up curl -X DELETE "localhost:9200/restored_test_*" ```

2. Use a Restore Configuration File

Create a restore configuration for complex restores:

bash
curl -X POST "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_restore" -H 'Content-Type: application/json' -d'
{
  "indices": "logs-*,products,users",
  "ignore_unavailable": true,
  "include_global_state": false,
  "rename_pattern": "(.+)",
  "rename_replacement": "restored_$1",
  "include_aliases": false
}
'

3. Verify Index Settings After Restore

```bash curl -X GET "localhost:9200/restored_logs-2024-01/_settings?pretty"

# Adjust replica count if needed curl -X PUT "localhost:9200/restored_logs-2024-01/_settings" -H 'Content-Type: application/json' -d' { "index": { "number_of_replicas": 1 } } ' ```

Troubleshooting Checklist

When restore fails, check these in order:

  1. 1.Repository accessible? GET _snapshot/repo/_verify
  2. 2.Snapshot valid? GET _snapshot/repo/snapshot
  3. 3.Index exists? GET _cat/indices
  4. 4.Disk space available? GET _cat/allocation
  5. 5.Allocation enabled? GET _cluster/settings
  6. 6.Version compatible? Check snapshot vs cluster version
  7. 7.Shards allocating? GET _cat/shards?v&s=state

Summary

Snapshot restore failures are usually caused by:

  1. 1.Existing indices with same name
  2. 2.Version incompatibility
  3. 3.Repository access issues
  4. 4.Corrupted snapshots
  5. 5.Disk space limitations
  6. 6.Shard allocation settings
  7. 7.Global state conflicts

Resolve by closing or renaming indices, ensuring repository access, freeing disk space, and using proper restore options. Always test your backup restore process before you need it in an emergency.