# How to Fix Elasticsearch Snapshot Restore Failures
Restoring from a snapshot should be straightforward, but various issues can cause failures. Whether you're dealing with repository problems, version incompatibilities, or shard allocation errors, here's how to diagnose and fix snapshot restore issues.
Recognizing Restore Failures
When a restore fails, you'll see errors like:
{
"error": {
"root_cause": [
{
"type": "snapshot_restore_exception",
"reason": "[backup-repo:snapshot-2024-01-15] cannot restore index [logs-2024-01] because an open index with same name already exists in the cluster"
}
],
"type": "snapshot_restore_exception",
"reason": "[backup-repo:snapshot-2024-01-15] cannot restore index [logs-2024-01]"
},
"status": 400
}Or during partial failure:
{
"snapshot" : "snapshot-2024-01-15",
"indices" : [ "logs-2024-01", "logs-2024-02" ],
"shards" : {
"total" : 10,
"failed" : 3,
"successful" : 7
}
}Pre-Restore Checklist
Before attempting a restore, verify your snapshot repository:
```bash # List registered repositories curl -X GET "localhost:9200/_snapshot?pretty"
# Verify repository is accessible curl -X POST "localhost:9200/_snapshot/backup-repo/_verify?pretty"
# List available snapshots curl -X GET "localhost:9200/_snapshot/backup-repo/_all?pretty" ```
Check snapshot details:
curl -X GET "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15?pretty"{
"snapshots" : [
{
"snapshot" : "snapshot-2024-01-15",
"uuid" : "abc123...",
"version_id" : 8010199,
"version" : "8.1.1",
"indices" : [ "logs-2024-01", "products", "users" ],
"data_streams" : [ ],
"include_global_state" : true,
"state" : "SUCCESS",
"start_time" : "2024-01-15T00:00:00.000Z",
"end_time" : "2024-01-15T00:30:00.000Z"
}
]
}Problem 1: Index Already Exists
The most common error - trying to restore an index that already exists:
{
"error": {
"type": "snapshot_restore_exception",
"reason": "cannot restore index [logs-2024-01] because an open index with same name already exists"
}
}Solution A: Close the Existing Index
```bash curl -X POST "localhost:9200/logs-2024-01/_close"
curl -X POST "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_restore" -H 'Content-Type: application/json' -d' { "indices": "logs-2024-01" } ' ```
Solution B: Delete the Existing Index
```bash # Warning: This deletes all data in the index curl -X DELETE "localhost:9200/logs-2024-01"
curl -X POST "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_restore" -H 'Content-Type: application/json' -d' { "indices": "logs-2024-01" } ' ```
Solution C: Restore to a Different Index Name
curl -X POST "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_restore" -H 'Content-Type: application/json' -d'
{
"indices": "logs-2024-01",
"rename_pattern": "(.+)",
"rename_replacement": "$1-restored"
}
'This restores logs-2024-01 as logs-2024-01-restored.
Problem 2: Version Incompatibility
Snapshot version mismatch errors:
{
"error": {
"type": "snapshot_restore_exception",
"reason": "the snapshot was created with Elasticsearch version [7.10.0] which is not compatible with the current version [8.5.0]"
}
}Solution: Upgrade Path
You cannot restore snapshots across major versions directly. Options:
- 1.Restore to the same major version, then reindex to a new index on the target version
- 2.Use the upgrade assistant for version migration
- 3.Export/Import data via JSON for cross-version migration
```bash # Check current version curl -X GET "localhost:9200"
# For minor version differences, upgrade the cluster first # Then snapshot and restore ```
Problem 3: Repository Not Accessible
Repository verification fails:
curl -X POST "localhost:9200/_snapshot/backup-repo/_verify?pretty"{
"error": {
"root_cause": [
{
"type": "repository_verification_exception",
"reason": "[backup-repo] path is not accessible"
}
],
"type": "repository_verification_exception",
"reason": "[backup-repo] path is not accessible"
},
"status": 500
}Solution: Fix Repository Configuration
Check the repository configuration:
curl -X GET "localhost:9200/_snapshot/backup-repo?pretty"For file system repositories, verify the path is accessible:
```bash # Check elasticsearch.yml for path.repo setting grep path.repo /etc/elasticsearch/elasticsearch.yml
# Verify the directory exists and is writable ls -la /mnt/backups/elasticsearch
# Check permissions sudo chown -R elasticsearch:elasticsearch /mnt/backups/elasticsearch ```
Re-register the repository:
curl -X PUT "localhost:9200/_snapshot/backup-repo" -H 'Content-Type: application/json' -d'
{
"type": "fs",
"settings": {
"location": "/mnt/backups/elasticsearch",
"compress": true
}
}
'For S3 repositories, check credentials:
curl -X PUT "localhost:9200/_snapshot/s3-backups" -H 'Content-Type: application/json' -d'
{
"type": "s3",
"settings": {
"bucket": "my-elasticsearch-backups",
"region": "us-east-1",
"base_path": "snapshots"
}
}
'Problem 4: Corrupted Snapshot
Snapshot integrity issues:
{
"error": {
"type": "snapshot_restore_exception",
"reason": "[backup-repo:snapshot-2024-01-15] failed to restore snapshot"
}
}Check snapshot status:
curl -X GET "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_status?pretty"Solution: Restore Specific Indices
If only some indices are corrupted, restore only the good ones:
```bash # List indices in the snapshot curl -X GET "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15?pretty"
# Restore specific indices curl -X POST "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_restore" -H 'Content-Type: application/json' -d' { "indices": "products,users", "ignore_unavailable": true } ' ```
Problem 5: Insufficient Disk Space
Restore fails due to disk space:
{
"error": {
"type": "illegal_state_exception",
"reason": "not enough disk space to restore snapshot"
}
}Solution: Free Space or Add Nodes
```bash # Check current disk usage curl -X GET "localhost:9200/_cat/allocation?v"
# Delete unnecessary indices curl -X DELETE "localhost:9200/old-logs-*"
# Or add nodes for more capacity ```
Temporarily lower the watermark:
curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
"transient": {
"cluster.routing.allocation.disk.watermark.low": "90%",
"cluster.routing.allocation.disk.watermark.high": "95%",
"cluster.routing.allocation.disk.watermark.flood_stage": "98%"
}
}
'Problem 6: Shard Allocation Failures
Shards fail to allocate during restore:
curl -X GET "localhost:9200/_cat/shards?v&s=state" | grep -v STARTEDSolution: Check Allocation Settings
```bash # Check allocation settings curl -X GET "localhost:9200/_cluster/settings?include_defaults=true&flat_settings=true&pretty" | grep allocation
# Ensure allocation is enabled curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d' { "transient": { "cluster.routing.allocation.enable": "all" } } ' ```
Check for shard allocation filtering:
curl -X GET "localhost:9200/_cluster/settings?include_defaults=true&flat_settings=true&pretty" | grep -A5 "allocation"Problem 7: Global State Conflicts
Restoring global state (templates, ILM policies) can conflict:
{
"error": {
"type": "snapshot_restore_exception",
"reason": "cannot restore global state because it already exists"
}
}Solution: Exclude or Overwrite Global State
Exclude global state:
curl -X POST "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_restore" -H 'Content-Type: application/json' -d'
{
"indices": "*",
"include_global_state": false
}
'Or delete existing conflicting objects first:
```bash # List templates curl -X GET "localhost:9200/_cat/templates?v"
# Delete specific template curl -X DELETE "localhost:9200/_template/my-template" ```
Monitoring Restore Progress
Track restore progress:
curl -X GET "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_status?pretty"{
"snapshots" : [
{
"snapshot" : "snapshot-2024-01-15",
"repository" : "backup-repo",
"state" : "IN_PROGRESS",
"shards_stats" : {
"initializing" : 0,
"started" : 5,
"finalizing" : 2,
"done" : 3,
"failed" : 0,
"total" : 10
}
}
]
}Wait for completion:
curl -X GET "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_status?wait_for_completion=true&timeout=300s"Best Practices for Successful Restores
1. Test Your Backups
Regularly test restore procedures:
```bash # Create a test restore curl -X POST "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_restore" -H 'Content-Type: application/json' -d' { "indices": "logs-2024-01", "rename_pattern": "(.+)", "rename_replacement": "restored_test_$1" } '
# Verify data curl -X GET "localhost:9200/restored_test_logs-2024-01/_count"
# Clean up curl -X DELETE "localhost:9200/restored_test_*" ```
2. Use a Restore Configuration File
Create a restore configuration for complex restores:
curl -X POST "localhost:9200/_snapshot/backup-repo/snapshot-2024-01-15/_restore" -H 'Content-Type: application/json' -d'
{
"indices": "logs-*,products,users",
"ignore_unavailable": true,
"include_global_state": false,
"rename_pattern": "(.+)",
"rename_replacement": "restored_$1",
"include_aliases": false
}
'3. Verify Index Settings After Restore
```bash curl -X GET "localhost:9200/restored_logs-2024-01/_settings?pretty"
# Adjust replica count if needed curl -X PUT "localhost:9200/restored_logs-2024-01/_settings" -H 'Content-Type: application/json' -d' { "index": { "number_of_replicas": 1 } } ' ```
Troubleshooting Checklist
When restore fails, check these in order:
- 1.Repository accessible?
GET _snapshot/repo/_verify - 2.Snapshot valid?
GET _snapshot/repo/snapshot - 3.Index exists?
GET _cat/indices - 4.Disk space available?
GET _cat/allocation - 5.Allocation enabled?
GET _cluster/settings - 6.Version compatible? Check snapshot vs cluster version
- 7.Shards allocating?
GET _cat/shards?v&s=state
Summary
Snapshot restore failures are usually caused by:
- 1.Existing indices with same name
- 2.Version incompatibility
- 3.Repository access issues
- 4.Corrupted snapshots
- 5.Disk space limitations
- 6.Shard allocation settings
- 7.Global state conflicts
Resolve by closing or renaming indices, ensuring repository access, freeing disk space, and using proper restore options. Always test your backup restore process before you need it in an emergency.