Introduction
Elasticsearch nodes cannot form cluster when discovery or network configuration incorrect. This guide provides step-by-step diagnosis and resolution.
Symptoms
Typical error output:
ERROR: Cluster formation failed
Node "es-node-1" cannot join cluster: discovery settings invalid
bootstrap checks failed: seed hosts not resolvedCommon Causes
- 1.discovery.seed_hosts not correctly configured
- 2.cluster.initial_master_nodes missing or incorrect
- 3.Network binding or firewall blocking node communication
- 4.Node with incompatible version or configuration
Step-by-Step Fix
Step 1: Check Current State
curl -XGET 'localhost:9200/_cluster/health?pretty'
curl -XGET 'localhost:9200/_nodes/_all/stats?pretty'
curl -XGET 'localhost:9200/_cat/nodes?v'Step 2: Identify Root Cause
curl -XGET 'localhost:9200/_cluster/health?pretty'
curl -XGET 'localhost:9200/_nodes/stats?pretty'Step 3: Apply Primary Fix
```bash # Update discovery settings PUT _cluster/settings { "persistent": { "discovery.seed_hosts": ["es-node-1:9300", "es-node-2:9300"], "cluster.initial_master_nodes": ["es-node-1", "es-node-2"] } }
# Verify cluster formation GET _cluster/health?wait_for_status=yellow&timeout=30s ```
Step 4: Apply Alternative Fix
```bash # Alternative fix: Check node stats GET _nodes/stats?pretty
# Update specific index settings PUT my-index/_settings { "index": { "refresh_interval": "30s" } }
# Verify the fix GET _cat/indices?v&index=my-index ```
Step 5: Verify the Fix
curl -XGET 'localhost:9200/_cluster/health?pretty'
curl -XGET 'localhost:9200/_cat/nodes?v'
# Status should be green or yellowCommon Pitfalls
- Not waiting for cluster state propagation after settings change
- Using text field for aggregations instead of keyword
- Setting circuit breaker limits too low for production workload
- Ignoring disk watermark warnings until cluster blocks
Best Practices
- Monitor cluster health regularly with _cluster/health API
- Use keyword fields for aggregations to avoid fielddata
- Set circuit breaker limits based on heap size
- Configure ILM policies for automated index management
Related Issues
- Elasticsearch Cluster Red Status
- Elasticsearch Index Not Found
- Elasticsearch Query Timeout
- Elasticsearch Node High CPU