Introduction Elasticsearch encrypts inter-node communication with TLS on the transport layer. When TLS certificates are misconfigured, expired, or missing from the trust store, nodes cannot establish connections, resulting in a split cluster where nodes cannot discover or communicate with each other.

Symptoms - Nodes cannot join the cluster, each showing as a separate single-node cluster - Elasticsearch logs show `javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_unknown` - `GET /_cluster/health` shows only one node in the cluster - `GET /_cat/nodes?v` shows fewer nodes than expected - Transport layer errors: `handshake failed for [{node_name}{node_id}{ip}{ip:9300}]`

Common Causes - Node certificates signed by different CAs not in the trust store - Certificate Common Name (CN) or Subject Alternative Name (SAN) does not match the node hostname - Expired transport certificates on one or more nodes - `elasticsearch.yml` transport TLS configuration pointing to wrong certificate paths - Mixed-mode cluster where some nodes have TLS enabled and others do not

Step-by-Step Fix 1. **Verify transport TLS configuration on each node": ```yaml # /etc/elasticsearch/elasticsearch.yml xpack.security.transport.ssl.enabled: true xpack.security.transport.ssl.verification_mode: certificate xpack.security.transport.ssl.keystore.path: certs/transport.p12 xpack.security.transport.ssl.truststore.path: certs/transport.p12 ```

  1. 1.**Check certificate details":
  2. 2.```bash
  3. 3.# Inspect the transport certificate
  4. 4.openssl pkcs12 -in /etc/elasticsearch/certs/transport.p12 -nokeys -nodes | \
  5. 5.openssl x509 -noout -dates -subject -issuer -ext subjectAltName

# Verify the certificate chain openssl pkcs12 -in /etc/elasticsearch/certs/transport.p12 -nokeys -nodes | \ openssl verify -CAfile /etc/elasticsearch/certs/ca.crt ```

  1. 1.**Regenerate transport certificates using elasticsearch-certutil":
  2. 2.```bash
  3. 3.# Generate CA (if not already done)
  4. 4./usr/share/elasticsearch/bin/elasticsearch-certutil ca \
  5. 5.--out /etc/elasticsearch/certs/ca.p12 \
  6. 6.--pass ""

# Generate transport certificates for all nodes /usr/share/elasticsearch/bin/elasticsearch-certutil cert \ --ca /etc/elasticsearch/certs/ca.p12 \ --ca-pass "" \ --out /etc/elasticsearch/certs/transport.p12 \ --dns node1,node2,node3 \ --ip 10.0.1.10,10.0.1.11,10.0.1.12 \ --pass ""

# Distribute to all nodes and set permissions sudo chown elasticsearch:elasticsearch /etc/elasticsearch/certs/*.p12 sudo chmod 640 /etc/elasticsearch/certs/*.p12 ```

  1. 1.**Verify inter-node connectivity after certificate update":
  2. 2.```bash
  3. 3.# Test TLS connection between nodes
  4. 4.openssl s_client -connect node2:9300 -cert /etc/elasticsearch/certs/transport.crt \
  5. 5.-key /etc/elasticsearch/certs/transport.key -CAfile /etc/elasticsearch/certs/ca.crt

# Check cluster formation curl -k -u elastic:password https://localhost:9200/_cat/nodes?v ```

Prevention - Use `elasticsearch-certutil` to generate all certificates from a common CA - Include all node hostnames and IPs in the certificate SAN - Set up certificate expiry monitoring for transport certificates - Use automated certificate rotation with cert-manager in Kubernetes - Test TLS configuration with `openssl s_client` after any certificate change - Document the certificate generation and distribution process - Keep a backup of the CA key for generating new node certificates