Introduction
Select correct key usage bits for RSA key transport vs ECDH key agreement scenarios requires systematic diagnosis across multiple technical layers. This guide provides enterprise-grade troubleshooting procedures with deep technical analysis suitable for complex production environments.
Symptoms and Impact Assessment
### Primary Indicators - TLS/SSL handshake failures reported by clients or load balancers - Certificate validation errors appear in browser or application logs - HTTPS connections fail intermittently or consistently depending on root cause - Related services depending on secure channels may exhibit cascading failures
### Business Impact Analysis - Customer-facing services unavailable over HTTPS causing revenue loss - API integrations broken when certificate trust cannot be established - Compliance violations for PCI-DSS, HIPAA, or SOC2 encryption requirements - Security exposure if fallback to unencrypted connections is permitted
Technical Background
### TLS Protocol Context Understanding TLS handshake mechanics, certificate chain validation, and cipher suite negotiation is essential for effective diagnosis of SSL/TLS failures.
### Certificate Authority Trust Model The PKI trust hierarchy, certificate transparency requirements, and revocation checking mechanisms all affect whether clients will accept presented certificates.
Root Cause Analysis Framework
### Diagnostic Methodology
- **Capture TLS handshake** - Use openssl s_client, Wireshark, or SSL Labs to examine certificate presentation
- **Validate certificate chain** - Verify complete chain from leaf to trusted root with openssl verify
- **Check revocation status** - Test OCSP and CRL accessibility and response validity
- **Analyze cipher compatibility** - Compare server offerings with client requirements
- **Review configuration drift** - Compare current TLS settings against known-good baseline
### Common Root Cause Categories
| Category | Typical Indicators | Investigation Priority | |----------|-------------------|----------------------| | Certificate expiration | Sudden complete failure, date-correlated | Critical | | Chain incomplete | Browser warnings, mobile client failures | High | | Cipher mismatch | Protocol-level handshake failures | High | | Revocation check timeout | Slow connections, intermittent failures | Medium | | SNI misconfiguration | Wrong certificate served for hostname | Medium | | CT policy violation | Chrome-specific rejections | Low |
Step-by-Step Remediation
### Phase 1: Immediate Triage (0-30 minutes)
- **Verify certificate validity** - Check expiration dates and subject names match expected values.
- **Test from multiple clients** - Determine if failure is universal or client-specific to narrow root cause.
- **Capture current certificate** - Save certificate and chain for analysis before any replacement.
- **Enable fallback if available** - Consider temporary CDN or load balancer certificate to restore service.
### Phase 2: Systematic Diagnosis (30-120 minutes)
- **Run SSL Labs test** - Use Qualys SSL Labs for comprehensive certificate and configuration analysis.
- **Trace certificate chain** - Use openssl s_client -showcerts to capture full chain presentation.
- **Check revocation infrastructure** - Verify OCSP responder and CRL distribution point accessibility.
- **Validate private key match** - Confirm certificate and private key are cryptographically paired.
### Phase 3: Targeted Resolution (2-8 hours)
- **Replace or renew certificate** - Install valid certificate with complete chain and matching private key.
- **Update intermediate certificates** - Ensure all required intermediate CA certificates are installed.
- **Configure cipher suites** - Enable compatible cipher suites while maintaining security requirements.
- **Verify SNI configuration** - Confirm correct certificate selection for each hostname on shared IPs.
### Phase 4: Prevention and Hardening (Post-Incident)
- **Implement certificate monitoring** - Deploy automated expiration alerts with 30-day advance warning.
- **Automate renewal** - Where possible, implement ACME-based automatic certificate renewal.
- **Document certificate inventory** - Maintain current record of all certificates, locations, and renewal dates.
- **Test renewal procedures** - Practice certificate replacement in non-production to validate runbooks.
Technical Deep Dive
### Advanced Diagnostics
For complex cases requiring deeper analysis:
- Decode certificate with openssl x509 -text to examine all extensions
- Compare certificate fingerprints across environments
- Analyze TLS session resumption behavior
- Test with openssl s_client using various protocol versions
### Common Pitfalls
Avoid these counterproductive actions:
- Replacing certificates without capturing current state for comparison
- Installing certificates without verifying private key match
- Disabling security features (like CT validation) as permanent workaround
- Skipping validation of complete chain including intermediates
Monitoring and Alerting Strategy
| Metric Category | Specific Metrics | Alert Threshold | Data Source | |----------------|------------------|-----------------|-------------| | Certificate validity | Days until expiration | <30 days warning | Certificate scanner | | Chain completeness | Chain validation status | Any validation error | Load balancer | | Cipher strength | Weak cipher usage | Any export/TLS1.0 | SSL Labs | | OCSP health | OCSP response time | >5 seconds | Synthetic monitoring |
Related References
- RFC 5280: Internet X.509 Public Key Infrastructure
- RFC 8446: TLS Protocol Version 1.3
- CA/Browser Forum Baseline Requirements
- Vendor-specific certificate deployment guides
Conclusion
Systematic SSL/TLS troubleshooting following this methodology enables efficient resolution while building organizational capability. The key principles are: capture current state before changes, validate certificate chain completely, test from multiple client perspectives, and implement proactive monitoring to prevent recurrence.