Introduction

Spring Boot Actuator's /actuator/health endpoint aggregates the status of all registered health indicators. By default, this includes database connectivity, disk space, and any custom HealthIndicator beans. If any indicator performs a slow operation (database query with connection pool wait, external HTTP call without timeout, filesystem check on NFS mount), the entire health endpoint becomes slow. This causes Kubernetes readiness probes to fail, load balancer health checks to timeout, and deployment rollbacks.

Symptoms

  • /actuator/health takes 5-30 seconds to respond
  • Kubernetes pod marked as NotReady due to probe timeout
  • Load balancer removes instance from rotation
  • Health endpoint slow only during high load or database contention
  • HealthContributor timeouts not configured
bash
# Slow health check response
$ time curl http://localhost:8080/actuator/health
{"status":"UP","components":{"db":{"status":"UP"},...}}
real    0m12.345s  # Should be under 1 second!

Common Causes

  • Database health indicator waiting for connection from exhausted pool
  • External service health checks without timeout
  • DiskSpaceHealthIndicator checking slow network filesystems
  • Custom HealthIndicator doing expensive operations
  • RabbitMQ/Kafka health indicators connecting to unavailable brokers

Step-by-Step Fix

  1. 1.Configure health indicator timeouts:
  2. 2.```properties
  3. 3.# Database health check timeout
  4. 4.spring.datasource.hikari.connection-timeout=5000

# Disable specific health indicators management.health.db.enabled=true management.health.diskspace.enabled=false management.health.rabbit.enabled=false

# Enable health details only for specific components management.endpoint.health.show-details=when-authorized management.endpoint.health.show-components=when-authorized ```

  1. 1.Add timeout to custom health indicators:
  2. 2.```java
  3. 3.@Component
  4. 4.public class ExternalServiceHealthIndicator implements HealthIndicator {

private final RestTemplate restTemplate;

@Override public Health health() { try { // Set explicit timeout ResponseEntity<String> response = restTemplate.execute( "https://api.example.com/health", HttpMethod.GET, null, responseExtractor -> { if (responseExtractor.getStatusCode().is2xxSuccessful()) { return Health.up().build(); } return Health.down() .withDetail("status", responseExtractor.getStatusCode()) .build(); } ); return response.getBody() != null ? Health.up().build() : Health.unknown().build(); } catch (Exception e) { return Health.down() .withDetail("error", e.getMessage()) .withDetail("timeout", "5s") .build(); } } } ```

  1. 1.Separate liveness and readiness probes:
  2. 2.```java
  3. 3.// Kubernetes liveness - is the application alive?
  4. 4.@Component
  5. 5.public class LivenessStateHealthIndicator implements HealthIndicator {
  6. 6.@Override
  7. 7.public Health health() {
  8. 8.return Health.up().build(); // Simple - just check if process is alive
  9. 9.}
  10. 10.}

// Kubernetes readiness - is the application ready to serve? @Component public class ReadinessStateHealthIndicator implements HealthIndicator { private final DataSource dataSource;

@Override public Health health() { try { // Quick check - just validate connection, don't query dataSource.getConnection().close(); return Health.up().build(); } catch (SQLException e) { return Health.down().withDetail("db", e.getMessage()).build(); } } } ```

  1. 1.Configure separate health endpoint groups:
  2. 2.```properties
  3. 3.# Liveness probe - minimal checks
  4. 4.management.health.livenessState.enabled=true
  5. 5.management.endpoint.health.probes.enabled=true

# Readiness probe - include database management.endpoint.health.group.readiness.include=db,redis

# Custom group for monitoring management.endpoint.health.group.custom.include=db,redis,diskSpace management.endpoint.health.group.custom.show-details=always ```

Prevention

  • Keep health checks under 1 second total response time
  • Set connection timeouts on all health indicator dependencies
  • Use Kubernetes readiness/liveness probe separation
  • Monitor health endpoint response times in production
  • Add circuit breakers to external service health checks
  • Use @ConditionalOnProperty to enable/disable health indicators per environment
  • Test health endpoint under load before deploying