Introduction
HashiCorp Vault secret rotation failures occur when automated credential rotation cannot complete successfully, leaving applications with expired or soon-to-expire credentials. Vault provides dynamic secrets with automatic rotation for databases, cloud providers, PKI certificates, and custom backends. When rotation fails, applications receive authentication errors, database connections fail, or services lose access to cloud resources. Common causes include lease renewal errors, backend system unavailability, permission changes, authentication method expiration, or network connectivity issues between Vault and the target system.
Symptoms
- Application logs show
403 Permission Deniedorlease expirederrors - Vault audit logs show
lease renewal failedorsecret rotation failed - Database connections fail with
password authentication failed - Cloud API calls return
InvalidClientTokenIdorSignatureDoesNotMatch - Certificates expired with no renewal
- Issue appears after backend credential change, Vault upgrade, network policy update, or TTL configuration change
- Vault CLI shows
Error looking up secretorlease cannot be renewed
Common Causes
- Dynamic secret backend (database, AWS, PKI) misconfigured or unreachable
- Lease TTL too short for application renewal cycle
- Root credentials for dynamic backend expired or changed
- AppRole authentication expired or secret_id consumed
- Database user permissions changed, rotation script lacks privileges
- Network connectivity lost between Vault and target system
- Vault seal status preventing secret generation
- Audit device failure blocking secret operations
Step-by-Step Fix
### 1. Check Vault health and seal status
Verify Vault is operational:
```bash # Check Vault status vault status
# Expected output: # Key Value # --- ----- # Seal Type shamir # Sealed false # Auto-Unseal aws-kms # Started true # Version 1.15.4 # Cluster Name vault-cluster # Cluster ID abc123... # HA Enabled true # HA Cluster https://10.0.1.10:8201
# If Sealed = true, unseal Vault vault operator unseal <unseal-key>
# For auto-unseal, check cloud provider connectivity vault status -format json | jq '.sealed' ```
Check HA cluster health:
```bash # Check if node is active leader vault operator raft list-peers
# Check cluster health vault operator raft health
# For Consul HA backend consul kv get vault/sys/leader
# Expected: {"index":123,"value":"active"} ```
### 2. Verify lease status and renewal
Check specific lease health:
```bash # List all leases vault list sys/leases/lookup
# Lookup specific lease vault lease lookup <lease-path>
# Example: vault lease lookup database/creds/myapp-role-abc123
# Output: # Key Value # --- ----- # Lease ID database/creds/myapp-role-abc123/xyz789 # Lease Duration 1h # Lease Start 2026-03-30T10:00:00Z # Lease End 2026-03-30T11:00:00Z # Renewable true # TTL 3600 # Max TTL 24h
# Check if lease is expired vault lease lookup <lease-id> | grep -E "TTL|Lease End"
# If TTL is negative or Lease End is in past, lease expired ```
Renew lease manually:
```bash # Renew single lease vault lease renew <lease-id>
# Renew with specific increment (seconds) vault lease renew -increment=3600 <lease-id>
# Renew all leases under path vault lease renew -force database/creds/myapp-role
# Check renewable status vault lease lookup <lease-id> | grep Renewable # If false, lease cannot be renewed - must get new credentials ```
### 3. Check dynamic secrets backend configuration
Verify database secrets engine:
```bash # List secrets engines vault secrets list
# Check database configuration vault read database/config/<db-name>
# Expected: # Key Value # --- ----- # allowed_roles [myapp-role, readonly-role] # connection_details map[connection_url:postgresql://vault:***@db.example.com:5432/mydb?sslmode=disable max_open_connections:10] # plugin_name postgresql-database-plugin
# Test database connection vault write -force database/rotate-root/<db-name>
# Output should show: # Success! Data written to: database/rotate-root/mydb # New root password generated and stored ```
If rotation fails:
```bash # Check Vault has correct permissions vault read database/config/<db-name> -format json | \ jq '.data.connection_details.connection_url'
# Verify Vault database user has ROTATE permission # PostgreSQL: ALTER ROLE, CREATE ROLE, DROP ROLE # MySQL: CREATE USER, DROP USER, ALTER USER # SQL Server: ALTER LOGIN, CREATE LOGIN, DROP LOGIN
# Manually test connection from Vault server psql "postgresql://vault:***@db.example.com:5432/mydb" -c "SELECT 1" ```
### 4. Check role configuration and TTL settings
Verify role TTL configuration:
```bash # Read role configuration vault read database/roles/<role-name>
# Output: # Key Value # --- ----- # Creation Statements [CREATE ROLE "{{name}}" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; GRANT SELECT ON ALL TABLES IN SCHEMA public TO "{{name}}";] # Default TTL 1h # Max TTL 24h # Renew Statements [] # Rollback Statements [] # DB Name mydb
# Check TTL is appropriate for workload # Default TTL: How long credentials are valid # Max TTL: Maximum renewal period # If Default TTL too short, applications can't renew in time ```
Update role TTL:
```bash # Increase TTL for long-running jobs vault write database/roles/<role-name> \ db_name=<db-name> \ creation_statements=@creation.sql \ default_ttl=4h \ max_ttl=72h
# Or with explicit renewal statements vault write database/roles/<role-name> \ db_name=<db-name> \ creation_statements=@creation.sql \ renewal_statements=@renewal.sql \ default_ttl=1h \ max_ttl=24h ```
### 5. Check authentication method status
Verify AppRole authentication:
```bash # List auth methods vault auth list
# Check AppRole configuration vault read auth/approle/role/<role-name>
# Output: # Key Value # --- ----- # bind_secret_id true # secret_id_bound_cidrs [10.0.0.0/8] # secret_id_num_uses 0 # secret_id_ttl 24h # token_ttl 1h # token_max_ttl 24h
# Generate new secret_id vault write -f auth/approle/role/<role-name>/secret-id
# Output: # Key Value # --- ----- # secret_id abc123-xyz789 # secret_id_accessor def456-uvw012
# Check if secret_id is expired vault read auth/approle/role/<role-name>/secret-id/lookup/<secret-id>
# If TTL expired, generate new secret_id ```
Re-authenticate:
```bash # Login with AppRole vault write auth/approle/login \ role_id=<role-id> \ secret_id=<secret-id>
# Output: # Key Value # --- ----- # token s.abc123... # token_accessor A.abc123... # token_duration 1h # token_renewable true
# Store token for application export VAULT_TOKEN=s.abc123... ```
### 6. Check AWS secrets engine rotation
For AWS dynamic credentials:
```bash # Read AWS backend configuration vault read aws/config/root
# Output: # Key Value # --- ----- # access_key AKIA... # max_retries 5 # region us-east-1 # secret_type iam_user
# Check IAM user permissions aws iam get-user --user-name <vault-created-user>
# Verify Vault has IAM permissions to create users aws iam list-attached-user-policies --user-name <vault-created-user>
# Expected policies: # - IAMUserChangePassword # - IAMUserCreateAccessKey ```
Rotate AWS credentials:
```bash # Force root credential rotation vault write -force aws/rotate-root
# Generate new IAM user credentials vault read aws/creds/<role-name>
# Output: # Key Value # --- ----- # access_key AKIA... # secret_key *** # security_token <if STS> # ttl 1h
# Check IAM user isn't at limit (max 2 access keys per user) aws iam list-access-keys --user-name <vault-created-user> ```
### 7. Check PKI certificate rotation
For PKI secrets engine:
```bash # Check CA certificate expiration vault read pki/cert/ca | openssl x509 -noout -dates
# Check role configuration vault read pki/roles/<role-name>
# Output: # Key Value # --- ----- # allow_bare_domains false # allow_ip_sans true # allow_subdomains true # allow_token_displayname false # allowed_domains [example.com] # allowed_uri_sans [] # client_flag true # code_signing_flag false # email_protection_flag false # key_bits 2048 # key_type rsa # max_ttl 720h # not_after n/a # require_cn true # server_flag true # ttl 24h # use_csr_common_name true # use_csr_sans true
# Generate new certificate vault write pki/issue/<role-name> \ common_name=app.example.com \ ttl=24h
# Check CA chain is valid vault read pki/cert/ca vault read pki/cert/crl ```
Certificate renewal:
```bash # Renew certificate (if allowed by role) vault write pki/renew/<cert-serial>
# Or re-issue with same parameters vault write pki/issue/<role-name> \ common_name=app.example.com \ ttl=24h ```
### 8. Check network connectivity between Vault and backends
Verify Vault can reach target systems:
```bash # From Vault server, test database connectivity nc -zv db.example.com 5432
# Test AWS endpoint connectivity curl -s https://iam.amazonaws.com -o /dev/null -w "%{http_code}"
# Check Vault server firewall rules sudo iptables -L -n | grep -E "5432|443"
# Check security group (if in cloud) aws ec2 describe-security-groups \ --filters "Name=group-id,Values=<vault-sg-id>" \ --query 'SecurityGroups[0].IpPermissionsEgress'
# Verify VPC routing (for AWS) aws ec2 describe-route-tables \ --filters "Name=association.subnet-id,Values=<vault-subnet-id>" ```
### 9. Check audit logs for rotation failures
Enable and review audit logs:
```bash # Check audit device status vault audit list
# If not enabled, enable file audit vault audit enable file file_path=/var/log/vault/audit.log
# Or enable syslog audit vault audit enable syslog tag=vault-audit
# Search for rotation failures grep -E "rotation failed|lease renewal|secret rotation" /var/log/vault/audit.log | tail -50
# Parse JSON audit logs jq 'select(.request.operation == "lease-renew" and .response.error != null)' /var/log/vault/audit.log
# Check for permission denied errors grep "permission denied" /var/log/vault/audit.log | tail -20 ```
Audit log analysis:
```bash # Count errors by type jq -r '.response.error' /var/log/vault/audit.log | sort | uniq -c | sort -rn
# Find failed authentication jq 'select(.request.path == "auth/approle/login" and .response.auth == null)' /var/log/vault/audit.log
# Check secret access patterns jq 'select(.request.path | startswith("database/creds"))' /var/log/vault/audit.log | \ jq -r '.request.path' | sort | uniq -c ```
### 10. Set up secret rotation monitoring
Monitor lease expiration:
```bash # Create monitoring script #!/bin/bash
# Check leases expiring soon vault list sys/leases/lookup -format json | \ jq -r '.[]' | while read lease_path; do vault lease lookup "$lease_path" -format json | \ jq -r ' select(.ttl < 300) | "WARNING: Lease \(.id) expires in \(.ttl) seconds" ' done
# Check for expired leases vault list sys/leases/lookup -format json | \ jq -r '.[]' | while read lease_path; do vault lease lookup "$lease_path" -format json | \ jq -r ' select(.ttl < 0) | "CRITICAL: Lease \(.id) has expired" ' done ```
Prometheus metrics:
```yaml # vault_exporter configuration scrape_configs: - job_name: vault static_configs: - targets: ['localhost:9091']
# Key metrics to alert: # vault_lease_count - Total leases # vault_lease_expire_soon - Leases expiring within threshold # vault_secret_ttl - Time until secret expiration # vault_database_connection_count - DB connections in pool ```
Alert thresholds:
- vault_lease_expire_soon > 10: Warning
- vault_secret_ttl < 300: Critical (5 minutes)
- vault_database_connection_count = 0: Backend unreachable
Prevention
- Set TTL values appropriate for application renewal cycles
- Configure automatic lease renewal in application SDK
- Monitor lease expiration with Prometheus/Grafana
- Set up alerts for backend connectivity failures
- Rotate root credentials on schedule before expiration
- Test secret rotation in staging before production changes
- Document rollback procedures for each secrets backend
- Use response wrapping for sensitive secret delivery
Related Errors
- **Permission Denied**: Token lacks required capabilities
- **Lease Expired**: Credential no longer valid
- **Connection Refused**: Backend system unreachable
- **Invalid Token**: Authentication token expired or revoked