Introduction

TTL (Time to Live) controls how long DNS resolvers cache your records. It's the fundamental mechanism that balances DNS efficiency with update speed. TTL issues manifest as stale records persisting too long, or as excessive query load from too-short TTLs. Understanding and managing TTL is critical for planned changes, troubleshooting propagation, and optimizing DNS performance.

Symptoms

  • Old IP addresses returned after changing DNS records
  • Changes propagate slower than expected
  • Inconsistent resolution across different networks/users
  • DNS queries return old records despite zone file updates
  • Propagation takes days instead of hours
  • Website migration users still see old server
  • High DNS query volume from short TTLs

Common Causes

  • TTL set too high before making changes
  • Negative caching (NXDOMAIN) with long TTL
  • Recursive resolvers ignoring TTL and caching longer
  • Not lowering TTL before planned changes
  • Multiple layers of caching (browser, OS, resolver, ISP)
  • TTL not respected by some DNS providers
  • SOA minimum TTL affecting negative responses

Step-by-Step Fix

  1. 1.Check current TTL values on your records.

```bash # Check TTL on A record dig example.com A

# Look at the TTL column in the answer section: # example.com. 3600 IN A 192.0.2.1 # ^^^^ # TTL in seconds

# Get just the TTL dig example.com A +short # This doesn't show TTL, use: dig example.com A | grep example.com | head -1 | awk '{print $2}'

# Check TTL on specific record type dig www.example.com A dig example.com MX dig example.com NS

# Compare authoritative vs cached echo "=== Authoritative TTL ===" dig @ns1.yourprovider.com example.com A | grep "example.com" | head -1

echo "=== Cached TTL (resolver) ===" dig @8.8.8.8 example.com A | grep "example.com" | head -1 ```

  1. 1.Understand TTL hierarchy and caching layers.

```bash # Multiple layers of caching affect propagation: # 1. Browser DNS cache # 2. OS DNS cache # 3. Local resolver/cache # 4. ISP recursive resolver # 5. Other public resolvers

# Check OS cache # Linux (systemd-resolved): systemd-resolve --statistics | grep "DNS cache"

# macOS: dscacheutil -statistics

# Windows: ipconfig /displaydns

# Browser cache (Chrome): # chrome://net-internals/#dns

# TTL flows down from authoritative to all caches # Each cache honors TTL independently # Maximum wait = TTL at time of query ```

  1. 1.Check negative cache TTL for non-existent records.

```bash # NXDOMAIN responses are cached too # This is controlled by SOA minimum field

# Check SOA minimum (negative cache TTL) dig example.com SOA +short # Output: ns1 admin serial refresh retry expire MINIMUM # ^^^^^^^ # Last field is negative cache TTL

# Example: # example.com SOA ns1.example.com admin.example.com 2026040401 3600 600 86400 3600 # Last 3600 = 1 hour negative cache

# Test non-existent subdomain dig nonexistent123.example.com A

# Look at SOA in authority section for negative TTL: # ;; AUTHORITY SECTION: # example.com. 3600 IN SOA ...

# Negative caching affects: # - Typos in hostnames # - Records that don't exist yet # - Deleted records still returning NXDOMAIN ```

  1. 1.Flush DNS caches at different levels.

```bash # Flush local system cache

# Linux (systemd-resolved): sudo systemd-resolve --flush-caches sudo systemctl restart systemd-resolved

# Linux (nscd): sudo systemctl restart nscd

# Linux (dnsmasq): sudo systemctl restart dnsmasq

# macOS: sudo dscacheutil -flushcache sudo killall -HUP mDNSResponder

# Windows: ipconfig /flushdns

# Browser DNS cache: # Chrome: chrome://net-internals/#dns -> Clear host cache # Firefox: about:networking#dns -> Clear DNS Cache

# Note: You cannot flush ISP or public resolver caches # You must wait for TTL to expire ```

  1. 1.Prepare for planned DNS changes by lowering TTL.

```bash # Best practice: Lower TTL 24-48 hours before planned change

# Original TTL (e.g., 1 day): example.com. 86400 IN A 192.0.2.1

# Step 1: Lower TTL to 300-600 seconds (24-48 hours before change) example.com. 300 IN A 192.0.2.1

# Step 2: Make the change (TTL is now low, quick propagation) example.com. 300 IN A 192.0.2.2

# Step 3: Raise TTL back (after propagation confirmed) example.com. 3600 IN A 192.0.2.2

# Calculate timing: # 1. Lower TTL to 300 seconds # 2. Wait for old TTL (86400) to expire = 24 hours # 3. All caches now have 300 second TTL # 4. Make change # 5. Maximum propagation = 300 seconds (5 minutes) # 6. Verify change # 7. Raise TTL back to 3600 or 86400 ```

  1. 1.Check TTL on records across authoritative servers.

```bash # Verify TTL is consistent across all authoritative servers

for ns in $(dig example.com NS +short); do echo "=== ${ns%.} ===" dig @${ns%.} example.com A | grep -E "^example.com|^www.example" done

# All servers should show same TTL # If different, they may have different zone file versions

# Check serial numbers match too for ns in $(dig example.com NS +short); do echo -n "${ns%.}: " dig @${ns%.} example.com SOA +short | awk '{print $1}' done ```

  1. 1.Monitor TTL countdown on cached records.

```bash # Watch TTL decrease on cached record # This shows TTL counting down

domain="example.com" resolver="8.8.8.8"

echo "Watching TTL countdown for $domain via $resolver" echo "Press Ctrl+C to stop"

while true; do ttl=$(dig @$resolver $domain A | grep "$domain" | head -1 | awk '{print $2}') echo "$(date '+%H:%M:%S') TTL: $ttl seconds" sleep 5 done

# If TTL doesn't decrease, record might be re-queried # If TTL stays high, resolver might be ignoring TTL

# Check authoritative TTL echo "Authoritative TTL:" dig @ns1.yourprovider.com $domain A | grep "$domain" | head -1 ```

  1. 1.Diagnose resolvers ignoring TTL.

```bash # Some resolvers cache longer than TTL (violates RFC)

# Test by checking TTL over time # Query same resolver repeatedly: for i in {1..10}; do echo -n "Query $i: " dig @8.8.8.8 example.com A | grep "example.com" | head -1 | awk '{print $2}' sleep 60 done

# If TTL stays at same high value, resolver may be: # 1. Serving from cache with extended TTL # 2. Re-fetching and resetting TTL # 3. Ignoring your authoritative TTL

# Compare with authoritative: echo "Authoritative TTL:" dig @ns1.yourprovider.com example.com A | grep "example.com" | head -1

# Test multiple resolvers for resolver in 8.8.8.8 1.1.1.1 9.9.9.9; do echo -n "$resolver: " dig @$resolver example.com A | grep "example.com" | head -1 | awk '{print $2}' done ```

  1. 1.Handle TTL for specific record types.

```bash # Different record types may have different TTL needs

# High traffic records (longer TTL): # - A/AAAA records for stable IPs: 3600-86400 # - MX records: 3600-14400 # - NS records: 86400+

# Frequently changed records (shorter TTL): # - Development/staging environments: 60-300 # - Load balanced with failover: 30-60 # - CDN/proxy endpoints: 60-300

# Check current TTLs for all record types echo "=== All TTLs for example.com ===" for type in A AAAA MX NS TXT SRV CNAME; do result=$(dig @ns1.yourprovider.com example.com $type +short 2>/dev/null | head -1) if [ -n "$result" ]; then ttl=$(dig @ns1.yourprovider.com example.com $type | grep "example.com" | head -1 | awk '{print $2}') echo "$type: TTL=$ttl" fi done

# BIND zone file with mixed TTLs: example.com. 86400 IN SOA ... ; SOA long TTL example.com. 3600 IN A 192.0.2.1 ; A medium TTL www.example.com. 300 IN CNAME example.com. ; www short TTL for changes example.com. 14400 IN MX 10 mail.example.com. ; MX medium TTL ```

  1. 1.Fix propagation issues with long TTL records.

```bash # When you already changed a record but old TTL is still cached

# Option 1: Wait for TTL to expire # Maximum wait = old TTL at time of change

# Option 2: Use a new hostname # Create new record with new name new.example.com. 300 IN A 192.0.2.2

# Option 3: Communicate with users # Tell them to flush DNS or wait

# Option 4: Check if authoritative shows correct value dig @ns1.yourprovider.com example.com A +short

# If authoritative is correct, issue is cached resolvers # Wait for TTL to expire globally

# Estimate propagation: old_ttl=86400 # 1 day time_elapsed=36000 # 10 hours remaining=$((old_ttl - time_elapsed)) echo "Estimated remaining propagation time: $remaining seconds ($(($remaining/3600)) hours)"

# Use online propagation checkers: # - https://dnschecker.org # - https://www.whatsmydns.net # These check from multiple global locations ```

Verification

Complete TTL verification checklist:

```bash # 1. Check authoritative TTL echo "=== Authoritative TTL ===" dig @ns1.yourprovider.com example.com A | grep "example.com" | head -1

# 2. Check cached TTL at multiple resolvers echo -e "\n=== Cached TTLs ===" for resolver in 8.8.8.8 1.1.1.1; do echo -n "$resolver: " dig @$resolver example.com A | grep "example.com" | head -1 | awk '{print "TTL="$2", IP="$5}' done

# 3. Check negative cache TTL echo -e "\n=== Negative Cache TTL (SOA minimum) ===" dig @ns1.yourprovider.com example.com SOA +short | awk '{print "Minimum="$7" seconds"}'

# 4. Check all record TTLs echo -e "\n=== All Record TTLs ===" for type in A MX NS; do ttl=$(dig @ns1.yourprovider.com example.com $type | grep "example.com" | head -1 | awk '{print $2}') echo "$type: $ttl seconds" done

# 5. Verify TTL is appropriate for use case echo -e "\n=== TTL Recommendations ===" echo "Static websites: 3600-86400" echo "Frequent changes: 300-1800" echo "Load balanced: 30-60" echo "Development: 60-300" ```

TTL Cheat Sheet

```bash # TTL Quick Reference: # 60 = 1 minute # 300 = 5 minutes (good for changes) # 600 = 10 minutes # 1800 = 30 minutes # 3600 = 1 hour # 7200 = 2 hours # 14400 = 4 hours # 28800 = 8 hours # 43200 = 12 hours # 86400 = 1 day # 604800 = 1 week

# Change Process: # 1. [T-48h] Lower TTL to 300 # 2. [T-0h] Make change # 3. [T+5m] Verify change at multiple resolvers # 4. [T+1h] Raise TTL to 3600 # 5. [T+24h] Raise TTL to 86400 (optional) ```

Remember: TTL is a suggestion to resolvers, not a command. Some resolvers may cache longer or shorter than specified.