What's Actually Happening
DNS resolution intermittently times out, causing applications to fail connecting to services. Some queries succeed while others fail without a clear pattern, affecting system reliability.
The Error You'll See
Intermittent DNS failure:
```bash $ curl https://example.com
curl: (6) Could not resolve host: example.com
# Try again: $ curl https://example.com # Works fine ```
nslookup timeout:
```bash $ nslookup example.com
;; connection timed out; no servers could be reached
# Second attempt: $ nslookup example.com Server: 8.8.8.8 Address: 8.8.8.8#53
Non-authoritative answer: Name: example.com Address: 93.184.216.34 ```
Application errors:
# In application logs:
java.net.UnknownHostException: api.example.com: Name or service not known
# or
getaddrinfo failed: Temporary failure in name resolutionWhy This Happens
- 1.DNS server overload - Server cannot handle query volume
- 2.Network packet loss - UDP packets dropped
- 3.Firewall rules - Intermittently blocking DNS
- 4.DNS server unreachable - Temporary connectivity issues
- 5.Resolver cache issues - Local resolver problems
- 6.Multiple DNS servers misconfigured - Failover not working
Step 1: Check Current DNS Configuration
```bash # Check resolver configuration cat /etc/resolv.conf
# Check for multiple nameservers grep nameserver /etc/resolv.conf
# Check search domains grep search /etc/resolv.conf
# Check DNS options grep options /etc/resolv.conf
# Check systemd-resolved status systemctl status systemd-resolved
# Check NetworkManager DNS nmcli dev show | grep DNS
# Check if using local DNS cache ps aux | grep -E "dnsmasq|unbound|systemd-resolve" ```
Step 2: Test DNS Resolution
```bash # Test with dig (shows timing) dig example.com
# Test specific DNS server dig @8.8.8.8 example.com
# Test with timing info dig +stats example.com
# Query statistics: ;; Query time: 50 msec ;; SERVER: 8.8.8.8#53(8.8.8.8) ;; WHEN: Wed Apr 16 00:59:00 UTC 2026 ;; MSG SIZE rcvd: 56
# Test with nslookup nslookup example.com 8.8.8.8
# Test with host host example.com 8.8.8.8
# Test UDP connectivity nc -uzv 8.8.8.8 53
# Test TCP DNS (for large responses) dig +tcp @8.8.8.8 example.com ```
Step 3: Run Continuous DNS Tests
```bash # Continuous DNS monitoring while true; do echo "$(date): $(dig +short @8.8.8.8 example.com 2>&1)" sleep 1 done
# Track success/failure rate cat << 'EOF' > dns_test.sh #!/bin/bash SERVER=8.8.8.8 DOMAIN=example.com COUNT=100 SUCCESS=0 FAIL=0
for i in $(seq 1 $COUNT); do RESULT=$(dig +short +time=2 @$SERVER $DOMAIN 2>&1) if [ -n "$RESULT" ]; then ((SUCCESS++)) echo -n "." else ((FAIL++)) echo -n "X" fi done
echo "" echo "Success: $SUCCESS/$COUNT" echo "Failed: $FAIL/$COUNT" echo "Success rate: $(echo "scale=2; $SUCCESS*100/$COUNT" | bc)%" EOF
chmod +x dns_test.sh ./dns_test.sh
# Test all configured nameservers for ns in $(grep nameserver /etc/resolv.conf | awk '{print $2}'); do echo "Testing $ns..." for i in {1..10}; do dig +short +time=2 @$ns google.com >/dev/null && echo -n "." || echo -n "X" done echo "" done ```
Step 4: Check DNS Server Health
```bash # Check if DNS servers are responding for ns in 8.8.8.8 8.8.4.4; do echo "Testing $ns:" ping -c 5 $ns echo "" done
# Check DNS server latency for ns in 8.8.8.8 8.8.4.4; do echo "Latency to $ns:" for i in {1..10}; do time dig +short @$ns example.com >/dev/null 2>&1 done done
# Check if using IPv6 DNS (may have issues) dig @2001:4860:4860::8888 example.com
# Test public vs private DNS time dig @8.8.8.8 example.com time dig @192.168.1.1 example.com # Local DNS ```
Step 5: Check Network for Packet Loss
```bash # Check for packet loss to DNS servers mtr -c 100 -r 8.8.8.8
# Check interface errors ip -s link show eth0
# Look for: # RX errors, dropped, overruns # TX errors, carrier
# Check for UDP packet drops netstat -su | grep -A 5 Udp
# Check for dropped packets cat /proc/net/snmp | grep Udp
# Check firewall drops iptables -L -n -v | grep -i drop
# Monitor packet loss in real-time watch -n 1 'netstat -su | grep -E "packet receive errors|buffer errors"' ```
Step 6: Configure DNS Timeouts and Retries
```bash # Edit /etc/resolv.conf to add options:
# Increase timeout (default 5 seconds per attempt) options timeout:10
# Increase attempts (default 2) options attempts:5
# Combine options options timeout:10 attempts:5
# Rotate between nameservers options rotate
# Use EDNS0 for larger responses options edns0
# Single-request (avoid parallel queries) options single-request
# Single-request-reopen (new socket per query) options single-request-reopen
# Complete example: nameserver 8.8.8.8 nameserver 8.8.4.4 nameserver 1.1.1.1 options timeout:5 attempts:3 rotate
# For systemd-resolved, edit: # /etc/systemd/resolved.conf [Resolve] DNS=8.8.8.8 8.8.4.4 1.1.1.1 FallbackDNS=1.0.0.1 DNSStubListener=yes ReadEtcHosts=yes
# Restart systemd-resolved systemctl restart systemd-resolved ```
Step 7: Add Backup DNS Servers
```bash # Add multiple DNS servers for redundancy
# In /etc/resolv.conf: nameserver 8.8.8.8 # Google primary nameserver 8.8.4.4 # Google secondary nameserver 1.1.1.1 # Cloudflare primary nameserver 1.0.0.1 # Cloudflare secondary
# With rotation for load balancing: options rotate timeout:2 attempts:2
# For NetworkManager: nmcli con mod eth0 ipv4.dns "8.8.8.8 8.8.4.4 1.1.1.1" nmcli con up eth0
# Verify changes cat /etc/resolv.conf ```
Step 8: Set Up Local DNS Cache
```bash # Install dnsmasq for local caching apt-get install dnsmasq # Ubuntu/Debian yum install dnsmasq # CentOS/RHEL
# Configure dnsmasq cat << 'EOF' > /etc/dnsmasq.conf # Use upstream DNS servers server=8.8.8.8 server=8.8.4.4 server=1.1.1.1
# Cache size cache-size=1000
# Cache TTL min-cache-ttl=60 max-cache-ttl=3600
# Listen on localhost listen-address=127.0.0.1 bind-interfaces
# Log queries (for debugging) # log-queries EOF
# Start dnsmasq systemctl enable dnsmasq systemctl start dnsmasq
# Update resolv.conf to use local cache echo "nameserver 127.0.0.1" > /etc/resolv.conf echo "options timeout:2 attempts:2" >> /etc/resolv.conf
# Test cache dig @127.0.0.1 example.com # First query: ~50ms dig @127.0.0.1 example.com # Second query: ~1ms (cached) ```
Step 9: Check Application DNS Settings
```bash # Java applications # Add to JVM options: -Dsun.net.inetaddr.ttl=30 -Dsun.net.inetaddr.negative.ttl=10
# Or in java.security: networkaddress.cache.ttl=30 networkaddress.cache.negative.ttl=10
# Python applications # Use dnspython for explicit DNS: import dns.resolver resolver = dns.resolver.Resolver() resolver.nameservers = ['8.8.8.8', '1.1.1.1'] resolver.timeout = 5 resolver.lifetime = 10 answer = resolver.resolve('example.com', 'A')
# Node.js applications # Set DNS servers: const dns = require('dns'); dns.setServers(['8.8.8.8', '8.8.4.4']);
# glibc resolver options # In /etc/resolv.conf: options timeout:5 attempts:3 rotate ```
Step 10: Monitor DNS Resolution
```bash # Create monitoring script cat << 'EOF' > monitor_dns.sh #!/bin/bash
echo "=== DNS Configuration ===" cat /etc/resolv.conf
echo "" echo "=== DNS Server Latency ===" for ns in $(grep nameserver /etc/resolv.conf | awk '{print $2}'); do echo -n "$ns: " avg=$(for i in {1..5}; do dig +short @$ns google.com 2>/dev/null | head -1 && echo "ok" || echo "fail"; done | grep -c ok) echo "$avg/5 successful" done
echo "" echo "=== Recent DNS Failures ===" journalctl -u systemd-resolved --since "1 hour ago" | grep -i "fail|error|timeout" | tail -10
echo "" echo "=== DNS Cache Stats (dnsmasq) ===" if pgrep dnsmasq >/dev/null; then echo "Cache size: $(sudo dnsmasq --dump-cache 2>/dev/null | wc -l)" sudo kill -USR1 $(cat /var/run/dnsmasq/dnsmasq.pid) fi
echo "" echo "=== UDP Statistics ===" cat /proc/net/snmp | grep Udp EOF
chmod +x monitor_dns.sh
# Set up continuous monitoring watch -n 10 ./monitor_dns.sh
# Prometheus node exporter DNS monitoring # Add to prometheus alerts: - alert: DNSResolutionFailure expr: probe_success{job="dns_probe"} == 0 for: 2m labels: severity: critical annotations: summary: "DNS resolution failing" EOF ```
DNS Resolution Timeout Checklist
| Check | Command | Expected | |
|---|---|---|---|
| Resolver config | cat /etc/resolv.conf | Valid nameservers | |
| DNS response | dig @server domain | Returns IP | |
| Latency | dig +stats | < 100ms typical | |
| Packet loss | mtr dns-server | < 1% | |
| Multiple servers | grep nameserver | 2+ servers | |
| Timeout options | grep options | timeout >= 5 | |
| Local cache | ps aux \ | grep dnsmasq | Running |
Verify the Fix
```bash # After adjusting DNS configuration
# 1. Test consistent resolution for i in {1..20}; do dig +short google.com; done // All queries return same IP
# 2. Measure response times for i in {1..20}; do time dig +short google.com; done 2>&1 | grep real // Consistent low latency
# 3. Test all configured servers for ns in $(grep nameserver /etc/resolv.conf | awk '{print $2}'); do echo "Testing $ns:" for i in {1..5}; do dig +short @$ns google.com || echo "FAIL"; done done // All servers respond
# 4. Monitor for failures timeout 60 bash -c 'while true; do dig +short google.com || echo "FAIL at $(date)"; sleep 1; done' // No failures for 60 seconds
# 5. Check application connectivity curl -I https://google.com // Connects successfully ```
Related Issues
- [Fix DNS Resolution Failed](/articles/fix-dns-resolution-failed)
- [Fix DNS Server Not Responding](/articles/fix-dns-server-not-responding)
- [Fix DNS Propagation Delay](/articles/fix-dns-propagation-delay)