What's Actually Happening

Kubernetes pods cannot resolve DNS names. Services cannot be discovered by name, breaking cluster-internal communication and external connectivity.

The Error You'll See

DNS resolution failure:

```bash $ kubectl exec mypod -- nslookup kubernetes

;; connection timed out; no servers could be reached ```

Service discovery failed:

```bash $ kubectl exec mypod -- curl http://api-service:8080

curl: (6) Could not resolve host: api-service ```

CoreDNS error:

```bash $ kubectl logs -n kube-system coredns-xxx

[ERROR] plugin/errors: 2 kubernetes.default.svc.cluster.local. A: dns: no such host ```

Why This Happens

  1. 1.CoreDNS down - CoreDNS pods not running
  2. 2.ConfigMap misconfigured - CoreDNS configuration errors
  3. 3.Network policies - Blocking DNS traffic
  4. 4.resolv.conf wrong - Incorrect DNS configuration in pods
  5. 5.Node DNS issues - Host DNS not working
  6. 6.Cluster IP conflict - DNS service IP conflict

Step 1: Check CoreDNS Status

```bash # Check CoreDNS pods: kubectl get pods -n kube-system -l k8s-app=kube-dns

# Should show Running state

# Check CoreDNS service: kubectl get svc -n kube-system kube-dns

# Check endpoints: kubectl get endpoints -n kube-system kube-dns

# Check CoreDNS logs: kubectl logs -n kube-system -l k8s-app=kube-dns

# Check CoreDNS metrics: kubectl exec -n kube-system coredns-xxx -- curl localhost:9153/metrics ```

Step 2: Test DNS Resolution

```bash # Test DNS from pod: kubectl exec mypod -- nslookup kubernetes kubectl exec mypod -- nslookup kubernetes.default.svc.cluster.local

# Test external DNS: kubectl exec mypod -- nslookup google.com

# Test with dig: kubectl exec mypod -- dig @10.96.0.10 kubernetes.default.svc.cluster.local

# Test with curl: kubectl exec mypod -- curl -v http://kubernetes/api/v1

# Create test pod: kubectl run dns-test --rm -it --image=busybox -- nslookup kubernetes

# Check resolv.conf in pod: kubectl exec mypod -- cat /etc/resolv.conf

# Expected: # nameserver 10.96.0.10 # search default.svc.cluster.local svc.cluster.local cluster.local # options ndots:5 ```

Step 3: Check CoreDNS Configuration

```bash # Get CoreDNS ConfigMap: kubectl get configmap coredns -n kube-system -o yaml

# Corefile should have: apiVersion: v1 kind: ConfigMap metadata: name: coredns namespace: kube-system data: Corefile: | .:53 { errors health { lameduck 5s } ready kubernetes cluster.local in-addr.arpa ip6.arpa { pods insecure fallthrough in-addr.arpa ip6.arpa ttl 30 } prometheus :9153 forward . /etc/resolv.conf { max_concurrent 1000 } cache 30 loop reload loadbalance }

# Check for syntax errors: kubectl exec -n kube-system coredns-xxx -- coredns -conf /etc/coredns/Corefile -validate ```

Step 4: Check DNS Service IP

```bash # Check DNS service ClusterIP: kubectl get svc kube-dns -n kube-system -o jsonpath='{.spec.clusterIP}'

# Usually 10.96.0.10 or similar

# Verify no IP conflict: kubectl get svc -A -o wide | grep 10.96.0.10

# Check DNS service port: kubectl get svc kube-dns -n kube-system -o jsonpath='{.spec.ports}'

# Should be 53/UDP and 53/TCP

# Test connectivity to DNS service: kubectl run test --rm -it --image=busybox -- nc -zuv 10.96.0.10 53

# Check kubelet DNS configuration: ps aux | grep kubelet | grep cluster-dns ```

Step 5: Check Network Policies

```bash # List network policies: kubectl get networkpolicy -A

# Check if policy blocks DNS: kubectl describe networkpolicy -n myapp deny-all

# Allow DNS traffic: apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-dns spec: podSelector: {} policyTypes: - Egress egress: - to: - namespaceSelector: {} podSelector: matchLabels: k8s-app: kube-dns ports: - protocol: UDP port: 53 - protocol: TCP port: 53

# Apply policy: kubectl apply -f allow-dns.yaml ```

Step 6: Check Node DNS

```bash # Check node resolv.conf: cat /etc/resolv.conf

# Test DNS on node: nslookup google.com dig google.com

# Check if node can reach external DNS: ping 8.8.8.8

# Check node DNS service: systemctl status systemd-resolved

# Check if DNS ports blocked: iptables -L -n | grep 53

# Allow DNS: iptables -I INPUT -p udp --dport 53 -j ACCEPT iptables -I INPUT -p tcp --dport 53 -j ACCEPT

# Check for DNS cache issues: systemctl restart systemd-resolved ```

Step 7: Check Pod Networking

```bash # Check pod network: kubectl exec mypod -- ip route kubectl exec mypod -- ip addr

# Check pod can reach DNS service: kubectl exec mypod -- ping -c 3 10.96.0.10

# Check CNI plugin: ls /etc/cni/net.d/

# For Calico: kubectl get pods -n kube-system -l k8s-app=calico-node

# For Flannel: kubectl get pods -n kube-system -l app=flannel

# For Weave: kubectl get pods -n kube-system -l name=weave-net

# Check kube-proxy: kubectl get pods -n kube-system -l k8s-app=kube-proxy kubectl logs -n kube-system -l k8s-app=kube-proxy ```

Step 8: Check DNS Cache

```bash # CoreDNS cache statistics: kubectl exec -n kube-system coredns-xxx -- curl localhost:9153/metrics | grep coredns_cache

# Check cache hits/misses: # coredns_cache_hits_total # coredns_cache_misses_total

# Clear DNS cache (restart CoreDNS): kubectl rollout restart deployment coredns -n kube-system

# Check cache size: kubectl exec -n kube-system coredns-xxx -- curl localhost:9153/metrics | grep coredns_cache_size

# Adjust cache in Corefile: cache 30 { success 9984 30 denial 9984 5 } ```

Step 9: Fix Specific DNS Issues

```bash # Issue: External DNS not resolving # Fix: Check forward plugin in Corefile forward . /etc/resolv.conf { max_concurrent 1000 }

# Change to specific DNS servers: forward . 8.8.8.8 8.8.4.4 { max_concurrent 1000 }

# Issue: DNS loop detected # Fix: Check for loop in resolv.conf # Remove 127.0.0.1 from node resolv.conf

# Issue: DNS timeout # Fix: Increase timeout or check network forward . /etc/resolv.conf { max_concurrent 1000 policy sequential health_check 10s }

# Apply changes: kubectl rollout restart deployment coredns -n kube-system ```

Step 10: Monitor DNS Health

```bash # Create monitoring script: cat << 'EOF' > /usr/local/bin/monitor-k8s-dns.sh #!/bin/bash

echo "=== CoreDNS Pods ===" kubectl get pods -n kube-system -l k8s-app=kube-dns

echo "" echo "=== DNS Service ===" kubectl get svc -n kube-system kube-dns

echo "" echo "=== DNS Endpoints ===" kubectl get endpoints -n kube-system kube-dns

echo "" echo "=== Test Internal DNS ===" kubectl run dns-test --rm -it --image=busybox --restart=Never -- nslookup kubernetes 2>/dev/null

echo "" echo "=== Test External DNS ===" kubectl run dns-test-ext --rm -it --image=busybox --restart=Never -- nslookup google.com 2>/dev/null

echo "" echo "=== CoreDNS Metrics ===" kubectl exec -n kube-system $(kubectl get pods -n kube-system -l k8s-app=kube-dns -o jsonpath='{.items[0].metadata.name}') -- curl -s localhost:9153/metrics | grep -E "coredns_(cache|dns|forward)" EOF

chmod +x /usr/local/bin/monitor-k8s-dns.sh

# Prometheus metrics: # coredns_dns_request_duration_seconds # coredns_dns_requests_total # coredns_cache_hits_total # coredns_forward_request_duration_seconds

# Alerts: - alert: CoreDNSDown expr: kube_deployment_status_replicas_available{deployment="coredns", namespace="kube-system"} == 0 for: 2m labels: severity: critical annotations: summary: "CoreDNS deployment has no available replicas"

  • alert: DNSResolutionFailing
  • expr: rate(coredns_dns_response_rcode_count_total{rcode="SERVFAIL"}[5m]) > 0
  • for: 5m
  • labels:
  • severity: warning
  • annotations:
  • summary: "CoreDNS returning SERVFAIL errors"
  • `

Kubernetes DNS Checklist

CheckCommandExpected
CoreDNS podsget podsRunning
DNS serviceget svc kube-dnsClusterIP assigned
Endpointsget endpointsAddresses listed
Internal DNSnslookup kubernetesResolved
External DNSnslookup google.comResolved
Network policyget networkpolicyAllows DNS

Verify the Fix

```bash # After fixing DNS issues

# 1. Check CoreDNS running kubectl get pods -n kube-system -l k8s-app=kube-dns // All Running

# 2. Test internal resolution kubectl exec mypod -- nslookup kubernetes // Server: 10.96.0.10 // Address: 10.96.0.10#53 // Name: kubernetes.default.svc.cluster.local

# 3. Test external resolution kubectl exec mypod -- nslookup google.com // Resolved successfully

# 4. Test service discovery kubectl exec mypod -- curl http://api-service:8080/health // Connected

# 5. Check CoreDNS metrics kubectl exec -n kube-system coredns-xxx -- curl localhost:9153/metrics | grep coredns_dns_requests // Requests processed

# 6. Monitor ongoing /usr/local/bin/monitor-k8s-dns.sh // All tests pass ```

  • [Fix DNS Resolution Timeout Randomly](/articles/fix-dns-resolution-timeout-randomly)
  • [Fix Kubernetes Pod Not Ready](/articles/fix-kubernetes-pod-not-ready)
  • [Fix Kubernetes Service Not Accessible](/articles/fix-kubernetes-service-not-accessible)