Introduction Service mesh DNS resolution can add latency to every inter-service call if not properly optimized. This accumulates across service chains, significantly impacting end-to-end latency.

Symptoms - Service-to-service calls taking longer than expected - DNS resolution time contributing significantly to latency - CoreDNS CPU usage high - Service mesh proxy resolving DNS for every request - Intermittent DNS timeouts

Common Causes - Envoy proxy not caching DNS resolution - CoreDNS overwhelmed by resolution requests - DNS search domains adding resolution overhead - TTL too short causing frequent re-resolution - DNS resolution happening in request path

Step-by-Step Fix 1. **Check DNS resolution time': ```bash kubectl exec <pod> -- time nslookup my-service.my-namespace.svc.cluster.local ```

  1. 1.**Enable DNS caching in Envoy':
  2. 2.```yaml
  3. 3.trafficPolicy:
  4. 4.connectionPool:
  5. 5.http:
  6. 6.h2UpgradePolicy: DEFAULT
  7. 7.dnsRefreshRate: 60s # Cache DNS for 60 seconds
  8. 8.`
  9. 9.**Optimize CoreDNS':
  10. 10.```bash
  11. 11.kubectl edit configmap coredns -n kube-system
  12. 12.# Increase cache size
  13. 13.# cache 30
  14. 14.`

Prevention - Use IP-based routing instead of DNS where possible - Enable DNS caching in service mesh proxy - Monitor DNS resolution latency - Optimize CoreDNS configuration for mesh scale - Use service mesh native service discovery