Introduction Service mesh DNS resolution can add latency to every inter-service call if not properly optimized. This accumulates across service chains, significantly impacting end-to-end latency.
Symptoms - Service-to-service calls taking longer than expected - DNS resolution time contributing significantly to latency - CoreDNS CPU usage high - Service mesh proxy resolving DNS for every request - Intermittent DNS timeouts
Common Causes - Envoy proxy not caching DNS resolution - CoreDNS overwhelmed by resolution requests - DNS search domains adding resolution overhead - TTL too short causing frequent re-resolution - DNS resolution happening in request path
Step-by-Step Fix 1. **Check DNS resolution time': ```bash kubectl exec <pod> -- time nslookup my-service.my-namespace.svc.cluster.local ```
- 1.**Enable DNS caching in Envoy':
- 2.```yaml
- 3.trafficPolicy:
- 4.connectionPool:
- 5.http:
- 6.h2UpgradePolicy: DEFAULT
- 7.dnsRefreshRate: 60s # Cache DNS for 60 seconds
- 8.
` - 9.**Optimize CoreDNS':
- 10.```bash
- 11.kubectl edit configmap coredns -n kube-system
- 12.# Increase cache size
- 13.# cache 30
- 14.
`