Introduction Cloud DNS resolution failures cause service discovery issues, broken internal communication, and failed deployments. When DNS records in managed zones cannot be resolved, applications cannot find services by name.
Symptoms - `nslookup my-service.internal` returns NXDOMAIN or SERVFAIL - Applications fail with "could not resolve host" errors - DNS queries timeout after several seconds - Only specific VPCs affected while others resolve correctly
Common Causes - Private zone not bound to the querying VPC network - DNS peering zone not configured correctly - Inbound/outbound forwarding rules misconfigured - Record set propagation delay after creation/update - Conflicting DNS policies
Step-by-Step Fix 1. **Check if private zone is bound to the VPC**: ```bash gcloud dns managed-zones describe <zone-name> --format="value(dnsName,visibility)" ``` If the querying VPC is not in the networks list, add it: ```bash gcloud dns managed-zones update <zone-name> --networks default,vpc-a ```
- 1.Check VPC DNS policy:
- 2.```bash
- 3.gcloud dns policies list --format="value(name,enableInboundForwarding)"
- 4.
` - 5.Verify record sets exist:
- 6.```bash
- 7.gcloud dns record-sets list --zone <zone-name> --format="table(name,type,ttl,rrdatas)"
- 8.
` - 9.Test DNS resolution from a VM in the VPC:
- 10.```bash
- 11.gcloud compute ssh <vm-name> --zone <zone> --command "nslookup my-service.internal"
- 12.
`