Introduction
A platform migration can move workloads successfully while the service mesh still sends east-west traffic to the old service. Requests appear healthy at the application level, but data comes from the previous cluster, one version keeps receiving internal calls, or failover behaves strangely because virtual routing inside the mesh was not fully updated during the migration.
Treat this as an internal traffic-policy problem instead of a generic microservice bug. Start by checking which destination the sidecars or mesh proxy actually choose at runtime, because migrations often replace deployments and DNS while virtual services, destination rules, or mesh discovery still favor the retired target.
Symptoms
- Internal service-to-service calls still reach the old environment after migration
- One namespace or workload works correctly while another still depends on the previous service instance
- Mesh metrics show traffic to a destination that should already be retired
- New deployments are healthy, but dependent services still read stale data or hit old APIs
- Canary or failover behavior does not match the intended post-migration routing
- The issue started after moving clusters, mesh control planes, or service discovery boundaries
Common Causes
- A virtual service or traffic policy still routes to the old host, subset, or cluster-local service name
- Destination rules, service entries, or mesh gateways still reference the previous endpoint
- Sidecars or proxies are running stale xDS or route configuration and never picked up the new destination
- Multi-cluster or cross-network discovery still advertises the old service first
- mTLS, SNI, or mesh gateway settings make the new target unusable, so traffic falls back to the legacy route
- Validation checked pod readiness and DNS resolution but not the live mesh path between services
Step-by-Step Fix
- Capture one affected internal request and identify which destination service, cluster, or subset the mesh actually selected, because the runtime route is what determines where traffic goes.
- Compare that live destination with the intended post-migration service target, because one stale virtual route can keep every dependent workload attached to the old environment.
- Review virtual services, destination rules, service entries, mesh gateways, and multi-cluster discovery objects for any remaining reference to the retired service, because mesh routing usually spans several resources.
- Check sidecar or proxy config on affected workloads and confirm they received current route data from the control plane, because stale xDS state can preserve old paths after manifests were updated.
- Update the authoritative routing objects and restart sidecars or workloads only if needed to force fresh config distribution, because control-plane changes alone do not always clear old proxy state immediately.
- Generate a controlled internal request and confirm telemetry shows the intended service, namespace, and cluster receiving it, because success at the HTTP layer does not prove the mesh chose the right backend.
- Verify the old service stops receiving east-west traffic from migrated workloads, because dual routing inside a mesh can stay hidden longer than edge traffic problems.
- Review retries, failover, circuit breaking, and outlier detection policies if requests still behave unpredictably, because those features can continue steering traffic in unexpected ways after the primary route changes.
- Document who owns mesh routing, multi-cluster discovery, and cutover validation so future migrations test the actual service graph instead of only workload health.