Introduction
Validating admission webhooks sit directly in the request path for Kubernetes creates and updates. That means a broken webhook can turn a local service problem into a cluster-wide control-plane incident. If the webhook is unreachable, too slow, or scoped too broadly with failurePolicy: Fail, normal deployment and recovery operations can grind to a halt.
Symptoms
kubectl applyorkubectl createfails with webhook call errors- Many unrelated resources across namespaces are rejected
- Existing workloads continue running, but new changes cannot be applied
- Operators see timeout or TLS errors pointing at the webhook service
Common Causes
- The webhook Service or backing Pods are down
failurePolicy: Failblocks requests while the webhook is unhealthy- Timeout settings are too short for normal webhook response time
- The webhook matches more namespaces or resource types than intended
Step-by-Step Fix
- 1.Inspect the ValidatingWebhookConfiguration
- 2.Confirm failure policy, timeout, and matching scope before changing anything blindly.
kubectl get validatingwebhookconfigurations
kubectl describe validatingwebhookconfigurations my-webhook- 1.Check webhook service and Pod health
- 2.If the webhook backend is unavailable, the admission layer cannot succeed consistently.
kubectl get svc -n webhook-namespace
kubectl get pods -n webhook-namespace- 1.Temporarily reduce blast radius if needed
- 2.In an outage, moving
failurePolicytoIgnoreor narrowing scope may be the fastest way to restore cluster operations while you fix the backend. - 3.Validate webhook certificates and response timing
- 4.TLS errors and slow handlers are common root causes of “everything is blocked” incidents.
Prevention
- Keep validating webhooks narrowly scoped
- Use cautious failure policies during development and rollout
- Monitor webhook latency and availability as production dependencies
- Exclude critical system namespaces unless there is a strong reason not to