Introduction

Sometimes the origin is healthy again, but Cloudflare keeps serving an old error response from cache or from a stale edge state created during the incident. That makes the outage look unresolved even though the backend is already working. The fix is to confirm whether the edge is still reusing a cached failure and then narrow which rule, TTL, or purge gap is allowing that stale response to persist.

Symptoms

  • The site still shows an error page after the origin has recovered
  • Bypassing Cloudflare or hitting the origin directly returns a normal response
  • The problem affects some paths or regions more than others
  • A manual refresh does not clear the issue for visitors
  • The outage began during a brief backend failure but outlasted it significantly

Common Causes

  • A cache rule or page rule cached an unexpected error response
  • Edge cache TTL settings kept the stale object alive longer than expected
  • A purge request targeted the wrong hostname, path, or cache key variant
  • Custom cache key behavior caused multiple stale variants to remain active
  • The origin still returns headers that encourage reuse of a degraded response

Step-by-Step Fix

  1. Compare the response from Cloudflare with a direct origin response so you can confirm the edge is the layer still serving the stale error.
  2. Review recent cache rules, page rules, and custom cache behavior for the affected hostname and path.
  3. Inspect response headers to see whether the error was cached intentionally, revalidated poorly, or preserved under stale-serving behavior.
  4. Purge the exact failing path first, then broaden to a hostname or full cache purge only if the problem is clearly wider.
  5. Confirm the origin now returns the expected success response and sane cache headers once the backend is healthy again.
  6. Check whether query strings, device variants, cookies, or custom cache keys created multiple stale copies of the same page.
  7. Retest from more than one network or region so you know the stale object is gone across the edge rather than only in one location.
  8. Tighten cache rules so dynamic error-prone paths are not cached like static assets during future incidents.
  9. Keep purge procedures aligned with your cache key design so recovery actions remove every affected variant quickly.