Introduction

SSH can look healthy until a restart, rollout, or token refresh makes permission denied after identity context change. The usual cause is stale identity state: the runtime is still reading an old secret, using the wrong user, or holding a cached credential that no longer matches the intended access path.

Symptoms

  • Operations that used to work now return authorization or permission errors
  • The issue appeared after a user, role, token, or mount change
  • One fresh worker behaves differently from older warm workers
  • Manual access with an administrator account still works while the runtime path fails

Common Causes

  • The process is using an old token, secret, or credential file
  • A base image or mount change reset file ownership or runtime user assumptions
  • Scope or audience changed in one layer but the service kept the old cache
  • The new identity exists in config, but long-lived workers never reloaded it

Step-by-Step Fix

  1. 1.Inspect the live state first
  2. 2.Capture the active runtime path before changing anything so you know whether the process is stale, partially rolled, or reading the wrong dependency.
bash
date -u
printenv | sort | head -80
grep -R "error\|warn\|timeout\|retry\|version" logs . 2>/dev/null | tail -80
  1. 1.Compare the active configuration with the intended one
  2. 2.Look for drift between the live process and the deployment or configuration files it should be following.
bash
grep -R "timeout\|retry\|path\|secret\|buffer\|cache\|lease\|schedule" config deploy . 2>/dev/null | head -120
  1. 1.Apply one explicit fix path
  2. 2.Prefer one clear configuration change over several partial tweaks so every instance converges on the same behavior.
yaml
securityContext:
  runAsUser: 1001
envFrom:
  - secretRef:
      name: runtime-credentials
serviceAccountName: runtime-access
  1. 1.Verify the full request or worker path end to end
  2. 2.Retest the same path that was failing rather than assuming a green deployment log means the runtime has recovered.
bash
curl -I https://internal.example.com/health
curl -s https://internal.example.com/check | head

Prevention

  • Publish active version, config, and runtime identity in one observable place
  • Verify the real traffic path after every rollout instead of relying on one green health log
  • Treat caches, workers, and background consumers as part of the same production system
  • Keep one source of truth for credentials, timeouts, routing, and cleanup rules