Introduction

Servers use file descriptors for more than just files. Network sockets, logs, pipes, temporary files, and service internals all consume them. When a process or the host hits its descriptor limit, websites can start refusing connections, failing to open logs, or throwing unpredictable application errors. The fix is to determine whether the limit is too low, the workload is too high, or one process is leaking descriptors over time.

Symptoms

  • Services log errors about too many open files
  • Web traffic starts failing with connection resets, 500 errors, or missing log writes
  • The issue gets worse under sustained traffic or after the process has been running for a long time
  • Restarting the affected service temporarily clears the problem
  • Different services on the same host may fail in related ways

Common Causes

  • Per-process or system-wide descriptor limits are too low for current traffic patterns
  • A web server, proxy, or application opens many sockets and holds them too long
  • A process leaks file descriptors because resources are not being released properly
  • Log rotation or background jobs increase handle usage unexpectedly
  • Capacity settings were never updated as the service grew

Step-by-Step Fix

  1. Identify which process is hitting the descriptor limit and correlate the error with traffic or background workload.
  2. Check both per-process and system-wide open-file limits so you know where the actual ceiling is.
  3. Review current open file and socket usage for the affected service to see whether it is a steady leak or normal high concurrency.
  4. Inspect connection handling, keep-alive behavior, and long-lived streams that may hold descriptors longer than expected.
  5. Confirm log files, temporary files, and background jobs are not contributing unexpected handle growth.
  6. Raise limits only after understanding the workload and verifying the service can safely use more open files.
  7. Restart or reload the affected process if needed after applying the corrected limit.
  8. Re-test under realistic traffic and confirm the service no longer exhausts descriptors.
  9. Keep descriptor usage monitoring in place so growing connection patterns or leaks are visible before they trigger outages.