Introduction

A fork bomb in Python multiprocessing occurs when worker processes inadvertently spawn child processes recursively, consuming all system resources and crashing the host. This happens most commonly on Linux where the default start method is fork, which copies the entire parent process memory space including any running threads. When a thread that holds a lock is forked, the child inherits a copy of the lock in a locked state that can never be unlocked -- leading to deadlocks that trigger watchdog restarts, which then create more processes in an exponential cascade.

Symptoms

System monitoring shows exponential process growth:

bash
$ ps aux | grep python | wc -l
4
# Ten seconds later
$ ps aux | grep python | wc -l
128
# Thirty seconds later - system becomes unresponsive
$ ps aux | grep python | wc -l
2048

The kernel log shows OOM kills:

bash
[12345.678901] python invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0
[12345.678902] Out of memory: Killed process 5678 (python) total-vm:4521832kB, anon-rss:1234567kB

Or the process table fills up:

bash
bash: fork: retry: Resource temporarily unavailable
bash: fork: Cannot allocate memory

Common Causes

  • **Using fork start method with threaded code**: The multiprocessing default on Linux forks the process, copying all threads. If a thread holds a GIL or a mutex during fork, child processes deadlock
  • **Missing if __name__ == "__main__" guard**: Code at module level runs in every child process, which then runs the same code again, spawning grandchildren
  • Unbounded process pool: Creating Pool() without processes argument spawns os.cpu_count() workers, which may each spawn their own pools
  • Worker code itself uses multiprocessing: A pool worker that creates another pool creates recursive process spawning
  • Signal handlers in parent not inherited properly: Forked children may not receive SIGTERM, preventing graceful shutdown

Step-by-Step Fix

Step 1: Always guard entry point with `if __name__ == "__main__"`

```python # WRONG - this spawns processes at import time in every child from multiprocessing import Pool

def worker(x): return x * x

pool = Pool(4) # BAD: runs in every forked child too results = pool.map(worker, range(10))

# CORRECT - only the main process creates the pool from multiprocessing import Pool

def worker(x): return x * x

if __name__ == "__main__": pool = Pool(4) results = pool.map(worker, range(10)) pool.close() pool.join() ```

Step 2: Use `spawn` start method instead of `fork`

The spawn start method starts a fresh Python interpreter, avoiding all fork-related thread-safety issues:

```python import multiprocessing as mp

if __name__ == "__main__": mp.set_start_method("spawn", force=True)

def process_data(items): with mp.Pool(processes=4) as pool: results = pool.map(worker_function, items) return results ```

For Python 3.8+, you can also use the context manager approach:

```python import multiprocessing as mp

if __name__ == "__main__": ctx = mp.get_context("spawn") with ctx.Pool(processes=4) as pool: results = pool.map(worker_function, range(100)) ```

Step 3: Enforce process limits with cgroups

In production, use cgroups to prevent runaway processes from consuming all resources:

bash
# Limit to 2 CPU cores and 4GB memory for the Python process group
sudo systemd-run --scope \
  -p CPUQuota=200% \
  -p MemoryMax=4G \
  -p TasksMax=64 \
  python3 worker.py

Or in Docker:

dockerfile
docker run --cpus=2 --memory=4g --pids-limit=100 python:3.11 worker.py

Step 4: Prevent nested multiprocessing in workers

If your worker needs to call library code that uses multiprocessing internally, set the process count explicitly:

```python def worker(batch): # Prevent libraries like joblib from spawning their own processes import os os.environ["JOBLIB_MULTIPROCESSING"] = "0" os.environ["OMP_NUM_THREADS"] = "1" os.environ["MKL_NUM_THREADS"] = "1"

# Process the batch return process(batch) ```

Prevention

  • Always use if __name__ == "__main__" guard in any file using multiprocessing
  • Prefer spawn or forkserver start methods over fork in production
  • Use concurrent.futures.ProcessPoolExecutor which defaults to spawn on Python 3.14+
  • Set TasksMax in systemd or --pids-limit in Docker to hard-cap process count
  • Never create a Pool inside a Pool worker
  • Add psutil monitoring to detect abnormal process counts and trigger circuit breakers
  • Use resource.setrlimit(resource.RLIMIT_NPROC, ...) to set per-user process limits