Introduction
A fork bomb in Python multiprocessing occurs when worker processes inadvertently spawn child processes recursively, consuming all system resources and crashing the host. This happens most commonly on Linux where the default start method is fork, which copies the entire parent process memory space including any running threads. When a thread that holds a lock is forked, the child inherits a copy of the lock in a locked state that can never be unlocked -- leading to deadlocks that trigger watchdog restarts, which then create more processes in an exponential cascade.
Symptoms
System monitoring shows exponential process growth:
$ ps aux | grep python | wc -l
4
# Ten seconds later
$ ps aux | grep python | wc -l
128
# Thirty seconds later - system becomes unresponsive
$ ps aux | grep python | wc -l
2048The kernel log shows OOM kills:
[12345.678901] python invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0
[12345.678902] Out of memory: Killed process 5678 (python) total-vm:4521832kB, anon-rss:1234567kBOr the process table fills up:
bash: fork: retry: Resource temporarily unavailable
bash: fork: Cannot allocate memoryCommon Causes
- **Using
forkstart method with threaded code**: Themultiprocessingdefault on Linux forks the process, copying all threads. If a thread holds a GIL or a mutex during fork, child processes deadlock - **Missing
if __name__ == "__main__"guard**: Code at module level runs in every child process, which then runs the same code again, spawning grandchildren - Unbounded process pool: Creating
Pool()withoutprocessesargument spawnsos.cpu_count()workers, which may each spawn their own pools - Worker code itself uses multiprocessing: A pool worker that creates another pool creates recursive process spawning
- Signal handlers in parent not inherited properly: Forked children may not receive SIGTERM, preventing graceful shutdown
Step-by-Step Fix
Step 1: Always guard entry point with `if __name__ == "__main__"`
```python # WRONG - this spawns processes at import time in every child from multiprocessing import Pool
def worker(x): return x * x
pool = Pool(4) # BAD: runs in every forked child too results = pool.map(worker, range(10))
# CORRECT - only the main process creates the pool from multiprocessing import Pool
def worker(x): return x * x
if __name__ == "__main__": pool = Pool(4) results = pool.map(worker, range(10)) pool.close() pool.join() ```
Step 2: Use `spawn` start method instead of `fork`
The spawn start method starts a fresh Python interpreter, avoiding all fork-related thread-safety issues:
```python import multiprocessing as mp
if __name__ == "__main__": mp.set_start_method("spawn", force=True)
def process_data(items): with mp.Pool(processes=4) as pool: results = pool.map(worker_function, items) return results ```
For Python 3.8+, you can also use the context manager approach:
```python import multiprocessing as mp
if __name__ == "__main__": ctx = mp.get_context("spawn") with ctx.Pool(processes=4) as pool: results = pool.map(worker_function, range(100)) ```
Step 3: Enforce process limits with cgroups
In production, use cgroups to prevent runaway processes from consuming all resources:
# Limit to 2 CPU cores and 4GB memory for the Python process group
sudo systemd-run --scope \
-p CPUQuota=200% \
-p MemoryMax=4G \
-p TasksMax=64 \
python3 worker.pyOr in Docker:
docker run --cpus=2 --memory=4g --pids-limit=100 python:3.11 worker.pyStep 4: Prevent nested multiprocessing in workers
If your worker needs to call library code that uses multiprocessing internally, set the process count explicitly:
```python def worker(batch): # Prevent libraries like joblib from spawning their own processes import os os.environ["JOBLIB_MULTIPROCESSING"] = "0" os.environ["OMP_NUM_THREADS"] = "1" os.environ["MKL_NUM_THREADS"] = "1"
# Process the batch return process(batch) ```
Prevention
- Always use
if __name__ == "__main__"guard in any file using multiprocessing - Prefer
spawnorforkserverstart methods overforkin production - Use
concurrent.futures.ProcessPoolExecutorwhich defaults tospawnon Python 3.14+ - Set
TasksMaxin systemd or--pids-limitin Docker to hard-cap process count - Never create a Pool inside a Pool worker
- Add
psutilmonitoring to detect abnormal process counts and trigger circuit breakers - Use
resource.setrlimit(resource.RLIMIT_NPROC, ...)to set per-user process limits