Fix PM2 Cluster Mode Fork Stuck on Startup

Introduction

PM2's cluster mode forks multiple Node.js processes that share the same server port, distributing incoming requests across workers. When a worker gets stuck in the "forking" or "launching" state and never reaches "online", the application is partially available -- some workers handle requests while the stuck workers consume resources but serve nothing. This commonly happens due to port conflicts, native addon incompatibilities with cluster mode, or application code that blocks the event loop during startup.

Symptoms

PM2 list shows stuck workers:

bash

$ pm2 list
┌────┬────────────────────┬──────────┬──────┬───────────┬──────────┬──────────┐
│ id │ name               │ mode     │ status  │ cpu       │ memory    │ watching │
├────┼────────────────────┼──────────┼──────┼───────────┼──────────┼──────────┤
│ 0  │ myapp              │ cluster  │ online │ 0%        │ 85.2mb    │ disabled │
│ 1  │ myapp              │ cluster  │ online │ 0%        │ 84.8mb    │ disabled │
│ 2  │ myapp              │ cluster  │ launching │ 0%     │ 45.1mb    │ disabled │
│ 3  │ myapp              │ cluster  │ errored   │ 0%     │ 0mb       │ disabled │
└────┴────────────────────┴──────────┴──────┴───────────┴──────────┴──────────┘

PM2 logs show the issue:

bash

$ pm2 logs myapp --lines 50
0|myapp  | Error: listen EADDRINUSE: address already in use :::3000
1|myapp  | Server listening on port 3000
2|myapp  | (stuck - no output)
3|myapp  | Error: Cannot find module './build/Release/addon.node'

Common Causes

Port already in use: Another process is bound to the same port
Native addons not compiled for cluster mode: Some native modules do not work with cluster.fork()
Application code blocks startup: Synchronous file I/O or database migration blocks the fork
PM2 max memory restart loop: Worker exceeds memory limit, restarts, exceeds again
Missing environment variables: Forked process does not inherit required environment
File descriptor limit: Too many open files prevent the fork from creating sockets

Step-by-Step Fix

Step 1: Check for port conflicts

```bash # Find what is using the port lsof -i :3000 # OR ss -tlnp | grep 3000

# Kill the conflicting process kill -9 $(lsof -t -i:3000)

# Restart PM2 pm2 restart myapp ```

Step 2: Use PM2 ecosystem file with proper configuration

```javascript // ecosystem.config.js module.exports = { apps: [{ name: 'myapp', script: 'server.js', instances: 4, exec_mode: 'cluster',

// Environment variables for all workers env: { NODE_ENV: 'production', PORT: 3000, },

// Restart configuration max_memory_restart: '500M', restart_delay: 3000, max_restarts: 10,

// Logging error_file: '/var/log/pm2/myapp-error.log', out_file: '/var/log/pm2/myapp-out.log', merge_logs: true,

// Worker timeout kill_timeout: 5000, listen_timeout: 8000, // How long to wait for 'listening' event }] }; ```

Step 3: Fix native addon compatibility

If using native addons that do not support cluster mode:

```javascript // server.js const cluster = require('cluster');

if (cluster.isPrimary) { // Primary process - do not load native addons here const numCPUs = require('os').cpus().length;

for (let i = 0; i < numCPUs; i++) { cluster.fork(); }

cluster.on('exit', (worker, code, signal) => { console.log(Worker ${worker.process.pid} died. Restarting...); cluster.fork(); }); } else { // Worker process - load native addons here const nativeAddon = require('./build/Release/addon.node'); const app = require('./app'); app.listen(process.env.PORT || 3000); } ```

Step 4: Debug stuck workers

```bash # Get detailed info on a stuck worker pm2 describe myapp

# Check worker logs pm2 logs myapp --raw

# Monitor worker memory pm2 monit

# If stuck, delete and recreate pm2 delete myapp pm2 start ecosystem.config.js ```

Prevention

Use PM2 ecosystem files instead of command-line arguments for reproducible configuration
Set listen_timeout to detect stuck workers (default 8000ms)
Monitor worker restart rate with pm2 monit and alert on frequent restarts
Avoid native addons in cluster mode, or load them only in worker processes
Ensure the application emits the listening event on the server object
Use merge_logs: true to combine logs from all workers for easier debugging
Set max_restarts to prevent infinite restart loops on broken deployments

Fix PM2 Cluster Mode Fork Stuck on Startup

Introduction

Symptoms

Common Causes

Step-by-Step Fix

Step 1: Check for port conflicts

Step 2: Use PM2 ecosystem file with proper configuration

Step 3: Fix native addon compatibility

Step 4: Debug stuck workers

Prevention

Share this guide

More Node.js Troubleshooting Guides

Fix WebSocket Close Code 1006 Unexpected Disconnection

Fix npm ci Checksum Mismatch Registry Error

Fix Node.js v8 Serialize Circular Reference Error

Fix Node.js util.promisify Callback First Argument Null Error

Fix Node.js Stream Backpressure Pipe Error

Fix Node.js require Cache Not Invalidated After File Edit