Introduction

Node.js cluster module allows multiple worker processes to share the same server port by having the master process listen on the port and distribute connections to workers via round-robin (on all platforms except Windows). However, port contention occurs when workers attempt to bind to the same port independently (not using the cluster module correctly), when a new worker starts before the old worker has released the port, or when mixing cluster mode with direct listen() calls. The result is EADDRINUSE errors, dropped connections during worker restarts, and uneven load distribution across workers.

Symptoms

bash
Error: listen EADDRINUSE: address already in use :::3000
    at Server.setupListenHandle [as _listen2] (node:net:1463:16)
    at listenInCluster (node:net:1511:12)
    at Server.listen (node:net:1599:7)
    at Object.<anonymous> (/app/server.js:15:5)

Or uneven load distribution:

bash
Worker 1 handled 9500 requests
Worker 2 handled 300 requests
Worker 3 handled 200 requests
Worker 4 handled 0 requests  <-- Not receiving any connections

Or during rolling restart:

bash
Worker 12345 shutting down
Worker 12346 starting
Worker 12346 error: EADDRINUSE
Worker 12345 still holding port

Common Causes

  • Workers calling listen() directly: Each worker tries to bind to the same port
  • Not using cluster.isPrimary check: Same code runs in both master and worker
  • SO_REUSEPORT not enabled: Multiple processes cannot bind to the same port
  • Graceful restart not implemented: Old worker releases port after new worker tries to bind
  • Mixed cluster and standalone mode: Some instances run in cluster, some standalone
  • Port conflict with other services: Another process (Docker, proxy) already using the port

Step-by-Step Fix

Step 1: Correct cluster setup with master-worker separation

```javascript const cluster = require('cluster'); const os = require('os');

if (cluster.isPrimary) { console.log(Master ${process.pid} is running);

// Fork workers const numCPUs = os.cpus().length; for (let i = 0; i < numCPUs; i++) { cluster.fork(); }

cluster.on('exit', (worker, code, signal) => { console.log(Worker ${worker.process.pid} died. Forking new worker.); cluster.fork(); });

} else { // Worker processes - this is where the server listens const app = require('./app'); const PORT = process.env.PORT || 3000;

// Only the worker listens - the master distributes connections const server = app.listen(PORT, () => { console.log(Worker ${process.pid} listening on port ${PORT}); });

// Graceful shutdown process.on('SIGTERM', () => { console.log(Worker ${process.pid} shutting down); server.close(() => { console.log(Worker ${process.pid} closed connections); process.exit(0); }); }); } ```

Step 2: Handle rolling restarts gracefully

```javascript if (cluster.isPrimary) { let isRestarting = false;

function rollingRestart() { if (isRestarting) return; isRestarting = true;

const workers = Object.values(cluster.workers); let currentIndex = 0;

function restartNext() { if (currentIndex >= workers.length) { isRestarting = false; console.log('Rolling restart complete'); return; }

const worker = workers[currentIndex]; currentIndex++;

// Fork new worker before killing the old one const newWorker = cluster.fork();

newWorker.on('listening', () => { // New worker is ready - kill the old one worker.send('shutdown'); });

// Wait for old worker to exit, then continue worker.on('exit', () => { setTimeout(restartNext, 1000); });

// Force kill if old worker does not exit setTimeout(() => { worker.kill('SIGKILL'); }, 10000); }

restartNext(); }

// Trigger rolling restart process.on('SIGUSR2', rollingRestart); } ```

Step 3: Handle EADDRINUSE gracefully

javascript const server = app.listen(PORT, () => { console.log(Worker ${process.pid} listening on port ${PORT}`); });

server.on('error', (err) => { if (err.code === 'EADDRINUSE') { console.error(Port ${PORT} is already in use. Waiting...);

// Wait and retry setTimeout(() => { server.listen(PORT); }, 1000); } else { console.error('Server error:', err); process.exit(1); } }); ```

Prevention

  • Always check cluster.isPrimary before forking workers
  • Only call app.listen() in worker processes, never in the master
  • Use SIGUSR2 signal to trigger graceful rolling restarts
  • Implement proper shutdown sequence: stop accepting -> drain -> exit
  • Monitor load distribution across workers to detect port contention issues
  • Use cluster.schedulingPolicy to control the load balancing strategy
  • Add SO_REUSEPORT socket option for platforms that support it