Introduction

Node.js Worker Threads each have their own V8 isolate and heap. When a worker processes data that exceeds its memory allocation, the Linux OOM (Out of Memory) killer terminates the process with SIGKILL. In Docker or Kubernetes, this happens when the worker exceeds the container memory limit. The main thread receives no notification about the killed worker, and the task fails silently.

Symptoms

  • Worker thread disappears without error
  • Worker exited unexpectedly with no stack trace
  • dmesg shows: Out of memory: Killed process 12345 (node) score 850
  • Tasks assigned to workers never complete
  • Container restarts due to OOMKilled
  • Works locally but fails in Docker/Kubernetes with memory limits

``` # dmesg output [12345.678901] Out of memory: Killed process 12345 (node) total-vm:2097152kB, anon-rss:1048576kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:2048kB oom_score_adj:0

# Node.js logs show: # Worker started # Processing large dataset... # (nothing - worker was killed) ```

Common Causes

  • Worker processing large files or datasets without memory bounds
  • Container memory limit too low for the workload
  • Memory leak in worker thread accumulating across tasks
  • Worker creating large buffers or arrays
  • No memory monitoring in worker threads

Step-by-Step Fix

  1. 1.Detect worker death and handle gracefully:
  2. 2.```javascript
  3. 3.const { Worker } = require('worker_threads');

function runWorker(script, workerData) { return new Promise((resolve, reject) => { const worker = new Worker(script, { workerData });

worker.on('message', (result) => resolve(result)); worker.on('error', reject); worker.on('exit', (code) => { if (code !== 0) { // Check if OOM killed reject(new Error(Worker stopped with exit code ${code} (possible OOM))); } }); }); }

// Usage with retry async function runWithRetry(script, data, maxRetries = 2) { for (let i = 0; i < maxRetries; i++) { try { return await runWorker(script, data); } catch (err) { if (i === maxRetries - 1) throw err; console.log(Worker failed, retry ${i + 1}/${maxRetries}); await new Promise(r => setTimeout(r, 2000)); } } } ```

  1. 1.Monitor worker memory usage:
  2. 2.```javascript
  3. 3.const { Worker, isMainThread, parentPort, workerData } = require('worker_threads');

// In the worker thread if (!isMainThread) { function checkMemory() { const mem = process.memoryUsage(); const heapUsedMB = mem.heapUsed / 1024 / 1024;

if (heapUsedMB > 400) { // Warning threshold parentPort.postMessage({ type: 'memory_warning', heapUsedMB: Math.round(heapUsedMB), }); }

if (heapUsedMB > 450) { // Critical - stop processing parentPort.postMessage({ type: 'memory_critical', heapUsedMB: Math.round(heapUsedMB), }); process.exit(1); } }

setInterval(checkMemory, 5000); } ```

  1. 1.Process data in chunks within workers:
  2. 2.```javascript
  3. 3.// worker.js
  4. 4.const { parentPort, workerData } = require('worker_threads');
  5. 5.const fs = require('fs');

async function processLargeFile(filePath) { const stream = fs.createReadStream(filePath, { highWaterMark: 64 * 1024 }); let buffer = ''; let results = [];

for await (const chunk of stream) { buffer += chunk.toString();

// Process complete lines let lineEnd; while ((lineEnd = buffer.indexOf('\n')) !== -1) { const line = buffer.slice(0, lineEnd); buffer = buffer.slice(lineEnd + 1);

results.push(processLine(line));

// Flush results periodically to free memory if (results.length >= 10000) { parentPort.postMessage({ type: 'batch', data: results }); results = []; global.gc && global.gc(); // Force GC if exposed } } }

// Send remaining if (results.length > 0) { parentPort.postMessage({ type: 'batch', data: results }); } parentPort.postMessage({ type: 'done' }); }

processLargeFile(workerData.filePath); ```

  1. 1.Set container memory limits correctly:
  2. 2.```yaml
  3. 3.# docker-compose.yml
  4. 4.services:
  5. 5.app:
  6. 6.build: .
  7. 7.deploy:
  8. 8.resources:
  9. 9.limits:
  10. 10.memory: 1G # Container limit
  11. 11.reservations:
  12. 12.memory: 512M # Guaranteed memory
  13. 13.environment:
  14. 14.- NODE_OPTIONS=--max-old-space-size=768 # 75% of container limit
  15. 15.`

Prevention

  • Set --max-old-space-size to 75% of container memory limit
  • Monitor worker memory with process.memoryUsage()
  • Process large data in streams, not bulk loads
  • Add OOM monitoring to Kubernetes/infrastructure alerts
  • Use worker.terminate() to kill workers exceeding memory thresholds
  • In Kubernetes, set resources.limits.memory and monitor container_oom_killed_total
  • Test with production-like memory constraints in CI