Introduction
Node.js worker threads run in isolated V8 contexts with their own memory limits. When a worker thread exceeds the available memory, the operating system sends SIGKILL (OOM killer on Linux) and the worker is terminated without a chance to clean up. The main thread sees the worker exit with code 137 (SIGKILL signal) but cannot determine what caused it.
This issue affects applications that use worker threads for CPU-intensive tasks like image processing, data transformation, or report generation.
Symptoms
- Worker exits with code 137 (SIGKILL) without any error message
- dmesg shows "Out of memory: Killed process XXXX (node)"
- Main thread receives 'exit' event with code 137 but no error details
Common Causes
- Worker processes data that exceeds the per-thread memory limit
- Memory leak in worker code that accumulates over multiple tasks
- System-level OOM killer terminates the process when total memory is exhausted
Step-by-Step Fix
- 1.Set worker resource limits and detect memory issues: Monitor worker memory usage.
- 2.```javascript
- 3.const { Worker } = require('worker_threads');
const worker = new Worker('./worker.js', { resourceLimits: { maxOldGenerationSizeMb: 512, maxYoungGenerationSizeMb: 128, } });
worker.on('exit', (code) => {
if (code === 137) {
console.error('Worker was killed (likely OOM)');
} else if (code !== 0) {
console.error(Worker exited with code ${code});
}
});
worker.on('error', (err) => { console.error('Worker error:', err); }); ```
- 1.Process data in chunks within the worker: Avoid loading all data at once.
- 2.```javascript
- 3.// worker.js
- 4.const { parentPort } = require('worker_threads');
parentPort.on('message', async (task) => { try { // BAD: load entire file into memory // const data = fs.readFileSync(task.filePath); // process(data);
// GOOD: process in chunks const stream = fs.createReadStream(task.filePath, { highWaterMark: 64 * 1024 // 64KB chunks });
let processed = 0; for await (const chunk of stream) { processChunk(chunk); processed += chunk.length;
// Report progress and allow GC if (processed % (1024 * 1024) === 0) { parentPort.postMessage({ type: 'progress', bytes: processed }); // Give GC a chance between chunks await new Promise(setImmediate); } }
parentPort.postMessage({ type: 'done' }); } catch (err) { parentPort.postMessage({ type: 'error', message: err.message }); } }); ```
- 1.Implement worker restart with task queuing: Handle worker crashes gracefully.
- 2.```javascript
- 3.const { Worker } = require('worker_threads');
class WorkerPool { constructor(script, maxWorkers = 4) { this.script = script; this.maxWorkers = maxWorkers; this.taskQueue = []; this.activeWorkers = 0; }
async submit(task) { return new Promise((resolve, reject) => { this.taskQueue.push({ task, resolve, reject }); this._dispatch(); }); }
_dispatch() { while (this.activeWorkers < this.maxWorkers && this.taskQueue.length > 0) { const { task, resolve, reject } = this.taskQueue.shift(); this._runTask(task).then(resolve).catch(reject); } }
async _runTask(task, retries = 2) { const worker = new Worker(this.script); this.activeWorkers++;
return new Promise((resolve, reject) => { worker.on('message', (result) => { if (result.type === 'error') { reject(new Error(result.message)); } else { resolve(result); } });
worker.on('exit', (code) => { this.activeWorkers--; this._dispatch();
if (code === 137 && retries > 0) {
// Worker was OOM killed, retry with reduced data
console.warn(Worker OOM killed, retrying (${retries} left));
resolve(this._runTask(task, retries - 1));
} else if (code !== 0) {
reject(new Error(Worker exited with code ${code}));
}
});
worker.postMessage(task); }); } } ```
Prevention
- Set explicit resourceLimits on workers to fail fast instead of SIGKILL
- Process data in chunks within workers to keep memory usage bounded
- Implement worker restart logic with task retry for crash recovery
- Monitor worker memory usage and alert before hitting OOM thresholds