Introduction

Ruby's Open3 library spawns child processes for shell command execution. Without proper timeout handling, a hung subprocess can consume system resources indefinitely, create zombie processes, and eventually exhaust the process table. In production applications that process files, run external tools, or execute system commands, unmanaged subprocesses are a leading cause of gradual resource degradation.

Symptoms

  • Application memory grows as subprocesses accumulate
  • ps aux shows zombie (defunct) Ruby child processes
  • System reports fork: Cannot allocate memory under load
  • External command execution hangs indefinitely
  • File descriptors exhausted from unclosed subprocess pipes

Check for zombie processes: ```bash ps aux | grep defunct # Output: # user 12345 0.0 0.0 0 0 ? Z 10:00 0:00 [ruby] <defunct>

# Count zombie processes ps aux | awk '$8 == "Z" { print }' | wc -l ```

Common Causes

  • Open3.capture3 called without timeout on unreliable external commands
  • Subprocess output not fully consumed, causing pipe deadlock
  • Child process not waited on after termination
  • Standard error pipe buffer full causing subprocess to block
  • Parent process killed before child, leaving orphan

Step-by-Step Fix

  1. 1.Add timeout to Open3 calls:
  2. 2.```ruby
  3. 3.require 'open3'
  4. 4.require 'timeout'

# WRONG - no timeout, can hang forever stdout, stderr, status = Open3.capture3("external-tool --process large-file.csv")

# CORRECT - with timeout and proper error handling def run_with_timeout(command, timeout_sec: 30) Timeout.timeout(timeout_sec) do Open3.capture3(command) end rescue Timeout::Error raise "Command timed out after #{timeout_sec}s: #{command}" rescue Errno::ENOENT raise "Command not found: #{command}" end

stdout, stderr, status = run_with_timeout( "external-tool --process large-file.csv", timeout_sec: 60 ) ```

  1. 1.Handle large output without pipe deadlock:
  2. 2.```ruby
  3. 3.# For commands that produce large output, use popen3 with streaming
  4. 4.def run_with_streaming_output(command, timeout_sec: 30)
  5. 5.Timeout.timeout(timeout_sec) do
  6. 6.Open3.popen3(command) do |stdin, stdout, stderr, wait_thr|
  7. 7.stdin.close # Close stdin if not writing

output = +"" error = +""

# Read stdout and stderr simultaneously to prevent pipe buffer deadlock stdout_thread = Thread.new { stdout.read } stderr_thread = Thread.new { stderr.read }

output = stdout_thread.value error = stderr_thread.value

exit_status = wait_thr.value [output, error, exit_status] end end end ```

  1. 1.Properly clean up subprocess on parent exit:
  2. 2.```ruby
  3. 3.# Set up signal handlers to kill child processes
  4. 4.class SubprocessRunner
  5. 5.def self.run(command, timeout_sec: 30)
  6. 6.pid = nil

# Kill child on parent exit trap("TERM") do Process.kill("TERM", pid) if pid && process_alive?(pid) exit 1 end

Timeout.timeout(timeout_sec) do Open3.popen3(command) do |stdin, stdout, stderr, wait_thr| pid = wait_thr.pid stdin.close

output = stdout.read error = stderr.read exit_status = wait_thr.value

[output, error, exit_status] end end end

def self.process_alive?(pid) Process.kill(0, pid) true rescue Errno::ESRCH false end end ```

  1. 1.Monitor subprocess resource usage:
  2. 2.```ruby
  3. 3.# Middleware to track subprocess execution
  4. 4.class SubprocessMonitor
  5. 5.def self.active_children
  6. 6.# Linux: count child processes
  7. 7.if File.exist?("/proc/#{Process.pid}/task/#{Process.pid}/children")
  8. 8.children = File.read("/proc/#{Process.pid}/task/#{Process.pid}/children").split
  9. 9.children.map { |pid| Process.pid.to_i }
  10. 10.else
  11. 11.# Fallback: parse ps output
  12. 12.ps --ppid #{Process.pid} -o pid= 2>/dev/null.split.map(&:to_i)
  13. 13.end
  14. 14.end

def self.log_if_excessive(max_children: 10) children = active_children if children.size > max_children Rails.logger.warn( "Excessive child processes: #{children.size} (PIDs: #{children.join(', ')})" ) end end end ```

Prevention

  • Always wrap Open3 calls in Timeout.timeout with reasonable limits
  • Use streaming output (popen3) for commands that produce large output
  • Set up signal handlers to clean up child processes on parent exit
  • Monitor child process count in production health checks
  • Prefer native Ruby libraries over shell commands when available
  • Add subprocess count alerts to monitoring dashboards