Fix Sidekiq Retry Exhausted Dead Queue | Background Job Recovery

Introduction

When a Sidekiq job fails repeatedly, it exhausts its retry budget and moves to the Dead set. By default, Sidekiq performs 25 retries over about 20 days using exponential backoff. If the underlying issue is not fixed, the job is permanently lost after 6 months in the dead set. This is a critical production issue for payment processing, email delivery, and data synchronization jobs.

Symptoms

Sidekiq Web UI shows jobs in the "Dead" tab
Sidekiq::DeadSet contains jobs that should have been processed
No alerts fired when jobs moved to dead set
Important business logic (emails, payments) silently failed
Dead set grows continuously indicating systemic issues

Example error in logs: ``2026-04-09T10:15:00.000Z pid=1234 tid=abc123 WARN: {"context":"Job raised exception", "job":{"class":"ProcessPaymentJob","args":[12345],"retry":25,"queue":"default"}, "error_class":"Stripe::InvalidRequestError", "error_message":"No such customer: cus_abc123", "failed_at":1712500000,"retry_count":25}

Common Causes

External service returns permanent error (404, invalid data)
Retry count too low for transient failure patterns
No alerting configured for dead jobs
Job arguments reference deleted records
Infinite retry loop: bug causes same failure every time

Step-by-Step Fix

1.Check and recover dead jobs:
2.```ruby
3.# In Rails console
4.dead_set = Sidekiq::DeadSet.new
5.dead_set.size # Number of dead jobs

# Find specific failed jobs dead_set.select { |job| job.klass == 'ProcessPaymentJob' }.each do |job| puts "Args: #{job.args.inspect}, Error: #{job['error_message']}, Failed: #{Time.at(job['failed_at'])}" end

# Retry all dead jobs of a specific class dead_set.select { |job| job.klass == 'ProcessPaymentJob' }.each(&:retry) ```

1.Configure per-job retry strategies:
2.```ruby
3.class ProcessPaymentJob
4.include Sidekiq::Job
5.sidekiq_options queue: :payments, retry: 10, dead: false

# Or use custom retry logic sidekiq_retry_in do |count, exception| case exception when Stripe::RateLimitError 60 * count # Linear backoff for rate limits when Stripe::APIConnectionError 5 * (2 ** count) # Exponential backoff for connection errors when Stripe::InvalidRequestError raise Sidekiq::JobRetry::Skip # Don't retry invalid requests else 15 * (2 ** count) # Default exponential backoff end end end ```

1.Add alerting for dead jobs:
2.```ruby
3.# config/initializers/sidekiq.rb
4.Sidekiq.configure_server do |config|
5.config.death_handlers << lambda do |job, ex|
6.# Send to error tracking service
7.Sentry.with_scope do |scope|
8.scope.set_tags(job_class: job['class'], job_id: job['jid'])
9.scope.set_context('sidekiq_job', { args: job['args'] })
10.Sentry.capture_exception(ex)
11.end

# Or send Slack alert for critical job classes if %w[ProcessPaymentJob SendInvoiceJob].include?(job['class']) Slack.notify("#alerts", "CRITICAL: #{job['class']} dead after #{job['retry_count']} retries") end end end ```

1.Implement dead letter queue monitoring:
2.```ruby
3.# Rake task to monitor dead queue
4.namespace :sidekiq do
5.desc "Report on dead jobs"
6.task :dead_report => :environment do
7.dead_set = Sidekiq::DeadSet.new
8.puts "Dead jobs: #{dead_set.size}"

dead_set.each do |job| puts " #{job['class']}: #{job['error_message']} (#{Time.at(job['failed_at'])})" end end end ```

Prevention

Configure dead: false for jobs that should retry indefinitely (with caution)
Set up monitoring and alerting on dead queue size
Use sidekiq_retry_in to implement smart retry strategies per exception type
Add idempotency keys to jobs so retries are safe
Regularly audit dead jobs to identify systemic issues
Keep dead job TTL reasonable (default 6 months): Sidekiq::DeadSet.new.kill_old(30.days)

Fix Sidekiq Retry Exhausted Dead Queue Messages Lost

Introduction

Symptoms

Common Causes

Step-by-Step Fix

Prevention

Share this guide

More Ruby Troubleshooting Guides

Fix Ruby Bundler Could Not Find Gem in Any of the Sources

Fix Ruby NoMethodError Undefined Method on Nil From ActiveRecord

Fix Ruby Gem ConflictError From Incompatible Dependency Versions

Fix Rails Credentials Edit Missing RAILS_MASTER_KEY

Fix Rake Task Namespace Collision with Same Task Name

Fix Ruby 3 Keyword Arguments Breaking Change Warnings