Background job retry strategies

2646 views

                  class ProcessPaymentWorker
  include Sidekiq::Worker

  sidekiq_options queue: :critical, retry: 10

  sidekiq_retry_in do |count, exception|
    case exception
    when PaymentGatewayTimeoutError
      # Exponential backoff for transient errors
      (count ** 4) + 15
    when PaymentValidationError
      # Fast fail for validation errors
      :kill
    else
      # Default: exponential backoff
      (count ** 2) + 15
    end
  end

  def perform(payment_id)
    payment = Payment.find(payment_id)

    # Idempotency check
    return if payment.processed?

    gateway = PaymentGateway.new
    result = gateway.process(payment)

    payment.update!(
      status: 'processed',
      processed_at: Time.current,
      gateway_transaction_id: result.transaction_id
    )
  rescue PaymentGatewayTimeoutError => e
    Rails.logger.warn("Payment gateway timeout for payment #{payment_id}, will retry")
    raise e
  rescue StandardError => e
    Rails.logger.error("Payment processing failed for #{payment_id}: #{e.message}")
    payment.update(status: 'failed', error_message: e.message)
    raise e
  end
end

Not all background job failures should be retried the same way. Transient failures like network timeouts benefit from exponential backoff, but bugs in code or invalid data should fail immediately after a few attempts. Sidekiq provides retry configuration per worker class, and I customize it based on the job's characteristics. For critical jobs like payment processing, I set high retry counts and notify engineers when retries are exhausted via exception tracking. For best-effort jobs like sending analytics events, I might disable retries entirely. The sidekiq_retry_in hook allows dynamic backoff calculations. I also implement idempotency checks so retries don't duplicate side effects if the job partially succeeded before failing.

Alex Kumar

More from Alex Kumar

Background job retry strategies

0 Comments

More from Alex Kumar