Scaling up cron jobs with Rails, whenever and Sidekiq

Scaling up application servers has quite a few standard techniques like Auto-Scaling etc., but we recently faced a problem where we wanted to run hundreds of scheduled/cron jobs.

We needed these for user communications , miscellaneous timely reports, caching jobs etc. and they have to run at precise timings, can’t be skipped, can’t be mistakenly executed twice. We were able to scale this up pretty well using whenever , and having a single dedicated server spin-up to execute these crons using a very easy to maintain scheduling configuration provided by whenever.

But as more and more scheduled tasks were required, server sometimes got shut down due to some bad cron / multiple crons running several at a time and we wanted to scale this as seamlessly as auto-scaling. But auto-scaling could not be used because multiple servers might execute the same job.

We already had Sidekiq in our stack, for several asynchronous tasks, and the idea of crons running as separate sidekiq job looked just perfect for solving our key problems:

a) Scaling

b) Limiting the effect of one bad job to that one only.

Neither do I wanted to lose ‘whenever’ as it provides very easy-to-maintain configuration which is quite developer-friendly.

The design followed this way : ‘To read jobs directly from whenever gem and push it to sidekiq’. This way all our existing jobs could be moved to Sidekiq in an abstracted manner, without moving each and every job individually – which in itself could have been a huge erroneous task.

Specifically:

  1. Create a single sidekiq job that will execute a rake task programmatically using  the task name.
  2. Create a single task to run every day which will read the job list from whenever and push the above job accordingly as per its frequency and scheduled times.

Did some digging around and monkeypatched some of whenever gem’s methods to access them:


require 'whenever'
module Whenever
  class JobList
    def getjobs
       @jobs
    end
  end
end

module Whenever
  class Job
    def getoptions
      @options
    end
  end
end

Now you can use them as:


jl = Whenever::JobList.new({update: true, file: 'config/schedule.rb'})
jobs = jl.getjobs

Simply iterate on each job, identify at what time to ‘perform_at’ using the key of job and ‘at’ option, push it to a common sidekiq worker. Note that the key here is integer value representing frequency of job execution or the precise time, all of these cases we can handle programmatically to enqueue all next 24 hours jobs as:

Rake::Task[job.getoptions[:task]].execute

Using above we scaled well over 10k different scheduled tasks over a day, and all complex logic of scheduler has now been abstracted over to above jobs in terms of logic and independent auto-scaled servers running Sidekiq in terms of scalability.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s