Working with Postgresql generate_series in Rails and ActiveRecord

ActiveRecord provides a lot of advantages as an ORM dealing with queries alongwith complex business logics. However things get sometimes difficult with complex queries.

Recently I faced a similar issue, when I had to use generate_series over one of the table which contained date ranges and a value, to get a grouped date by value, the raw sql of which would go something like:


select sum(amount) as sum_amount, custom_date
from my_table,
generate_series(my_table.start_date, my_table.end_date, interval '1 day') custom_date
group by custom_date;

The above looks out of way to write with ActiveRecord as it performs a cross join of original table with generate_series() and none of the AR methods seamlessly allow you to do so.

The other option was writing raw sql, which I wanted to avoid completely as it would take out one of key feature of AR i.e. chaining of conditions, which was must as the code involved a lot of where conditions over the parameters to build the final query.

I worked back to how some of the AR method works, and played with joins() to get desired sql being built successfully as (note the comma in the joins method to get the desired sql🙂 ):

MyModel.
  joins(", generate_series(my_table.start_date,my_table.end_date,interval '1 day') custom_date").
  group("custom_date").
  select("sum(amount) as sum_amount, custom_date ")

 

The above query became very helpful in writing complex SQLs involving date ranges using Rails!

Scaling up cron jobs with Rails, whenever and Sidekiq

Scaling up application servers has quite a few standard techniques like Auto-Scaling etc., but we recently faced a problem where we wanted to run hundreds of scheduled/cron jobs.

We needed these for user communications , miscellaneous timely reports, caching jobs etc. and they have to run at precise timings, can’t be skipped, can’t be mistakenly executed twice. We were able to scale this up pretty well using whenever , and having a single dedicated server spin-up to execute these crons using a very easy to maintain scheduling configuration provided by whenever.

But as more and more scheduled tasks were required, server sometimes got shut down due to some bad cron / multiple crons running several at a time and we wanted to scale this as seamlessly as auto-scaling. But auto-scaling could not be used because multiple servers might execute the same job.

We already had Sidekiq in our stack, for several asynchronous tasks, and the idea of crons running as separate sidekiq job looked just perfect for solving our key problems:

a) Scaling

b) Limiting the effect of one bad job to that one only.

Neither do I wanted to lose ‘whenever’ as it provides very easy-to-maintain configuration which is quite developer-friendly.

The design followed this way : ‘To read jobs directly from whenever gem and push it to sidekiq’. This way all our existing jobs could be moved to Sidekiq in an abstracted manner, without moving each and every job individually – which in itself could have been a huge erroneous task.

Specifically:

  1. Create a single sidekiq job that will execute a rake task programmatically using  the task name.
  2. Create a single task to run every day which will read the job list from whenever and push the above job accordingly as per its frequency and scheduled times.

Did some digging around and monkeypatched some of whenever gem’s methods to access them:


require 'whenever'
module Whenever
  class JobList
    def getjobs
       @jobs
    end
  end
end

module Whenever
  class Job
    def getoptions
      @options
    end
  end
end

Now you can use them as:


jl = Whenever::JobList.new({update: true, file: 'config/schedule.rb'})
jobs = jl.getjobs

Simply iterate on each job, identify at what time to ‘perform_at’ using the key of job and ‘at’ option, push it to a common sidekiq worker. Note that the key here is integer value representing frequency of job execution or the precise time, all of these cases we can handle programmatically to enqueue all next 24 hours jobs as:

Rake::Task[job.getoptions[:task]].execute

Using above we scaled well over 10k different scheduled tasks over a day, and all complex logic of scheduler has now been abstracted over to above jobs in terms of logic and independent auto-scaled servers running Sidekiq in terms of scalability.