Sorry this page looks weird. It was automatically migrated from my old blog, which had a different layout and different CSS.

Deploying and Monitoring Delayed Job

I recently set up Delayed Job (v2.0.3) for my Rails app. It was nice and easy to use in my code, but I found managing the background worker process(es) in production tricky.

In development you can run the workers in the foreground with rake jobs:work. Easy.

In production the workers have to be daemonised, so you use:

script/delayed_job start|stop|restart

In theory, also easy. However you have to ensure you use the right version of the third-party gem which does the daemonising, and even then you have to be careful that the workers have finished before you start them up again.

This is my Capistrano deploy.rb before I added monitoring:

namespace :delayed_job do

  task :stop, :roles => :app do
    run "cd #{current_path}; RAILS_ENV=#{rails_env} script/delayed_job stop"
  end

  task :start, :roles => :app do
    run "cd #{current_path}; RAILS_ENV=#{rails_env} script/delayed_job start"
  end

  # Override normal restart to force wait for job-in-progress to finish.
  # http://gist.github.com/178397
  # http://github.com/collectiveidea/delayed_job/issues#issue/3
  desc "Restart the delayed_job process"
  task :restart, :roles => :app do
    stop
    wait_for_process_to_end('delayed_job')
    start
  end

end

def wait_for_process_to_end(process_name)
  run "COUNT=1; until [ $COUNT -eq 0 ]; do COUNT=`ps -ef | grep -v 'ps -ef' | grep -v 'grep' | grep -i '#{process_name}'|wc -l` ; echo 'waiting for #{process_name} to end' ; sleep 2 ; done"
end

after "deploy:stop",    "delayed_job:stop"
after "deploy:start",   "delayed_job:start"
after "deploy:restart", "delayed_job:restart"

The :start and :stop tasks shouldn’t be necessary because Delayed Job should load them from the gem. But I couldn’t get that to work for me. The important point is that :restart waits for the workers to finish.

Monitoring with Monit

The deployment recipe above worked well for me. But I needed to make it work with Monit, the tool I use to monitor my services and processes.

First things first. My Capistrano start and stop tasks became:

  task :stop, :roles => :app do
    sudo "monit -c /etc/monit.conf stop delayed_job"
  end

  task :start, :roles => :app do
    sudo "monit -c /etc/monit.conf start delayed_job"
  end

The final step was getting Monit to start and stop Delayed Job. Delayed Job bundles a sample Monit configuration but its start and stop commands didn’t work for me. It took me several hours, but this is my solution:

start program = "/bin/su - deploy -c 'cd /var/www/apps/myapp/current; RAILS_ENV=production script/delayed_job start'"
stop program = "/bin/su - deploy -c 'cd /var/www/apps/myapp/current; RAILS_ENV=production script/delayed_job stop'"

This runs as the correct user (deploy in my case) and changes to the correct directory before executing the delayed_job script. Without both these properties I couldn’t get Monit to manage the workers.

Monit alert – Execution failed

As an aside, Monit was somewhat taciturn when it came to explaining why it couldn’t start or stop my workers. The trick was to quit Monit and start it with -Iv. This makes it write out everything on the console, including the errors that were preventing it from executing my start and stop commands.

The Result

If my Delayed Job workers ever go down, Monit will restart them for me.

When I deploy a new release of my app, Monit will stop the workers for me (instead of me stopping them as per my first recipe and Monit trying to restart them in the middle of the deployment), wait for all the background tasks to finish, then start them up again.

And I can bounce the workers with a quick cap delayed_job:restart.

Andrew Stewart • 4 November 2010 • Deployment
You can reach me by email or on Twitter.