I recently set up Delayed Job (v2.0.3) for my Rails app. It was nice and easy to use in my code, but I found managing the background worker process(es) in production tricky.
In development you can run the workers in the foreground with rake jobs:work. Easy.
In production the workers have to be daemonised, so you use:
script/delayed_job start|stop|restart
In theory, also easy. However you have to ensure you use the right version of the third-party gem which does the daemonising, and even then you have to be careful that the workers have finished before you start them up again.
This is my Capistrano deploy.rb before I added monitoring:
namespace :delayed_job do
task :stop, :roles => :app do
run "cd #{current_path}; RAILS_ENV=#{rails_env} script/delayed_job stop"
end
task :start, :roles => :app do
run "cd #{current_path}; RAILS_ENV=#{rails_env} script/delayed_job start"
end
# Override normal restart to force wait for job-in-progress to finish.
# http://gist.github.com/178397
# http://github.com/collectiveidea/delayed_job/issues#issue/3
desc "Restart the delayed_job process"
task :restart, :roles => :app do
stop
wait_for_process_to_end('delayed_job')
start
end
end
def wait_for_process_to_end(process_name)
run "COUNT=1; until [ $COUNT -eq 0 ]; do COUNT=`ps -ef | grep -v 'ps -ef' | grep -v 'grep' | grep -i '#{process_name}'|wc -l` ; echo 'waiting for #{process_name} to end' ; sleep 2 ; done"
end
after "deploy:stop", "delayed_job:stop"
after "deploy:start", "delayed_job:start"
after "deploy:restart", "delayed_job:restart"
The :start and :stop tasks shouldn’t be necessary because Delayed Job should load them from the gem. But I couldn’t get that to work for me. The important point is that :restart waits for the workers to finish.
The deployment recipe above worked well for me. But I needed to make it work with Monit, the tool I use to monitor my services and processes.
First things first. My Capistrano start and stop tasks became:
task :stop, :roles => :app do
sudo "monit -c /etc/monit.conf stop delayed_job"
end
task :start, :roles => :app do
sudo "monit -c /etc/monit.conf start delayed_job"
end
The final step was getting Monit to start and stop Delayed Job. Delayed Job bundles a sample Monit configuration but its start and stop commands didn’t work for me. It took me several hours, but this is my solution:
start program = "/bin/su - deploy -c 'cd /var/www/apps/myapp/current; RAILS_ENV=production script/delayed_job start'"
stop program = "/bin/su - deploy -c 'cd /var/www/apps/myapp/current; RAILS_ENV=production script/delayed_job stop'"
This runs as the correct user (deploy in my case) and changes to the correct directory before executing the delayed_job script. Without both these properties I couldn’t get Monit to manage the workers.
As an aside, Monit was somewhat taciturn when it came to explaining why it couldn’t start or stop my workers. The trick was to quit Monit and start it with -Iv. This makes it write out everything on the console, including the errors that were preventing it from executing my start and stop commands.
If my Delayed Job workers ever go down, Monit will restart them for me.
When I deploy a new release of my app, Monit will stop the workers for me (instead of me stopping them as per my first recipe and Monit trying to restart them in the middle of the deployment), wait for all the background tasks to finish, then start them up again.
And I can bounce the workers with a quick cap delayed_job:restart.