Following on yesterday’s post about Virtualenv Tips, I will be talking about celery tips. Yesterday I talked about how to run celery with upstart easily, and today I’ll be expanding on that below as well as talking about how to set it up using supervisord.
Note: Also interesting, I wrote a Big list of django tips back in 2008, that still has a lot of good information.
When you run celery in production, you should be using a queue on the backend. However, when you’re running celery in development, it’s nice to execute the code paths, but not actually need a queue. This is where the CELERY_ALWAYS_EAGER setting comes in handy. It makes celery run the code in process, but will make sure your code paths work correctly.
I talk about this and more in my djangocon talk.
On ReadTheDocs I would run into problems with celery tasks never returning. Luckily, celery has a way to handle this. The CELERYD_TASK_TIME_LIMIT setting lets you set the number of seconds that a task can run until it is killed. This is nice to make sure that a run-away task won’t take down all your backend processing.
This allows you to use another language to put a message that looks like a celery task in the queue, and it should just work.
When you run celery, it defaults to having the number of workers equal to the number of cores the machine has. If you are running multiple queue workers on the same machine, it is a good idea to use less. You can set this with the CELERYD_CONCURRENCY setting, or passing -c<num> on the command line.
At work we run a bunch of different sites on multiple databases. When we were figuring out how to deploy celery, we needed a good way to make sure that celeryd was always running, and we needed multiple celery daemons for each of our databases. We have written our tasks to run against multiple sites on the same database in order to reduce the number of daemons we had to use.
Celery ships with a couple of daemon configurations out of the box, support for init.d style init scripts, and support for supervisord. We first looked at the init.d approach, but there didn’t seem to be a good way to have it run multiple settings files without creating multiple scripts, which seemed hacky. So we went with superisord for the task. Below is our configuration, if you are curious.
By default, the conf file is in the top-level /etc/ directory. We kept it this way, but I kind of wish it was in it’s own directory. This is basically the exact script that celery ships with
unix_http_server] file=/tmp/supervisor.sock ; path to your socket file [supervisord] logfile=/var/log/supervisord/supervisord.log ; supervisord log file logfile_maxbytes=50MB ; maximum size of logfile before rotation logfile_backups=10 ; number of backed up logfiles loglevel=info ; info, debug, warn, trace pidfile=/var/run/supervisord.pid ; pidfile location nodaemon=false ; run supervisord as a daemon minfds=1024 ; number of startup file descriptors minprocs=200 ; number of process descriptors user=root ; default user childlogdir=/var/log/supervisord/ ; where child log files will live [rpcinterface:supervisor] supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface ; [supervisorctl] serverurl=unix:///tmp/supervisor.sock ; use unix:// schem for a unix sockets. [include] files = supervisord/celeryd.conf
Then we created a supervisord directory which we included in the above file (in the last line) that contains the celery specific configuration. On this machine the only thing that supervisord is watching is celery, so that has kept our configuration simple.
Inside of our celeryd specific configuration we went with mostly stock options except how we are setting up the DJANGO_SETTINGS_MODULE. We need to change the environment in which we are deploying, so that the celery daemon runs against the correct database.
[program:celery-cms] environment = PYTHONPATH='/home/code',DJANGO_SETTINGS_MODULE='ljworld.standard' command=/home/code/django/bin/django-admin.py celeryd --loglevel DEBUG -c2 user=nobody numprocs=1 stdout_logfile=/var/log/celery/cms_supervisord.log stderr_logfile=/var/log/celery/cms_supervisord.err autostart=true autorestart=true startsecs=10 [program:celery-weeklies] environment = PYTHONPATH='/home/code',DJANGO_SETTINGS_MODULE='desotoexplorer.settings' command=/home/code/django/bin/django-admin.py celeryd --loglevel DEBUG -c2 user=nobody numprocs=1 stdout_logfile=/var/log/celery/weeklies_supervisord.log stderr_logfile=/var/log/celery/weeklies_supervisord.err autostart=true autorestart=true startsecs=10
The really nice part about using supervisord is that our fabric script for deploying changes to celery is just deploying the code and then running supervisorctl restart celery-cms.
I hope today’s post was useful, and I’m again curious for any other awesome celery tips!