Deferred Tasks and Scheduled Jobs with Celery 3.1, Django 1.7 and Redis

Setting up celery with Django can be a pain, but it doesn't have to be. In this video learn what it takes to setup Celery for deferred tasks, and as your cron replacement. We will use Celery 3.1 and Django 1.7 both introduce changes you need to be aware of.

resources

cli

pip install celery
pip install django-celery

celery -A demo worker -l debug celery -A demo beat -l debug --max-interval=10

celery.py

from future import absolute_import

import os
import django

from celery import Celery
from django.conf import settings

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'demo.settings')
django.setup()

app = Celery('demo')

app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
settings.py
INSTALLED_APPS = (
    'djcelery',
)

BROKER_URL = 'redis://127.0.0.1:6379/0'
BROKER_TRANSPORT = 'redis'
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'
tasks.py
from demo.celery import app

@app.task
def hello_world():
    print('Hello World')
views.py
from django.shortcuts import render
from django.views.generic import TemplateView

from .tasks import hello_world

class IndexView(TemplateView):
    template_name = 'home/index.html'

    def get_context_data(self, kwargs):
        context = super(IndexView, self).get_context_data(kwargs)
        hello_world.delay()
        return context

In this video, we're going to go over setting up Celery. Celery is used for doing distributed commands and doing deferred processing of tasks that don't need to be in the normal request response life-cycle. An example of this might be you want to send an email out, that doesn't necessarily need to happen in the request and the response. Another thing could be if somebody's uploading images, you would have the original image saved to the model and then you might have a bunch of different versions of the image and different sizes saved off as a related object. Then you could have that generation done in a task outside of the request response process so that it keeps your application running smoothly and quickly. And because those can be resource intensive, you can have a Celery worker running on a beefier computer than, say, maybe a web server.

To get started, let's install Celery with a pip install celery. It's going to go through a normal pip install and install all the dependencies. Today, we're going to actually set up Celery with Redis as our queue instead of RabbitMQ, which is the default for Celery. To do that, we can actually do a pip install and then in brackets add redis and then it'll install the dependencies for using Redis with Celery. So, now that we have our dependencies installed, we can do our settings. It's very simple. Go into our setting.py at the very bottom and we'll add our broker URL, which is redis://127.0.0.1. And we're going to use the Redis port of 6379 and we're going to use the first database of zero. We're going to set our broker transport as Redis. A broker is the location where you store all of your tasks that need to be executed in a queue. So, every time we want to say, "Hey, I want to run this command or I want to run this set of code," it actually sends it to Redis and sits in Redis until the worker can come and pop it off that queue.

So with that done, let's go ahead and create our celery.py. In this, I'm going to go ahead and do the imports, and we'll see how they work with each other here in a second. We're going to set an environment variable default for DJANGO_SETTINGS_MODULE and that's going to be demo.settings. demo being our project name, .settings we are going to reference our settings file in our demo project. This is important, because we're bootstrapping Django, so that we can use Django. Next thing that we are going to do is we're going to do django.setup(). This is very important in Django 1.7 and up, because they're using the new app registry system. That means in 1.6 and below; you don't need to run this line. The next thing that we need to do is we need to actually instantiate our Celery object so that we have an app running on Celery. Celery is actually dependent upon its own application and we'll run Celery as a separate thing that just bootstraps Django inside of it. Here, we are creating a new Celery app and we are naming it demo and we're setting it to the app variable by convention. Next thing we're doing is we're setting app.config from object and we're telling it we're going to use the django.conf settings. We use our project settings as our settings file, instead of a separate file, so all of our configuration can live in one place. - The final thing we're going to do is we're going to set auto-discover tasks. We're going to look in every single project that we have in our installed apps and look for a tasks.py file in each of those, for any tasks that we need to be able to run, and that's it. That's all we need to do to set up Celery. To get an idea of how to use Celery tasks let's go ahead and write one. If we go into home/tasks.py we're going to import from demo.celery. We're going to import our app that we just created. Then we're going  to do app.task as a decorator on a function. We're going to name our function hello_world. And all it's going to do is print out "Hello World". At the end of the day, this is a function, and we can call it like a normal function, or we can call it like a task for Celery. To demonstrate this, we're going to open up our view in our home. We're going to import from .tasks. We're going to import Hello World. Then our get_context_data, we're going to just call hello_world as a function call. If we run our server and then we load up localhost:9000, go back, we see Hello World written out to the console. If we go back to our file, and change our method or our function call to hello_world.delay(), this says, "Hey, we want to run this function, but we want to go ahead and run it using Celery." So now when we refresh the page and we go back to our console output, we don't see Hello World anymore,  because it has pushed off that execution into our Redis queue for a worker to pick up. So with that in mind, let's go ahead and start up our worker. We're going to use the Celery command. We're going to tell it we want the project of demo. We want to start a worker, and a worker is what actually pops things off of the queue and executes them. Demo is also going to be the project that is our Django project. Notice we're in the project root of our folder. Then finally we want to go ahead and set a log level of debug so that we just get a lot of extra output and we can see a little more of what's going on. We'll start this and if you'll actually notice, at the very bottom here, third line from the bottom, you'll see Hello World printed out. That means it's actually popped the task that we told it to run, off of our queue, and then went ahead and ran it. We jump back into our browser and refresh the page and then come back to our output again, now you'll see two Hello Worlds printed out. So that shows that  our tasks are executing. They're actually ready to do any kind of deferred tasks that you want, so you can really potentially speed up your website so you can find those things that can run outside of your normal request response cycle and have those execute as delayed tasks, so your users are happier with the speed of your website.

The next thing that we want to cover is periodic tasks. Sometimes cron can be annoying to mess with and configure all the time. It's actually easier to configure something through a web-interface and have it execute all the time and not have to mungle with trying to get anything like that working with cron. So, to do that, Celery has a process called celerybeat. What it does is it polls - at a given interval - a celerybeat schedule location that you have that lists all of the things that you want to run and when you want them to run. In our case, we're going to have that be our Django project. So with that in mind, we need to go ahead and install django-celery. This used to be a required package to even get Celery to work, but since Celery 3.1, they have built in a lot of things for Django. So this is no longer a required package, and now it's just a complement if you want to do other specific things. In this case, we want to configure our celerybeat using our Django and so that we can go into the admin and set stuff up. So with that installed, let's go ahead and do a couple of settings real quick. We want to add djcelery to our INSTALLED_APPS and then we want to jump to the bottom with our Celery configuration. We want to add our celerybeat scheduler to be djcelery.scheduler.databasescheduler. This tells celerybeat to access our database and pull out all the tasks that we need to run and send those to our queue at a specific time. Now that we've made configuration changes, we need to restart our Celery process. This is good to note that whenever you do a deployment and you're using Django and you change code, you need to restart your Celery workers. Preferably, you need to stop your Celery workers before you start the code deploy, do your code deploy, and then restart it. This won't have any affect on tasks not running, because they'll just continue to be queued up in your queue, so as soon as you start it back up, it's just going to start churning through that queue. So with it restarted, we can go ahead and do a migrate on our command, since we have djcelery in our INSTALLED_APPS.

Now, we actually want to start our celerybeat process. We do it almost the exact same as with our worker. We do Celery when we use the project demo, except this time we want it to be the beat process. We want to set the log level to debug and finally we want to do a max interval of ten seconds. I'm doing this for demo purposes so that we know that, "Hey, every ten seconds it's actually going to poll that database and look for new things to throw into our queue." So we start that up, and if you'll look, you'll see a bunch of queries where it's actually querying our database for djcelery periodic tasks. With that in mind, let's go in to our admin section and let's create a new crontab. We're going to set it to run every hour at minute 31 and then we're going to go back and create a new periodic task. We're going to give it our location of home.task.hello_world, since that was the location of our Python Project. Name it Hello and set to our new crontab, and with that, it's ready to run. Jump back over to our celerybeat. If we'll just sit here for a few seconds, you'll see the last one was at 8:30 and 54 seconds. That means the next one will be at 8:31 and 4 seconds. There we go. It pulsed, and you'll see we have a whole bunch of other stuff that it did. And basically what celerybeat has done has said, "Hey, there's actually something in this location for this time. We need to go ahead and add this process that needs to run to our task queue." And if we'll jump over to our output for our Celery worker, we see that at 8:31, it ran Hello World. And that's it. That's all you need to do to set up, to have a periodic tasks run in a cron-like manner, and that's all you need to do to set up tasks to run in the background. A couple of things I did not go over, but are fairly easy to set up if you'll just dive through the documentation, is setting a results back-end to our database. You have a little more options to see what is happening. I've not actually run into a lot of instances where I needed to do that, so I didn't bother showing you for the sake of time and because talking to other people, many people don't actually need it, because of using it to send a lot of emails or update specific data and they'll get notified if it doesn't work, so they never go in and check the worker results.

comments powered by Disqus