+ All Categories
Home > Technology > Celery with python

Celery with python

Date post: 10-May-2015
Category:
Upload: alexandre-gonzalez-rodriguez
View: 8,720 times
Download: 3 times
Share this document with a friend
Description:
Óscar talk on monthly PyGrunn #2 (Feb '12) about the use of Celery with python
Popular Tags:
35
Transcript
Page 1: Celery with python
Page 2: Celery with python

Celery

Òscar Vilaplana

February 28 2012

Page 3: Celery with python

Outline

self.__dict__

Use task queues

Celery and RabbitMQ

Getting started with RabbitMQ

Getting started with Celery

Periodic tasks

Examples

Page 4: Celery with python

self.__dict__

{'name': 'Òscar Vilaplana','origin': 'Catalonia','company': 'Paylogic','tags': ['developer', 'architect', 'geek'],'email': '[email protected]',

}

Page 5: Celery with python

Proposal

I Take a slow task.

I Decouple it from your system

I Call it asynchronously

Page 6: Celery with python

Separate projects

Separate projects allow us to:I Divide your system in sections

I e.g. frontend, backend, mailing, reportgenerator. . .

I Tackle them individuallyI Conquer them�declare them Done:

I Clean codeI Clean interfaceI Unit testedI Maintainable

(but this is not only for Celery tasks)

Page 7: Celery with python

Coupled Tasks

In some cases, it may not be possible to decouple some tasks.Then, we either:

I Have some workers in your system's networkI with access to the code of your systemI with access to the system's database

I They handle messages from certain queues, e.g. internal.#

Page 8: Celery with python

Candidates

Processes that:

I Need a lot of memory.

I Are slow.

I Depend on external systems.

I Need a limited amount of data to work (easy to decouple).

I Need to be scalable.

Examples:

I Render complex reports.

I Import big �les

I Send e-mails

Page 9: Celery with python

Example: sending complex emails

Create a in independent project: yourappmailI Generator of complex e-mails.

I It needs the templates, images. . .I It doesn't need access to your system's database.

I Deploy it in servers of our own, or in Amazon serversI We can add/remove as we need themI On startup:

I Join the RabbitMQ clusterI Start celeryd

I Normal operation: 1 server is enough

I On high load: start as many servers as needed (tpspeak

tpsserver)

Page 10: Celery with python

yourappmail

A decoupled email generator:I Has a clean API

I Decoupled from your system's db: It needs to receive allinformation

I Customer informationI Custom dataI Contents of the email

I Can be deployed to as many servers as we needI Scalable

Page 11: Celery with python

Not for everything

I Task queues are not a magic wand to make things fasterI They can be used as such (like cache).I It hides the real problem.

Page 12: Celery with python

Celery

I Asynchronous distributed task queue

I Based on distributed message passing.

I Mostly for real-time queuing

I Can do scheduling too.

I REST: you can query status and results via URLs.

I Written in Python

I Celery: Message Brokers and Result Storage

Page 13: Celery with python

Celery's tasks

I Tasks can be async or sync

I Low latency

I Rate limiting

I Retries

I Each task has an UUID: you can ask for the result back if youknow the task UUID.

I RabbitMQI Messaging systemI Protocol: AMQPI Open standard for messaging middlewareI Written in Erlang

I Easy to cluster!

Page 14: Celery with python

Install the packages from the RabbitMQ website

I RabbitMQ ServerI Management Plugin (nice HTML interface)

I rabbitmq-plugins enable rabbitmq_management

I Go to http://localhost:55672/cli/ and download the cli.

I HTML interface at http://localhost:55672/

Page 15: Celery with python

Set up a cluster

rabbit1$ rabbitmqctl cluster_statusCluster status of node rabbit@rabbit1 ...[{nodes,[{disc,[rabbit@rabbit1]}]},{running_nodes,[rabbit@rabbit1]}]...done.rabbit2$ rabbitmqctl stop_appStopping node rabbit@rabbit2 ...done.rabbit2$ rabbitmqctl resetResetting node rabbit@rabbit2 ...done.rabbit2$ rabbitmqctl cluster rabbit@rabbit1Clustering node rabbit@rabbit2 with [rabbit@rabbit1] ...done.rabbit2$ rabbitmqctl start_appStarting node rabbit@rabbit2 ...done.

Page 16: Celery with python

Notes

I Automatic con�guration

I Use .config �le to describe the cluster.

I Change the type of the node

I RAM node

I Disk node

Page 17: Celery with python

Install Celery

I Just pip install

Page 18: Celery with python

De�ne a task

Example tasks.py

from celery.task import task

@taskdef add(x, y):

print "I received the task to add {} and {}".format(x, y)return x + y

Page 19: Celery with python

Con�gure username, vhost, permissions

$ rabbitmqctl add_user myuser mypassword$ rabbitmqctl add_vhost myvhost$ rabbitmqctl set_permissions -p myvhost myuser ".*" ".*" ".*"

Page 20: Celery with python

Con�guration �le

Write celeryconfig.py

BROKER_HOST = "localhost"BROKER_PORT = 5672BROKER_USER = "myusername"BROKER_PASSWORD = "mypassword"BROKER_VHOST = "myvhost"CELERY_RESULT_BACKEND = "amqp"CELERY_IMPORTS = ("tasks", )

Page 21: Celery with python

Launch daemon

celeryd -I tasks # import the tasks module

Page 22: Celery with python

Schedule tasks

from tasks import add

# Schedule the taskresult = add.delay(1, 2)

value = result.get() # value == 3

Page 23: Celery with python

Schedule tasks by name

Sometimes the tasks module is not available on the clients

from tasks import add

# Schedule the taskresult = add.delay(1, 2)

value = result.get() # value == 3print value

Page 24: Celery with python

Schedule the tasks better: apply_async

task.apply_async has more options:

I countdown=n: the task will run at least n seconds in thefuture.

I eta=datetime: the task will run not earlier than thandatetime.

I expires=n or expires=datetime the task will be revoked inn seconds or at datetime

I It will be marked as REVOKEDI result.get will raise a TaskRevokedError

I serializerI pickle: default, unless CELERY_TASK_SERIALIZER says

otherwise.I alternative: json, yaml, msgpack

Page 25: Celery with python

Result

A result has some useful operations:

I successful: True if task succeeded

I ready: True if the result is ready

I revoke: cancel the task.

I result: if task has been executed, this contains the result if itraised an exception, it contains the exception instance

I state:I PENDINGI STARTEDI RETRYI FAILUREI SUCCESS

Page 26: Celery with python

TaskSet

Run several tasks at once. The result keeps the order.

from celery.task.sets import TaskSetfrom tasks import addjob = TaskSet(tasks=[

add.subtask((4, 4)),add.subtask((8, 8)),add.subtask((16, 16)),add.subtask((32, 32)),

])result = job.apply_async()result.ready() # True -- all subtasks completedresult.successful() # True -- all subtasks successfulvalues = result.join() # [4, 8, 16, 32, 64]print values

Page 27: Celery with python

TaskSetResult

The TaskSetResult has some interesting properties:

I successful: if all of the subtasks �nished successfully (noException)

I failed: if any of the subtasks failed.

I waiting: if any of the subtasks is not ready yet.

I ready: if all of the subtasks are ready.

I completed_count: number of completed subtasks.

I revoke: revoke all subtasks.

I iterate: iterate oer the return values of the subtasks oncethey �nish (sorted by �nish order).

I join: gather the results of the subtasks and return them in alist (sorted by the order on which they were called).

Page 28: Celery with python

Retrying tasks

If the task fails, you can retry it by calling retry()

@taskdef send_twitter_status(oauth, tweet):

try:twitter = Twitter(oauth)twitter.update_status(tweet)

except (Twitter.FailWhaleError, Twitter.LoginError), exc:send_twitter_status.retry(exc=exc)

To limit the number of retries set task.max_retries.

Page 29: Celery with python

Routing

apply_async accepts the parameter routing to create someRabbitMQ queues

pdf: ticket.#import_files: import.#

I Schedule the task to the appropriate queue

import_vouchers.apply_async(args=[filename],routing_key="import.vouchers")

generate_ticket.apply_async(args=barcodes,routing_key="ticket.generate")

Page 30: Celery with python

celerybeat

from celery.schedules import crontab

CELERYBEAT_SCHEDULE = {# Executes every Monday morning at 7:30 A.M"every-monday-morning": {"task": "tasks.add","schedule": crontab(hour=7, minute=30,day_of_week=1),"args": (16, 16),

},}

Page 31: Celery with python

There can be only one celerybeat running

I But we can have two machines that check on each other.

Page 32: Celery with python

Import a big �le:

tasks.py

def import_bigfile(server, filename):with create_temp_file() as tmp:

fetch_bigfile(tmp, server, filename)import_bigfile(tmp)report_result(...) # e.g. send confirmation e-mail

Page 33: Celery with python

Import big �le: Admin interface, server-Side

import tasksdef import_bigfile(filename):

result = tasks.imporg_bigfile.delay(filename)return result.task_id

class ImportBigfile(View):def post_ajax(request):

filename = request.get('big_file')task_id = import_bigfile(filename)return task_id

Page 34: Celery with python

Import big �le: Admin interface, client-side

I Post the �le asynchronously

I Get the task_id back

I Put some �working. . . � message.

I Periodically ask Celery if the task is ready and change�working. . . � into �done!�

I No need to call Paylogic code: just ask Celery directly

I Improvements:I Send the username to the task.I Have the task call back the Admin interface when it's done.I The Backo�ce can send an e-mail to the user when the task is

done.

Page 35: Celery with python

Do a time-consuming task.

from tasks import do_difficult_thing...stuff...# I have all data necessary to do the difficult thingdifficult_result = do_difficult_thing.delay(some, values)# I don't need the result just yet, I can keep myself busy... stuff ...# Now I really need the resultdifficult_value = difficult_result.get()


Recommended