Guide
Celery fundamentals explained
A customer clicks “Place order” on your storefront. The HTTP handler must return in under 200 ms, but fulfillment needs to charge the card, reserve inventory, generate a shipping label, and send email — steps that can fail independently and take seconds each. Celery is the de facto distributed task queue for Python: your web app enqueues a message to a broker, separate worker processes pull jobs and execute them, and results land in a backend store if you need them. Unlike FastAPI background tasks (which die when the process restarts) or raw RabbitMQ consumers you wire by hand, Celery gives you retries, routing, periodic schedules, and workflow primitives out of the box. This guide covers architecture and brokers, defining and calling tasks, worker concurrency and prefetch, acknowledgments and idempotency, Celery Beat for cron jobs, canvas chains and chords, monitoring with Flower, a Harbor Commerce order-fulfillment worked example, a task-queue decision table, common pitfalls, and a production checklist alongside our Python fundamentals guide and Redis fundamentals guide.
What Celery is (and what it is not)
Celery is a task execution framework, not a message broker. It sits between your application code and a broker such as Redis or RabbitMQ, serializing Python function calls into messages workers deserialize and run. The moving parts:
- Producer — your Django, Flask, or FastAPI app calls
send_email.delay(order_id). - Broker — holds the queue until a worker is ready (Redis list, RabbitMQ queue).
- Worker — long-running process that imports your task module and executes jobs.
- Result backend (optional) — stores return values so callers can poll
AsyncResult. - Beat scheduler (optional) — separate process that enqueues periodic tasks on a cron-like timetable.
Celery is not a replacement for Kafka event streaming (no durable replay log for analytics), nor for synchronous gRPC RPC. It excels at fire-and-forget side effects, competing consumers on a work queue, and scheduled maintenance — thumbnail generation, report exports, webhook retries, nightly reconciliation.
Installation and minimal app
Install Celery with a broker client. Redis is the simplest starting point for small and medium deployments:
pip install "celery[redis]"
# tasks.py
from celery import Celery
app = Celery("harbor", broker="redis://localhost:6379/0",
backend="redis://localhost:6379/1")
@app.task(bind=True, max_retries=3, default_retry_delay=60)
def charge_and_fulfill(self, order_id: str):
try:
payment = charge_card(order_id)
create_label(order_id, payment)
except TransientPaymentError as exc:
raise self.retry(exc=exc)
# Start a worker
# celery -A tasks worker --loglevel=info --concurrency=4
The @app.task decorator registers the function name and module
path in the message so any worker with the same code can execute it. Keep task
modules importable without side effects — workers import them at startup.
Broker choice
- Redis — fast, simple ops, good for queues under ~100k msgs/day; use separate DB indexes for broker vs result backend.
- RabbitMQ — stronger AMQP routing, publisher confirms, quorum queues; better when you already run Rabbit for other services.
- Amazon SQS — managed, pay-per-message; higher latency but zero broker patching.
Avoid using the result backend for high-volume fire-and-forget tasks — writing every return value to Redis adds write amplification you rarely read.
Calling tasks: delay, apply_async, and routing
task.delay(*args, **kwargs) is shorthand for
apply_async. Use the full form when you need control:
charge_and_fulfill.apply_async(
args=["ord_9f2a"],
queue="payments",
countdown=30, # delay 30 seconds
expires=3600, # drop if not started within 1 hour
priority=5,
)
Define multiple queues in config and route heavy jobs away from quick notifications:
app.conf.task_routes = {
"tasks.charge_and_fulfill": {"queue": "payments"},
"tasks.send_receipt_email": {"queue": "email"},
}
Run specialized workers per queue:
celery -A tasks worker -Q payments --concurrency=2 so a stuck
PDF render cannot starve payment retries.
Workers, concurrency, and prefetch
A Celery worker process can run tasks with different concurrency models:
- prefork (default) — forked child processes; best for CPU-bound Python; sidesteps the GIL.
- threads — lighter for I/O-bound HTTP calls; watch thread safety of shared clients.
- gevent / eventlet — greenlets for massive I/O concurrency; requires monkey-patching and compatible libraries.
- solo — one task at a time; useful for debugging.
Prefetch controls how many messages a worker reserves before
finishing current jobs. High prefetch on long tasks causes unfair distribution
— one worker hoards messages while others sit idle. Rule of thumb:
worker_prefetch_multiplier=1 for tasks longer than a few seconds;
default 4 is fine for sub-second email sends.
Scale horizontally by adding worker containers, not inflating
--concurrency on a single machine until CPU is saturated.
Pair with
Docker Compose
or
Kubernetes
Deployments where each pod runs one worker command.
Acknowledgments, retries, and idempotency
Celery acknowledges broker messages when a task finishes (or fails permanently). If a worker is killed mid-task, the message can be redelivered — tasks must be idempotent or guarded with deduplication keys.
acks_late=True— ack after success, not at start; safer but requires idempotency on crash.reject_on_worker_lost=True— requeue if worker disappears.max_retries+self.retry(exc=...)— exponential backoff for transient failures.task_time_limit/task_soft_time_limit— kill runaway jobs.
Store a processing_status column or Redis SETNX lock keyed by
order_id before charging cards. On retry, short-circuit if already
charged. Failed tasks after max retries should land in a
dead-letter queue
or alert channel — Celery’s built-in task_reject_on_failure
and broker DLX wiring vary by backend; many teams log to Sentry and move poison
messages manually.
Celery Beat: periodic and cron schedules
Celery Beat is a scheduler process that enqueues tasks on an interval or crontab. Run exactly one Beat instance per schedule namespace (or use a distributed lock) to avoid duplicate cron fires:
from celery.schedules import crontab
app.conf.beat_schedule = {
"nightly-inventory-reconcile": {
"task": "tasks.reconcile_inventory",
"schedule": crontab(hour=2, minute=15),
},
"poll-carrier-tracking": {
"task": "tasks.refresh_shipment_status",
"schedule": 300.0, # every 5 minutes
},
}
Beat stores last-run times in a local shelve file by default — ephemeral containers lose state on restart and may re-fire tasks. Production setups use django-celery-beat (database scheduler) or Redis-backed redbeat so schedule state survives deploys.
Canvas: chains, groups, and chords
Celery’s canvas primitives compose multi-step workflows:
- chain — sequential:
(validate.s(order_id) | charge.s() | ship.s())() - group — parallel fan-out:
group(send_email.s(id) for id in ids) - chord — parallel then callback:
chord(group(...), finalize.s())
Canvas results require the result backend. Chords are notoriously fragile on Redis (lost callback headers under broker restarts) — prefer RabbitMQ or orchestrate sagas explicitly in code with status rows for mission-critical pipelines. For most order flows, a single task with internal steps and structured logging is simpler than a deep chain.
Monitoring, configuration, and deployment
Operate Celery like any production service:
- Flower — web UI for worker heartbeats, active tasks, revoke; protect with auth in production.
- Structured logs — include
task_idand business keys in every log line; correlate with HTTP request IDs. - Metrics — export queue depth, task runtime histograms, and failure rates to Prometheus.
- Health checks — workers respond to
celery inspect ping; orchestrators should restart unresponsive pods. - Config — centralize in
celeryconfig.pyor env vars; never hardcode broker URLs.
Deploy workers independently of web processes. Rolling out new task code requires draining or versioning: old workers cannot deserialize new task signatures. Blue/green worker pools or feature flags inside tasks reduce mixed-version crashes during deploys.
Harbor Commerce order fulfillment: worked example
Harbor Commerce’s checkout API returns 202 Accepted with an
order ID while Celery handles side effects. June 2026 architecture (illustrative):
- POST /orders — FastAPI validates cart, writes
ordersrow aspending, enqueuesfulfill_order.delay(order_id), returns JSON immediately. - Broker — Redis DB 0; separate
paymentsandnotificationsqueues with routed tasks. - fulfill_order — idempotent lock on
order_id; calls Stripe; on success updates row topaid, enqueuescreate_labelandsend_receiptin parallel via group. - Retries — payment task
max_retries=5with exponential backoff; inventory release on permanent failure. - Beat —
reconcile_stuck_ordersevery 15 minutes findspendingolder than 1 hour and alerts ops. - Observability — OpenTelemetry trace context injected into task headers; Flower read-only behind SSO.
- Verdict — p95 checkout latency 85 ms; fulfillment p95 4.2 s async; zero duplicate charges after idempotency keys shipped in Q1.
The pattern decouples user-perceived speed from partner API slowness — the same reason teams pair Celery with transactional outbox when the enqueue must survive database commit.
Task-queue decision table
| Need | Best fit | Why |
|---|---|---|
| Python background jobs with retries and cron | Celery + Redis/RabbitMQ | Mature ecosystem, routing, Beat, canvas workflows. |
| Minimal Redis-only queue, tiny codebase | RQ (Redis Queue) | Simpler than Celery; fewer features, easier onboarding. |
| Alternative Python worker with sane defaults | Dramatiq | Smaller API surface; good middleware story. |
| Work inside one FastAPI process, best-effort | FastAPI BackgroundTasks | No broker ops; tasks lost on restart; fine for non-critical hooks. |
| Polyglot consumers, complex routing | RabbitMQ directly or via Celery | AMQP exchanges; Celery adds Python ergonomics on top. |
| High-volume event log, replay, stream processing | Kafka | Durable partitioned log; not a task queue replacement. |
| Serverless, no workers to patch | SQS + Lambda | Ops-free at cost of cold starts and vendor coupling. |
| Scheduled k8s batch without app code | CronJob | Shell scripts and ETL; not for per-user request fan-out. |
Common pitfalls
- Passing non-serializable objects — ORM models and open file handles break JSON serialization; pass IDs only.
- Import cycles — tasks defined in views.py that import views cause worker startup failures; isolate tasks module.
- Running Beat twice — duplicate cron enqueues; use one Beat or distributed scheduler lock.
- Ignoring idempotency — late acks and retries double-charge customers without dedup keys.
- Monolithic concurrency — one queue for 30s PDF jobs and 200 ms emails starves the fast path.
- Result backend on everything — Redis fills with stale AsyncResult blobs nobody polls.
- Chords on Redis in production — lost chord callbacks under restarts; use RabbitMQ or avoid chords.
- Deploying code before workers — web enqueues new task names old workers cannot find; deploy workers first or use versioned task names.
Production checklist
- Pin Celery and broker client versions; test upgrades in staging with real queue depth.
- Separate broker and result backend Redis databases (or disable results for fire-and-forget).
- Configure
task_routesfor at least payments vs notifications vs heavy batch. - Set
acks_late=Truewith idempotent tasks for anything financial. - Define
task_time_limitandmax_retrieson every external API call. - Run Beat with persistent schedule store (database or redbeat), not default shelve.
- Export queue depth and task failure metrics; alert on DLQ growth.
- Protect Flower or replace with CLI inspect + Grafana dashboards.
- Document worker deploy order relative to web deploys.
- Load-test broker under 2× peak enqueue rate before Black Friday.
Key takeaways
- Celery executes Python functions asynchronously via a broker — it is not the broker itself.
- Workers scale horizontally; tune concurrency model and prefetch per task duration.
- Retries and late acks demand idempotent task bodies and deduplication keys.
- Celery Beat handles cron; persist schedule state in production.
- Choose RQ or BackgroundTasks for trivial queues; Kafka when you need event replay, not Celery.
Related reading
- RabbitMQ explained — AMQP broker Celery can use for routing and durability
- Redis fundamentals explained — the most common Celery broker and result backend
- FastAPI fundamentals explained — HTTP layer that enqueues Celery tasks after validation
- Message queues explained — delivery guarantees and broker comparison overview