madelyneriksen / dramatiq-batch-jobs-python

Companion Repo for a Blog Post

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Batch Processing in Python using Dramatiq

This repo is a companion for a blog post on my website about a simple batch processing technique in Python using Dramatiq, Docker, and Redis. The technique is similar to a simple MapReduce-style of batch processing.

Run The Project

You will need the following installed:

  • GNU Make (As a task runner)
  • Docker
  • Docker Compose

Building the container is as simple as running make:

# That's all folks.
make

Then you can launch the stack (again with make):

make compose

Alternatively, you can just run docker commands directly.

About the Project

The script in app.py downloads Moby Dick, and tries to extract all person entities from the text using Spacy.

Because Dramatiq and Redis are used as a backend for processing the documents, it's possible to scale the processing out horizontally as far as required, to greatly reduce execution time.

License

MIT (c) 2019 Madelyn Eriksen

About

Companion Repo for a Blog Post


Languages

Language:Python 80.2%Language:Dockerfile 14.4%Language:Makefile 5.3%