Newstory Scraper
A framework build with docker, kafka and mssql that allows scraping profiles and tags from instagram
Quick Start
git clone repo
cd repo
make up-d
docker-compose exec worker newstory produce newstory.tasks.scrape.tag fun {\"tag\":\"fun\"}
docker-compose logs -f worker
Installation
git clone https://github.com/mattkohl/docker-flask-celery-redis
Local
pip install -r requirements.txt
python -m entrypoint.py
Build & Launch
docker-compose up -d --build
This will expose the Flask application's endpoints on port 5001
as well as a Flower server for monitoring workers on port 5555
To add more workers:
docker-compose up -d --scale worker=5 --no-recreate
To shut down:
docker-compose down
To change the endpoints, update the code in api/app.py
Task changes should happen in queue/tasks.py
adapted from https://github.com/itsrifat/flask-celery-docker-scale
Troubleshooting
Problems when loading keycloak (mixed content): https://keycloak.discourse.group/t/keycloak-in-docker-behind-reverse-proxy/1195/13