How does the catholic media portal kath.ch work on daily bases? A quantitative analysis.
Windows
pip venv env
env\Scripts\activate
pip install -r requirements.txt
Mac/Linux
pip venv env
source env/bin/activate on mac
pip install -r requirements.txt
Run src/1_daily.py
as you wish (daily). It will push new articles into the git repo. You need to set up ssh keys on your server.
Run src/2_extract_files.ipynb
. This will generate json-files from html-articles.
See 3_analyse.ipynb