asc-csa / ckan-gov-canada-harvester-master

🌾 Ce répertoire contient un script permettant de moissonner une instance de CKAN pour importer les jeux de données de celle-ci dans une nouvelle instance de ckan | 🌾 This repository contains a script that allows the harvesting of the datasets of one CKAN instance in order to upload them to another instance of CKAN.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

To implement an automated task that runs the harvester

emilinecsa opened this issue · comments

It would be useful to have an automated task that runs the harvester on the production environment. This way, a new dataset will be added to our data portal when TBS adds a new dataset. No manual step will be required.

Note: The data pusher is active in the production environment.

(see @nfee006 for more information)

This can be done by setting up a cron task that runs the harvester, once a day, at night (e.g. 2 AM). The cron task shall call the Python script that imports the datasets: app.py. The script will import the new datasets and will ignore the existing ones.

This web page gives the basic cron information.

I set up the cron task properly in this task. We need to implement something similar in order to call app.py.

* * * * *   /usr/bin/env > /home/efadmin/tmp/cron.log
@daily . /usr/lib/ckan/default/bin/activate && python3 /usr/lib/ckan/default/src/ckan-gov-canada-harvester-master/app.py

I configured the cron task on the production server today. It works good as I can see in the log file.