OCHA-DAP / hdx-scraper-whosonfirst

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Collector for Who's On First Datasets

Build Status Coverage Status

This script connects to the Who's On First site and extracts data links and metadata creating one dataset per country in HDX. It makes one read from the data inventory and around 250 read/writes (API calls) to HDX in a one-hour period. It creates one temporary file and is run every month.

Usage

python run.py

For the script to run, you will need to have a file called .hdx_configuration.yaml in your home directory containing your HDX key e.g.

hdx_key: "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
hdx_read_only: false
hdx_site: prod

You will also need to supply the universal .useragents.yaml file in your home directory as specified in the parameter user_agent_config_yaml passed to facade in run.py. The collector reads the key hdx-scraper-whosonfirst as specified in the parameter user_agent_lookup.

Alternatively, you can set up environment variables: USER_AGENT, HDX_KEY, HDX_SITE, TEMP_DIR, LOG_FILE_ONLY

About

License:MIT License


Languages

Language:Python 100.0%