thinkingmachines / project-cchain

The Project CCHAIN dataset is a validated, open-sourced linked dataset measuring 20 years (2003-2022) of health, climate, environment, and socioeconomic variables at the barangay (village) level across 12 Philippine cities.

Home Page:https://thinkingmachines.github.io/project-cchain/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Lacuna fund climate health

Python Code style: black



πŸ“œ Description

The Lacuna Fund Climate x Health project aims to develop a dataset that links health data to climate, environmental conditions, and socioeconomic vulnerabilities that spans 20 years along with a baseline ML model for 12 target cities in the Philippines.

We aim to open source the linked dataset and the baseline ML model in order for more research to be developed on the impact of environmental and societal conditions to health and to further create better policies on for communities.

πŸ—„ File Organization

Data Directory

  1. 01-admin-bounds - official administrative boundaries for the target areas
  2. 02-raw - subdivided further per partner and data source
  3. 03-processed - subdivided further per partner and data source
  4. 04-output - final tables that would be used for the linked dataset
  5. 05-gis - map plots

Notebooks

Directory is divided based on each organization/partner in this project.



βš™οΈ Local Setup for Development

This repo assumes the use of conda for simplicity in installing GDAL.

Requirements

  1. Python 3.9
  2. make
  3. conda

🐍 One-time Set-up

Run this the very first time you are setting-up the project on a machine to set-up a local Python environment for this project.

  1. Install miniconda for your environment if you don't have it yet.
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
  1. Create a local conda env and activate it. This will create a conda env folder in your project directory.
make conda-env
conda activate <env name>
  1. Run the one-time set-up make command.
make setup

🐍 Testing

To run automated tests, simply run make test.

πŸ“¦ Dependencies

Over the course of development, you will likely introduce new library dependencies. This repo uses pip-tools to manage the python dependencies.

There are two main files involved:

  • requirements.in - contains high level requirements; this is what we should edit when adding/removing libraries
  • requirements.txt - contains exact list of python libraries (including depdenencies of the main libraries) your environment needs to follow to run the repo code; compiled from requirements.in

When you add new python libs, please do the ff:

  1. Add the library to the requirements.in file. You may optionally pin the version if you need a particular version of the library.

  2. Run make requirements to compile a new version of the requirements.txt file and update your python env.

  3. Commit both the requirements.in and requirements.txt files so other devs can get the updated list of project requirements.

Note: When you are the one updating your python env to follow library changes from other devs (reflected through an updated requirements.txt file), simply run pip-sync requirements.txt

About

The Project CCHAIN dataset is a validated, open-sourced linked dataset measuring 20 years (2003-2022) of health, climate, environment, and socioeconomic variables at the barangay (village) level across 12 Philippine cities.

https://thinkingmachines.github.io/project-cchain/

License:Apache License 2.0


Languages

Language:Jupyter Notebook 99.6%Language:Python 0.4%Language:Makefile 0.0%