markusritschel / bayes-climsim-eval

Bayesian evaluation of the CMIP5 data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bayesian evaluation of climate simulations

main

License MIT license

Evaluation of climate model simulations (e.g. from CMIP) with a Bayesian approach

Table of Contents

Preparation

Cloning the project to your local machine

To reproduce the project, clone this repository on your machine

git clone https://github.com/markusritschel/bayes-climsim-eval

Setting up a dedicated virtual environment

As a next step, although optional, I'd highly recommend that you create a new virtual environment.

Note: I recommend that you use Conda as a package manager. For performance boost, it is recommended to use Mamba.

You can simply use the Makefile command make setup-conda-env from inside the cloned directory (cd bayes-climsim-eval/). This is probably the easiest way to get set up, especially if you're not familiar with virtual environments. This will install Mamba, create a new conda environment with the same name as the project directory, install the packages as they are listed in the environment.yml, and activate the environment.

Installing requirements

Then, in the directory you just cloned run either python setup.py install or simply make src-available to make the source code in src available as a package. From now on you can use import src in any python context within your conda environment.

πŸ‘‰ Note: If you intend to make changes to the code and want them reflected in the installed instance, replace the install in the previous command with develop:

python setup.py develop

The command make src-available will actually use the develop option. [See here for an explanation]

If you don't wanna use conda for any reason, you can also install the required packages via pip only:

pip install -r requirements.txt

Note: If you experience that something is not working (e.g. creating the documentation via make docs) try to perform an update via mamba update --all. This might solve the problem.

Make raw data available

Next, make the raw data available or accessible under data/ (see project structure below). If the project is dealing with large amounts of data that reside somewhere outside your home directory, I would suggest that you link the respective subdirectories inside data/ accordingly. The python scripts should be able to follow symlinks.

High-level & Low-level Code

All high-level code (i.e. the code that the user is directly interacting with) resides in the scripts/ and the notebooks/ directory. High-level code is, for example, code that produces a figure, a report, or similar.
Both the scripts and the notebooks should be named in a self-explanatory way that indicates their order of execution and their purpose.

Code residing in src/ is exclusively source code or low-level code and is not actively executed.

A recommendation for long-running tasks:
Some tasks like data processing will need a long time. It is highly recommended that you use a detachable terminal environment like screen or tmux. This way you can detach from the session (even close your terminal) without losing or ending the process. Alternatively, if you work on a high-performance computer, make use of the queuing system to submit jobs.

Testing

To test your code, run make tests in the root directory. This will execute both the unit tests and docstring examples using pytest.

Project Structure

β”œβ”€β”€ assets             <- A place for assets like shapefiles or config files
β”‚
β”œβ”€β”€ data               <- Contains all data used for the analyses in this project.
β”‚   β”‚                     The sub-directories can be links to the actual location of your data.
β”‚   β”‚                     However, they should never be under version control! (-> .gitignore)
β”‚   β”œβ”€β”€ interim        <- Intermediate data that have been transformed from the raw data
β”‚   β”œβ”€β”€ processed      <- The final, processed data used for the actual analyses
β”‚   └── raw            <- The original, immutable(!) data
β”‚
β”œβ”€β”€ docsrc             <- The technical documentation (default engine: Sphinx; but feel free to use 
β”‚                         MkDocs, Jupyter-Book or anything similar).
β”‚                         This should contain only documentation of the code and the assets.
β”‚                         A report of the actual project should be placed in `reports/book`.
β”‚
β”œβ”€β”€ logs               <- Storage location for the log files being generated by scripts (tip: use `lnav` to view)
β”‚
β”œβ”€β”€ notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
β”‚   β”‚                     and a short `-` or `_` delimited description, e.g. `01-initial-analyses`
β”‚   β”œβ”€β”€ _paired        <- Optional location for your paired jupyter notebook files
β”‚   β”œβ”€β”€ exploratory    <- Notebooks for exploratory tasks
β”‚   └── reports        <- Notebooks generating reports and figures
β”‚
β”œβ”€β”€ references         <- Data descriptions, manuals, and all other explanatory materials
β”‚
β”œβ”€β”€ reports            <- Generated reports (e.g. HTML, PDF, LaTeX, etc.)
β”‚   β”œβ”€β”€ book           <- A Jupyter-Book describing the project
β”‚   └── figures        <- Generated graphics and figures to be used in reporting
β”‚
β”œβ”€β”€ setup.py           <- makes project pip installable (pip install -e .) so src can be imported
β”œβ”€β”€ scripts            <- High-level scripts that use (low-level) source code from `src/`
β”œβ”€β”€ src                <- Source code (and only source code!) for use in this project
β”‚   β”œβ”€β”€ tests          <- Contains tests for the code in `src/`
β”‚   └── __init__.py    <- Makes src a Python module and provides some standard variables
β”‚
β”œβ”€β”€ .env               <- In this file, specify all your custom environment variables
β”‚                         Keep this out of version control!
β”œβ”€β”€ CHANGELOG.md       <- All major changes should go in there
β”œβ”€β”€ jupytext.toml      <- Configuration file for jupytext
β”œβ”€β”€ LICENSE            <- The license used for this project
β”œβ”€β”€ Makefile           <- A self-documenting Makefile for standard CLI tasks
β”œβ”€β”€ README.md          <- The top-level README of this project
β”œβ”€β”€ requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
β”‚                         generated with `pip freeze > requirements.txt`
β”‚
└── setup.py           <- Setup python file to install your source code in your (virtual) python environment

Dummy files

The following files are for demonstration purposes only and can be safely deleted if not needed:

β”œβ”€β”€ notebooks/01-minimal-example.ipynb
β”œβ”€β”€ docsrc/source/*
β”œβ”€β”€ reports/book/*
β”œβ”€β”€ scripts/01-test.py
└── src
    β”œβ”€β”€ tests/*
    └── submodule.py

Maintainer

Contact & Issues

For any questions or issues, please contact me via git@markusritschel.de or open an issue.


Β© Markus Ritschel, 2024

About

Bayesian evaluation of the CMIP5 data

License:MIT License


Languages

Language:Jupyter Notebook 74.4%Language:Python 19.7%Language:Makefile 4.8%Language:Batchfile 0.5%Language:CSS 0.3%Language:Shell 0.3%Language:TeX 0.0%