Cookiecutter Data Science

A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.

Project homepage

I forked from this homepage to modify the project layout to my liking.

Requirements to use the cookiecutter template:

Python 3
Cookiecutter Python package >= 1.4.0: This can be installed with pip by or conda depending on how you manage your Python packages:

$ pip install cookiecutter

or

$ conda config --add channels conda-forge
$ conda install cookiecutter

To start a new project, run:

cookiecutter https://github.com/michael-hoffman/cookiecutter-data-science

TODO: create own cast for this particular project, but it is close enough!

The resulting directory structure

The directory structure of your new project looks like this:

├── LICENSE
├── Makefile
├── README.md
├── data
│   ├── external
│   ├── interim
│   ├── processed
│   └── raw
├── docs
│   ├── Makefile
│   ├── commands.rst
│   ├── conf.py
│   ├── getting-started.rst
│   ├── index.rst
│   └── make.bat
├── references
├── requirements.txt
├── results
│   └── figures
├── setup.py
├── src
│   ├── __init__.py
│   ├── build_features.py
│   ├── make_dataset.py
│   ├── predict_model.py
│   ├── train_model.py
│   └── visualize.py
└── test_environment.py

Contributing

We welcome contributions! See the docs for guidelines.

Installing development requirements

pip install -r requirements.txt

Running the tests

py.test tests

About

A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.

http://drivendata.github.io/cookiecutter-data-science/

MIT License

Languages

Language:Python 48.1%Language:Makefile 34.7%Language:Batchfile 17.2%