vvssttkk / dst

yet another custom data science template via cookiecutter

Home Page:https://vvssttkk.github.io/dst/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

data science template

in this repo u can look at default template for ds/ml/dl/.. projects or similar

how to use

  • before creating a new project from this template, u need to install the next dependencies

  • after go to the directory where u want to create your project and run

    cookiecutter gh:vvssttkk/dst
  • follow the instruction

using the next project structure

├── .github/                           <- some actions
│   ├── workflows/  
│   │   ├── ci.yml  
│   │   └── dependency-review.yml  
│   └── dependabot.yml  
│  
├── config/                            <- often it's yaml-files with some parameters
│  
├── data/  
│   ├── external/                      <- data from third party sources
│   ├── interim/                       <- intermediate data that has been transformed
│   ├── processed/                     <- the final, canonical data sets for modeling
│   ├── raw/                           <- the original, immutable data dump
│   ├── features/                      <- another
│   └── README.md  
│  
├── docs/                              <- a default sphinx project (see sphinx-doc.org for details)
│  
├── experiments/                       <- for any experiments
│   └── README.md  
│  
├── models/                            <- trained & serialized models, model predictions, or model summaries
│   └── README.md  
│  
├── notebooks/                         <- notebooks for research naming convention is a number (for ordering), the creator's initials,
│                                         and a short `-` delimited description, eg `1.0-jqp-initial-data-exploration`
│  
├── references/                        <- data dictionaries, manuals, and all other explanatory materials
│   └── README.md  
│  
├── tests/                             <- test for project
│   └── __init__.py
│  
├── {{ cookiecutter.project_name }}/   <- source code
│   ├── __init__.py                    <- propose generate with `mkinit`
│   ├── data/                          <- scripts to download or generate data
│   ├── models/                        <- scripts to train models and then use trained models to make predictions
│   └── visualization/                 <- scripts to create exploratory and results oriented visualizations
│  
├── .gitignore                         <- default for python
├── .pre-commit-config.yaml            <- custom pcc with `reorder_python_imports`, `black`, `flake8`, `pyright`, `mypy`, `pre-commit-hooks`..  
├── LICENSE                            <- will be created if u choose
├── README.md
└── requirements.txt                   <- propose generate with `pipreqs`

other similar templates

propose to use next tools