zyablitsev / sample_nb_code_project

An example of a project for doing data work in Python using notebooks but also placing code in Python files and testing them

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sample Data Project

Briefly describe what this project is about.

Description

Provide a more detailed description of the project, its main features, and its use cases.

Table of Contents

Installation

  1. Clone the Repository

    git clone <repository-url>
    cd <repository-name>
    
  2. Setup Project Directories

    make setup-directories
  3. Setup the Virtual Environment

    Before setting up the virtual environment, ensure you have python3 installed on your machine.

    make setup_venv

    This will create a new virtual environment and install all necessary packages listed in requirements.txt.

  4. Docker (Optional)

    If you intend to run the application inside a Docker container, ensure you have Docker installed and running.

Usage

  • Running the Dash App Locally

    make run_dash_app
  • Working with Docker

    • Build the Docker image:

      make docker_build
    • Deploy the Docker container:

      make docker_deploy
    • To stop and remove the Docker container:

      make docker_stop

Running in Codespaces

If you are using GitHub Codespaces, you launch Jupyter or run the dash application from Codespaces. Just run the make commands from the terminal in Codespaces.

Development Commands

Here are some of the main Makefile commands you might use during development:

  • make help: Display a list of available commands.
  • make setup-directories: Set up the directory structure for the project.
  • make setup_venv: Set up the Python virtual environment and install dependencies.
  • make test: Run tests using pytest.
  • make docker_build: Build a Docker image for the project.
  • make docker_deploy: Deploy the application using Docker.
  • make ci: Run a full CI/CD sequence, which includes tests, Docker image building, and Docker deployment.
  • make docker_stop: Stop and remove the Docker container.
  • make clean: Clean up the development environment (remove virtual environment and Docker image).

Directory Structure

proj/
│
├── data/                  # Data directory for storing all project data
│   ├── raw/               # Raw data, unmodified from its original state
│   ├── processed/         # Data that has been processed and ready for analysis
│   └── external/          # External data, like third-party datasets or exports
│
├── notebooks/             # Jupyter notebooks directory
│   ├── exploratory/       # Notebooks for initial data exploration and experimentation
│   └── report/            # Final notebooks used for reporting and presentation
│
├── venv/                  # Python virtual environment (not committed to version control)
|
├── src/                   # Source code directory
│
├── tests/                 # Tests directory for unit tests, integration tests, etc.
│
├── requirements.txt       # Python dependencies
│
├── .github/workflows/     # GitHub Actions workflows
│
├── .gitignore             # Specifies intentionally untracked files to ignore by Git
│
├── README.md              # Project description, usage, and other details
│
└── Makefile               # Contains automation commands for the project setup and management

Directory Descriptions:

  • data/: This is the primary directory where all the data related to the project resides. Data is further categorized into raw, processed, and external to maintain clarity and separation of concerns.

  • notebooks/: Contains Jupyter notebooks used throughout the project. exploratory/ contains initial data exploration and experimentation notebooks, while report/ has the finalized notebooks for presentation and reporting purposes.

  • src/: The source code of the project resides here. This might include modules, scripts, and other necessary code files.

  • tests/: This directory is dedicated to testing. It contains test scripts, fixtures, and other testing-related files to ensure the codebase's functionality and robustness.

  • Makefile: This is a simple way to manage project tasks. It provides a set of commands for setting up the environment, running tests, building Docker images, and more. It also serves as a form of documentation for the project. If you are running on Windows, you can refer to the Makefile commands and run the equivalent commands in the command prompt.


You can add this section to your README.md to give readers a clear overview of your project's structure and organization.

Alternative Layouts

Contributing

If you'd like to contribute to this project, please follow the usual fork-and-pull request workflow. Ensure you run tests and all checks pass before making a pull request.

License

This code serves as a template and is licensed under the MIT License


Make sure to update placeholders like <repository-url> and <repository-name> with the actual information. Additionally, flesh out sections like "Description" with more detailed info about your project.

About

An example of a project for doing data work in Python using notebooks but also placing code in Python files and testing them


Languages

Language:Jupyter Notebook 99.5%Language:Python 0.4%Language:Makefile 0.1%Language:Dockerfile 0.0%