deepcodebase / classification

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Classification Codebase

This repo implements a simple PyTorch codebase for training classification models with powerful tools including Docker, PyTorchLightning, and Hydra.

Requirements

  • nvidia-docker
  • docker-compose

Setup

Build the environment

We use docker to run all experiemnts. Before running any codes, you should check docker-compose.yml first. The defualt setting is shown as below:

version: "3.9"
services:
    playground:
        container_name: playground
        build:
            context: docker/
            dockerfile: Dockerfile.local
            args:
                - USER_ID=${UID}
                - GROUP_ID=${GID}
                - USER_NAME=${USER_NAME}
        image: pytorch_local
        environment:
            - TZ=Asia/Shanghai
            - TORCH_HOME=/data/torch_model
        ipc: host
        hostname: docker
        working_dir: /code
        command: ['sleep', 'infinity']
        volumes:
            - .:/code
            - /data1/data:/data
            - /data2/data/train_log/outputs:/outputs

You should change the volumes to:

  • mount your dataset folders to /data,
  • and mount a folder for /outputs (training logs will be written to this folder)

Next, simply run:

python core.py env prepare

This command will first build an image based on /docker/Dockerfile.local and then luanch a container based on this image.

Enter the environment

Simply run:

python core.py env

The default user is the same as the host to avoid permission issues. And of course you can enter the container with root:

python core.py env --root

Change the environment

Basiclly, there are four config files:

  • /docker/Dockerfile.pytorch defines basic environments including cuda, cudnn, nccl, conda, torch, etc. This image has been build at deepbase/pytorch. By default, you don't need to change this.
  • /docker/Dockerfile.local defines the logic of building the local image. For example, install packages defined in requirements.txt.
  • /docker/requirements.txt defines the python packages you want to install.
  • /docker-compose.yml defines the setting of running the container. For example, the volumes, timezone, etc.

After changing the settings as you want, you can rebuild the local image by running:

python core.py env prepare --build

Training

Enter the environment and run:

python train.py

Suggestions

Reading the offical documents of Hydra and PyTorchLightning to know more:

  • Hydra: Very powerful and convenient configuration system and more.
  • PyTorchLightning: You almost only need to write codes for models and data. Say goodbye to codes for pipelines, mixed precision, logging, etc.

About


Languages

Language:Python 100.0%