ML project template
Authors:
- Gal Suchetzky (galsuchetzky@gmail.com)
This project is a template project for a general deep learning project.
It provides a base architecture for the project as well as some useful functions and code,
such as:
- Logging
- Base environment
- Base training loop structure
- etc...
In each file, you will find documentation for the code and templates already existing in the file,
as well as comments, TODOs and recommendations for what logic might be implemented in that part.
Note that the provided code is pytorch based, but can be translated into tensorflow if necessary.
Files Description
environment.yml
This file is a yml file for setting up an environment using the anaconda package manager.
In it, you are already provided with a template for the file's structure, and some dependencies.
Change this file and add all your dependencies as you go along with your project to keep the setup of a new
environment simple.
Creating the environment:
- Run the anaconda command line.
- run
conda env create -f environment.yml
now you have a new conda environment with the name specified in the yml file.
In the case of the template environment, the name of the new environment is MLFramework.
Activating the environment:
Each time you would like to work in the environment (via terminal or editors that do not remember the environment)
you should run activate MLFramework
to activate the environment.
todo: add anaconda installation details, basic commands, package installation details.
Defaults.py
This file contains the list of default values, as well as all the project constants.
Usage:
- Add to that file all the default values for the command line arguments (see Args.py below) and any project constants.
Args.py
This file is defining all the command line arguments for the all scripts in the project.
For example, the command line arguments that specify the sources for the dataset, the learning rate and the data
paths.
The file contains basic arguments, and many more examples for possible arguments, and you may change these according
to need.
The argument parsing is based on the argparse library.
Usage:
In the file itself you will find examples for arguments as well as documentation and explanation on adding new
arguments.
Config.py
This file contains the configuration classes for the project.
Each config class takes care of parsing the command-line arguments that are relevant to it.
Currently, there are 3 config classes it this file: SetupConfig, TrainConfig, TestConfig.
Usage:
- If necessary, add more config classes.
- Edit the SetupConfig, TrainConfig and TestConfig classes to parse all the arguments relevant to this class.
- Make sure to include processing for any additional arguments that you have added to the Args.py file.
Setup.py
This file is the setup script for the project.
Here, all the required resources for the project will be downloaded, including the dataset.
Additionally, the preprocessing of the dataset will occur here as well.
Note that this script will only run once.
Usage:
- Edit the Args.py to include arguments to receive the resources parameters that are needed.
- Edit the download function to correctly download the resources (more details in the function).
- Edit the pre_process function to perform preprocessing to your downloaded dataset.
Models.py
This file will contain the architecture of the models used in the project.
The template is for pytorch based models, for additional information about pytorch see https://pytorch.org/docs/stable/
Usage:
- Implement your models in this file using the supplied pytorch template.
Layers.py
This file will contain all the custom layers of your models.
Use this file to maintain readability in the Models.py file.
Usage:
- Implement your custom layers in this file using the supplied pytorch template.
Trainers.py
todo: add explanation and usage details
Train.py
todo: add explanation and usage details
Test.py
todo: add explanation and usage details
Utils.py
todo: add explanation and usage details
Framework usage
todo: Add explanation and recommendation for how one might go about implementing a project based of this template implementation - from which files to start, how to run training, how to test, how to setup the environment...
Setup
-
Make sure you have Miniconda installed
- Conda is a package manager that sandboxes your project’s dependencies in a virtual environment
- Miniconda contains Conda and its dependencies with no extra packages by default (as opposed to Anaconda, which installs some extra packages)
-
cd into src, run
conda env create -f environment.yml
- This creates a Conda environment called
MLFramework
- This creates a Conda environment called
-
Run
source activate MLFramework
- This activates the
MLFramework
environment - Do this each time you want to write/test your code
- This activates the