This project is an Active Learning Framework for Drug Discovery which has been designed to be extendable and permits the integration of different Regression models for determing the potency of chemcial compounds.
- Python Environment Manager such as conda or miniconda
- Code Editor - Visual Studio Code or Jupyter-Lab preferred
- Clone repo into the selected folder via
git clone https://gitlab.com/Baldur10/drug-discovery-al
- Enter the root folder of the repo via
cd \drug-discovery-al\
- Set up the conda environment by
conda create --name dd-al --file=environ_al.yml
- Activate the conda environment via
conda activate dd-al
- Gaussian Processes Regressor (Scikit-Learn GPR)
- Random Forest Regressor (Intel(R) Extension for Scikit-Learn and Scikit-Learn RFR)
- Neural Network Regressor (SKORCH)
Pretrained Models for the default assays are available at:
Storage | Link |
---|---|
Onedrive | FYP Models |
- Open the requisite model training scripts inside
/scripts
- Taking the example of the Gaussian Processes Regressor Model, the approriater file is
/scripts/test_gpr.ipynb
- Open the file in the code editor and run all cells. If given the option, select
dd-al
as the python interpretator - The variable
assay_limit
can be changed to any integer 'n' to set the first 'n' number of assays for which models have to be trained. - After the training loop is completed, the models can be found under
/models
and the data is present under/data/data_results
- Before running Flask, set the environment variable using
set FLASK_APP=app.py
- Run the Flask app via
flask run
Contact me at rmohan2-c@my.cityu.edu.hk