gifford-lab / FactorizableLibrary

Code and data for factorizable libraries

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

FactorizableLibrary

Folder contents

model_training/:

  • data/: contains datasets for all targets presented in manuscript
    • *.csv: sequence data with target enrichment labeled by train/test splits
    • *.pkl: easy loading pickle file with train/test splits for training
  • weights/: output folder for trained weights. Contains all trained weights for models used in Dai & Saksena et al 2022.
  • TrainModel.ipynb: notebook containing functions for training scoring models for later use in SAPS
  • encoding.pkl: sequence encoding used for training

library_generation:

  • GenerateLibrary.ipynb: notebook containing functions to generate SAPS-designed library

Instructions for running notebooks

GPU:

  • The following instructions require access to GPUs with CUDA support. If a user does not have access, please use the Google Colab notebooks provided below to run SAPS.

Install dependencies:

Using conda:

conda create -n factorizable
conda activate factorizable
conda install --file requirements.txt

Using pip:

pip install -r requirements.txt

Train model

After installing dependencies, open Jupyter notebook:

cd FactorizableLibrary/
jupyter notebook model_training/TrainModel.ipynb

Follow instructions in the notebook to run each cell and train a model. You can use one of the datasets provided in data/, or you can generate your own *.pkl data file and replace the path in the execution portion of the notebook.

Generate library

Once you have trained a model, it should be stored in weights/. Next, open the script for library generation.

jupyter notebook library_generation/GenerateLibrary.ipynb

Follow instructions in the notebook to run each cell and generate a SAPS library. You can change the path to different model weights files to change the optimization objective of the library. You can also adjust the entropy parameter as specified in the notebook.

Google CoLab demos

We have provided demo notebooks in which example models can be trained using pu
blicly available Google GPUs for illustrative purposes. These can be accessed h
ere:

About

Code and data for factorizable libraries


Languages

Language:Jupyter Notebook 100.0%