judicaelclair / cu_mlfg_project

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MLFG Project: Predicting ATAC-seq accessibility from DNA methylation

The code has been tested on Ubuntu 21.10 with Python 3.9.7. Throughout this README, we assume that your current working directory is the root directory of this repository.

First make sure you have the necessary packages installed:

python3 -m pip install -r requirements.txt

Please download the folder named data found here and place this data folder inside the parent directory of this repository's root directory. That is, the file structure should be:

  • <parent_directory>/cu_mlfg_project
  • <parent_directory>/data

If you would like to change the data directory path, please modify the variable DATA_DIR, which is defined in the file src/common_util.py. Furthermore, if you would like to change the device used by PyTorch, please modify the variable torch_device, which is also defined in the file src/common_util.py.

The code for the models in addition to utilities for training and evaluating the models can be found in the src folder. Accordingly, depending on which model you want to run, execute the following:

  • Variational Autoencoder (VAE): python3 src/conv1d_vae.py
  • Vanilla Grouping-based RNN: python3 src/per_sample_rnn.py
  • Range Grouping-based RNN: python3 src/per_range_rnn.py
  • Chromosome Grouping-based RNN: python3 src/multi_range_rnn.py
  • Tissue Grouping-based RNN: python3 src/per_tissue_rnn.py

The code for data processing is located in the data_processing folder.

About


Languages

Language:Jupyter Notebook 81.0%Language:Python 19.0%