APARENT-Perturb

This repository contains the code for training APARENT-Perturb, a multi-task neural network that takes 3' UTR polyadenylation signal sequences as input (alongside APARENT2 baseline scores) to predict the impact of single-cell Perturb-seq perturbations on polyadenylation site usage.

Contact jlinder2 (at) stanford.edu for any questions about the model or data.

APARENT-Perturb does not require installation. Just clone or fork the github repository:

git clone https://github.com/johli/aparent-perturb.git

APARENT-Perturb requires the following packages to be installed

Python >= 3.6
Tensorflow == 1.13.1
Keras == 2.2.4

Data Availability

The processed data features (e.g. one-hot-coded sequence matrices and pseudo-bulked APA isoform proportions) are available at the link below. The link also houses prediction and interpretation results (e.g. ISM matrices) for all modelled perturbations.

Processed Data Repository

Notebooks

The following notebook scripts contain scripts for processing the data, training models and applying interpretation methods to them.

Notebook 0a: Process Data
Notebook 0b: Process Data (3' UTR only)
Notebook 0c: Predict Non-targeting Controls (with APARENT2)

Notebook 1a: Train APARENT-Perturb
Notebook 1b: Cross-Validate APARENT-Perturb

Notebook 2a: Interpret APARENT-Perturb (Windowed ISM)
Notebook 2b: Interpret APARENT-Perturb (Epistasis)

Notebook 3a: Predict PAF Perturbation (Intronic)
Notebook 3b: Predict PAF Perturbation (3' UTR; Control)
Notebook 3c: Intronic Site Strength vs. Distance

About

A multi-task neural network for predicting perturbation responses to Alternative Polyadenylation usage levels

MIT License

Languages

Language:Jupyter Notebook 99.8%Language:Python 0.2%Language:Shell 0.0%