elihalpern / Splice-Junction-Identify

Identifying Splice Junctions in primates using Time Series Classification

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Identifying Splice Junctions in Primate DNA using Time Series Classification

In this project, the aim is to develop a method of classifying whether a given section of primate DNA contains a splice junction or not. Furthermore, determining whether the splice junction is exon-intron, or intron-exon. This will be accomplished using a Time Series Classification model, specifically using Dynamic Time Warping.

Getting Started

To run this file, make sure you have Python 3.7.2, Jupyter Notebook, and (optionally) Anaconda installed.

Prerequisites

Most of these, except DTW, can be installed automatically using the requirements.txt file:

pip install -r requirements.txt

or if you're using Anaconda, create a virtual environment:

conda create --name splice_junction --file requirements.txt

Running

Navigate to the directory with the ipynb file, and then launch Jupyter:

jupyter notebook

Authors

Eli Halpern, Abolfazl Saghafi

License

This project is licensed under the GNU General Public License - see the LICENSE.md file for details

Acknowledgments

  • Dr. Abolfazl Saghafi, my research advisor who guided me through this project
  • Dr. Zhijun Li, head of the Bioinformatics department at USciences

About

Identifying Splice Junctions in primates using Time Series Classification

License:GNU General Public License v3.0


Languages

Language:Jupyter Notebook 100.0%