xypan1232 / PredcircRNA

predicting circularRNA from other long non-coding RNA using machine learning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PredcircRNA: computational classification of circular RNA from other long non-coding RNA using hybrid features

PredcircRNA, focused on distinguishing circularRNA from other lncRNAs using
multiple kernel learning. Firstly we extracted different sources of discriminative features, including graph feature, conservation information and sequence compositions, ALU and tandem repeat, SNP density and open reading frame (ORF) from transcripts. Secondly, to better integrate features from different sources, we proposed a computational approach based on multiple kernel learning framework to fuse those heterogeneous features.

Dependcy:

  1. GraphProt: http://www.bioinf.uni-freiburg.de/Software/GraphProt/
  2. SHOTGUN: http://www.shogun-toolbox.org/
  3. txCdsPredict: http://hgdownload.cse.ucsc.edu/admin/
  4. Tandem repeats finder(trf): http://tandem.bu.edu/trf/trf.download.html

Input bed file format(such as test_bed):
chr2 69304539 69318051 + gene1
chr7 138593736 138597206 - gene2
chr22 39134591 39137055 - gene3

NOTICE: in the last column, we need have unique name (here is gene1, gene2...) for the transcript.

How to use the tool, the command as follows:
python PredcircRNA.py --inputfile=test_bed --outputfile=test_bed_out

The output file have corresponding lncRNA type in last column.

Webserver :
http://rth.dk/resources/webcircrna You can also use our updated webserver to predict the circRNA potential for coding and non-coding RNAs.

Reference
Xiaoyong Pan, Kai Xiong. PredcircRNA: computational classification of circular RNA from other long non-coding RNA using hybrid features. Mol. BioSyst., 2015, 11, 2219-2226

About

predicting circularRNA from other long non-coding RNA using machine learning


Languages

Language:Python 100.0%