arendsee / fagin

Classify genes using a syntenic filter

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Travis-CI Build Status Coverage Status

Fagin

A pipeline for the classification of orphans into origin classes using a syntenic filter.

Funding

This work is funded by the National Science Foundation grant:

NSF-IOS 1546858 Orphan Genes: An Untapped Genetic Reservoir of Novel Traits

Installation

devtools::install_github("arendsee/fagin")
library(fagin)

Dependencies

Currently fagin has no dependencies outside of R. It makes heavy use of bioconductor (e.g. Biostring, GenomicRanges, and GenomicFeatures). It also uses the rather experimental packages 'synder' and 'rmonad'.

Input

The following is required

  • Phylogeny for all included species
  • Name of the focal species
  • Synteny map for the focal species versus each other species
  • For each species
    • GFF file (must at least include gene models)
    • Full genome (GFF reference)

Documentation

Go here to see working case studies that you can adapt for your own projects.

You can also check out the (under construction) wiki here.

Configuration

To run and configure fagin, you need to set paths to your data in a configuration object. The default configuration can be generated

config()

This will need to be tailored to your specific needs. To run the full fagin analysis, call

# Where con is your configuration object
run_fagin(con)

Pipeline

  • Identify target genes that overlap the search space.
  • Search the query protein against the overlapping target gene's ORFs
  • Search the query gene DNA against the search interval DNA sequences
  • Predict ancestor states

About

Classify genes using a syntenic filter

License:GNU General Public License v3.0


Languages

Language:R 91.2%Language:Shell 5.7%Language:Python 1.7%Language:Perl 1.4%