Ulthran / GenusFinder

Given a bacterial 16S gene, infer the genus by placing it on a tree of similar sequences

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GenusFinder

Tests Super-Linter codecov Upload Python Package

Given a bacterial 16S gene, infer the genus by placing it on a tree of similar sequences

Installation

GenusFinder should be run on a computer with at least XX RAM and XX CPUs. To install,

NOT PUBLISHED YET

To install the dev version of GenusFinder,

git clone https://github.com/Ulthran/GenusFinder.git
cd GenusFinder
conda env create --file genusfinder_env.yaml
conda activate genusfinder
pip install .

Running

idgenus --seq ATCGATCGATCGATCG...GCTACTATACGA

Using an NCBI API key will speed up the process of creating the 16S database, you can get one for free by following the directions here.

idgenus --seq ATCGATCGATCGATCG...GCTACTATACGA --ncbi_api_key XXXXXXXXXXXXXXXXXXXXX

Note: The LTP alignment file (used in the full tree method only) takes up

Steps

NEW

  • Fetch LTP tree and alignment
  • Align query sequence with LTP alignment
  • Add the query sequence to the LTP tree using the combined alignment
  • List nearby genuses to query and train curves on those genuses
  • Apply curves to tree to get probabilities of query being in those genuses

OLD

  • Fetch all type species 16S sequences from Tree of Life
  • Search this db for XX similar seqs
  • Build tree including the query
  • Examine nearest neighbors (subtrees snipped at high confidence intervals) for common genus
  • Examine nearest neighbors (non-topological distance between nodes) for common genus

About

Given a bacterial 16S gene, infer the genus by placing it on a tree of similar sequences


Languages

Language:Python 100.0%