matrs / HGTector

HGTector2: Genome-wide prediction of horizontal gene transfer based on distribution of sequence homology patterns.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

HGTector2

The development of HGTector is now at qiyunlab. Versions starting from 2.0b3 will be released from this repo. Please access HGTector using the new URL: https://github.com/qiyunlab/HGTector.

HGTector2 is a completely re-engineered software tool, featuring a fully automated analytical pipeline with smart determination of parameters which requires minimum human involvement, a re-designed command-line interface which facilitates standardized scientific computing, and a high-quality Python 3 codebase.

HGTector is a computational pipeline for genome-wide detection of putative horizontal gene transfer (HGT) events based on sequence homology search hit distribution statistics.

Documentation

What's New

Installation

Tutorials

References

Quick start

Set up a Conda environment and install dependencies:

conda create -n hgtector python=3 pyyaml pandas matplotlib scikit-learn bioconda::diamond
conda activate hgtector

Install HGTector2:

pip install git+https://github.com/qiyunlab/HGTector.git

Build a reference database using the default protocol:

hgtector database -o db_dir --default

This will retrieve the latest genomic data from NCBI. If this does not work (e.g., due to network issues), or you need some customization, please read the database page.

Prepare input file(s). They should be multi-Fasta files of amino acid sequences (faa). Each file represents the whole protein set of a complete or partial genome.

Perform homology search:

hgtector search -i input.faa -o search_dir -m diamond -p 16 -d db_dir/diamond/db -t db_dir/taxdump

Perform HGT prediction:

hgtector analyze -i search_dir -o analyze_dir -t hgtdb/taxdump

Examine the prediction results under the analyze_dir directory.

It is recommended that you read the first run, second run and real runs pages to get familiar with the pipeline, the underlying methodology, and the customization options.

License

Copyright (c) 2013-2020, Qiyun Zhu and Katharina Dittmar. Licensed under BSD 3-clause. See full license statement.

Citation

Zhu Q, Kosoy M, Dittmar K. HGTector: an automated method facilitating genome-wide discovery of putative horizontal gene transfers. BMC Genomics. 2014. 15:717.

About

HGTector2: Genome-wide prediction of horizontal gene transfer based on distribution of sequence homology patterns.

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Python 100.0%