danknights / SHOGUN

SHallow shOtGUN profiler

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Shallow shotgun sequencing

Shallow seq pipeline for optimal shotgun data usage

Installation

These installation instructions are streamlined for Linux and macOS systems. The tool SHOGUN is installable on windows with a few minor tweaks to this tutorial. This package requires anaconda, which is a system agnostic package and virtual environment manager. Follow the installation instructions for your system at http://conda.pydata.org/miniconda.html.

The Easy Way

Once anaconda is installed, get the environment file:

wget https://raw.githubusercontent.com/knights-lab/SHOGUN/master/environment.yml

Then install the requirements into the environment 'shogun':

conda env create -f environment.yml

The Harder Way

Once anaconda is installed, create an environment:

conda create -n shogun python=3

Now activate the environment.

# OSX, Linux
source activate shogun

With the shogun environment activated, install the developmental SHOGUN toolchain.

# If you want to use bowtie2
conda install -c bioconda bowtie2

# NINJA-utils
pip install git+https://github.com/knights-lab/NINJA-utils.git --no-cache-dir --upgrade

# DOJO
pip install git+https://github.com/knights-lab/DOJO.git --no-cache-dir --upgrade

# SHOGUN
pip install git+https://github.com/knights-lab/SHOGUN.git --no-cache-dir --upgrade

With the flags provided to pip, copying and pasting any of these commands will redo the installation if a failure happened.

If you are installing SHOGUN for BugBase, you are done. The database is provided for you.

Building a Database

Next, to test the installation, download the test data.

wget https://www.dropbox.com/s/b5w4xe08x7snm93/shogun_test_files.zip?dl=1

Extract the folder using your favorite extraction utility.

7z x <downloaded file>

Next you create the database.

shogun_bt2_db -i ./test.hmp_species.fna -x '>, '

This will take some time, the DOJO software is lazy loading the NCBI Taxonomy.

shogun_bt2_lca -i ./mock_communities -b ./annotated/bt2/test.hmp_species

The results of the taxonomy counts will be in the taxon_counts.csv 🐱‍👤

To run it with UTree

shogun_utree_db -i ./test.hmp_species.fna -x '>, '

The run LCA:

shogun_utree_lca -i ./mock_communities -u ./annotated/utree/test.hmp_species.ctr

Introduction to Functional Profiling

As of 1/10/17 the only supported functional profiling is through bowtie2 and the IMG database.

# Align reads to IMG
# Input directory has one FASTA (.fna extension) file per sample
shogun_functional -i <input directory> -o <output directory> -l False -b /project/flatiron/tonya/img_bowtie_builds/img.gene.bacteria.bowtie

# Input is a folder filled with SAM files, one SAM file per sample
kegg_parse_img_ids -i <output folder from shogun_functional> -o <location of the kegg.csv file>

# Input file is the KEGG csv file from the kegg_parse_img_ids
# -m mapping file for IMG gene to ko-map was generated by you in the spring
kegg_predictions -i <kegg.csv file> -o <final output>  --algorithm intersection -m /project/flatiron/data/img/img-gene-ko-map.txt

About

SHallow shOtGUN profiler

License:GNU Affero General Public License v3.0


Languages

Language:Python 100.0%