lch14forever / shotgunmetagenomics-nf

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Shotgun Metagnomics Pipeline

Description

This is a Nextflow re-implementation of the original pipeline used by Computational and Systems Biology Group 5 (CSB5) at the Genome Institute of Singapore (GIS).

中文文档

Development plan

  • Add customized HUMAnN2 to a conda channel
  • Add nf-core style documentation

Features

  • The new DSL2 syntax for pipeline modularity and reusabiligy
  • Dockerfile for each software (all containers can be found at DockerHub)
  • Conda recipe for each software/step
  • Configuration for local execution (server), GIS HPC (using SGE schedular), AWS batch and AWS auto-scaling cluster

Dependencies

Main pipeline

  • Nextflow
  • Java Runtime Environment >= 1.8

Quality control and host DNA decontamination

  • Fastp (>=0.20.0): Adapter trimming, low quality base trimming
  • BWA (>=0.7.17): Host DNA removal
  • Samtools (>=1.7): Host DNA removal

Reference based analysis

Setup and configuration

Usage

Run with docker

$ shotgunmetagenomics-nf/main.nf -profile docker --read_path PATH_TO_READS

Run on AWS batch (AWS batch configuration tutorial)

  • IAM configuration (set environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION)
  • Batch compute environment & job queue
  • Customized AMI (AWS ECS optimized linux + awscli installed with miniconda)
$ shotgunmetagenomics-nf/main.nf -profile awsbatch --awsqueue AWSBATCH_QUEUE --awsregion AWS_REGION --bucket-dir S3_BUCKET --outdir S3_BUCKET 

You can specifiy multiple profiles separated by comma, e.g. -profile docker,test.

Run multiple profilers

$ shotgunmetagenomics-nf/main.nf -profile gis --profilers kraken2,metaphlan2 --read_path PATH_TO_READS

Usage cases

  • Chng et al. Whole metagenome profiling reveals skin microbiome dependent susceptibility to atopic dermatitis flares. Nature Microbiology (2016)
  • Nandi et al. Gut microbiome recovery after antibiotic usage is mediated by specific bacterial species. BioRxiv (2018)
  • Chng et al. Cartography of opportunistic pathogens and antibiotic resistance genes in a tertiary hospital environment. BioRxiv (2019)

Adding a module

  1. Write a module and put it into modules/
  2. Add to the main script main.nf
  3. Modify the configuration file conf/base.config to add resources required (for GIS users, modify conf/gis.config as well for the specific conda envrionment)
  4. Add conda and docker files for the new module

Contact

Chenhao Li: lich@gis.a-star.edu.sg, lichenhao.sg@gmail.com

About

License:MIT License


Languages

Language:Python 72.5%Language:Nextflow 25.5%Language:Shell 1.7%Language:Dockerfile 0.4%