jdetras / SNP-Calling

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SNP-Calling

GATK Variant calling pipeline for genomic data using Nextflow

nextflow

Quickstart

Install Nextflow using the following command:

curl -s https://get.nextflow.io | bash

Index reference genome:

$ bwa index /path/to/reference/genome.fa

$ samtools faidx /path/to/reference/genome.fa

$ gatk CreateSequenceDictionary -R /path/to/genome.fa -O genome.dict

Launch the pipeline execution with the following command:

nextflow run jdetras/snp-calling -r main -profile docker

Pipeline Description

The variant calling pipeline follows the recommended practices from GATK. The input genomic data are aligned to a reference genome using BWA. The alignemnt files are processed using Picard Tools. Variant calling is done using samtools and GATK.

Input files

The input files required to run the pipeline:

  • Genomic sequence paired reads, *_{1,2}.fq.gz
  • Reference genome, *.fa

Pipeline parameters

Usage

Usage: nextflow run jdetras/snp-calling -profile docker [options]

Options:

  • --reads
  • --genome
  • --output

Example: $ nextflow run jdetras/snp-calling -profile docker --reads '/path/to/reads/*_{1,2}.fq.gz' --genome '/path/to/reference/genome.fa' --output '/path/to/output'

--reads

  • The path to the FASTQ read files.
  • Wildcards (*, ?) can be used to declare multiple reads. Use single quotes when wildcards are used.
  • Default parameter: $projectDir/data/reads/*_{1,2}.fq.gz

Example: $ nextflow run jdetras/snp-calling -profile docker --reads '/path/to/reads/*_{1,2}.fq.gz'

--genome

  • The path to the genome file in fasta format.
  • The extension is .fa.
  • Default parameter: $projectDir/data/reference/genome.fa

Example: $ nextflow run jdetras/snp-calling -profile docker --genome /path/to/reference/genome.fa

--output

  • The path to the directory for the output files.
  • Default parameter: $projectDir/output

Software

About

License:MIT License


Languages

Language:Nextflow 76.9%Language:Dockerfile 23.1%