SRAtac

A pipeline for ATAC-seq data analysis built on SRAlign.

Introduction

SRAtac is a Nextflow pipeline for processing ATAC-seq data.

SRAtac is designed to be highly flexible pipeline for ATAC-seq data processing. The goal of this pipeline is to perform end-to-end data processing of ATAC-seq samples with extensive QC at all steps.

Pipeline overview

Trim reads
QC of reads
1. Raw reads FastQC
2. Trim reads FastQC
3. Summary MultiQC
Align reads
1. Align to reference genome/transcriptome
2. Check contamination
Preprocess alignments
1. Mark duplicates
2. Compress sam to bam
3. Index bam
QC of alignments
1. samtools stats
2. Samtools index stats
3. Percent duplicates
4. Percent aligned to contamination reference
5. Summary MultiQC
Library complexity and reproducibility
1. Preseq library complexity
2. DeepTools correlation
3. DeepTools PCA
Full pipeline MultiQC

Quick start

Install Nextflow
Install Docker
Download SRAtac:
```
nextflow pull trev-f/SRAtac
```

Run SRAtac in test mode:

nextflow run trev-f/SRAtac -profile test

Run your analysis:

nextflow run trev-f/SRAtac -profile docker --input <input.csv> --genome <valid genome key>

Detailed documentation can be found in docs and usage

About

A pipeline for ATAC-seq preprocessing, alignment, peak calling, and read counting. Built on the SRAlign pipeline.

MIT License

Languages

Language:Nextflow 63.4%Language:Python 24.7%Language:Groovy 11.9%