snakemake-workflows / docs

Documentation of the Snakemake-Workflows project

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

single-end, paired-end, singletons and merged

SilasK opened this issue · comments

Hey, I'm currently developing the metagenomic pipeline ATLAS. I'm searching to be more compliant with your guidelines, and maybe pack some parts in wrappers. One problem that I'm facing is that you don't know if the user is using single-end or paired-end reads.
In addition, trough quality filtering you might end with reads that lost their mate (singletons). If you don't want to lose them you will have three files from the initial two files. If you merge paired-end reads you also end up with additional files with reads, which don't have the same length distribution.
If you want to keep them separate you might end up with 4 files for the same reads.

It seems that most wrappers are not made to handle this different number of reads nor to distinguish between them. Any Idea on how to solve this issue?

In the Atlas pipeline, we solved the issue by checking at the beginning if the sample is single-end or paired-end and then input functions.

Test the number of files inside the wrapper is a good solution
https://snakemake-wrappers.readthedocs.io/en/stable/wrappers/bwa/mem.html