jdidion / atropos

An NGS read trimming tool that is specific, sensitive, and speedy. (production)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support UMIs from a third input file

chapmanb opened this issue · comments

Douglas and John;
Thanks for the recent work on adding UMI extraction support to atropos (#61). I'm looking forward to using this along with trimming. Would it be possible to support including UMIs from a third file, as we often have tagging strategies where bcl2fastq produces R1/R2/R3 with the UMI barcode in R2? In this strategy we'd extract the barcode from the R2 file and write R1/R3 as the first and second read, respectively, with this UMI in the name.

Alternatively, we've worked around support for this with fastp using separate tagging runs for R1/R2 and R3/R2 but then we lose the ability to do paired end trimming in the same run. Ideally we'd like to be able to quality, polyG trim and UMI tag in a single run.

Thanks much for considering this.

Thanks Brad. One question: the intention with v1.2 onward is to require python 3.6, because I'm looking to make some other improvements as well (use of type annotations, use of xphyle for file management). UMI features are slated for 1.2, but if the py3.6 requirement makes this untenable for you, we can look at backporting those features to the 1.1.x branch. We definitely want to support use of Atropos in bcbio.

John -- thanks for considering this. Moving to Python 3.6 only isn't a problem at all. bcbio still isn't python 3 compatible (I know, boo) but we install atropos in a separate anaconda environment with python3 and that has 3.6.2 now so it would work without any issues. Thanks again.

Now implemented in develop branch