DReichLab/adna

aDNA is a set of tools to process ancient DNA data, in particular data produced with the ReichLab protocol.

As of now, aDNA consists of two tools, adna-trim and adna-ldup. adna-trim efficiently trims sequencing adapters, checks inline barcodes and merges overlapping ends, all in one go. The typical command line to invoke adna-trim is:

seqtk mergepe R1.fq.gz R2.fq.gz | adna-trim -p out-pe -b barcode.txt - | gzip -1 > out-se.fq.gz

where seqtk generates an interleaved FASTQ and barcode.txt gives the read1 and read2 barcodes (see example.bc for an example). Read pairs that can't be unambiguously merged will be written to out-pe.R1.fq.gz and out-pe.R2.fq.gz; merged reads will be outputted to the standard output. Inline barcodes are appended to read names.

adna-ldup marks potential PCR duplicates. It works for both single-end and paired-end reads. Different from typical single-end duplicate markers, adna-ldup checks both start and end positions and is barcode aware. To use adna-ldup:

adna-ldup sorted.bam > sorted-marked.bam

About

Processing WGS aDNA data using the ReichLab protocol

Languages

Language:C 92.9%Language:Perl 4.3%Language:Objective-C 2.3%Language:Makefile 0.5%