fastq-prep

Prepares sequencing data for parallelized variant calling. Takes BAM, SAM, CRAM, or FASTQ files with paired-end reads and produces split and compressed FASTQ files. Corresponding reads occur on same line in respective FASTQ file.

Currently filters reads with SAM flags indicating non primary alignment and quality control failure. Also filters reads with SAM flags indicating "first in pair" or "second in pair" when such a read has already been encountered (i.e. Two reads have the same name and are both reported as first in pair). Filtering options are specified in the header of the python file.

usage

python fastq_prep.py [output.prefix] [input.bam or input.fq]
OR python fastq_prep.py [output.prefix] [input_R1.fq,input_R2.fq]
OR [write SAM records to stdout] | python fastq_prep.py [output.prefix]
Multiple input files are separated by commas, no spaces

requirements

pysam

todo

Support generation of interleaved FASTQ chunks.

About

prepares sequencing data for parallelized downstream analysis

Languages

Language:Python 100.0%