A tool designed to provide high-speed scalable quality control for sequencing data which can take full advantage of modern hardware. It includes a variety of function modules and supports different sequencing technologies (Illumina, Oxford Nanopore, PacBio). RabbitQC achieves speedups between one and two orders-of-magnitude compared to other state-of-the-art tools.
- For single end data (not compressed)
rabbit_qc -w nthreads -i in.fq -o out.fq
- For paired end data (gzip compressed)
rabbit_qc -w nthreads -i in.R1.fq.gz -I in.R2.fq.gz -o out.R1.fq.gz -O out.R2.fq.gz
rabbit_qc -w nthreads -D -i in.fq
A more efficient strategy to process large gzip compressed FASTQ files is to decompress files using pugz and then process them using RabbitQC. Pugz has been integrated into RabbitQC project.
cd RabbitQC/pugz && make asserts=0
./gunzip -t nthreads in.fq.gz
For more help information, please refer to rabbit_qc -h
.
If -w
opition is not specified, RabbitQC will set working thread number to total CPU cores - 2.
By default, the HTML report is saved to RabbitQC.html
(can be specified with -h
option), and the JSON report is saved to RabbitQC.json
(can be specified with -j
option).
RabbitQC suports all fastp options for short read quality control and all NanoQC optiions for long read quality control. For details please refer to fastp and NanoQC.
RabbitQC
creates reports in both HTML and JSON format.
cd RabbitQC && make
RabbitQC paper is under review now.