RabbitQC

A tool designed to provide high-speed scalable quality control for sequencing data which can take full advantage of modern hardware. It includes a variety of function modules and supports different sequencing technologies (Illumina, Oxford Nanopore, PacBio). RabbitQC achieves speedups between one and two orders-of-magnitude compared to other state-of-the-art tools.

Simple usage

For short read

For single end data (not compressed)

rabbit_qc -w nthreads -i in.fq -o out.fq

For paired end data (gzip compressed)

rabbit_qc -w nthreads -i in.R1.fq.gz -I in.R2.fq.gz -o out.R1.fq.gz -O out.R2.fq.gz

For long read

rabbit_qc -w nthreads -D -i in.fq

For large gz files

A more efficient strategy to process large gzip compressed FASTQ files is to decompress files using pugz and then process them using RabbitQC. Pugz has been integrated into RabbitQC project.

cd RabbitQC/pugz && make asserts=0
./gunzip -t nthreads in.fq.gz

Options

For more help information, please refer to rabbit_qc -h.

If -w opition is not specified, RabbitQC will set working thread number to total CPU cores - 2. By default, the HTML report is saved to RabbitQC.html (can be specified with -h option), and the JSON report is saved to RabbitQC.json (can be specified with -j option).

RabbitQC suports all fastp options for short read quality control and all NanoQC optiions for long read quality control. For details please refer to fastp and NanoQC.

Examples of report

RabbitQC creates reports in both HTML and JSON format.

Build

cd RabbitQC && make

Citation

RabbitQC paper is under review now.

yanlifeng / RabbitQCPlus