shenwei356 / seqkit

A cross-platform and ultrafast toolkit for FASTA/Q file manipulation

Home Page:https://bioinf.shenwei.me/seqkit

Repository from Github https://github.comshenwei356/seqkitRepository from Github https://github.comshenwei356/seqkit

The calculation of average quality score appears to be lower than it actually is

bolangshali opened this issue · comments

Hi, We have used seqkit to calculate a batch of data and found that the results of average quality score seemed to be low. As in the screenshot below, the Q30 ratio is 86%, but the average mass is only 15.45, even if these 85% bases are only calculated according to the mass value of 30, the remaining base mass is 0, and the average mass should be more than 24, please check the possible problem.
1711707498050

#297 (comment)

you can’t just do simple arithmetic mean of all the qscores, because it won’t be a representation of the mean error rate then.

https://gigabaseorgigabyte.wordpress.com/2017/06/26/averaging-basecall-quality-scores-the-right-way/