gymrek-lab / LongTR

Tandem repeat genotyping with long reads

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Low base quality score

wdecoster opened this issue · comments

Hi,

Most of our (ONT R9) reads overlapping a location get dropped because of low base quality score, and we are a bit confused about how to fix that. I naively first tried to put the --min-sum-qual to 10, then to 1 and 0, and even to -1000. Only the latter seemed to change something in the filtering, but still, for a few remaining loci, most reads were dropped. Did I misunderstand this, or could this be a bug?

Thanks,
Wouter

Dear Wouter,

Thank you for your interest in LongTR. The --min-sum-qual is the sum over log values of quality scores across all base pairs, so it is a negative number, and for longer reads, a large one. So please use a very large negative number (like -1e10) to keep all reads. I will change the default value for now, but we will work on coming up with a better measurement of quality for longer reads. Thank you for raising this issue.

Best,
Helia

Ah! The documentation made me think it would be in the Phred scale. Thanks for the quick reply.