aquaskyline / Skyhawk

An Artificial Neural Network-based discriminator for validating clinically significant genomic variants

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

acceptable coverage

przemekl opened this issue · comments

What is an acceptable sequencing depth level?
I have almost 75% of the variants identified as "S: No read cover".

I've not tested depth below 30x. What's the average depth of your sample?

30x (counted by 'samtools mpileup') or 27x (counted by samtools mpileup -aa)

I've tried 3 different depths (10x, 25x and 50x) on the GIAB HG001 sample with ~3.6M known variants. The "S: No read cover" remain low for all three (627, 315 and 273). The number of "X: Mismatch" at 25x is 1.5x higher than the 50x. In contrast, 10x is ~10x higher than the 50x. In your case, 75% of the variants identified as "S" is likely to be a result of inappropriate inputs. I've updated the code to include some validity checks on the inputs, please try again and let me know that doesn't help. Thanks.

After your recent fixes, number of variants identified as "S: No read cover" has dropped to 12,5%. Such a number is acceptable. Please, close this issue.