haploid genotypes cause an error in dataPrepScripts/GetTruth.py
AndrewCarroll opened this issue · comments
Line 68-69 of dataPrepScripts/GetTruth.py
varType = last.split(":")[0].replace("/","|").replace(".","0").split("|")
p1, p2 = varType
Will fail if the variant site is haploid. It seems that this will cause Skyhawk to fail for certain variant callers (even if the majority of the sites are diploid and only a few haploid).
There are some good reasons that a variant caller might decide to not write a diploid call (chrX or chrY come to mind).
Here is a snippet of a Strelka2 VCF (HG001 on hg38) with a haploid call - on chromosome1 (maybe it thinks there is a deletion here?) that causes Skyhawk to fail. I'm not sure how you want to handle haploid sites - but I thought you would like to know as this does seem to limit Skyhawk to certain callers.
fixed in 63e7e9d