brentp / somalier

fast sample-swap and relatedness checks on BAMs/CRAMs/VCFs/GVCFs... "like damn that is one smart wine guy"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

chrX sites count

AinaMontalban opened this issue · comments

commented

Hello,

I am using somalier to perform sex QC on WES samples, and I have a question regarding the number of sites on chrX.

I created a VCF file containing 203 sites from chrX using our target bed file and the somalier find-sites command. Then, when I ran the command somalier extract and relate on a sample, I noticed that Somalier reports X_n as 199, even though there should be 203 sites on chrX:

X_depth_mean X_n X_hom_ref X_het X_hom_alt
81.56 199 103 0 96

Why are we missing 4 sites?

Initially, I thought that the -d (MIN_DEPTH) parameter might be the reason. So, I ran the HaplotypeCaller with a base-pair resolution to check the allele depth of these sites. But none of the sites had a depth of less than 7. However, 4 sites had a GQ<30:

chrX	69670203	.	C	<NON_REF>	.	.	.	GT:AD:DP:GQ:PL	0/0:60,19:79:**0**:0,0,1333
chrX	119934456	.	G	<NON_REF>	.	.	.	GT:AD:DP:GQ:PL	0/0:19,2:21:**1**:0,1,642
chrX	136491943	.	A	<NON_REF>	.	.	.	GT:AD:DP:GQ:PL	0/0:36,10:46:**0**:0,0,923
chrX	151397088	.	G	<NON_REF>	.	.	.	GT:AD:DP:GQ:PL	0/0:10,0:10:**24**:0,24,360

Does somalier exclude genotypes with low quality? Or is it something else?