brentp / somalier

fast sample-swap and relatedness checks on BAMs/CRAMs/VCFs/GVCFs... "like damn that is one smart wine guy"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

full results from relate output

rdmorin opened this issue · comments

I get the following error when I run somalier relate on about 1000 exomes. Even exomes that are likely from the same individual are being flagged as "unrelated" (possibly because most of the variants in the VCF are not in the exome). Is it possible to force the relatedness results from all samples (or at least the most highly related ones) to be in the outputs?

somalier version: 0.2.11
[somalier] starting read of 999 samples
[somalier] time to read files and get per-sample stats for 999 samples: 1.44
[somalier] time to get expected relatedness from pedigree graph: 0.00
[somalier] html and text output will have unrelated sample-pairs subset to 20.04% of points
[somalier] time to calculate all vs all relatedness for all 498501 combinations: 2.04

As you note, in cases like this, somalier will only output a subset of the pairs. But it will always output any pair with a relatedness greater than 0.05. So anything that's absent from the text file (or html) has a relatedness less than 0.05.
I could make an option to force output, but this is mostly not what you want.
If you want to verify the relatedness of a couple of samples, just run those 2 samples through somalier relate.

Thanks for clarifying. I was concerned that the potentially related cases might be filtered because they are all labeled in the plot as "unrelated" even when the relatedness is 1. Do you know what might be causing this?

the labels (like "identical", "parent-child", etc.) in the plot are from the pedigree file or groups file. the positions (and calculated relatedness) are determined from the genotypes.

so if you have samples where you have specified a relatedness of 1 (with groups file) but they appear as unrelated in the output file, then that would be a bug.

I could make an option to force output, but this is mostly not what you want.

it would be great if this option is provided