sites.vcf for chm13v2 (T2T) reference
kpalin opened this issue · comments
Attached is a sites.chm13v2.vcf.gz file for chm13v2 reference which is approximately compatible with the provided sites files. It will likely provide 5 conflicts for following variants whose location is non obvious (i.e. liftover fails) in the new reference
chr1 248522418 chr1_248522418_A_T
chrX 104019658 chrX_104019658_G_A
chrX 149715688 chrX_149715688_C_T
chrY 6246522 chrY_6246522_A_G
chrY 22513968 chrY_22513968_T_C
Hi Kimmo, thanks for creating this. What is in those 5 sites now? Are they included or removed?
Since there's only 1 in the autosome and the chrX and Y are only used for depth, then I think this will be fine.
They are "nearby" sites with appropriate reference allele. The new coordinates below.
chr1 247970375 chr1_248522418_A_T
chrX 102460255 chrX_104019658_G_A
chrX 147953346 chrX_149715688_C_T
chrY 10226948 chrY_6246522_A_G
chrY 22907941 chrY_22513968_T_C
I got following with 89Gbp Nanopore WGS after re-basecalling and comparing the alignments to chm13v2 and GRCh38
#sample_a sample_b relatedness hom_concordance hets_a hets_b shared_hets hom_alts_a hom_alts_b shared_hom_alts ibs0 ibs2 n x_ibs0 x_ibs2 expected_relatedness
My6606T4_19_1323 GRCh38:My6606T4_19_1323 0.941 0.856 6543 6638 6170 3243 4213 2787 5 11237 11345 0 161 -1.0
Is it somehow possible to compare digests extracted with different (subset) of sites?
Is it somehow possible to compare digests extracted with different (subset) of sites?
It is not. But I'd like to make your sites file available. I am thinking about updating the sites files to exclude these 5 variants--and to include your set on the downloads.
Or, we could simply include your set with the knowledge that it's close enough.
Perhaps a better way to do it would be to have the 5 sites in your file point to non-variant sites (but match the reference to avoid an error message). Then it will just ignore them instead of having e.g. het sites that would match between samples, but do not.
Feel free to add "my" sites file to downloads. If you decide to remove those 5 sites, do make sure to version the new site files clearly. My motivation for using "wrong" positions was to retain backward compatibility with my >4000 old digests.
Yes, I think I'll leave in those 5 sites for exactly the reason you state. Thanks very much!