SIGSEGV error for somalier ancestry
nswh opened this issue · comments
I have the following error either from docker and the static build
somalier ancestry --labels ancestry-labels-1kg.tsv 1kg-somalier/*.somalier ++ query/*.somalier
somalier version: 0.2.13
SIGSEGV: Illegal storage access. (Attempt to read from nil?)
query/*.somalier is successfully generated from VCF by the following command
somalier extract -d query/ --sites sites.hg38.vcf.gz -f GRCh38Decoy_genome.fa query.vcf.gz
somalier version: 0.2.13
[somalier] found 17608 sites
A small proportion of the VCF is here
query.vcf.gz
can you share, for example, 1 or 2 of your somalier files?
This actually works for me with the following output. Do you see the issue with the vcf you sent? It could be that your machine does not support the CPU instructions needed but usually that would give an unknown instruction sig.
$ somalier extract --sites ~/src/somalier/sites.hg38.vcf.gz -f /data/human/Homo_sapiens_assembly38.fasta query.vcf.gz
somalier version: 0.2.13
[somalier] found 138 sites
$ somalier ancestry --labels scripts/ancestry-labels-1kg.tsv 1kg-somalier/*.somalier ++ ~/Downloads/NA12878_566_20210429_A00712.somalier
somalier version: 0.2.13
[somalier] subset from 17384 to 123 high call-rate sites (removed 99.29%)
[somalier] time for dimensionality reduction to shape [2496, 5]: 0.25 seconds
[somalier] Epoch:0. loss: 1.10086. accuracy on unseen data: 0.594. total-time: 0.00
[somalier] Epoch:500. loss: 0.48224. accuracy on unseen data: 0.822. total-time: 1.82
[somalier] Epoch:1000. loss: 0.52429. accuracy on unseen data: 0.772. total-time: 3.68
[somalier] Epoch:1500. loss: 0.47193. accuracy on unseen data: 0.832. total-time: 5.56
[somalier] Epoch:2000. loss: 0.39615. accuracy on unseen data: 0.842. total-time: 7.89
[somalier] Epoch:2500. loss: 0.55570. accuracy on unseen data: 0.792. total-time: 10.05
[somalier] Epoch:3000. loss: 0.47538. accuracy on unseen data: 0.802. total-time: 12.01
[somalier] Epoch:3500. loss: 0.45755. accuracy on unseen data: 0.832. total-time: 14.17
[somalier] Epoch:4000. loss: 0.54101. accuracy on unseen data: 0.802. total-time: 16.69
[somalier] Epoch:4500. loss: 0.42020. accuracy on unseen data: 0.822. total-time: 19.09
[somalier] Epoch:5000. loss: 0.58802. accuracy on unseen data: 0.792. total-time: 21.47
[somalier] Epoch:5500. loss: 0.48210. accuracy on unseen data: 0.812. total-time: 23.76
[somalier] Epoch:6000. loss: 0.51912. accuracy on unseen data: 0.792. total-time: 26.20
[somalier] Epoch:6500. loss: 0.46640. accuracy on unseen data: 0.832. total-time: 29.54
[somalier] Epoch:7000. loss: 0.45954. accuracy on unseen data: 0.822. total-time: 31.81
[somalier] Epoch:7500. loss: 0.46691. accuracy on unseen data: 0.772. total-time: 34.08
[somalier] Epoch:8000. loss: 0.47362. accuracy on unseen data: 0.822. total-time: 36.42
[somalier] Epoch:8500. loss: 0.45352. accuracy on unseen data: 0.822. total-time: 38.97
[somalier] Epoch:9000. loss: 0.49716. accuracy on unseen data: 0.782. total-time: 41.30
[somalier] Epoch:9500. loss: 0.50170. accuracy on unseen data: 0.812. total-time: 43.58
[somalier] Epoch:10000. loss: 0.52464. accuracy on unseen data: 0.772. total-time: 45.74
[somalier] reduced query set to: [1, 5]
[somalier] wrote text file to somalier-ancestry.somalier-ancestry.tsv
[somalier] wrote html file to somalier-ancestry.somalier-ancestry.html
Same error unfortunately. The machine is Amazon ubuntu instance. CPU model is Intel(R) Xeon(R) Platinum. Would it be an issue? The storage is Amazon EBS volume attached to the instance.
somalier ancestry --labels ancestry-labels-1kg.tsv 1kg-somalier/*.somalier ++ NA12878_566_20210429_A00712.somalier
somalier version: 0.2.13
SIGSEGV: Illegal storage access. (Attempt to read from nil?)
Attached is the full somalier file I generated from full VCF file.
NA12878_566_20210429_A00712.somalier.zip
I also attached full VCF file here.
NA12878_566_20210429_A00712.hard-filtered.vcf.gz
can you show the output of head ancestry-labels-1kg.tsv
then also, here is a debug build of somalier. it's unchanged from release, but should give more info about where it's crashing. can you give it a try and let me know the result?
somalier_debug.gz
Hmm, it is the ancestry-labels-1kg.tsv
problem. Now it works. Thanks.