brentp / somalier

fast sample-swap and relatedness checks on BAMs/CRAMs/VCFs/GVCFs... "like damn that is one smart wine guy"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ancestry Error: unhandled exception: couldn't open sketch file

bopohdr opened this issue · comments

Hi !

I am using Somalier via docker for the relatedness by using:

docker run -v /Users/Somalier:/Somalier brentp/somalier somalier extract -d Somalier/extracted --sites /Somalier/sites.hg38.vcf.gz -f /Somalier/GRCh38.no_alt_analysis_set.fa /Somalier/GLnexus.vcf.gz

and afterwards

docker run -v /Users/Somalier:/Somalier brentp/somalier somalier relate -i -o /Somalier/output_100_samples /Somalier/extracted/'*.somalier'

However, I cannot get to work the ancestry estimation. When running:

docker run -v /Users/Somalier:/Somalier brentp/somalier somalier ancestry --labels /Somalier/ancestry-labels-1kg.tsv /Somalier/1kg-somalier/'*.somalier' ++ /Somalier/extracted/'*.somalier'

I get :
somalier version: 0.2.11
depthview.nim(145) read_extracted
Error: unhandled exception: couldn't open sketch file:/Somalier/1kg-somalier/*.somalier [IOError]

I though there is a problem with recognition of the wild card, but it works in the Somalier relate function...

Thank you !

did you download the thousand genomes files and untar them into the 1kg-somalier directory?

Yes.
I tried to provide a specific file just to see if it works:

docker run -v /Users/Somalier:/Somalier brentp/somalier somalier ancestry --labels /Somalier/ancestry-labels-1kg.tsv /Somalier/1kg-somalier/NA21130.somalier ++ /Somalier/extracted/'*.somalier'

and then it cannot find files for the samples of interest:

somalier version: 0.2.11
depthview.nim(145) read_extracted
Error: unhandled exception: couldn't open sketch file:/Somalier/extracted/*.somalier [IOError]

can you try: ... ++ '/Somalier/extracted/*.somalier'
so the quotes are around the full path? this is a problem of when the glob is getting expanded.

This does not help.

  • For the 1kg files & samples of interest

docker run -v /Users/Somalier:/Somalier brentp/somalier somalier ancestry --labels /Somalier/ancestry-labels-1kg.tsv '/Somalier/1kg-somalier/*.somalier' ++ '/Somalier/extracted/*.somalier'

somalier version: 0.2.11
depthview.nim(145) read_extracted
Error: unhandled exception: couldn't open sketch file:/Somalier/1kg-somalier/*.somalier [IOError]

-if provide a single 1kg sample and samples of interest

docker run -v /Users/Somalier:/Somalier brentp/somalier somalier ancestry --labels /Somalier/ancestry-labels-1kg.tsv /Somalier/1kg-somalier/NA21130.somalier ++ '/Somalier/extracted/*.somalier'

somalier version: 0.2.11
depthview.nim(145) read_extracted
Error: unhandled exception: couldn't open sketch file:/Somalier/extracted/*.somalier [IOError]

maybe try putting the command in a bash script, so: e.g.

in run.sh:

#!/bin/sh
somalier ancestry --labels /Somalier/ancestry-labels-1kg.tsv '/Somalier/1kg-somalier/*.somalier' ++ '/Somalier/extracted/*.somalier'

then for docker:

docker run -v /Users/Somalier:/Somalier brentp/somalier run.sh

same results:

docker run -v /Users/Somalier:/Somalier brentp/somalier /Somalier/somalier.sh
somalier version: 0.2.11
depthview.nim(145) read_extracted
Error: unhandled exception: couldn't open sketch file:/Somalier/1kg-somalier/*.somalier [IOError]

I also run it via bash terminal (previously in zsh):

  1. without quotes
    docker run -v /Users/Somalier:/Somalier brentp/somalier somalier ancestry --labels /Somalier/ancestry-labels-1kg.tsv /Somalier/1kg-somalier/*.somalier ++ /Somalier/extracted/*.somalier somalier version: 0.2.11
    depthview.nim(145) read_extracted
    Error: unhandled exception: couldn't open sketch file:/Somalier/1kg-somalier/*.somalier [IOError]

  2. with quotes
    docker run -v /Users/Somalier:/Somalier brentp/somalier somalier ancestry --labels /Somalier/ancestry-labels-1kg.tsv '/Somalier/1kg-somalier/*.somalier' ++ '/Somalier/extracted/*.somalier' somalier version: 0.2.11
    depthview.nim(145) read_extracted
    Error: unhandled exception: couldn't open sketch file:/Somalier/1kg-somalier/*.somalier [IOError]

I just noticed that I haven't implemented the glob handling in the ancestry code. I'll fix this for next release. Meanwhile, you'll have to let your shell expand the glob (don't use quotes around it). If that makes the command-line too long, then wait for next release, I should have it out next week.