aquaskyline / Clairvoyante

Clairvoyante: a multi-task convolutional deep neural network for variant calling in Single Molecule Sequencing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is a docker image available?

PhilPalmer opened this issue · comments

Hi,

I'm having some problems running Clairvoyante in parallel using the conda installation on a BAM file generated from minimap2 and one of the nanopore models.

I ran the following command:

clairvoyante.py callVarBamParallel \
    --chkpnt_fn learningRate1e-3.epoch999 \
    --ref_fn chr20.fa \
    --bam_fn giab.hg002.2D.bam \
    --sampleName giab.hg002.2D \
    --output_prefix giab.hg002.2D \
    --threshold 0.125 \
    --minCoverage 4 \
    --tensorflowThreads 4 \
    > commands.sh
export CUDA_VISIBLE_DEVICES=""
cat commands.sh | parallel -j4

However commands.sh is empty & it gave the following error which was repeated many times:

Delay 6 seconds before starting variant calling ...
  Loading model ...
  samtools: error while loading shared libraries: libcrypto.so.1.0.0: cannot open shared object file: No such file or directory
  Failed to load reference seqeunce. Please check if the provided reference fasta chr20.fa and the ctgName chr20 are correct.
  samtools: error while loading shared libraries: libcrypto.so.1.0.0: cannot open shared object file: No such file or directory
  Failed to load reference seqeunce.
  Traceback (most recent call last):
    File "/opt/conda/bin/clairvoyante/callVarBam.py", line 225, in <module>
      main()
    File "/opt/conda/bin/clairvoyante/callVarBam.py", line 221, in main
      Run(args)
    File "/opt/conda/bin/clairvoyante/callVarBam.py", line 138, in Run
      c.CVInstance.wait()
    File "/opt/conda/lib/python2.7/subprocess.py", line 1099, in wait
      pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0)
    File "/opt/conda/lib/python2.7/subprocess.py", line 125, in _eintr_retry_call
      return func(*args)
    File "/opt/conda/bin/clairvoyante/callVarBam.py", line 30, in CheckRtCode
      c.CTInstance.kill(); c.CVInstance.kill()
    File "/opt/conda/lib/python2.7/subprocess.py", line 1279, in kill
      self.send_signal(signal.SIGKILL)
    File "/opt/conda/lib/python2.7/subprocess.py", line 1269, in send_signal
      os.kill(self.pid, sig)
  OSError: [Errno 3] No such process

I am running this using a docker container I have made lifebitai/clairvoyante:latest

Dockerfile:

FROM continuumio/miniconda:4.5.4

# Install procps so that Nextflow can poll CPU usage
RUN apt-get update && apt-get install -y procps gcc && apt-get clean -y 
RUN conda install conda=4.6.7

RUN pip install tensorflow==1.9.0 && \
    pip install blosc && \
    pip install intervaltree==2.1.0 && \
    pip install numpy

RUN conda config --add channels conda-forge && \
    conda install -c conda-forge pypy2.7==5.10.0 && \
    conda install -c conda-forge python-blosc==1.8.1 && \
    conda install -c conda-forge intervaltree==2.1.0

RUN wget https://bootstrap.pypa.io/get-pip.py && \
    pypy get-pip.py && \
    pypy -m pip install --no-cache-dir intervaltree==2.1.0

RUN conda config --add channels bioconda && \
    conda install -c bioconda clairvoyante && \
    clairvoyante.py

RUN apt-get install parallel -y

RUN conda install -c bioconda samtools openssl=1.0 && \
    conda install -c bioconda htslib && \
    conda install -c bioconda vcflib

To generate the BAM file I download the following file s3://giab/data/AshkenazimTrio/HG002_NA24385_son/CORNELL_Oxford_Nanopore/giab.hg002.2D.fastq before aligning it to hg19 reference genome using minimap2. I then then sorted & indexed it before marking duplicates.

The FASTA file is chr20 from hg19. It seems like the problem is something to do with the ctgName or one of the samtools libraries.

Do you know what the issue is & how it can be resolved?

Thanks in advance, any help would be much appreciated

I'm not familiar with the docker image you mentioned, but clearly, samtools wasn't working properly in your docker image due to lacking the libcrypto.so library.

I will build an official docker image and I will let you know when it's done.

A fast workaround is to get a working samtools binary in your environment and point it to Clairvoyante using the --samtools option.

Hi @aquaskyline,

Thanks for your help & prompt response.

Yeah, you were right. Samtools wasn't installed in the container properly. I have now fixed the docker image lifebitai/clairvoyante:latest & edited my earlier comment so that the Dockerfile is now correct.

Everything is now working correctly, thanks for a great tool 😃