aquaskyline / Clairvoyante

Clairvoyante: a multi-task convolutional deep neural network for variant calling in Single Molecule Sequencing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

problem in

sinamajidian opened this issue · comments

Hi
I need a program for calling variants on Pacbio reads. Freebayes is very slow (10 hours for 1Mbp reference genome) and most of the variants are missed with the coverage of 50. I hope that your package can help me.

I installed using these line (I have not curl)

git clone --depth=1 https://github.com/aquaskyline/Clairvoyante.git
cd Clairvoyante
wget http://www.bio8.cs.hku.hk/trainedModels.tbz 
tar trainedModels.tbz -jxf

pip install tensorflow --user
pip install blosc --user
pip install intervaltree --user
pip install numpy --user

wget 'http://www.bio8.cs.hku.hk/training.tar'
tar -xf training.tar

Firstly, "Quick Start with Variant Calling" is a bit unclear for me. when I download from "I need some results now" part, what should I do then to get some demo variant calls?
Next part titled "Call variants from at known variant sites using a BAM file and a trained model" needs testingData folder which i did not download it. So, I dismiss it and run the next part.

I run

 python ../clairvoyante/callVar.py --chkpnt_fn ../trainedModels/fullv3-illumina-novoalign-hg001+hg002-hg38/learningRate1e-3.epoch500 --tensor_fn tensor_can_chr21 --call_fn tensor_can_chr21.vcf

and faced this:

Loading model ...
From /mnt/scratch/majid001/installed/Clairvoyante/clairvoyante/clairvoyante_v3.py:60: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.conv2d instead.
From /home/majid001/.local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
From /mnt/scratch/majid001/installed/Clairvoyante/clairvoyante/clairvoyante_v3.py:66: max_pooling2d (from tensorflow.python.layers.pooling) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.max_pooling2d instead.
From /mnt/scratch/majid001/installed/Clairvoyante/clairvoyante/clairvoyante_v3.py:108: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dense instead.
From /home/majid001/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
Restoring parameters from /mnt/scratch/majid001/installed/Clairvoyante/trainedModels/fullv3-illumina-novoalign-hg001+hg002-hg38/learningRate1e-3.epoch500
Traceback (most recent call last):
  File "../clairvoyante/callVar.py", line 266, in <module>
    main()
  File "../clairvoyante/callVar.py", line 262, in main
    Run(args)
  File "../clairvoyante/callVar.py", line 47, in Run
    Test(args, m, utils)
  File "../clairvoyante/callVar.py", line 183, in Test
    PrintVCFHeader(args, call_fh)
  File "../clairvoyante/callVar.py", line 157, in PrintVCFHeader
    print >> call_fh, '##fileformat=VCFv4.1'
TypeError: unsupported operand type(s) for >>: 'builtin_function_or_method' and '_io.TextIOWrapper'. Did you mean "print(<message>, file=<output_stream>)"?

Versions and files in folder

python -c 'import tensorflow as tf; print(tf.__version__)'
1.13.1
python --version
Python 3.6.7
/mnt/scratch/majid001/installed/Clairvoyante$ ls
LICENSE.md  clairvoyante     dataPrepScripts  port23.py              python_requirements.txt  training
README.md   clairvoyante.py  jupyter_nb       pypy_requirements.txt  trainedModels

Would you please help me?

Please use python 2.7.

Sorry for asking basic questions. I should read instruction carefully. I'm thankful for your quick reply. That's working now.
Then, I put my reference genome and aligned reads to the folder mydata, and run this

python2 clairvoyante/callVarBamParallel.py \
       --chkpnt_fn trainedModels/fullv3-pacbio-ngmlr-hg001-hg19/learningRate1e-3.epoch999 \
       --ref_fn mydata/ref.fasta \
       --bam_fn mydata/reads.bam \
       --sampleName a \
       --output_prefix b \
       --threshold 0.125 \
       --minCoverage 4 \
       --tensorflowThreads 4 \
       > commands.sh

and faced this Error: pypy executable not found .

I should install pypy beforehand.

How about callVarBam.py? It seems that it also needs pypy.
Is there any version which works without pypy?

Please install pypy following the README. It doesn't require admin privilege and is not hard.

Thanks, I installed pypy. #10 and #11 are also helpful for me. callVarBam.py works well.