rderelle / Broccoli

orthology assignment using phylogenetic and network analyses

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

can't run example dataset test

ESDeutekom opened this issue · comments

Hi Romain,
Very curious to start using Brocolli. I tried running it but got into some trouble already when running on the test directory. I can't see the problem, maybe you can help. Thanks in advance!

:$ python broccoli.py -dir example_dataset

            Broccoli v1.0


 --- STEP 1: kmer clustering

 # parameters
 proteomes dir : example_dataset/
 kmer size     : 100
 min size seq  : 10
 min nb aa     : 15

 # check input files
 6 input files
 879 sequences

 # kmer clustering
 6 proteomes on 1 threads
 -> 868 proteins saved for the next step


 --- STEP 2: phylomes

 # parameters
 e_value  : 0.001
 nb_hit   : 6
 gap      : 0.7
 threads  : 1

 # check input files
 6 input fasta files
 868 sequences

 # build phylomes ... be patient
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/eva/eva2/Programs/miniconda3/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/home/eva/eva2/Programs/miniconda3/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/hosts/linuxhome/scarab/eva2/Programs/Broccoli-master/scripts/broccoli_step2.py", line 217, in process_file
    --compress 1 --more-sensitive -e ' + str(evalue) + ' -o ./dir_step2/' + index + '/' + search_output + ' --outfmt 6 qseqid sseqid qstart qend qseq_gapped sseq_gapped 2>&1', shell=True)
  File "/home/eva/eva2/Programs/miniconda3/lib/python3.7/subprocess.py", line 395, in check_output
    **kwargs).stdout
  File "/home/eva/eva2/Programs/miniconda3/lib/python3.7/subprocess.py", line 487, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'diamond blastp --quiet --threads 1 --db ./dir_step2/0.db --max-target-seqs 6 --query ./dir_step1/0.fas                  --compress 1 --more-sensitive -e 0.001 -o ./dir_step2/0/0_0.gz --outfmt 6 qseqid sseqid qstart qend qseq_gapped sseq_gapped 2>&1' returned non-zero exit status 1.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "broccoli.py", line 157, in <module>
    broccoli_step2.step2_phylomes(evalue, max_per_species, path_diamond, path_fasttree, trim_thres, nb_threads)
  File "/hosts/linuxhome/scarab/eva2/Programs/Broccoli-master/scripts/broccoli_step2.py", line 67, in step2_phylomes
    multithread_process_file(list_files, nb_threads)
  File "/hosts/linuxhome/scarab/eva2/Programs/Broccoli-master/scripts/broccoli_step2.py", line 141, in multithread_process_file
    results_2 = tmp_res.get()
  File "/home/eva/eva2/Programs/miniconda3/lib/python3.7/multiprocessing/pool.py", line 657, in get
    raise self._value
subprocess.CalledProcessError: Command 'diamond blastp --quiet --threads 1 --db ./dir_step2/0.db --max-target-seqs 6 --query ./dir_step1/0.fas                  --compress 1 --more-sensitive -e 0.001 -o ./dir_step2/0/0_0.gz --outfmt 6 qseqid sseqid qstart qend qseq_gapped sseq_gapped 2>&1' returned non-zero exit status 1.

Hi,

could you please check you are using the correct version of DIAMOND ?
It should be version 0.9.25 or above.

My best guess is that you are using an older version that does not have the qseq_gapped and sseq_gapped output fields.

Romain

Hi Romain,

That will probably be it. My conda is giving me grief with updating to the latest diamond and didn't do so apparently.
Thanks a lot for the info,
Eva

Hi Eva,

I will have a look at the most recent version of DIAMOND available in conda.
Probably the best for now would be to install the right version of DIAMOND locally using its source code.

I'll close this issue.
Please don't hesitate to open a new issue if this problem persists (it shouldn't though) or if you have additional problems/questions.

Romain