can't run example dataset test
ESDeutekom opened this issue · comments
Hi Romain,
Very curious to start using Brocolli. I tried running it but got into some trouble already when running on the test directory. I can't see the problem, maybe you can help. Thanks in advance!
:$ python broccoli.py -dir example_dataset
Broccoli v1.0
--- STEP 1: kmer clustering
# parameters
proteomes dir : example_dataset/
kmer size : 100
min size seq : 10
min nb aa : 15
# check input files
6 input files
879 sequences
# kmer clustering
6 proteomes on 1 threads
-> 868 proteins saved for the next step
--- STEP 2: phylomes
# parameters
e_value : 0.001
nb_hit : 6
gap : 0.7
threads : 1
# check input files
6 input fasta files
868 sequences
# build phylomes ... be patient
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/eva/eva2/Programs/miniconda3/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/home/eva/eva2/Programs/miniconda3/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/hosts/linuxhome/scarab/eva2/Programs/Broccoli-master/scripts/broccoli_step2.py", line 217, in process_file
--compress 1 --more-sensitive -e ' + str(evalue) + ' -o ./dir_step2/' + index + '/' + search_output + ' --outfmt 6 qseqid sseqid qstart qend qseq_gapped sseq_gapped 2>&1', shell=True)
File "/home/eva/eva2/Programs/miniconda3/lib/python3.7/subprocess.py", line 395, in check_output
**kwargs).stdout
File "/home/eva/eva2/Programs/miniconda3/lib/python3.7/subprocess.py", line 487, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'diamond blastp --quiet --threads 1 --db ./dir_step2/0.db --max-target-seqs 6 --query ./dir_step1/0.fas --compress 1 --more-sensitive -e 0.001 -o ./dir_step2/0/0_0.gz --outfmt 6 qseqid sseqid qstart qend qseq_gapped sseq_gapped 2>&1' returned non-zero exit status 1.
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "broccoli.py", line 157, in <module>
broccoli_step2.step2_phylomes(evalue, max_per_species, path_diamond, path_fasttree, trim_thres, nb_threads)
File "/hosts/linuxhome/scarab/eva2/Programs/Broccoli-master/scripts/broccoli_step2.py", line 67, in step2_phylomes
multithread_process_file(list_files, nb_threads)
File "/hosts/linuxhome/scarab/eva2/Programs/Broccoli-master/scripts/broccoli_step2.py", line 141, in multithread_process_file
results_2 = tmp_res.get()
File "/home/eva/eva2/Programs/miniconda3/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
subprocess.CalledProcessError: Command 'diamond blastp --quiet --threads 1 --db ./dir_step2/0.db --max-target-seqs 6 --query ./dir_step1/0.fas --compress 1 --more-sensitive -e 0.001 -o ./dir_step2/0/0_0.gz --outfmt 6 qseqid sseqid qstart qend qseq_gapped sseq_gapped 2>&1' returned non-zero exit status 1.
Hi,
could you please check you are using the correct version of DIAMOND ?
It should be version 0.9.25 or above.
My best guess is that you are using an older version that does not have the qseq_gapped and sseq_gapped output fields.
Romain
Hi Romain,
That will probably be it. My conda is giving me grief with updating to the latest diamond and didn't do so apparently.
Thanks a lot for the info,
Eva
Hi Eva,
I will have a look at the most recent version of DIAMOND available in conda.
Probably the best for now would be to install the right version of DIAMOND locally using its source code.
I'll close this issue.
Please don't hesitate to open a new issue if this problem persists (it shouldn't though) or if you have additional problems/questions.
Romain