jamiemcg / BUSCO_phylogenomics

BUSCO Phylogenomics | Utility script to construct species phylogenies using BUSCO proteins

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error reading BUSCO files

ohudson1 opened this issue · comments

My problem when running the BUSCO_phylogenomics.py script is that it seems to look in all subdirectories within the sample directory (run_*) to have a folder called busco_sequences.
My input is:
python BUSCO_phylogenomics.py -i busco_runs/ -o BUSCO_tree_results --supermatrix --threads 8
Where inside busco_runs/ are 50 directories called run_samplename, for example run_FV-NE-02, and inside that are normal busco outputs: hmmer_output, busco_sequences, metaeuk_output, full_table.tsv

And the log reports that each run is 3 different runs because of their subdirectories:

run_FV-NE-02 busco_runs/run_FV-NE-02/hmmer_output
run_FV-NE-02 busco_runs/run_FV-NE-02/busco_sequences
run_FV-NE-02 busco_runs/run_FV-NE-02/metaeuk_output
No such file or directory: 'busco_runs/run_FV-NE-02/hmmer_output/busco_sequences/single_copy_busco_sequences'

I think that the script should be looking only in "busco_runs/run_FV-NE-02/busco_sequences" for single_copy_busco_sequences, and not the other directories. So instead of recognizing my 50 genome directories, it says there are 150.
How can I get it to recognize the busco_sequences folder as the appropriate one?

Similar to the most recent issue raised, I was able to partially correct this error by making an extra directory layer.
I moved the "run_FV-NE-02" within a new directory that I labeled FV-NE-02_busco, within a parent directory test_busco/.

This worked well to initiate the .py script but failed to create the alignment files, so the error output was:
ERROR: Alignment not loaded: "/BUSCO_tree_results/supermatrix/alignments/50715at5125.aln" Check the file's content.

Please respond or comment if you have a corrected a similar problem.
Thanks!

Resolved - needed an updated MUSCLE version - previous .py script versions include an option for old MUSCLE versions too.