CAT_DB process has harcoded CAT subdirectory names
maxibor opened this issue · comments
Maxime Borry commented
Description of the bug
In the CAT_DB process, the subdirectories names are hardcoded (to database
and taxonomy
), which is problematic because the newer versions of the CAT database these directories renames to db
and tax
.
Furthermore, the symlinking of these subdirectories in the process might be posing an issue when running using singularity.
ERROR ~ Error executing process > 'NFCORE_MAG:MAG:CAT_DB (20231120_CAT_nr)'
Caused by:
Missing output file(s) `database/*` expected by process `NFCORE_MAG:MAG:CAT_DB (20231120_CAT_nr)`
Command executed:
if [[ 20231120_CAT_nr != *.tar.gz ]]; then
ln -sr `find 20231120_CAT_nr/ -type d -name "*taxonomy*"` taxonomy
ln -sr `find 20231120_CAT_nr/ -type d -name "*database*"` database
else
mkdir catDB
tar -xf 20231120_CAT_nr -C catDB
mv `find catDB/ -type d -name "*taxonomy*"` taxonomy/
mv `find catDB/ -type d -name "*database*"` database/
fi
cat <<-END_VERSIONS > versions.yml
"NFCORE_MAG:MAG:CAT_DB":
tar: $(tar --version 2>&1 | sed -n 1p | sed 's/tar (GNU tar) //')
END_VERSIONS
Command exit status:
0
Command output:
(empty)
Work dir:
/home/lucia_winkler/nf-temp/26/59a97e2b1eb9ced31d84c2abe2d7d9
Command used and terminal output
nextflow run nf-core/mag -r 2.5.4 \
-profile eva,archgen \
--input /home/lucia_winkler/speleothem/pilot_sequences/2024-04-16_samplesheet.csv \
--outdir results \
--reads_minlength 30 \
--bbnorm \
--igenomes_base "/home/maxime_borry/SDAG_old/04_genomes/" \
--host_genome GRCh38 \
--skip_spades \
--refine_bins_dastool \
--ancient_dna \
--skip_prokka \
--binning_map_mode own \
--busco_db "/r1/people/maxime_borry/02_db/busco_downloads" \
--run_gunc \
--gunc_db /r1/people/maxime_borry/02_db/gunc/gunc_db_progenomes2.1.dmnd \
--postbinning_input both \
--gtdb_db /home/maxime_borry/02_db/gtdb/r207/gtdbtk_r207_v2_data.tar.gz \
--cat_db "/home/maxime_borry/02_db/cat/20231120_CAT_nr" \
-resume \
-with-tower
Relevant files
No response
System information
No response
James A. Fellows Yates commented
Agree, that mdoule is very old and rather fragile
We should entirely replace CAT modules with official ones, and I think from: https://github.com/MGXlab/CAT_pack
Which looks MUCH better (although not yet on bioconda), as it also describves hwo to make custom databses etc.