nf-core / mag

Assembly and binning of metagenomes

Home Page:https://nf-co.re/mag

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error with classification

feixiang1209 opened this issue · comments

Description of the bug

Could you please see below error that happened every time I re-run the pipeline. I was about to send the log file, but there are too many as I have repeated this run many times. Could you please advise the reason of this error?

Thanks

-[nf-core/mag] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFYWF (MEGAHIT-MaxBin2-unclassified-unrefined-M-23-6561_3-022umB_QIA-UDI041-QIA-UDI041_L002_R)'

Caused by:
Process NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFYWF (MEGAHIT-MaxBin2-unclassified-unrefined-M-23-6561_3-022umB_QIA-UDI041-QIA-UDI041_L002_R) terminated with an error exit status (1)

Command executed:

export GTDBTK_DATA_PATH="${PWD}/database"
if [ --scratch_dir pplacer_tmp != "" ] ; then
mkdir pplacer_tmp
fi

gtdbtk classify_wf
--extension fa
--genome_dir bins
--prefix "gtdbtk.MEGAHIT-MaxBin2-unclassified-unrefined-M-23-6561_3-022umB_QIA-UDI041-QIA-UDI041_L002_R"
--out_dir "${PWD}"
--cpus 10
--skip_ani_screen
--scratch_dir pplacer_tmp
--min_perc_aa 10
--min_af 0.65

mv classify/* .

mv identify/* .

mv align/* .
mv gtdbtk.log "gtdbtk.MEGAHIT-MaxBin2-unclassified-unrefined-M-23-6561_3-022umB_QIA-UDI041-QIA-UDI041_L002_R.log"

mv gtdbtk.warnings.log "gtdbtk.MEGAHIT-MaxBin2-unclassified-unrefined-M-23-6561_3-022umB_QIA-UDI041-QIA-UDI041_L002_R.warnings.log"

find -name gtdbtk.MEGAHIT-MaxBin2-unclassified-unrefined-M-23-6561_3-022umB_QIA-UDI041-QIA-UDI041_L002_R.*.classify.tree | xargs -r gzip # do not fail if .tree is missing

cat <<-END_VERSIONS > versions.yml
"NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFYWF":
gtdbtk: $(echo $(gtdbtk --version -v 2>&1) | sed "s/gtdbtk: version //; s/ Copyright.*//")
END_VERSIONS

Command exit status:
1

Command output:
[2024-01-08 20:00:33] INFO: Creating concatenated alignment for 80,801 bacterial GTDB and user genomes.
[2024-01-08 20:00:58] INFO: Creating concatenated alignment for 12 bacterial user genomes.
[2024-01-08 20:00:58] INFO: Processing 2 genomes identified as archaeal.
[2024-01-08 20:01:00] INFO: Read concatenated alignment for 4,416 GTDB genomes.
[2024-01-08 20:01:01] TASK: Generating concatenated alignment for each marker.
[2024-01-08 20:01:02] INFO: Completed 2 genomes in 0.04 seconds (46.29 genomes/second).
[2024-01-08 20:01:03] TASK: Aligning 51 identified markers using hmmalign 3.3.2 (Nov 2020).
[2024-01-08 20:01:07] INFO: Completed 51 markers in 2.65 seconds (19.22 markers/second).
[2024-01-08 20:01:07] TASK: Masking columns of archaeal multiple sequence alignment using canonical mask.
[2024-01-08 20:01:12] INFO: Completed 4,418 sequences in 5.24 seconds (843.51 sequences/second).
[2024-01-08 20:01:12] INFO: Masked archaeal alignment from 13,540 to 10,135 AAs.
[2024-01-08 20:01:12] INFO: 0 archaeal user genomes have amino acids in <10.0% of columns in filtered MSA.
[2024-01-08 20:01:12] INFO: Creating concatenated alignment for 4,418 archaeal GTDB and user genomes.
[2024-01-08 20:01:15] INFO: Creating concatenated alignment for 2 archaeal user genomes.
[2024-01-08 20:01:15] INFO: Done.
[2024-01-08 20:01:16] INFO: Using a scratch file for pplacer allocations. This decreases memory usage and performance.
[2024-01-08 20:01:16] TASK: Placing 2 archaeal genomes into reference tree with pplacer using 10 CPUs (be patient).
[2024-01-08 20:01:16] INFO: pplacer version: v1.1.alpha19-0-g807f6f3
[2024-01-08 20:07:11] INFO: Calculating RED values based on reference tree.
[2024-01-08 20:07:12] TASK: Traversing tree to determine classification method.
[2024-01-08 20:07:12] INFO: Completed 2 genomes in 0.00 seconds (7,884.03 genomes/second).
[2024-01-08 20:07:12] TASK: Calculating average nucleotide identity using FastANI (v1.32).
[2024-01-08 20:07:13] INFO: Completed 8 comparisons in 0.97 seconds (8.23 comparisons/second).
[2024-01-08 20:07:14] INFO: 0 genome(s) have been classified using FastANI and pplacer.
[2024-01-08 20:07:14] INFO: Using a scratch file for pplacer allocations. This decreases memory usage and performance.
[2024-01-08 20:07:14] TASK: Placing 12 bacterial genomes into backbone reference tree with pplacer using 10 CPUs (be patient).
[2024-01-08 20:07:14] INFO: pplacer version: v1.1.alpha19-0-g807f6f3
[2024-01-08 20:10:08] INFO: Calculating RED values based on reference tree.
[2024-01-08 20:10:09] INFO: 12 out of 12 have an class assignments. Those genomes will be reclassified.
[2024-01-08 20:10:09] INFO: Using a scratch file for pplacer allocations. This decreases memory usage and performance.
[2024-01-08 20:10:09] TASK: Placing 10 bacterial genomes into class-level reference tree 7 (1/2) with pplacer using 10 CPUs (be patient).
[2024-01-08 20:15:17] INFO: Calculating RED values based on reference tree.
[2024-01-08 20:15:20] TASK: Traversing tree to determine classification method.
[2024-01-08 20:15:20] INFO: Completed 10 genomes in 0.00 seconds (5,375.93 genomes/second).
[2024-01-08 20:15:20] TASK: Calculating average nucleotide identity using FastANI (v1.32).
[2024-01-08 20:15:21] INFO: Completed 18 comparisons in 1.18 seconds (15.21 comparisons/second).
[2024-01-08 20:15:22] INFO: 2 genome(s) have been classified using FastANI and pplacer.
[2024-01-08 20:15:22] INFO: Using a scratch file for pplacer allocations. This decreases memory usage and performance.
[2024-01-08 20:15:22] TASK: Placing 2 bacterial genomes into class-level reference tree 6 (2/2) with pplacer using 10 CPUs (be patient).
[2024-01-08 20:21:21] INFO: Calculating RED values based on reference tree.
[2024-01-08 20:21:24] TASK: Traversing tree to determine classification method.
[2024-01-08 20:21:24] INFO: Completed 2 genomes in 0.00 seconds (5,171.77 genomes/second).
[2024-01-08 20:21:24] TASK: Calculating average nucleotide identity using FastANI (v1.32).
[2024-01-08 20:21:28] INFO: Completed 44 comparisons in 4.00 seconds (11.01 comparisons/second).
[2024-01-08 20:21:28] INFO: 0 genome(s) have been classified using FastANI and pplacer.
[2024-01-08 20:21:29] INFO: Note that Tk classification mode is insufficient for publication of new taxonomic designations. New designations should be based on one or more de novo trees, an example of which can be produced by Tk in de novo mode.
[2024-01-08 20:21:29] INFO: Done.
[2024-01-08 20:21:29] INFO: Removing intermediate files.
[2024-01-08 20:21:29] INFO: Intermediate files removed.
[2024-01-08 20:21:29] INFO: Done.

Command error:

Search for files and perform actions on them.
First failed action stops processing of current file.
Defaults: PATH is current directory, action is '-print'

    -L,-follow  Follow symlinks
    -H          ...on command line only
    -xdev       Don't descend directories on other filesystems
    -maxdepth N Descend at most N levels. -maxdepth 0 applies
                actions to command line arguments only
    -mindepth N Don't act on first N levels
    -depth            Act on directory after traversing it

Actions:
    ( ACTIONS ) Group actions for -o / -a
    ! ACT       Invert ACT's success/failure
    ACT1 [-a] ACT2    If ACT1 fails, stop, else do ACT2
    ACT1 -o ACT2      If ACT1 succeeds, stop, else do ACT2
                Note: -a has higher priority than -o
    -name PATTERN     Match file name (w/o directory name) to PATTERN
    -iname PATTERN    Case insensitive -name
    -path PATTERN     Match path to PATTERN
    -ipath PATTERN    Case insensitive -path
    -regex PATTERN    Match path to regex PATTERN
    -type X           File type is X (one of: f,d,l,b,c,s,p)
    -executable File is executable
    -perm MASK  At least one mask bit (+MASK), all bits (-MASK),
                or exactly MASK bits are set in file's mode
    -mtime DAYS mtime is greater than (+N), less than (-N),
                or exactly N days in the past
    -mmin MINS  mtime is greater than (+N), less than (-N),
                or exactly N minutes in the past
    -newer FILE mtime is more recent than FILE's
    -inum N           File has inode number N
    -user NAME/ID     File is owned by given user
    -group NAME/ID    File is owned by given group
    -size N[bck]      File size is N (c:bytes,k:kbytes,b:512 bytes(def.))
                +/-N: file size is bigger/smaller than N
    -links N    Number of links is greater than (+N), less than (-N),
                or exactly N
    -empty            Match empty file/directory
    -prune            If current file is directory, don't descend into it
If none of the following actions is specified, -print is assumed
    -print            Print file name
    -print0           Print file name, NUL terminated
    -exec CMD ARG ;   Run CMD with all instances of {} replaced by
                file name. Fails if CMD exits with nonzero
    -exec CMD ARG + Run CMD with {} replaced by list of file names
    -delete           Delete current file/directory. Turns on -depth option
    -quit       Exit

Work dir:
/home/diazrur/Documents/MAG_test/work/19/a837a93fc87b6d7b434ba4fb75476e

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh

-- Check '.nextflow.log' file for details


From: Ruben Diaz Rua <ruben.diazrua@kaust.edu.sa>
Sent: Tuesday, January 9, 2024 9:43 AM
To: Xiang Zhao <xiang.zhao@kaust.edu.sa>
Subject: Re: error

nextflow run nf-core/mag --input "/home/diazrur/Documents/MAG_test/M-23-6561_3-022umB_QIA-UDI041-QIA-UDI041_L002_R{1,2}.fastq.gz" --outdir output -profile docker --skip_krona TRUE --cat_db /home/diazrur/Documents/metagenomics_DB/CAT_prepare_20210107.tar.gz --gtdb_db /home/diazrur/Documents//metagenomics_DB/gtdbtk_r214_data.tar.gz --binqc_tool checkm --skip_spades True --skip_spadeshybrid True --skip_concoct True --ancient_dna False --skip_metaeuk True --refine_bins_dastool True --run_gunc True --gtdbtk_pplacer_cpus 40 -c custom.txt -resume


From: Xiang Zhao <xiang.zhao@kaust.edu.sa>
Sent: Tuesday, January 9, 2024 9:39 AM
To: Ruben Diaz Rua <ruben.diazrua@kaust.edu.sa>
Subject: RE: error

From: Ruben Diaz Rua <ruben.diazrua@kaust.edu.sa>
Sent: Thursday, January 4, 2024 10:07 AM
To: Xiang Zhao <xiang.zhao@kaust.edu.sa>
Subject: error


                                    ,--./,-.
    ___     __   __   __   ___     /,-._.--~'

|\ | |__ __ / / \ |__) |__ } { | \| | \__, \__/ | \ |___ \-.,--, .,._,'
nf-core/mag v2.5.1-ge728900

Core Nextflow options
revision : master
runName : curious_bell
containerEngine : docker
launchDir : /home/diazrur/Documents/Aramco_metagenomes
workDir : /home/diazrur/Documents/Aramco_metagenomes/work
projectDir : /home/diazrur/.nextflow/assets/nf-core/mag
userName : diazrur
profile : docker
configFiles :

Input/output options
input : metadata.csv
outdir : output

Quality control for short reads options
phix_reference : /home/diazrur/.nextflow/assets/nf-core/mag/assets/data/GCA_002596845.1_ASM259684v1_genomic.fna.gz

Quality control for long reads options
lambda_reference : /home/diazrur/.nextflow/assets/nf-core/mag/assets/data/GCA_000840245.1_ViralProj14204_genomic.fna.gz

Taxonomic profiling options
gtdbtk_min_perc_aa : 10
gtdbtk_pplacer_cpus: 40

Assembly options
skip_spades : true
skip_spadeshybrid : true

Gene prediction and annotation options
skip_metaeuk : true

Binning options
skip_concoct : true

Bin quality check options
refine_bins_dastool: true
run_gunc : true

###################

-- Check '.nextflow.log' file for details
(env_nf) diazrur@KW60867:/Documents/Aramco_metagenomes$ vi custom.conf
(env_nf) diazrur@KW60867:
/Documents/Aramco_metagenomes$ nextflow run nf-core/mag --input metadata.csv --outdir output -profile docker --skip_spades True --skip_spadeshybrid True --skip_concoct True --ancient_dna False --skip_metaeuk True --refine_bins_dastool True --run_gunc True --gtdbtk_pplacer_cpus 40 -c custom.conf -resume
N E X T F L O W ~ version 23.10.0
Launching https://github.com/nf-core/mag [curious_bell] DSL2 - revision: e728900 [master]

Command used and terminal output

No response

Relevant files

No response

System information

No response

This is the same issue as
#547
Note the '*' in the path passed to find command.

Thanks @muniheart , I will close this in favour of your issue (as the older one) - will address it in the next couple of weeks as I'm now back from parental leave