nf-core / mag

Description of the bug

Encountering a MissingMethodException when executing the nf-core/mag pipeline, specifically related to the java.util.LinkedList.getFileSystem().

Steps to Reproduce

Command used to run the pipeline:
nextflow run nf-core/mag -profile docker --input 'fastq_files/*.fastq.gz' --outdir mag-results

Program output

N E X T F L O W  ~  version 23.10.0
Launching `https://github.com/nf-core/mag` [mighty_carlsson] DSL2 - revision: ba72349594 [master]
ERROR ~ Unknown method invocation `getFileSystem` on LinkedList type

 -- Check '.nextflow/assets/nf-core/mag/main.nf' at line: 44 or '.nextflow.log' for more details

Extract from .nextflow.log:

Nov-07 03:45:27.972 [main] DEBUG nextflow.Session - Session aborted -- Cause: No signature of method: java.util.LinkedList.getFileSystem() is applicable for argument types: () values: []
Nov-07 03:45:27.986 [main] ERROR nextflow.cli.Launcher - @unknown
groovy.lang.MissingMethodException: No signature of method: java.util.LinkedList.getFileSystem() is applicable for argument types: () values: []

Running with docker/test profiles worked:

nextflow run nf-core/mag -profile docker,test --outdir mag-test

Environment

Java Version

openjdk version "11.0.20" 2023-07-18 LTS
OpenJDK Runtime Environment (Red_Hat-11.0.20.0.8-1.amzn2.0.1) (build 11.0.20+8-LTS)
OpenJDK 64-Bit Server VM (Red_Hat-11.0.20.0.8-1.amzn2.0.1) (build 11.0.20+8-LTS, mixed mode, sharing)

Nextflow version

version 23.10.0 build 5889
created 15-10-2023 15:07 UTC

System information

Linux ip-172-31-37-92.ec2.internal 5.10.198-187.748.amzn2.x86_64 #1 SMP Tue Oct 24 19:49:54 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Thanks for the report!

Edit: What I wrote initially seems to be not true (that no direct raw read input is accepted), so here I attempt to correct:

I was initially pointing out that this code block doesnt accept fastq data, but actually, it doesnt lead anywhere and should be removed:

mag/workflows/mag.nf

Lines 18 to 25 in ba72349

    
           if(hasExtension(params.input, "csv")){ 
        
               Channel 
        
                   .from(file(params.input)) 
        
                   .splitCsv(header: true) 
        
                   .map { row -> 
        
                           if (row.long_reads) hybrid = true 
        
                       } 
        
           }

This part here does seem problematic because it says that a csv is expected but this seems to be not properly checked:

mag/nextflow_schema.json

Lines 15 to 24 in ba72349

    
           "input": { 
        
               "type": "string", 
        
               "format": "file-path", 
        
               "exists": true, 
        
               "mimetype": "text/csv", 
        
               "pattern": "^\\S+\\.csv$", 
        
               "description": "Input FastQ files or CSV samplesheet file containing information about the samples in the experiment.", 
        
               "help_text": "Use this to specify the location of your input FastQ files. For example:\n\n```bash\n--input 'path/to/data/sample_*_{1,2}.fastq.gz'\n``` \n\nAlternatively, to assign different groups or to include long reads for hybrid assembly with metaSPAdes, you can specify a CSV samplesheet input file with 5 columns and the following header: sample,group,short_reads_1,short_reads_2,long_reads. See [usage docs](https://nf-co.re/mag/usage#input-specifications).", 
        
               "fa_icon": "fas fa-file-csv" 
        
           },

I am not certain what the problem might be at that point.
An immediate work-around might be to make a sample sheet as described in the docs and the pipeline should work just fine.

Thanks to a team effort of @d4straub @mahesh-panchal @nvnieuwk we identified it to just a the problem with a mis-specified nextflow schema. So 'direct' inpout of FASTQ files is indeed still supported 👍

Fix incoming :)

Nice work, but why you kept in #537 the code in

mag/workflows/mag.nf

Lines 18 to 25 in ba72349

    
           if(hasExtension(params.input, "csv")){ 
        
               Channel 
        
                   .from(file(params.input)) 
        
                   .splitCsv(header: true) 
        
                   .map { row -> 
        
                           if (row.long_reads) hybrid = true 
        
                       } 
        
           }

I thought its not used, did I miss something?
What about

mag/workflows/mag.nf

Lines 13 to 16 in ba72349

    
           // Check already if long reads are provided 
        
           def hasExtension(it, extension) { 
        
               it.toString().toLowerCase().endsWith(extension.toLowerCase()) 
        
           }

that seems also dead code?

Good point! I'll investigate

So indeed hasExtension is not necessary anymore. The former still appears to be important, as that value might be use for the initalise function that has a bunch of QC checks. But I would need more time to investigate that, so I'm a bit loathe to do this right now in my limited time.

However we should review if we can strip any of those QC checks out as indeed some should be covered by the nextflow_schema.json

I will make a new issue for this, and a PR removing hasExtension

	if(hasExtension(params.input, "csv")){
	Channel
	.from(file(params.input))
	.splitCsv(header: true)
	.map { row ->
	if (row.long_reads) hybrid = true
	}
	}

	"input": {
	"type": "string",
	"format": "file-path",
	"exists": true,
	"mimetype": "text/csv",
	"pattern": "^\\S+\\.csv$",
	"description": "Input FastQ files or CSV samplesheet file containing information about the samples in the experiment.",
	"help_text": "Use this to specify the location of your input FastQ files. For example:\n\n```bash\n--input 'path/to/data/sample_*_{1,2}.fastq.gz'\n``` \n\nAlternatively, to assign different groups or to include long reads for hybrid assembly with metaSPAdes, you can specify a CSV samplesheet input file with 5 columns and the following header: sample,group,short_reads_1,short_reads_2,long_reads. See [usage docs](https://nf-co.re/mag/usage#input-specifications).",
	"fa_icon": "fas fa-file-csv"
	},

	// Check already if long reads are provided
	def hasExtension(it, extension) {
	it.toString().toLowerCase().endsWith(extension.toLowerCase())
	}

Pipeline Execution Failure: MissingMethodException for LinkedList.getFileSystem in Nextflow 23.10.0

Description of the bug

Steps to Reproduce

Program output

Environment