common-workflow-language / schema_salad

Semantic Annotations for Linked Avro Data

Home Page:https://www.commonwl.org/v1.2/SchemaSalad.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

secondaryFiles with pattern cannot be properly loaded when given in workflow inputs

YuxinShi0423 opened this issue · comments

https://github.com/common-workflow-lab/cwljava/issues/81
When parsing packed cwl which has the secondaryFiles given in the workflow inputs raises the error below:

org.w3id.cwl.cwl1_2.utils.ValidationException: Failed to match union type
  Trying 'RecordField'
    the `class` field is not valid because:
      Expected one of [Ljava.lang.String;@7c28c1
  Trying 'RecordField'
    the `class` field is not valid because:
      Expected one of [Ljava.lang.String;@75b3673
    the `inputs` field is not valid because:
      Failed to match union type
        Expected object with Java type of java.util.List but got java.util.LinkedHashMap
        Trying 'RecordField'
          the `secondaryFiles` field is not valid because:
            Missing 'pattern' in secondaryFiles specification entry.
    the `expression` field is not valid because:
      Expected a string.

where the secondaryFiles is given in the standard schema:

                        "secondaryFiles": [
                            {
                                "pattern": "^.bai",
                                "required": true
                            }
                        ], 

However, reformating it as a list of strings would work:

                         "secondaryFiles": [ "^.bai" ], 

The issue might result from https://github.com/common-workflow-lab/cwljava/blob/fd2d2bb0652e9d57ecee8c380157cc0f3186d636/src/main/java/org/w3id/cwl/cwl1_2/utils/SecondaryFilesDslLoader.java#L38-L44

where the pattern and required entries are removed and the source doc is modified, so when iterating candidate loaders in unionloader.java the latter could not find those entries.
For example, when testing I noticed that the loader of CommandInputParameter processed the inputs first and removed the secondaryFiles pattern and required, but later when the workflow input parameter loader tried to parse the doc it could only find an empty map.

The secondaryFiles as tool inputs seem to be working. So as the ones in the short format in both workflow and tool level.

The test cwl.json

{
    "cwlVersion": "v1.2",
    "$graph": [
        {
            "id": "main",
            "class": "Workflow",
            "inputs": [
                {
                    "id": "command",
                    "type": "string"
                },
                {
                    "id": "wf_file_input",
                    "type": "File",
                    "secondaryFiles": [
                        {
                            "pattern": ".also",
                            "required": true
                        }
                    ]
                },
                {
                    "id": "wf_file_input_array",
                    "type": {
                        "type": "array",
                        "items": "File"
                    },
                    "secondaryFiles": [
                        {
                            "pattern": ".also",
                            "required": true
                        }
                    ]
                }
            ],
            "outputs": [
                {
                    "id": "the_answer",
                    "type": "string",
                    "outputSource": "run_tool/the_answer"
                }
            ],
            "steps": [
                {
                    "id": "run_tool",
                    "run": "#cwl_secondary_files_workflow_tool",
                    "in": {
                        "command": "command",
                        "f": "wf_file_input",
                        "fs": "wf_file_input_array"
                    },
                    "out": [
                        {
                            "id": "the_answer"
                        }
                    ]
                }
            ]
        },
        {
            "id": "cwl_secondary_files_workflow_tool",
            "class": "CommandLineTool",
            "requirements": [
                {
                    "class": "InlineJavascriptRequirement"
                }
            ],
            "hints": {
                "DockerRequirement": {
                    "dockerPull": "debian:stretch-slim"
                },
                "NetworkAccess": {
                    "networkAccess": true
                },
                "LoadListingRequirement": {
                    "loadListing": "deep_listing"
                }
            },
            "inputs": [
                {
                    "id": "command",
                    "type": "string"
                },
                {
                    "id": "f",
                    "type": "File",
                    "inputBinding": {
                        "position": 2
                    }
                },
                {
                    "id": "fs",
                    "type": {
                        "type": "array",
                        "items": "File",
                        "inputBinding": {
                            "position": 3
                        }
                    }
                }
            ],
            "outputs": {
                "the_answer": {
                    "type": "string",
                    "outputBinding": {
                        "outputEval": "${ return \"\\$(\" + 42 + \")\"; }"
                    }
                }
            },
            "baseCommand": [],
            "arguments": [
                "bash",
                "-c",
                "$(inputs.command)"
            ]
        }
    ]
}

Please let me know if there is an estimated time for it to be resolved. Thanks!

Hi @YuxinShi0423 there is currently no estimated time for it to be fixed. We invite DNANexus to submit a pull request with a fix. I also understand a community member who has worked with DNANexus in the past has also reached about about possibly fixing it under contract.