common-workflow-language / cwljava

Java SDK for the Common Workflow Language standards

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

secondaryFiles with pattern cannot be parsed when given in workflow inputs

YuxinShi0423 opened this issue · comments

When parsing packed cwl which has the secondaryFiles given in the workflow inputs raises the error below:

org.w3id.cwl.cwl1_2.utils.ValidationException: Failed to match union type
  Trying 'RecordField'
    the `class` field is not valid because:
      Expected one of [Ljava.lang.String;@7c28c1
  Trying 'RecordField'
    the `class` field is not valid because:
      Expected one of [Ljava.lang.String;@75b3673
    the `inputs` field is not valid because:
      Failed to match union type
        Expected object with Java type of java.util.List but got java.util.LinkedHashMap
        Trying 'RecordField'
          the `secondaryFiles` field is not valid because:
            Missing 'pattern' in secondaryFiles specification entry.
    the `expression` field is not valid because:
      Expected a string.

where the secondaryFiles is given in the standard schema:

                        "secondaryFiles": [
                            {
                                "pattern": "^.bai",
                                "required": true
                            }
                        ], 

However, reformating it as a list of strings would work:

                         "secondaryFiles": [ "^.bai" ], 

The issue might result from https://github.com/common-workflow-lab/cwljava/blob/fd2d2bb0652e9d57ecee8c380157cc0f3186d636/src/main/java/org/w3id/cwl/cwl1_2/utils/SecondaryFilesDslLoader.java#L38-L44

where the pattern and required entries are removed and the source doc is modified, so when iterating candidate loaders in unionloader.java the latter could not find those entries.
For example, when testing I noticed that the loader of CommandInputParameter processed the inputs first and removed the secondaryFiles pattern and required, but later when the workflow input parameter loader tried to parse the doc it could only find an empty map.

The secondaryFiles as tool inputs seem to be working. So as the ones in the short format in both workflow and tool level.

The test cwl.json

{
    "cwlVersion": "v1.2",
    "$graph": [
        {
            "id": "main",
            "class": "Workflow",
            "inputs": [
                {
                    "id": "command",
                    "type": "string"
                },
                {
                    "id": "wf_file_input",
                    "type": "File",
                    "secondaryFiles": [
                        {
                            "pattern": ".also",
                            "required": true
                        }
                    ]
                },
                {
                    "id": "wf_file_input_array",
                    "type": {
                        "type": "array",
                        "items": "File"
                    },
                    "secondaryFiles": [
                        {
                            "pattern": ".also",
                            "required": true
                        }
                    ]
                }
            ],
            "outputs": [
                {
                    "id": "the_answer",
                    "type": "string",
                    "outputSource": "run_tool/the_answer"
                }
            ],
            "steps": [
                {
                    "id": "run_tool",
                    "run": "#cwl_secondary_files_workflow_tool",
                    "in": {
                        "command": "command",
                        "f": "wf_file_input",
                        "fs": "wf_file_input_array"
                    },
                    "out": [
                        {
                            "id": "the_answer"
                        }
                    ]
                }
            ]
        },
        {
            "id": "cwl_secondary_files_workflow_tool",
            "class": "CommandLineTool",
            "requirements": [
                {
                    "class": "InlineJavascriptRequirement"
                }
            ],
            "hints": {
                "DockerRequirement": {
                    "dockerPull": "debian:stretch-slim"
                },
                "NetworkAccess": {
                    "networkAccess": true
                },
                "LoadListingRequirement": {
                    "loadListing": "deep_listing"
                }
            },
            "inputs": [
                {
                    "id": "command",
                    "type": "string"
                },
                {
                    "id": "f",
                    "type": "File",
                    "inputBinding": {
                        "position": 2
                    }
                },
                {
                    "id": "fs",
                    "type": {
                        "type": "array",
                        "items": "File",
                        "inputBinding": {
                            "position": 3
                        }
                    }
                }
            ],
            "outputs": {
                "the_answer": {
                    "type": "string",
                    "outputBinding": {
                        "outputEval": "${ return \"\\$(\" + 42 + \")\"; }"
                    }
                }
            },
            "baseCommand": [],
            "arguments": [
                "bash",
                "-c",
                "$(inputs.command)"
            ]
        }
    ]
}

Please let me know if there is an estimated time for it to be fixed. Thanks!

Hi @YuxinShi0423

The cwljava package is the output of the schema-salad code generator from the CWL specification.

https://github.com/common-workflow-language/schema_salad/blob/main/schema_salad/java_codegen.py

So if there is a bug, it probably would need to be fixed upstream in the code generator, and the cwljava code re-generated.

Thanks! @tetron
I filed a new issue in schema_salad and will close this one.
common-workflow-language/schema_salad#525