common-workflow-language / cwljava

Java SDK for the Common Workflow Language standards

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

parsing CWL v1.0 & v1.1

svonworl opened this issue · comments

Recently, we integrated cwljava v1.0 into Dockstore, and during testing, we found some workflows that cause the parser to throw. In our webservice, we use our own preprocessor to combine the various component CWL files into one big CWL represented by Maps/Lists, and parse it with cwljava here:
https://github.com/dockstore/dockstore/blob/67f4547e771864cafacdc1c92fa7bd47261e32da/dockstore-webservice/src/main/java/io/dockstore/webservice/languages/CWLHandler.java#L364

The following workflows cause loadDocument to throw a ValidationException:

https://github.com/ICGC-TCGA-PanCancer/OxoG-Dockstore-Tools/tree/master
primary descriptor: /oxog_varbam_annotate_wf.cwl

https://github.com/h3abionet/h3agatk/tree/1.0.1
primary descriptor: /workflows/GATK/GATK-complete-WES-Workflow-h3abionet.cwl

The first workflow contains a SchemaDefRequirement and the parser appears to have trouble parsing the type references (TumourType.yaml#TumourType etc). When I change the type references to int, the workflow successfully parses.

Judging from exception message, the second workflow seems to be failing for a different reason, but I haven't pinpointed what, exactly. It is possible that it's not valid, but a cursory inspection didn't turn up any problems.

The exception messages are pretty big, so I put them and some stack trace info in the comments.

Please let me know if you need any more info. Thanks!

The first workflow exception message and stack trace is >600 lines long, below is an abbreviated version:

! org.w3id.cwl.cwl1_2.utils.ValidationException: Failed to match union type
!   Trying 'RecordField'
!     the `class` field is not valid because:
!       Expected one of [Ljava.lang.String;@270f831a
!     the `inputs` field is not valid because:
!       Failed to match union type
!         Expected object with Java type of java.util.List but got java.util.HashMap
!         Trying 'RecordField'
!           the `type` field is not valid because:
!             Failed to match union type
!               Expected object with Java type of java.lang.String but got java.util.LinkedHashMap
!               Expected object with Java type of java.lang.String but got java.util.LinkedHashMap
!               Trying 'RecordField'
!                 the `type` field is not valid because:
!                   Expected one of [Ljava.lang.String;@65553746
!               Trying 'RecordField'
!                 the `symbols` field is not valid because:
!                   Expected object with Java type of java.util.List but got null
!                 the `type` field is not valid because:
!                   Expected one of [Ljava.lang.String;@61bee517
!               Trying 'RecordField'

[  many lines deleted ]

!               Trying 'RecordField'
!                 the `symbols` field is not valid because:
!                   Expected object with Java type of java.util.List but got null
!                 the `type` field is not valid because:
!                   Expected one of [Ljava.lang.String;@61bee517
!               Trying 'RecordField'
!                 the `items` field is not valid because:
!                   TumourType.yaml#TumourType
!               Expected object with Java type of java.lang.String but got java.util.LinkedHashMap
!               Expected object with Java type of java.util.List but got java.util.LinkedHashMap
!   Expected object with Java type of java.util.List but got java.util.LinkedHashMap
! at org.w3id.cwl.cwl1_2.utils.UnionLoader.load(UnionLoader.java:31)
! at org.w3id.cwl.cwl1_2.utils.Loader.documentLoad(Loader.java:41)
! at org.w3id.cwl.cwl1_2.utils.RootLoader.loadDocument(RootLoader.java:18)

The second workflow exception message and stack trace:

! org.w3id.cwl.cwl1_2.utils.ValidationException: Failed to match union type
!   Trying 'RecordField'
!     the `class` field is not valid because:
!       Expected one of [Ljava.lang.String;@176252b7
!   Trying 'RecordField'
!     the `class` field is not valid because:
!       Expected one of [Ljava.lang.String;@6668dafa
!     the `expression` field is not valid because:
!       Expected a string.
!   Trying 'RecordField'
!     the `steps` field is not valid because:
!       Failed to match union type
!         Expected object with Java type of java.util.List but got java.util.HashMap
!         Trying 'RecordField'
!           the `run` field is not valid because:
!             Failed to match union type
!               Expected object with Java type of java.lang.String but got java.util.LinkedHashMap
!               Trying 'RecordField'
!                 the `class` field is not valid because:
!                   Expected one of [Ljava.lang.String;@176252b7
!               Trying 'RecordField'
!                 the `class` field is not valid because:
!                   Expected one of [Ljava.lang.String;@6668dafa
!                 the `expression` field is not valid because:
!                   Expected a string.
!               Trying 'RecordField'
!                 the `steps` field is not valid because:
!                   Failed to match union type
!                     Expected object with Java type of java.util.List but got java.util.HashMap
!                     Trying 'RecordField'
!                       the `run` field is not valid because:
!                         Failed to match union type
!                           Expected object with Java type of java.lang.String but got java.util.LinkedHashMap
!                           Trying 'RecordField'
!                             the `cwlVersion` field is not valid because:
!                               Expected one of [Ljava.lang.String;@5871ac54
!                           Trying 'RecordField'
!                             the `class` field is not valid because:
!                               Expected one of [Ljava.lang.String;@6668dafa
!                             the `cwlVersion` field is not valid because:
!                               Expected one of [Ljava.lang.String;@5871ac54
!                             the `expression` field is not valid because:
!                               Expected a string.
!                           Trying 'RecordField'
!                             the `class` field is not valid because:
!                               Expected one of [Ljava.lang.String;@13467c17
!                             the `cwlVersion` field is not valid because:
!                               Expected one of [Ljava.lang.String;@5871ac54
!                             the `steps` field is not valid because:
!                               Expected object with Java type of java.util.List but got null
!                           Trying 'RecordField'
!                             the `class` field is not valid because:
!                               Expected one of [Ljava.lang.String;@1f20d341
!                             the `cwlVersion` field is not valid because:
!                               Expected one of [Ljava.lang.String;@5871ac54
!               Trying 'RecordField'
!                 the `class` field is not valid because:
!                   Expected one of [Ljava.lang.String;@1f20d341
!   Trying 'RecordField'
!     the `class` field is not valid because:
!       Expected one of [Ljava.lang.String;@1f20d341
!   Expected object with Java type of java.util.List but got java.util.LinkedHashMap
! at org.w3id.cwl.cwl1_2.utils.UnionLoader.load(UnionLoader.java:31)
! at org.w3id.cwl.cwl1_2.utils.Loader.documentLoad(Loader.java:41)
! at org.w3id.cwl.cwl1_2.utils.RootLoader.loadDocument(RootLoader.java:18)

Hello @svonworl ; cwljava is currently only for CWL v1.2; if you'd like to parse CWL v1.1 and CWL v1.0 documents, that will have to be added as separate packages; you'll likely want code to dispatch to the correct loader based upon the cwlVersion

https://github.com/ICGC-TCGA-PanCancer/OxoG-Dockstore-Tools/blob/b38a8a4785746b8267913ea5389e21ae6dc921a3/oxog_varbam_annotate_wf.cwl#L3

@mr-c, are you planning to add support for 1.1 and 1.0? How much work would it be if we wanted to contribute that support ourselves (I haven't really looked at the codebase)?

@coverbeck I did a quick stab at CWL v1.1 in #105 (including a bit of documentation)

Would you like to take over the branch and continue with adding CWL v1.0 as well?

@mr-c , at this point we're still trying to figure out how/if we want to integrate cwljava with Dockstore. We couldn't commit to any work, at least not yet.

414 files in the PR! :) Presumably most of those are generated? Otherwise it seems like a massive undertaking. :)

@mr-c , at this point we're still trying to figure out how/if we want to integrate cwljava with Dockstore. We couldn't commit to any work, at least not yet.

414 files in the PR! :) Presumably most of those are generated? Otherwise it seems like a massive undertaking. :)

It is almost entirely automated, yes 😅

See https://github.com/common-workflow-language/cwljava/pull/105/files

Support for CWL v1.1 was added in #105 ; I can do the same for CWL v1.0, if someone requests it