Crash when reading ML prediction workflow
simleo opened this issue · comments
Simone Leo commented
Trying to read the DeepHealth tissue/tumor prediction workflow leads to a crash:
[simleo@neuron:tmp]$ python3 -m venv venv
[simleo@neuron:tmp]$ source venv/bin/activate
(venv) [simleo@neuron:tmp]$ pip install --upgrade pip
Collecting pip
Using cached pip-22.3-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 20.0.2
Uninstalling pip-20.0.2:
Successfully uninstalled pip-20.0.2
Successfully installed pip-22.3
(venv) [simleo@neuron:tmp]$ pip install wheel
Collecting wheel
Using cached wheel-0.37.1-py2.py3-none-any.whl (35 kB)
Installing collected packages: wheel
Successfully installed wheel-0.37.1
(venv) [simleo@neuron:tmp]$ pip install cwl-utils
Collecting cwl-utils
Using cached cwl_utils-0.20-py3-none-any.whl (282 kB)
Collecting packaging
Using cached packaging-21.3-py3-none-any.whl (40 kB)
Collecting requests
Using cached requests-2.28.1-py3-none-any.whl (62 kB)
Collecting rdflib
Using cached rdflib-6.2.0-py3-none-any.whl (500 kB)
Collecting CacheControl
Using cached CacheControl-0.12.11-py2.py3-none-any.whl (21 kB)
Collecting cwl-upgrader>=1.2.3
Using cached cwl_upgrader-1.2.4-py3-none-any.whl (24 kB)
Collecting schema-salad<9,>=8.3.20220825114525
Using cached schema_salad-8.3.20221016151607-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl (1.1 MB)
Requirement already satisfied: setuptools in ./venv/lib/python3.8/site-packages (from cwl-upgrader>=1.2.3->cwl-utils) (44.0.0)
Collecting ruamel.yaml<0.17.22,>=0.15.71
Using cached ruamel.yaml-0.17.21-py3-none-any.whl (109 kB)
Collecting mistune<0.9,>=0.8.1
Using cached mistune-0.8.4-py2.py3-none-any.whl (16 kB)
Collecting pyparsing
Using cached pyparsing-3.0.9-py3-none-any.whl (98 kB)
Collecting isodate
Using cached isodate-0.6.1-py2.py3-none-any.whl (41 kB)
Collecting certifi>=2017.4.17
Using cached certifi-2022.9.24-py3-none-any.whl (161 kB)
Collecting urllib3<1.27,>=1.21.1
Using cached urllib3-1.26.12-py2.py3-none-any.whl (140 kB)
Collecting charset-normalizer<3,>=2
Using cached charset_normalizer-2.1.1-py3-none-any.whl (39 kB)
Collecting idna<4,>=2.5
Using cached idna-3.4-py3-none-any.whl (61 kB)
Collecting msgpack>=0.5.2
Using cached msgpack-1.0.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (322 kB)
Collecting lockfile>=0.9
Using cached lockfile-0.12.2-py2.py3-none-any.whl (13 kB)
Collecting ruamel.yaml.clib>=0.2.6
Using cached ruamel.yaml.clib-0.2.6-cp38-cp38-manylinux1_x86_64.whl (570 kB)
Collecting six
Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: msgpack, mistune, lockfile, urllib3, six, ruamel.yaml.clib, pyparsing, idna, charset-normalizer, certifi, ruamel.yaml, requests, packaging, isodate, rdflib, CacheControl, schema-salad, cwl-upgrader, cwl-utils
Successfully installed CacheControl-0.12.11 certifi-2022.9.24 charset-normalizer-2.1.1 cwl-upgrader-1.2.4 cwl-utils-0.20 idna-3.4 isodate-0.6.1 lockfile-0.12.2 mistune-0.8.4 msgpack-1.0.4 packaging-21.3 pyparsing-3.0.9 rdflib-6.2.0 requests-2.28.1 ruamel.yaml-0.17.21 ruamel.yaml.clib-0.2.6 schema-salad-8.3.20221016151607 six-1.16.0 urllib3-1.26.12
(venv) [simleo@neuron:tmp]$ python
Python 3.8.10 (default, Jun 22 2022, 20:18:18)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
(venv) [simleo@neuron:tmp]$ python
Python 3.8.10 (default, Jun 22 2022, 20:18:18)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from cwl_utils.parser import load_document_by_uri
>>> wf_def = load_document_by_uri("predictions.cwl")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/tmp/venv/lib/python3.8/site-packages/cwl_utils/parser/__init__.py", line 90, in load_document_by_uri
return load_document_by_string(doc, baseuri, loadingOptions, id_)
File "/tmp/venv/lib/python3.8/site-packages/cwl_utils/parser/__init__.py", line 116, in load_document_by_string
return load_document_by_yaml(result, uri, loadingOptions, id_)
File "/tmp/venv/lib/python3.8/site-packages/cwl_utils/parser/__init__.py", line 135, in load_document_by_yaml
result = cwl_v1_1.load_document_by_yaml(
File "/tmp/venv/lib/python3.8/site-packages/cwl_utils/parser/cwl_v1_1.py", line 15685, in load_document_by_yaml
result, metadata = _document_load(
File "/tmp/venv/lib/python3.8/site-packages/cwl_utils/parser/cwl_v1_1.py", line 722, in _document_load
loader.load(doc, baseuri, loadingOptions, docRoot=baseuri),
File "/tmp/venv/lib/python3.8/site-packages/cwl_utils/parser/cwl_v1_1.py", line 534, in load
raise ValidationException("", None, errors, "-")
schema_salad.exceptions.ValidationException: - tried CommandLineTool but
Not a CommandLineTool
- tried ExpressionTool but
Not a ExpressionTool
- tried Workflow but
Trying 'Workflow'
the `steps` field is not valid because:
tried array<WorkflowStep> but
- tried array<WorkflowStep> but
Expected a list, was <class 'ruamel.yaml.comments.CommentedMap'>
- tried WorkflowStep but
Trying 'WorkflowStep'
predictions.cwl:35:5: the `run` field is not valid because:
- tried <class 'str'> but
Expected a type but got CommentedMap
- tried CommandLineTool but
Trying 'CommandLineTool'
predictions.cwl:46:7: the `inputs` field is not valid because:
- tried array<CommandInputParameter> but
Expected a list, was <class
'ruamel.yaml.comments.CommentedMap'>
- tried CommandInputParameter but
Trying 'CommandInputParameter'
predictions.cwl:51:11: the `secondaryFiles` field is not valid because:
Missing pattern in secondaryFiles specification
entry: ordereddict()
predictions.cwl:35:5: - tried ExpressionTool but
Not a ExpressionTool
- tried Workflow but
Not a Workflow
- tried array<CommandLineTool | ExpressionTool | Workflow> but
Expected a list, was <class 'dict'>
It also crashes with the packed version generated by cwltool.
Michael R. Crusoe commented
Thanks @simleo
We don't use YAML anchors & aliases in CWL. If you'd like to re-use a CommandLineTool
in multiple locations, either use $graph
or use separate files
Michael R. Crusoe commented
@simleo I can fix this with common-workflow-language/schema_salad#611 (and then apply that fix in cwl-utils
, we shouldn't be modifying the source as we process it anyhow