common-workflow-language / cwl-utils

Python utilities for CWL

Home Page:https://cwl-utils.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

docker_extract.py is not handling relative directories very well.

golharam opened this issue · comments

Hi - I'm curious how up to date and stable this package is. It seems like parsing a CWL 1.2 workflow works in some ways but not others. I can parse a workflow when its stored unpacked, but can't parse a workflow when its packed. I can use load_document_by_uri but can't use cwl_utils.parser.cwl_v1_2.load_document

Depending on how a document is stored/loaded, I'm running into different issues traversing the workflow steps. I'm trying to use the docker_extract.py script as an example. I'm wondering if there is a right way of using this package and/or a wrong way.

I think I know what happening. The relative directories are not being handled correctly/well. My directory folder structure is as follows:

base project (working) folder (/)
cwl files (CWL/) - workflow
- tools files (CWL/tools/myworkflow.steps/) - my individual tools/steps
Dockerfiles (Dockerfiles/)

I'm calling python

scripts/docker_extract.py Dockerfiles/ CWL/myworkflow.cwl

When cwl.load_document is initially called, args.input is "CWL/myworkflow.cwl".
This gets loaded without error.

As the script traverses the workflow, it tries to load the individual step cwl files. At this point its not constructing the relative paths to the actual cwl files correctly.

In get_process_from_step(), cwl.load_document is called but with the value in step.run.
step.run points to the step/tool cwl RELATIVE to the workflow cwl, NOT the working directory where the script it invoked from, e.g tools/myworkflow.steps/a_tool.cwl. This means when load_document is calling, its looks for the CWL in "./tools/myworkflow.steps/a_tool.cwl" INSTEAD of "./CWL/tools/myworkflow.steps/a_tool.cwl"